TY - GEN
T1 - LLMarking
T2 - 12th ACM Conference on Learning @ Scale, L@S 2025
AU - Wang, Hanling
AU - Chi, Banghao
AU - Wu, Yufei
AU - Chen, Kexin
AU - Wu, Di
AU - Liu, Songning
AU - Li, Yiwei
AU - Niu, Hanyan
AU - Zhu, Xiaohui
N1 - Publisher Copyright:
© 2025 Owner/Author.
PY - 2025/7/17
Y1 - 2025/7/17
N2 - With the advancement of educational technology, automatic assessment systems are becoming increasingly essential, particularly for grading short-answer questions. However, due to the inherent ambiguity and complexity of language, automatic grading of short-answer questions remains a challenge. Traditional grading methods are often time-consuming and subjective, highlighting the need for efficient, objective, and feedback-driven solutions. This paper proposes an innovative approach to automatic short answer grading (ASAG) utilizing large language models (LLMs). We introduce a specialized design for crafting questions and corresponding answers named Key Point Scoring Framework (KPSF) which significantly enhances the model's performance in ASAG tasks and improves the flexibility and objectivity of assessments. Moreover, we incorporate Prompt Dynamic Adjustment (PDA) that continuously refines the grading process, effectively handling ambiguous student responses while ensuring reliable results. To evaluate our approach, we develop a multidisciplinary dataset and incorporate real-world dataset from actual exams. The experimental results demonstrate that our ASAG approach provides educators with a highly efficient, flexible and accurate tool for short-answer assessments, indicating a significant advancement in automatic grading technology.
AB - With the advancement of educational technology, automatic assessment systems are becoming increasingly essential, particularly for grading short-answer questions. However, due to the inherent ambiguity and complexity of language, automatic grading of short-answer questions remains a challenge. Traditional grading methods are often time-consuming and subjective, highlighting the need for efficient, objective, and feedback-driven solutions. This paper proposes an innovative approach to automatic short answer grading (ASAG) utilizing large language models (LLMs). We introduce a specialized design for crafting questions and corresponding answers named Key Point Scoring Framework (KPSF) which significantly enhances the model's performance in ASAG tasks and improves the flexibility and objectivity of assessments. Moreover, we incorporate Prompt Dynamic Adjustment (PDA) that continuously refines the grading process, effectively handling ambiguous student responses while ensuring reliable results. To evaluate our approach, we develop a multidisciplinary dataset and incorporate real-world dataset from actual exams. The experimental results demonstrate that our ASAG approach provides educators with a highly efficient, flexible and accurate tool for short-answer assessments, indicating a significant advancement in automatic grading technology.
KW - automatic grading
KW - key point scoring
KW - large language models
KW - natural language processing
KW - prompt engineering
UR - https://www.scopus.com/pages/publications/105013071281
U2 - 10.1145/3698205.3729551
DO - 10.1145/3698205.3729551
M3 - Conference Proceeding
AN - SCOPUS:105013071281
T3 - L@S 2025 - Proceedings of the 12th ACM Conference on Learning @ Scale
SP - 105
EP - 115
BT - L@S 2025 - Proceedings of the 12th ACM Conference on Learning @ Scale
PB - Association for Computing Machinery, Inc
Y2 - 21 July 2025 through 23 July 2025
ER -