Empirical Evidence that Formative Assessments Improve Final Exams”

Home / Article Reviews / Empirical Evidence that Formative Assessments Improve Final Exams”

By Barbara Lentz, Wake Forest University Law school

Article: “Empirical Evidence that Formative Assessments Improve Final Exams” by Carol Springer Sargent and Andrea A. Curcio, 61 J. Legal Educ. 379 (Feb. 2012).

Are you integrating formative assessment in your courses to satisfy ABA Standard 314, but wondering whether your time to develop, implement and evaluate student work will improve students’ learning?  In the article “Empirical Evidence that Formative Assessments Improve Final Exams,” Professors Carol Springer Sargent and Andrea A Curcio report that shifting away from a single summative assessment to a combination of formative assessments over the term coupled with a final cumulative exam provides a measurable performance improvement to 70% of the law students in the course. The authors describe their study, review types of and benefits from formative assessments, list practices to improve effectiveness of feedback, and seek to explain why the term-long formative feedback boosted performance only for students in the top 70% of LSAT score or undergraduate GPA (UGPA) cohort regardless of first year law school grades.

Their study was based on data collected in two subsequent Autumn terms of a large, doctrinal Evidence course taught by the same experienced professor. The control section was taught using a problem method with case analysis and students were assessed solely by a cumulative final exam that comprised the full course grade. In the following year, the intervention section was taught using the same problem method with case analysis, but also received a series of formative assessments: five ungraded quizzes, a graded midterm, model answers, grading rubrics, and a self-reflective exercise. The final exam in the intervention section comprised 83% of the course grade (allocated so that the final course grade would correspond primarily to the final summative assessment). The authors compared the final exam scores of students on eleven common final exam questions. The variance in common question scores, 3.02466 points out of 50 points, is about a half a letter grade (6.048 percent). Id. at 391. The effect for the top 70% of the intervention class, however, was 4.595 points out of 50, or almost a full letter grade (9.19 percent). Id.

Before discussing the implications of their results, the authors reviewed best practices for implementing formative assessments to improve learning. Formative assessments improve learning by helping students identify misconceptions and knowledge gaps and by motivating or refocusing studying. In their view, the most effective formative feedback explains to students why an answer is correct instead of merely showing the correct answer. According to the authors, other strategies to make feedback more effective are providing suggestions to improve performance, delivering feedback close in time to the assessment. Id. at 381-82.

The professors and students valued types of formative assessment differently. The authors posit that ungraded feedback may be more helpful in improving learning because it focuses the student on suggestions for improvement rather than solely upon the grades. Id. at 382.  However, student evaluations showed that while many students found all the formative materials to be helpful, the graded midterm was valued more highly. Students commented that model answers, grading rubrics, and professor comments were the most helpful feedback, but peer edits and self-reflections provided the least useful feedback. Indeed, more than one-third of students surveyed found the self-reflective exercises unhelpful. Id. at 392.

Sargent and Curcio write that the demonstrated benefit from formative assessment disproportionately accrued to the top 70% of students (arranged by LSAT or UGPA). Students in the intervention section with below median first-year law school grades did show improved performance, compared to the control group, but only if those students were in the top two-thirds of the class on UGPA or LSAT scores. Id. at 400.  By correlating LSAT and UGPA with performance on the final exam, this study showed the bottom 30% of students by LSAT or UGPA (regardless of first year grades) either did not or could not use the information from formative assessments to monitor and improve the quality of their work as measured by performance on the final exam. Similarly, results from a similar prior study (comparing civil procedure courses taught by different instructors in the same term), showed that practice essays only helped improve final exam performance of students with above median LSAT scores and UGPAs.

The authors presented potential explanations for the disproportionate allocation of benefit from formative assessment. First, it is possible that not all students are able to use feedback to improve. Students’ ability to calibrate what they know and don’t know is a metacognitive skill. Id. at 395-96. There may be a difference in students’ metacognitive abilities. If the top 70% of students possessed stronger metacognitive skills, they would be better able to process and apply information gleaned from formative assessment to improve their learning and subsequent performance on final exams. Id. at 384. The top 70% cohort may also have higher confidence in their abilities to effectively use feedback, which improves their abilities to better self-monitor and calibrate their comprehension. Id. at 400. Additionally, it is possible that students in the top 70% of the LSAT or UGPA cohort are more motivated by grades. Id. at 396.

While the article provides evidence that formative assessment improves performance, the authors disclose are three caveats: First, they are not able to identify which types of formative assessment led to higher exam scores in the intervention section. While students did not report the self-reflective exercises to be helpful, student perceptions “are not a direct measure of the actual helpfulness of the materials.” Id. at 395. Second, it is possible that students perform better when they know their performance being measured, (the Hawthorne effect). Id. at 398. Finally, the authors questioned whether the formative assessments, which they described as practice materials, might inadvertently encourage performance-oriented goals rather than encouraging deeper mastery learning. “In other words, do [formative] practice materials support those whose main goal is to get higher course grades rather than assisting those who wish to truly comprehend and master the content?” Id. at 399.

The authors did not believe that the shift to formative assessment unreasonably burdened faculty, particularly when faculty time over the entire term was considered. “While drafting the questions, model answers, rubrics, and self-reflective exercises initially takes a few hours, those materials do not need updating each term. Grading a short midterm also takes a few hours, but may result in faster final exam grading due to better quality responses. Alternately, giving a midterm may justify a shorter final exam, thereby reducing time spent grading final exams.” Id. at 400. Because formative feedback improves performance by explaining why an answer is incorrect, it may be possible to produce the same learning effect without administering and individually grading a mid-term, further minimizing instructor effort. The authors suggest undertaking a future study to discern whether providing a model answer to an ungraded midterm might provide a learning effect similar to individually graded midterms.

Finally, they observed that completing formative exercises and providing feedback reduced the amount of traditional instruction time relative to the control section. Thus, feedback must be more helpful in improving learning than the reduction in class time to cover material. Id. at 398. However, the substantial improvement shown by the intervention group seems to show that reducing time for traditional instruction in favor of formative assessment may improve student learning at least as measured by final exam scores for the majority of students.