Research Indicates No Relationship between Student Standardized Test Scores and Quality of Teacher PerformanceBy Stewart Brekke
Recent research has shown that there is little or no correlation between teacher performance and student standardized test scores even with statistical enhancements.
Recent research from the University of Southern California has shown there is “weak or non-existent” relationship between state administered value added model tests—VAM, and the content and quality of teacher instruction. The study questions whether VAM data would be helpful in evaluating teacher performance and influencing teacher instruction.
Value Added Modeling, VAM, according to the Wikipedia, is a “method of teacher evaluation that measures the teacher’s contribution in a given year by comparing current test scores of their students to the scores of those same students in previous school years, as well as to scores of other students in the same grade.” In this way VAM tries to isolate the contribution in a given year that each teacher provides. Value added modeling also tries to isolate the the teacher’s contribution to test scores from factors outside the teacher’s control that strongly affect student test performance such as the students’ general intelligence, poverty and parental involvement. The study also found no relationship between multiple measure teacher effectiveness ratings which bring together value added measure with survey and observational grades and the content of teacher instruction in the school classroom.
Policy initiatives such as Race to the Top Fund and No Child Left Behind have created a force to improve methods by which teachers are evaluated. Many of these methods focus on teacher contributions to student learning based on standardized test scores. One of the study’s researchers stated that the findings show that comparing test scores does not illustrate “good teaching” and that it is “troubling” that more and more states are using test scores of students to make “a wide array of decisions” relating to teachers. In other words the study indicated that state value added performance measures do not reflect the content or quality of teachers’ instruction. This study was funded by the Bill and Melinda Gates Foundation, a well known advocate of using test scores in teacher evaluations.
Certainly, every classroom should have a professionally trained and well educated teacher. Also, school systems should recruit, prepare and keep teachers who are well qualified to do the job of teaching their students. Principals, or their designees, usually evaluate teachers primarily by observing them in their classrooms. Over 90% of teachers are given satisfactory ratings, but critics of this method maintain that especially in minority neighborhoods reading and mathematics scores are low and this situation must be the fault of the teachers and a better method of evaluation must be found.
The problem of evaluating teachers was thought to be partially solved through the use of students’ standardized test scores especially in mathematics and reading giving great weight to these scores to evaluate, reward, or remove a teacher. A few states are now considering intentions to give a 50% weight in teacher evaluation and compensation decisions to student scores in mathematics and reading. Some school districts such as the schools of New York City, the District of Columbia and Chicago have used the scores the identify teachers who would get bonuses, get further training and retention. Louisiana has enacted value added modeling as a method of identifying strong teachers and effective pedagogical techniques as well as weaker teachers.
Based on this study, these evaluation schemes using student test scores may prove erroneous and unfair. However, there has been broad agreement among statisticians, those who measure test performance and economists that student test scores alone are not sufficiently reliable and valid indicators of teacher effectiveness and that student test scores should not be utilized in high stakes personnel decisions. This is true even when sophisticated methods such as value added modeling are used as the study has shown.
Other factors have been identified which have great influence on student learning gains besides the students’ immediate teacher. One factor would be the students’ other teachers such as the students’ previous teachers and in high school current teachers of other subjects. The students’ tutors and other instructional specialists have been found to produce large achievement gains. Factors such as school conditions that affect learning gains are quality of curriculum materials, specialist or tutoring supports, and class size.
Student test score gains are greatly impacted by school attendance and many out of school learning experiences in the home, with peers, and at museums and libraries, in summer programs on-line and in the community. Parents who are well educated and supportive can help their children with homework and can get other advantages for them. In contrast other children have parents, who for a variety of reasons, cannot support their children academically. These may be income, health resources and even family mobility and such factors may also influence test score gains. The influence of the neighborhood such as violence in the community and the influence of neighborhood peers who may be more or less advantaged or disadvantaged may affect test scores which supposedly measure learning improvements.
Methods used to statistically adjust student test scores, such as the Value Added Method, for demographic factors and school differences do not produce better teacher effectiveness scores. This is especially true if they teach new English learners, special education students or low income students rather than if they teach students of wealthier parents and educationally advantaged students.
Further, the non-random assignment of students to classrooms and schools, many times based on natural test scoring ability such as innate intelligence, makes it difficult to accurately judge one teacher against another on the basis of students’ test scores in spite of the use of Value Added Methods. Also, relating teacher valuation to test scores may discourage teachers from wanting to work in schools or classrooms with the neediest students and also great variation in such results and their perceived unfairness certainly will undermine teacher morale. In fact various surveys have found that attrition of teachers and teacher demoralization are associated with student test based accountability efforts especially in high need schools.
It appears that value added scores are more related to teacher effects for mathematics rather than for language. This may be due to widely used poorly constructed tests for reading and language skills. However, this difference in teacher effects on reading scores as opposed to teacher effects on mathematics scores may be due to the fact that teachers basically have less influence on language development. Language skills are learned from a number of sources, mainly their families, while mathematics skills are primarily learned in school. Also, if all teachers are to be judged on test scores of their students, how can one produce standardized tests in subjects such as art, music, physical education and library work to evaluate teachers of those subjects?
Finally, using student test scores to evaluate teachers connecting them to rewards and sanctions will probably produce inaccurate personnel decisions and will demoralize teachers. In this manner this use of scores will thereby cause talented teachers to avoid high needs schools and classrooms. Using student test scores may result in teachers leaving the teaching profession entirely. Also, many potentially effective teachers may also be discouraged from entering the teaching profession as well. At this point legislatures, such as in Louisiana and Illinois, should avoid implementing a teacher evaluation system which is unproven and even disproved by the recent research mentioned in this article. Such evaluations, based on student test scores, will probably harm not only teachers, but also, their students.
1.Polikoff, M. “Instructional Alignment as a Measure of Teacher Quality.” Educational Evaluation and Policy Analysis (May, 2014).
2.Economic Policy Institute (2010, August) “Problems with the use of student test scores to evaluate teachers.” (Briefing Paper No.278). Washington, DC. Baker, E.L., et al.
3.Wikipedia: “Value added modeling.”(on line).
4.American Educational Research Association. “Study: State Value Added Performance Measures Do Not Reflect the Content or Quality of Teachers’ Instruction” (May 13, 2014) (online)
Stewart Brekke is a retired high school physics, chemistry and mathematics teacher who has taught primarily in the inner city high schools of the Chicago Public Schools. He has a PhD from the International University for Graduate studies in Arts and Sciences, and spends his retired years presenting scientific papers on nuclear physics and astrophysics and writing articles on educational subjects.