The central feature of the Obama administration’s $5 billion “Race to the Top” program was sharply refuted last week by the American Statistical Association (115 KB PDF), one of the nation’s leading scholarly organizations. Spurred on by the administration’s combination of federal cash and mandates, most states are now using student test scores to rank and evaluate teachers. This method of evaluating teachers by test scores is called value-added measurement, or VAM. Teachers’ compensation, their tenure, bonuses and other rewards and sanctions are tied directly to the rise or fall of their student test scores, which the Obama administration considers a good measure of teacher quality.
Secretary Arne Duncan believes so strongly in VAM that he has threatened to punish Washington state for refusing to adopt this method of evaluating teachers and principals. In New York, a state court fined New York City $150 million for failing to agree on a VAM plan.
The ASA issued a short but stinging statement that strongly warned against the misuse of VAM. The organization neither condemns nor promotes the use of VAM, but its warnings about the limitations of this methodology clearly demonstrate that the Obama administration has committed the nation’s public schools to a policy fraught with error. ASA warns that VAMs are “complex statistical models” that require “high-level statistical expertise” and awareness of their “assumptions and possible limitations,” especially when they are used for high-stakes purposes as is now common. Few, if any, state education departments have the statistical expertise to use VAM models appropriately. In some states, like Florida, teachers have been rated based on the scores of students they never taught.
The ASA points out that VAMs are based on standardized tests and “do not directly measure potential teacher contributions toward other student outcomes.” They typically measure correlation, not causation. That means that the rise or fall of student test scores attributed to the teacher might actually be caused by other factors outside the classroom, not under the teacher’s control. The VAM rating of teachers is so unstable that it may change if the same students are given a different test.
The ASA’s most damning indictment of the policy promoted so vigorously by Secretary of Education Arne Duncan is:
Most VAM studies find that teachers account for about one percent to 14 percent of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. Ranking teachers by their VAM scores can have unintended consequences that reduce quality.
The ASA points out:
This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.
As many education researchers have explained — including a joint statement by the American Educational Research Association and the National Academy of Education (112 KB PDF) — the VAM ratings of those who teach children with disabilities and English language learners will be low, because these children have greater learning challenges than their peers, as will the ratings of those who teach gifted students, because the latter group has already reached a ceiling. Those two groups, like the ASA agreed that test scores are affected by many factors besides the teacher, not only the family, but the school’s leadership, its resources, class size, curriculum, as well as the student’s motivation, attendance and health. Yet the Obama administration and most of our states are holding teachers alone accountable for student test scores.