Frankly, I’ve always perceived such efforts as more than a little inadequate to the task at hand. Shit, after all, even on a pinecone, is still shit. Likewise, there’s a good reason why the makers of incense don’t market a patchouli and crap stick. As we say in the south, you can “pretty up” a pig by slapping a dress on it, but in the end, it’s still a pig.
Despite their insistence that the new SAT will better predict student ability while reducing unfairness by eliminating culture-bound items like analogies, the announced changes actually overlook the largest problems with standardized tests.
In fact, whatever cultural bias the ETS has eliminated with the ban on analogies will likely be re-triggered with the addition of a “writing” section, whose graders no doubt will emphasize stylistically and grammatically Standard English, marking students down whose writing style employs idioms, phrases, or merely word patterns more common to communities of color. Poetic license will have no place, one suspects, on the SAT writing test.
The fact is, even if such biased items are removed from the SAT, the unequal educational experience of the students taking the test—especially in terms of class and race—all but guarantees a persistent scoring gap between whites on the one hand, and blacks, American Indians or Latinos on the other.
But indeed, even tracking isn’t the biggest issue here. Oh sure, it matters. On the one hand it means that certain students of color will be underexposed to the kind of material found on a test like the SAT; and on the other hand it means that certain students—especially whites and many Asians who are presumed to be “good at math” early on, and thus tracked accordingly—will have an edge going in to the test. But still, tracking is not the clincher that makes the SAT inherently problematic.
The first is what Claude Steele, Chair of the Psychology department at Stanford University has called “stereotype threat.” As Steele and his colleagues have noted in a number of ingenious experiments, black students take standardized tests under a cloud of group suspicion that hinders performance—suspicion on the part of the larger society that they are less intelligent and capable than others.
As proof that it is stereotype threat and not inherent ability differences that explain racial gaps on standardized admissions tests, Steele notes that when the same test questions are given to whites and blacks in experimental settings, and yet the students are told that the results are not indicative of ability, and will not be graded, the stereotype threat dissipates and they perform as well as their white counterparts.
And finally, racial gaps are ultimately a function of the way that tests like the SAT are developed. Indeed, the gaps are all but built-in.
But as ETS concedes, questions chosen for future use must produce (in the pre-test phase) similar gaps between test-takers as existed in the overall test taken at that time. In other words, questions are rarely if ever selected for future use if students who received lower scores overall answer that particular question correctly as often or more often than those who scored higher overall.
In practice, questions answered correctly by blacks more than whites have been routinely excluded from future use on the SAT. Although questions that whites answer correctly 30% more often than blacks are allowed to remain on the test, questions answered correctly even 7% more often by blacks than whites have been thrown out.
Essentially, the company’s position is that for any question to have “predictive validity,” it should be answered correctly or incorrectly in rough proportion to the overall number of correct or incorrect answers given by test-takers. But since the general scores have tended to exhibit a racial gap, such logic results in the virtual guarantee of maintaining that gap, as a function of test development itself.
Interestingly, as testing expert Jay Rosner has demonstrated, the makers of the SAT could reduce the racial gap between whites and blacks while still maintaining the same level of overall test difficulty by choosing questions that, although equally tough, produce less differentiation between white and black test-takers. That instead they maximize these differences by way of the questions they choose, and that reforms of this nature are not being offered by ETS indicates how unconcerned they truly are about test fairness.
SAT gaps of as many as 300 points between two students (or groups of students) can be completely insignificant in terms of indicating actual ability differences, and gaps of 125 points between students are considered random by the test-makers themselves, and say nothing about the different abilities of the students in question.
If ETS wants to promote fairness—and indeed they insist that they are committed to changing the unequal educational system that helps produce scoring gaps—they must first stop promoting a test battery that replicates and reinforces that inequity. If they wish to provide tests purely for the purpose of gauging how much is being taught and learned in K-12 schools, so be it.