Why the Metaphors Matter

Glenn Arbery, Ph.D.

No one in this room doubts the central place of testing in contemporary education. Regardless of what you think about it-whether, for example, you believe that state-mandated testing (one of many kinds) gives every teacher and every school a strong incentive not to neglect a single child’s mastery of essentials, or whether you believe that such testing unduly emphasizes measurable outcomes for political reasons-thefact of testing is the elephant in every classroom: impossible to miss and hard to get out the door.

When I was flying to Indianapolis a couple of weeks ago, I picked up the in-flight magazine and found an article called “Ace the Test,” instructing parents about the ins and outs of getting their children ready for standardized assessment tests. It’s easy to understand why such an article would be there: a vast parental anxiety is abroad, not least among parents at that very moment flying away from home. But as I was reading, a few sentences at the beginning of the article struck what seemed to me an especially ominous note for this conference. I kept the magazine so I could quote them: “Educational experts and parents are debating the efficacy of standardized tests and their effect on classrooms. But philosophical sparring won’t stave off testing day, not with federal funding riding on the imposition of standardized assessments. So while the talking heads talk, parents need to do their homework, say educators and testing experts.”

What bothered me, of course, was the ease with which thinking about testing was dismissed. “Philosophical sparring won’t stave off testing day.” There it was-the reason not to have this conference at all. Notice, for example, the great, impersonal feel of “testing day”-the varieties of foreshadowing dread it summons up. You glimpse for an instant, reading this sentence, the overwhelming momentum with which testing day rolls down the calendar toward your unsuspecting child (and here you are on an airplane). Notice, too, the word “sparring,” which replaces the word “debating.” In boxing, sparring is what you do in practice or an exhibition, more gesture than substance-not the real thing, in other words. You don’t really do any damage sparring. All these people talking about whether testing is good or bad are just gesturing at each other. Tests are the real thing, this article is saying, and to spend time thinking about them instead of getting ready for them is to be a “talking head.” The phrase originally described the close-up shot of TV newsmen and pundits-but here talking heads are unattached and ineffectual intellects, sparring without bodies. Don’t be like that, this article is saying. You need to keep your child from being crushed on testing day, and here’s how.

Fear is a pretty powerful thing-or maybe it stops just short of fear, at a level of general anxiety. Testing is by now a large, impersonal social force, identifiable with no one person, affecting not just schoolchildren and their parents, but teachers whose livelihoods depend upon the performances of their students on tests, principals whose schools are judged by their rankings on test scores, and whole districts that either look shabby or sleek next to other districts and their performances. Funding rides on compliance with this system. Private schools escape state-mandated tests, but they often develop an even more competitively intense matrix of testing, encouraging their students to take the SAT in the seventh grade, for example, to familiarize them with it before they even begin their work in high school. So the question at the outset is this: can we think seriously about testing, or is it too pervasive, too overpowering a reality, requiring accommodation but not deliberation? Are we too deeply implicated in it to consider what we’re doing? The most compelling way that I know to put the question is this: does it make any difference to think about testing, or is it like thinking about, say, the nature of taxation, an exercise that might involve philosophic sparring but that won’t stave off tax day?

Obviously, we think that the issue is worth bringing up again, because, as Tocqueville said in Democracy in America, political communities are not merely subject to “inflexible providence or to a blind fatality.” We are free to deliberate. In this conference, we want to address the real climate of education, which is fear or anxiety, but not by treating this issue as an exercise in problem-solving. As those of us on the faculty have talked about the topic over the past few months, what has emerged is that we need to make at least a beginning toward thinking through our situation as seriously and radically-that is, in its roots-as we can, and it has seemed to us that the issue is about more than testing as the way of measuring achievement: it is about measurement itself and about what measurement means when it is applied to the human mind, somehow treated as a distinct entity separated from emotional life, from familial and cultural connectedness, from soul and spirit. Thinking about measurement naturally leads to the idea of what the measure or the standard of measure is or should be, not just for a product of the educational system, but for who we are as human beings. The measure of who we are as human beings ought to govern the standards of measurement used in education.

So the question is who we are-individually, as communities, as citizens of a democracy committed to self-government. If the immediate, almost visceral answer is that we are not any one thing, then the next question is whether standardized testing tries to make students into one thing by using a common measure. Not necessarily, because a good measure can deal with very different things in that one respect. Think of what you can measure with ruler, for example: a desktop, a penciled line on a piece of paper, a hand. But if there were a test containing twelve algebra problems I should solve, and I answered all twelve but missed the second, fifth, and eighth, would that be like nine inches on a ruler? In that case, you would have to imagine something that was nine inches long only because the second, fifth, and eighth inches were disqualified as inches. If the test isn’t like a measure of surfaces, then what about something like volume? If each answer were a quantity like a fluid ounce, let’s call it a knowledge-ounce, and the test were a container that potentially held twelve of these, the total of my right answers would fill it to the level of nine knowledge-ounces. If it were simply a matter of how many knowledge-ounces of algebra I could pour into the test, then this would be an adequate measure. The test would assume that everyone had been taught the same things and that everyone ought to be able to answer the questions, and it would measure how well each different person could fill the test container with knowledge-ounces from what all had been given in common. On the basis of this measure, one could compare my knowledge of algebra and someone else’s knowledge without any reference to who we were, on the assumption that all knowledge-ounces of algebra were the same. One could average the results. If I had nine, somebody else had four, and somebody else all twelve, then the average amount poured into the test would be a little over eight knowledge-ounces per student out of a possible twelve.

The difficulty with this metaphor is that the act of thought represented by an answer is not really in itself a quantity or volume, any more than remembering to buy coffee on the way home from work is a quantity. It can be treated as a quantity-one act of thought-but that is not its nature. We know that no act of thought ever takes place without reference to the whole situation of the person doing the thinking, which is why we distinguish artificial intelligence from human intelligence. We know that all kinds of things, especially emotions, not only influence but even give rise to the act of thought. The problem with the metaphor of a container that measures knowledge-ounces is that wrong answers, which are also acts of thought, are simply absent from the container, and therefore they can’t reveal anything but their absence. The particularities of wrongness might be especially revealing for particular students; any teacher knows that very well. But in a test understood as this measurement of acts of thought treated as quantities, individual particularities are not the point. In this respect, testing is univocal; it assumes that the knowledge in each person can be treated in exactly the same way. This impartiality is itself an implicit or unstated metaphor of knowledge as quantifiable units of thought-acts distributed by means of teaching. It seems to me that the larger metaphor is something like this: the knowledge and skills essential for the continued efficient functioning of the workforce constitute a publicly owned utility, supported by taxes, and understood to be vital for the well-being of the populace as a whole as electricity or water. Under the social contract, the essential knowledge and skills are legally mandated and must be provided to every child. They are delivered more or less efficiently through the schools, teacher by teacher, and the efficiency of the delivery is metered student by student through testing. The knowledge-volume in each student belongs to the students in something like the way that water from the public water supply belongs to the consumer, but the students owe the state a return on the service; otherwise, it could not be exacted in this way on tests and metered. For example, I actually owe all twelve algebra knowledge-ounces, but nine is understood as acceptable, given a certain predictable level of waste, but the job of the teacher is to minimize spillage, in part by getting me to care more about it.

Maybe I don’t have this implicit metaphor right, but education imagined or felt as a giant utility like electricity or water seems to me to accompany state-mandated testing and to be internalized in the tested student as the most comprehensive metaphor for what he or she learns in school. First of all, though, what is a metaphor? In the Greek roots of the word, it is a carrying across or over. The word transfer has exactly the same root meaning in Latin. A metaphor is a transfer. The Oxford English Dictionary (an enormous collaborative enterprise published between 1884 and 1928, most recently dramatized in Simon Winchester’s book The Professor and the Madman) defines it as “The figure of speech in which a name or descriptive term is transferred to some object different from, but analogous to, that to which it is properly applicable.” Describing testing as the metering process for a giant public utility would be such a transfer. Whether it’s accurate is something we need to consider. But not only can metaphors be revealing figures of speech, but they can also be comparisons concealed in the way we think, and they can be felt in ways that govern what we say and do long before they are brought to the light and examined. In the past few days, I’ve found that cognitive science has a whole branch devoted to metaphor, associated with George Lakoff’s work in what is called conceptual metaphor theory. Lakoff speaks instead of the source domain of a metaphor and the target domain. In his usage, the source domain is “mapped” onto a target domain. If the test is the elephant in the classroom, then what we know about elephants is the source domain that is mapped onto the target domain, which is the effect testing on the classroom. The assumption of this conference is that finding and addressing implicit metaphors, especially metaphors of measure, is crucially important, because otherwise they will occupy a determining but unexamined role in shaping thought about education and its ends.

Perhaps new metaphors can lead to new ways of thinking. Or perhaps by returning to the original ones we can clarify where we are and how we got here. Ninety years ago, the poet and critic T. E. Hulme wrote that “Every word in the language originates as a live metaphor, but gradually of course all visual meaning goes out of them and they become a kind of counters.” He gives this example: “If I say the hill is clothed with trees your mind simply runs past the word clothed, it is not pulled up in any way to visualize it. You have no distinct image of the trees covering the hill as garments clothe the body.” In other words, clothed is a dead metaphor, used as a conceptual meaning like a bead on an abacus. But, Hulme says, “even this wordclothed . . . was probably, the first time it was employed, an attempt on the part of a poet to convey over [to meta-phor, to trans-fer] the vivid impression which the scene gave him.” What I want to do is to explore what happens when we try to bring the dead metaphors in two of the key words we’ll be talking about this weekend-test and assessment-back to life. The philosopher Paul Ricoeur, writing about “the rejuvenation of dead metaphors” in such thinkers as Martin Heidegger, says that “re-animated metaphor once again functions as fable and as redescription, which characterize living metaphor.” But re-animating the metaphors hidden in the words test or assessment seems almost hopeless. The metaphors in them areso dead that no one, certainly including me, would ever think of some original poetry present in them. But it might not be as hopeless as it looks at first glance. I’m teaching a colloquium course in Heidegger’s late essays right now, and his way of getting at the metaphors through the roots of words is one that I will be using, mostly through the etymologies and quotations provided by the OED.

Let me start with test: where did the word come from? What was shocking to me, when I first looked it up in the OED, was the very first definition, dating back to Chaucer in the Canon Yeoman’s Tale in 1386: it was originally from the word for a pot, related to the Latin word meaning an earthen vessel. It meant “the cupel used in treating gold or silver alloys or ore.” A cupel, just in case you are like me in not having the slightest idea, is “a small flat circular porous vessel, with a shallow depression in the middle, made of pounded bone-ash pressed into shape by a mould, and used in assaying gold or silver with lead.” Originally, a test was this bone-ash vessel used in assaying. But what did you do when you assayed gold or silver? To assay means to make trial of something, to put it to proof to determine its degree of purity; to assay metals means to put them to the test, literally. Since I have never seen a test used, I had to look up what is done with it in assaying metals. Here’s a recent online description of the process from a contemporary refiner:

The fire assay… has been used since the days of the ancient Egyptians to determine precious metal content. The fire assay begins by combining your sample with pure silver and pure lead in a process called cupelling. The silver / lead sample combination is then placed in a cupel [a test, in other words] (a small bone-ash bowl) where it’s heated in a temperature-controlled furnace until the non-precious metals are absorbed, leaving a small button of precious metal. This button [a small sphere] is combined with nitric acid and de-ionized water in a beaker and heated. The acid slowly dissolves all metals, except gold. (my emphasis)

For anyone who sees this process, it must be very striking, especially since it reveals the almost miraculous natural qualities of a substance that is purified by means of the test that actually absorbs the non-precious metals. It becomes more and more itself. “Of 1031 lbs. weight of lead they had from the taest 14 lbs. weight of silver” reads a citation from 1552, twelve years before the birth of Shakespeare. But Shakespeare must have been the first poet to seize upon the test and testing as a metaphor, as he did with so many things. In his play entitled-very interestingly for this conference-Measure for Measure, he has the puritanical Angelo use the word when Duke Vincentio leaves him in charge of Vienna:

Now, good my lord,

Let there be some more test made of my metal,

Before so noble and so great a figure

Be stamp’d upon it.

We read or hear the word test and pass right over it, as Hulme says we do with the word clothed. But for Shakespeare’s audience it would have been a vivid metaphor; it would bring to mind the image of gold emerging from the bone-ash test in a furnace as a comparison to an assay of Angelo’s character. Angelo, as it turns out, does not stand the test; he is not the gold worthy to have “so noble and so great a figure”-notice the idea of coinage-stamped upon it. Shakespeare uses the word still more metaphorically inHamlet, when Queen Gertrude says that the sight of the ghost is the “very coinage” of Hamlet’s brain, and Hamlet replies, “it is not madness / That I have utter’d: bring me to the test, / And I the matter will re-word.” If his words were mad, their non-precious quality would be absorbed into the test, the cupel, but he imagines instead his mind emerging in the test as gold that he can refashion into new words, as if into coins with a different stamp. He imagines the source of his language as tested gold. In The Tempest, Prospero tells his future son-in-law Ferdinand, after inflicting various harsh measures on him, “Thou hast strangely stood the test”-that is, he has held up under severe trials to an exceptional degree; he is pure gold; he has a golden nature. Again, this description of someone as golden does not pull us up, because we have almost lost the source domain for the metaphor of gold, even though the finest athletes still compete for it.

What does being golden mean? Gold is the noblest of noble metals, meaning that it will not combine with oxygen-that is, it is practically indestructible, never rusting or corroding, and it therefore naturally symbolizes a lasting glory. “Best of all things is water,” the Greek poet Pindar wrote in the fifth century B.C., “but gold, like a gleaming fire / by night outshines all pride of wealth beside.” It keeps its luster-(“The gold death-mask in the tomb of Tutankhamen looked as brilliant when it was unearthed in 1922 as when it was entombed in 1352 BC.”), and at the same time it is the most ductile and malleable of metals. Almost unbelievably, one ounce can be drawn into a wire more than five miles long or beaten out to 100 square feet; this is what John Donne means when he speaks of “gold to airy thinness beat.” But truly to summon up the world of the metaphor, we have to try to imagine the experience of its qualities, not simply as economic value, but as a complex image; we have to experience it with wonder as a kind of revelation from within the natural order. Obviously gold can be coveted; it like one of the primal words that Freud says in a famous essay have exactly opposite meanings: think of drug, for example, which means either a medicine or a narcotic. But I want to focus on it not simply understood as something that men value, like oil for the past century, but as a beautiful substance given in nature whose whole quality reveals itself only when it is brought to the test-as if it were meant to be brought to the test, in other words, as if were intended in nature that the human art of refining gold bring out its full range of qualities. It is not a mere element among other elements on the periodic table; rather, it is felt as a rich analogy to something best in us, not divine because it occurs in nature but analogous to divinity, something that “outshines all pride of wealth beside” and that has to be brought into its true luster and malleability by an art that puts it to the test, the revealing cupel.

Three hundred years later the word test turns up in the Journal of Educational Psychology sounding like this: “The following chart . . . throws into relief the periods of rapid, normal and slow mental growth as measured by the Binet scale. The Binet tests of 1908 are the only set hitherto devised covering any considerable variety of functions” (my emphasis). Over the course of three centuries, the word completely lost the meaning of cupel and it retained only what had been a figurative or target sense, but now brought into the usage of social science. By now it meant “the process or an instance of testing the academic, mental, physiological, or other qualities and conditions of a human subject,” as the OED puts it. Something about that phrase “human subject” is worrisome. After the faculty panel later this morning, it should be no surprise that the first use of the word in this sense was in 1910, just as Alfred Binet’s intelligence tests were beginning to be appropriated in England and the United States to replace-seriously-measuring headsas the index of intelligence. The word test since then obviously covers everything from driving tests to pregnancy tests. Perhaps the closest we get to the original sense of test-as-cupel is the test tube, which has very different connotations.

What would change in our thinking if the original metaphor for testing were as fresh as it was 400 years ago? We would be thinking of assaying-or essaying (the same word)-students to see whether metaphorical gold or silver emerges in the test and how much of it there is relative to the metaphorical alloy or ore. We would not be sentimental about the refining fires and a certain necessary level of exertion and suffering, but we would have to make some decisions about the nature of the test. Since it would no longer make sense to meter the return on what was put into the students by the schools, we would have to decide how to consider this naturally occurring substance we are trying to find. Is gold as I have described it (as the source domain) sufficiently summed up in the word intelligence, for example, when you map it onto the target domain of human qualities, or does it seem analogous to considerably more than that? And how do we look for this gold? Do we agree that assaying applies only to individuals, or do we agree that it applies to whole groups? At the beginning of his book The Mismeasure of Man, Stephen Jay Gould reminds his readers of the “noble lie” in Plato’s Republic-that human natures, from birth, are gold, silver, or brass and iron. In the imaginary republic, those with golden natures should be rulers, those with silver natures auxiliaries, and those with bronze and iron natures craftsmen or workers. If we think of testing as a way of distinguishing these inborn natures, then it becomes the basis for establishing a hierarchy of merit in a democratic society. IQ testing, as we will see later this morning, was used early in the 20th century in precisely this way. If this is what we as a free political community decide to do, then our assay yields from each group a promising few, the gifted ones; then those somewhat less gifted; then all the rest. The test is essentially competitive, and golden individuals emerge with greater and greater clarity in the test while the others are more or less absorbed into the background.

That would be one way we could go. But suppose we agree instead that testing groups to rank them makes no sense, because the test can really apply only to an individual’s qualities, as in Shakespeare’s metaphors. Angelo and Hamlet aren’t talking about how they are ranked but about who they are. The assay is the art of testing the gleaming, indestructible, ductile, and malleable gold in that particular nature. If this is our decision, then we agree in advance that every human being has this gold in some degree and that education should be structured in such a way that everybody is always looking for it, and if one teacher’s assaying cannot bring it out, another’s can. The aim is the disclosure of some excellence. Even though some will have more tested gold than others, even a little, properly worked, as we have seen, goes a very long way. But again, if the qualities of gold are the “source domain,” what do we understand as the “target” here? Gold, goldenness, in a person is not a substance but an activity, a particular act of existence in which each person is most distinctly that person and no other, and it is strangely transforming to be recognized in that particularity, to be transferred, metaphored, back into yourself by a teacher who has found the precious metal in you. Testing, re-imagined with all its metaphorical intensity, would be the way of revealing the malleable excellence-the workable gold-in each particular nature, instead of the way of ranking students, or teachers, schools, and districts, on a scale. Testing in this sense would no longer be a means of measurement, but of bringing forth. Practically speaking, a test would not be a way of metering return on input, but as an intense way of allowing the student to experience what can happen with the materials of a course under the intensity of a demanding opportunity to think beyond the given. In the test, mere repetition of what has been taught would sink into the background, and what would remain in the test-as-cupel is the true measure.

I am getting close to the five-mile mark on my ounce of gold, so let me turn more briefly to the idea of assessment. Barbara Khirallah from the University of Dallas will speak to us tomorrow morning about some of the current ideas about assessment, but I want to say a little now about the metaphor. We use the word “assessment” these days for evaluation of all kinds, but its first use in education occurred only in the mid-1950’s and burgeoned over the past fifteen years. From the mid-1400’s until now, the verb assesshas always meant, on the one hand, to fix a fine or a tax owed by a person or a community, or on the other hand, to estimate something’s value for purposes of taxation-for instance, your house compared to your neighbors’ houses in an assessment of who should pay how much property tax. From 1934 on, however, the meaning was also transferred to mean “to evaluate (a person or thing); to estimate (the quality, value, or extent of), to gauge or judge.” At some point in this transference, it was a live metaphor whose source domain was judgment or evaluation of worth for purposes of taxation and whose target domain was student performance, or teacher performance, or school performance. When we use the word, we think that it simply means to gauge or judge, but something of this sense of evaluating for purposes of taxation lingers. Through it, I want to return to our current situation and what seems to me the economic metaphor that dominates our thinking about education.

What could someone have meant-not a Shakespeare-in transferring the word assessment from taxation to education? What was the particular felt sense of it when someone first spoke or thought of assessing a student’s work? I’m going to imagine a purely hypothetical situation. Suppose that someone who had been working as a property tax assessor became a teacher. She was used to evaluating property-the desirability of the house and property in itself, whether improvements had been made since the last assessment, how the neighborhood affected its value, how the market was doing at the time, and so on, all this to determine how much the owner should pay for having the protections and services of the city for that property. If she had thought through it, she would have seen taxes as the price, like military service, for participation in the social contract. Now, in her new job, she was evaluating student work, and she suddenly recognized that she was doing the same thing: assessing. She was trying to discover how much each student should be taxed, that is, what level of performance each one owed for enjoying the services of the educational system under the social contract. What were this boy’s natural gifts? Had he improved on them since the last report period? How much did the other students affect how he did? What was the general level of performance in the school at the time? She called it assessment. It was a judgment of thevalue of what the student was really capable of doing, given all the circumstance-and not just whether he was doing it, but whether he was giving back what he should give.

Assessment obviously has to take many more things than test scores into account. Nevertheless, what begins to emerge for me once again is the question of why taxation has become the metaphor for education. Certainly the author of the article “Ace the Test” understood taxation as the economic reality supporting the whole system of public education and therefore coloring what schools with its metaphor or transfer. When taxation is literally the source domain for funding, does it necessarily map the target domain of the classroom and the school? To ask about measure in any truly radical way might mean that we have to step back, not just into the words, but into the whole metaphor of accounting that underlies accountability. But I will leave that to our next speaker.

© The Dallas Institute of Humanities and Culture – Permission is granted to copy and redistribute this lecture on the condition that the content remains complete and full credit is given to the author.

