I was at dinner with a colleague this week—midterm week. Predictably, talk turned to the scourge of all professors: grading essays. There are few tasks in the life of a college professor less fulfilling than grading student essays. Every once in a while a really good essay jolts me to consciousness. I am elated by such encounters. To be honest, however, reading essays is for the most part stultifying. This is not the fault of the students, many of whom are brilliant and exuberant writers. I find it trying to wade through 25 essays discussing the same book, offering varying opinions and theories, while keeping my attention and interest. How many different ways can one ask for a thesis, talk about the importance of transition sentences, and correct grammar? For some time it is fun, in a way. One learns new things and is captivated by comparing how bright young minds see things. But after years, grading the essay becomes just part of the worst part of a great job.
So how might my colleagues and I react to news that EdX—the influential Harvard-MIT led consortium offering online courses—has developed software that will grade college student essays? I imagine it is sort of like how people felt when the dishwasher was invented. You mean we can cook and feast and don’t have to scrub pots and wash dishes? It promises to allow us to focus on teaching well without having to do that part of our job that we truly dread.
The appeal of computer grading is obvious and broad. Not only will many professors and teachers be freed from unwanted tedium, but also it may help our students. One advantage of computer grading is that it is nearly instantaneous. Students can hand in their work and get a grade and feedback seconds later. Too often essays are handed back days or even weeks after they are submitted. By then the students have lost interest in their paper and forgotten the inspiration that breathed life into their writing. To receive immediate feedback will allow students to see what they did wrong and how they could improve while the generative impulse underlying the paper is still fresh. Computer grading might encourage students to turn in numerous drafts of a paper; it may very well help teach students to write better, something that professorial comments delivered after a week rarely accomplish.
Another putative advantage of computer grading is its objectivity and consistency. Every professor knows that it matters when we read essays and in what order. Some essays find us awake and attentive. Others meet my eyes as they struggle to remain open. As much as I try to ignore the names on the top of the page, I can’t deny that my reading and grading is personalized to the students. I teach at a small liberal arts college where I know the students. If I read a particularly difficult sentence by a student I have come to trust, I often make a second effort. My personal attention has advantages but it is of course discriminatory. The computer will not do that, which may be seen by some as more fair. What is more, the computer doesn’t get tired or need caffeine.
Perhaps the most important advantage for administrators considering these programs is the cost savings. If computers relieve professors from the burden of grading, that means professors can teach more. It may also mean that fewer TA’s are necessary in large lecture courses, thus saving money for strapped universities. There may even be a further side benefit to these programs. If universities need fewer TA’s to grade papers, they may admit fewer graduate students to their programs, thus going some way towards alleviating the extraordinary and irresponsible over-production of young professors that is swelling the ranks of unemployable Ph.D.s.
There are, of course, real worries about computer grading of essays. My concern is not that the computers will make mistakes (so do I); or that we lack studies that show that computers can grade as well as human professors—for I doubt professors are on the whole excellent graders. The real issue is elsewhere.
According to the group “Professionals Against Machine Scoring of Student Essays in High-Stakes Assessment,” the problem with computer grading of essays is simple: Machines cannot read. Here is what the group says in a statement:
Let’s face the realities of automatic essay scoring. Computers cannot ‘read.’ They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others.
What needs to be taken seriously is not that computers can’t grade as well as humans. In many ways they grade better. More consistently. More honestly. With less grade inflation. And more quickly. But computer grading will be different than human grading. It will be less nuanced and aspire to clearly defined criteria. Are sentences grammatical? Is there a clear statement of the thesis? Are there examples given? Is there a transition between sentences? All of these are important parts of good writing and the computer can be trained to look for these characteristics in an essay. What this means, however, is that computers will demand the kind of clear, precise, and logical writing that computers can understand and that many professors and administrators demand from students. What this also means, however, is that writing will become more mechanical.
There is much to be learned here from an analogy with the rise of computer chess. The great grandmaster Gary Kasparov—who famously lost to Deep Blue— has perceptively argued that machines have changed the ways Chess is played and redefined what a good chess move and a well-played chess game looks like. As I have written before:
The heavy use of computer analysis has pushed the game itself in new directions. The machine doesn’t care about style or patterns or hundreds of years of established theory. It counts up the values of the chess pieces, analyzes a few billion moves, and counts them up again. (A computer translates each piece and each positional factor into a value in order to reduce the game to numbers it can crunch.) It is entirely free of prejudice and doctrine and this has contributed to the development of players who are almost as free of dogma as the machines with which they train. Increasingly, a move isn’t good or bad because it looks that way or because it hasn’t been done that way before. It’s simply good if it works and bad if it doesn’t. Although we still require a strong measure of intuition and logic to play well, humans today are starting to play more like computers. One way to put this is that as we rely on computers and begin to value what computers value and think like computers think, our world becomes more rational, more efficient, and more powerful, but also less beautiful, less unique, and less exotic.
Much the same might be expected from the increasing use of computers to grade (and eventually to write) essays. Students will learn to write in ways expected from computers, just as they today try to learn to write in ways desired by their professors. The difference is that different professors demand and respond to varying styles. Computers will consistently and logically drive writing towards a more mechanical and logical style. Writing, like Chess playing, will likely become more rational, more efficient, and more effective, but also less beautiful, less unique, and less eccentric. In other words, writing will become less human.
It turns out that many secondary school districts already use computers to grade essays. But according to John Markoff in The New York Times, the EdX software promises to bring the technology into college classrooms as well as online courses.
It is quite possible that in the near future, my colleagues and I will no longer have to complain about grading essays. But that is unlikely at Bard. More likely is that such software will be used in large university lecture courses. In such courses with hundreds of students, professors already shorten questions or replace essays with multiple-choice tests. Or they use armies of underpaid graduate students to grade these essays. It is quite likely that software will actually augment the educational value of writing assignments at college in these large lecture halls.
In seminars, however, and in classes at small liberal arts colleges like Bard where I teach, such software will not likely free my colleagues and me from reading essays. The essays I assign are not simple responses to questions in which there are clear criteria for grading. I look for elegance, brevity, insight, and the human spark (please no comments on my writing). Whether or not I am good at evaluating writing or at teaching writing, that is my aspiration. I seek to encourage writing that is thoughtful rather than writing that is simply accurate. When I have time to make meaningful comments on papers, they concern structure, elegance, and depth. It is not only a way to grade an essay, but also a way to connect with my students and help them to see what it means to write and think well.
And yet, I can easily imagine making use of such a computer-grading program. I rarely have time to grade essays as well or as quickly as I would like. I would love to have my students submit drafts of their essays to the EdX computer program.
If they could repeatedly submit their essays and receive such feedback and use the computer to catch not only grammatical errors but also poor sentences, redundancies, repetitions, and whatever other mistakes the computer can be trained to recognize, that would allow them to respond and rework their essays many times before I see them. Used well, I hope, such grading programs might really augment my capacities as a professor and their experiences as students.
I have real fears that grading technology will rarely be used well. Rather, it will too-often replace human grading altogether and in large lectures, high schools and standardized tests will impose a new and inhuman standard on the way we write and thus the way we think. We should greet such new technologies enthusiastically and skeptically. But first, we should try to understand them. Towards that end, it is well worth reading John Markoff’s excellent account of the new EdX computer grading software in The New York Times. It is your weekend read.
Controversy is raging around Thomas Friedman’s column today advising the presumptive Secretary of State John Kerry to “break all the rules.”
In short, Friedman—known for his faithful belief that technology is making the world flat and changing things for the better—counsels that the U.S. ignore hostile governments and appeal directly to the people. Here’s the key paragraph:
Let’s break all the rules. Rather than negotiating with Iran’s leaders in secret — which, so far, has produced nothing and allows the Iranian leaders to control the narrative and tell their people that they’re suffering sanctions because of U.S. intransigence — why not negotiate with the Iranian people? President Obama should put a simple offer on the table, in Farsi, for all Iranians to see: The U.S. and its allies will permit Iran to maintain a civil nuclear enrichment capability — which it claims is all it wants to meet power needs — provided it agrees to U.N. observers and restrictions that would prevent Tehran from ever assembling a nuclear bomb. We should not only make this offer public, but also say to the Iranian people over and over: “The only reason your currency is being crushed, your savings rapidly eroded by inflation, many of your college graduates unemployed and your global trade impeded and the risk of war hanging overhead, is because your leaders won’t accept a deal that would allow Iran to develop civil nuclear power but not a bomb.” Iran wants its people to think it has no partner for a civil nuclear deal. The U.S. can prove otherwise.
Foreign policy types like Dan Drezner respond with derision.
Friedman's "break all the rules" strategy is as transgressive as those dumb-ass Dr. Pepper commercials. Worse, he's recommending a policy that would actually be counter-productive to any hope of reaching a deal with Iran. This is the worst kind of "World is Flat" pablum, applied to nuclear diplomacy. God forbid John Kerry were to read it and follow Friedman's advice.
I’ll leave the debate to others. But look at the central assumption in Friedman’s logic. If the leaders of a country don’t agree with us, go to the people. Tell them our plan. They’ll love it. But why is that so? For Friedman and so many of his brothers and sisters on the left and the right in the commentariat, the answer is: because our proposals are rational. Whether it is Friedman on Iran or Brooks on the economy or liberals on gun control or conservatives on the budget, there is an assumption that if everyone would just get together and talk this through like rational individuals, we would agree on a workable and rational solution. This is of course the basic view of President Obama. He sees himself as the most rational person in the room and wonders why people don’t agree with him.
This rationalist fallacy is wrong. Neuro-scientists tell us that people respond to emotional and non-rational inputs. But long ago Hannah Arendt understood and argued that the essence of politics is neither truth nor reason. It is plurality and opinion. The basic condition of politics is plurality, which means people need to come together and pursue a common good in spite of their disagreements and differences.
For Arendt, Western history has seen politics had come under the sway of philosophy and thus the pursuit of rational truth instead of being what it was: a space for the public engagement of different opinions. The tragedy of the last 50 years is that philosophical rationality has now been supplanted by technocratic rationality, so that politics is increasingly about neither opinion nor common truths, but technocracy.
One lesson Arendt took from her fundamental distrust of unity and rationality was the importance of the diffusion of powers and her distrust of centralized power. Her embrace of American Constitutional Federalism was neither conservative nor liberal; it was born from her insistence that politics cannot and should not seek to replace opinions with truths.
Friedman wants rational truth to win out and believes that if we just talk to the people, the veils will fall from their eyes. Well it doesn’t work here at home because people really do disagree and see the world differently. There is no reason to think it will work around the world either. A thoughtful foreign policy, as opposed to a rational one, would begin with the fact of true plurality. The question is not how to make others agree with us, but rather how we who disagree can still live together meaningfully in a common world.