“Grades are so subjective”

AGradeActually, they’re probably less subjective than you think. And to the degree that there is still some subjectivity, it probably works in your favor, not against you.

First, in many classes these days grades may be almost completely objective, as multiple-choice tests are sadly common in overcrowded, underfunded classrooms. History is one of the subjects that’s less likely to do this at all, or at least not exclusively. Most of our assignments are usually written essays, or some other form of project that you may think is graded subjectively, because the professor reads your work and then slaps a grade on it, and you may have so little idea what happens between those two steps that it might as well be random.

It’s not random. (Well, everyone has heard an anecdote to the contrary, but those are mostly jokes, made in the throes of the abject torture profs actually go through when they grade.)

It’s increasingly common these days for essays to be graded according to a rubric. Rubrics break down an assignment into component parts, often attaching some point value to each part. These are mainly intended to communicate more clearly to the student what the instructor is looking for, and to show relative strengths and weaknesses in different areas of an assignment. But there’s still a judgment being made on how to assign points — a number — to something like your writing style, argument, factual accuracy, creativity, etc. None of these things can really be reduced entirely to a number, so there is a certain amount of arbitrariness involved. But not very much, because the instructor is grading your essay compared to others by students in the same class. The quality of your argument may not be fully represented by, say, the number 18 out of 20. But if the quality of argument across a class of 30 students ranges from 20 to 5, and you’re an 18, you know that you’re doing very well, significantly above the median, but not quite at the top of the class. That’s real, though limited, information.

How does an instructor judge your work? How can she be sure yours ranks at 18 — that those handful of students who got 19 or 20 definitely did turn in work measurably more successful than yours? For one thing, things that may still feel really amorphous to you, like what an “arguable thesis topic” looks like or the level of specificity in your language choice are not at all amorphous to a professor who has been writing and reading these kinds of statements literally many thousands of times, day in and day out for years on end. Examples:

“The Bolsheviks won the Civil War because of their geo-strategic advantages” is an arguable topic, and therefore acceptable.

(That thesis statement should be followed by a detailed explanation of the specific geo-strategic advantages that the Bolsheviks did have, and that the Whites did not have — see? Arguable)

“Stalin’s purges were caused by his lust for power” is not arguable, and therefore not an acceptable thesis statement.

(What would you follow this statement with? A series of repetitive statements that all essentially say, “Stalin was a bad man. Real bad.” Believe me, I’ve seen it. But that’s not evidence supporting the thesis — it’s a circular restatement of the thesis over and over. Because that’s all you can do with something as amorphous “Stalin was bad because Stalin was bad” — it’s not arguable.)

[Note that whether or not your prof agrees with your thesis is totally irrelevant here — your prof is reading THOUSANDS of pages all saying basically the same things. She really just doesn’t care either way what your thesis is, only that it is actually a workable thesis, demonstrating that you understand fundamental concepts. She wants to be done grading already.]

How do we distinguish between “specific” languages choices and vague ones?

It’s easy to see that the sentence “Lenin was a ruthless leader” is vague when another essay states, “Lenin’s NEP was an ideological compromise that divided the Party and made Stalin’s manipulation of factions possible.” The second sentence is not only better writing, and more convincing as part of an argument, but also tells your prof that you actually know what happened and how and why it mattered.

Do you see how the difference between those two statements is both obvious, and objective? Multiply that by a million little judgments of exactly the same kind, and that’s how we can grade fairly.

Also, remember that your grade is not an absolute value that sums up everything about your work (let alone about you — this is not you as a person under judgment, but the words on a page that you turned in). It simply ranks the relative quality of your work compared to that of the other students on a few basic criteria that the instructor deems most significant (hopefully, your instructor told you what these criteria are — if not, ask).

If you read the same set of papers from your class that your professor got, you too would be able to roughly rank them in terms of clarity, accuracy, and how convincing they were as arguments. Most likely, your ranking would actually come pretty close to that of your professor (I say this because I often have students grade themselves and their peers in exercises, and they’re always right on in their assessments). The professor’s experience allows her to do this much more quickly that you probably would, and her expertise allows her to catch the errors. But otherwise, grading is not all that mysterious and most people would do it in a very similar way in most cases.

Each professor reading each essay does ultimately make some degree of holistic assessment (“this essay is cogent and careful, but doesn’t go out of the box; that one is creative but doesn’t fully support its claims; this other one blows my mind; and this one here makes me wonder if the student even knows what course they’re in”). But when multiple instructors read the same essays, they nearly always end up with very similar assessments (I’ve seen this from experience as a TA in large courses where multiple people do read the same essays, and I’ve seen studies concluding the same thing).

This general agreement on relative success comes from three things: (1) the more specifically one defines what one is looking for in an essay, the easier it is to see where those goals are reached and where they aren’t (2) experience reading lots and lots of essays makes these things much simpler to spot than it seems could be possible to the novice who is writing this kind of essay for the first time and (3) the differences usually are pretty stark — in an average class of 30 with a grade spread from A to F, the difference between A work, C work, and F work is blindingly obvious. The tricky part is distinguishing between, say, a B and B+. Those judgements are very fine, and it is true that two experienced readers may disagree at that level. Luckily, those kinds of fine distinctions aren’t really significant in the long run.

(In my own case, I tend to use pluses and minuses as signals — a B+ tells that student that the essay is not A work, but it’s coming close, and would need only a small amount of revision to get there. On the other hand, a B- tells the student that while their essay was essentially accurate and complete and therefore belongs in the B category, it just barely reached that level in some respects, so that the student knows s/he would need to revise quite thoroughly to reach A-level work.)

Finally, there is the issue of bias. Students talk a lot (or so I overhear on campus) about this or that prof having favorites, or “not liking” them. The first point to make here is that professors are insanely busy people who usually see hundreds of students every semester. Honestly, most of us don’t have time to form actual opinions about individual students. But of course it’s true that a student who comes frequently to office hours and turns in excellent work is going to build a good reputation with faculty, and students who don’t show up to class, turn in late and/or shoddy work or don’t turn in work at all, and then beg for a higher grade because they “need” it are going to lose the respect of faculty. But either way, that reputation is far less likely to be reflected in grades than students think (it does enormously affect things like recommendation letters and how willing a professor is to spend time chatting and giving advice — which ultimately may matter more). Simply because grades are much less subjective than students realize, there’s really no need and little opportunity to manipulate grades in this way. Even if we assume a truly ill-willed instructor who has the time to bother artificially inflating some grades and deflating others, the chances are that sooner or later complaints about this practice will accrue with the department chair and deans, and eventually there will be consequences for the faculty member, which would discourage those few who would ever bother with such asinine and pointless manipulation anyway.

But there is one way in which the relatively more subjective process of grading an essay is different from the wholly objective process of grading a multiple-choice exam, and that works entirely in favor of the student, in my experience.

I experimented briefly with multiple-choice exams once, in a class in which students also did a lot of writing. My notion was that since students had mostly been assessed by multiple-choice in the past (I did a survey to confirm this) that I could eliminate the anxiety involved in learning a new format of demonstrating their knowledge, and just find out what they actually knew. Then in separate written essays I could focus more on teaching them how to write well. As it turned out, the entirely objective grades from the exams were abysmal, far lower than I usually see on essay exams or written short-answer exam questions also aimed at testing content knowledge. I did some surveying to find out why, and while I can’t be sure, the problem seems to have been a combination of two things. First, because there was less anxiety about a multiple-choice exam, students studied less. Second, and most relevant here, when I grade an essay, I am more flexible in how I award credit to the student. For example, if the student answers a multiple-choice question and gets it wrong, it’s wrong, period. But in an essay on the same subject, it may be clear that the students is confused about one factual detail, but does fundamentally understand the concepts under discussion, and has analyzed the material well. In that case I’ll dock a small point value for the one bit of confusion, but give credit for the general understanding. “Objective” assessment does not give me the leeway to do that.

This entry was posted in Profession, Teaching and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *