## Specifications Grading, Revisited

The great specifications grading craze of 2014 continues, with Evelyn Lamb joining in and Robert Talbert going so far as to actually design a course using specs grading.

I have now actually read the book, so all of my misunderstandings have been updated to ‘informed misunderstandings.’ The book contained a lot of useful references to the literature on assessment, and I am planning on reading a couple of her other books soon.

I will write a second post soon about the ways the book is challenging me to improve my courses soon.

tl;dr Executive Summary

Most of the examples of specifications in the book are, in my opinion, very shallow. This makes me skeptical specifications grading is useful in a problem-solving classroom. The one example that Nilson gives from a computer science course seems to be isomorphic to accumulation grading (it seems like Leff gives 10 points for each demonstration, which is equivalent to simply counting the number of successes, as in accumulation grading, and then multiplying by 10), and seems like it is closer to my description of accumulation grading than Nilson’s description of specification grading (unless a problem template is equivalent to a set of specifications, which seems reasonable to me for some—but not all&dash;types of problems).

Barriers to Implementing in a Mathematics Classroom

The reason why this system is called “specifications grading” is because each assignment comes with a set of detailed specifications to guide the students in creating it. I think that this is a great idea, and I will say more about how this idea may influence my teaching in the next section.

My concern is almost all of the examples of specifications from the book are “mechanistic.” “Mechanistic” is actually Nilson’s word from page 63. She was only referring to one particular set of specs, although this set does not seem to me to be much different from the other examples. Here are all of the examples of specs from the “Setting Specs” section of Chapter 5 that I found from skimming:

1. Do what the directions say.
2. Be complete and provide answers to all of the questions.
3. The assignment must contain at least $n$ words.
4. The assignment must be a good-faith effort.
5. All of the problems must be set up and attempted.
6. Focus on a couple ideas from the reading; explain how they relate to your everyday life.
7. Briefly summarize the article.
8. Describe in three or four sentences.
9. Read and answer the following two questions.
10. Read the article and summarize what you learned in five to eight sentences.
11. Write an essay of the following length.
12. Write an essay that is at least 1,250 words, answer the questions, include four resources (at most two can be from the internet), a personal reflection, and evidence of how the topic from the reading impacts society.
13. Adhere to the following requirements on length, format, deadliens, and submission via turnitin.com, and also summarize the essential points of the article and “provide your reaction to those essential points, including a thorough and thoughtful assessment of the implications for doing business, particularly as related to concepts and discussions from class” (page 60).
14. Write the specified number of pages (or words).
15. Cite references correctly.
16. Use recent references.
17. Organize this literature review around this controversy (or problem, or question).
18. The first paragraph should be about X. The second paragraph should be about Y. The paper should conclude with Z.
19. Use the following logical conjugations to “highlight the relationships among the different works cited” (p 61).
20. Write according to a certain length/for a certain purpose/for a certain audience.
21. Have the following citations.
22. Respond to the comments on the weblog.
23. Include at least one image.
26. Include 10 major concepts.
27. It must be at least 1,200 words.
28. The concept map must be at least four levels deep.
29. The performance must be at least three minutes long.
30. Research a topic and formulate a policy statement.
31. Create a persuasive recommendation.
32. Assess the accuracy of negative press and prepare a press release response.
33. “Submit a 12-line biography that highlights your professional strengths while still conveying some sense of your personality” (page 63).
34. Write 1,000 or 1,200 words.
35. “Explain your solution (policy stance, recommendation, press release) in the first paragraph” (p 63).
36. Make a three-point argument about why your idea is the best possible.
37. Use at least $n$ references, and the references must be of the following types.
38. Write with at most $n$ grammar/spelling/etc. errors.
39. Spend at least four hours working on this assignment.

Nilson then writes, “Then these are the only features you look for in your students’ work and the only criteria on which you grade” (page 64). That sounds reasonable, since that is the point of specs grading. However, although Nilson at one point writes, “These critiera are not all low level” (page 61), I have to disagree. It seems to me that these examples help students to, say, write a particular type of paper; it does not seem to me that these promote any actual learning goals like critical thinking, taking other people’s perspectives, etc. I would have hoped for some specifications like, “Use the speculative method for analyzing this text”

Perhaps I am underestimating the power of simply doing the assignment properly (with respect to specs like page counts) in helping students learn—I definitely have no idea about how this would help students outside of mathematics learn. But within mathematics, I imagine that I would get a lot of proofs where the variables are properly defined, the proper symbols are used, students use “therefore/thus/etc.” correctly, but the student does not demonstrate much of any understanding of what the ideas of the proof are.

In short, I think that these specifications could be fine for, say, a humanities class (altough I do not know enough about how to effectively teach a humanities course to be sure), but I have little confidence that it would be useful in a problem solving class.

Now, Nilson did provide examples from Lawrence Leff’s and Steve Stevenson’s computer science classes. Here is a quote from page 113:

Leff uses a point system…He defines several “genres” of points in which each genre represents one of the education goals (content mastery or cognitive skills) or performance goals (amount of work)…In Leff’s area, one major performance goal is writing a minimal number of lines of code. So he defines a genre for each essential piece of content mastery or skill (e.g. bit-diddling and arrays) and another for lines of code. Each assessment is worth so many points toward meeting one or more educational goals and one or more performance goals, and he sets a minimum number of points in each genre that students must accumulate to earn a passing grade for the course. This minimum number ensures that all passing students have done an acceptable job on at least one assessment of every required educational and performance goal.

Here is my take (I will use the ‘education goals’ and ‘performance goals’ vocabulary for the next several paragraphs): if you allow for partial credit, this last bit is just traditional grading situation within a specifications grading wrapper. You get some—but not all—of the benefits of specs grading, and you might get most of the drawbacks of traditional grading. Worse yet, this is essentially traditional grading on the part of the course that I am most interesting in—the education goals.

If you do not allow for partial credit (which Leff doesn’t), then this system is isomorphic to accumulation grading. But I am not convinced that this is specifications grading, since I am not certain that actual specifications are provided. Leff does provide his students with templates for the C-level problems; B-level problems require some modification of the template; the A-level problems require independent reading (often of computer manuals) to complete, and I imagine they might deviate more from the template.

So perhaps the template is the best we can do for specifications grading for problem-solving courses. I am not sure if I like this, though, since one of my goals is usually for a student to evaluate which method to use. For example, a D-level goal I had for my calculus students was to identify problems that can be solved with an integral (they literally had to just say, “This problem can be solved with an integral” to get credit; actually, they just need to write “D8”). I do not see how a template could cover this learning goal—the template would be doing all of the work for them!

Also, I am frankly less concerned with the performance goals and, in many cases, I think that the performance goals might actually work against the education goals. For instance, there are many cases where 20 good lines of code can completely replace 100 crappy lines of code. Having such performance goals could actually discourage students from trying to find the 20 good lines. Similarly with word counts/page number requirements: my take is that it is more difficult to write a good short paper than a good long paper, yet every spec that I list above required longer papers for the higher grades.

My purpose is not to question the writing and computer science instructors’ judgment here—they definitely know more about teaching writing and computer science than I ever will. Moreover,I could solve this by reversing the specs (e.g. requiring short proofs to get the A).

But my main point is this: when it comes down to it, I just don’t think that I care a lot about performance goals. I would rather just measure the educational goals. If a student can demonstrate my education goals in a three-page paper, I don’t want to give them a grade of “fail” because she did not meet the performance goals. Worse yet, I don’t want the more conscientious students to take an excellent three-page paper, realize it does not meet my specs, and then include two pages of fluff so that it does meet my specs.

One quick comment: I fully understand that, to meet the education goals, one must put in a certain number of reps. One takeaway that I have is that I might not be supporting my students to put these reps in enough in my courses. I will definitely consider whether I should add performance goals to get students to help encourage my students to get the reps in so that they can do the education goals. But before I do this, I need to make completely sure that I am not going to be adding a bunch of busywork for many of my students.

Conclusion: My word count is already over 1700, so I have done enough for an A. So I am going to stop here and put my report on the “good” things about the textbook in a separate post.

Final questions:

1. Am I underestimating how much students can learn by just adhering to the mechanistic specs?
2. Am I wrong about equating Leff’s system with accumulation grading?
3. Does a template constitute a set of specifications?
4. How would one set up specifications for, say, a typical calculus assignment?

### 9 Responses to “Specifications Grading, Revisited”

1. Pinky Says:

Was Specifications Grading created for higher education? It doesn’t look like it consists of any of the standards or benchmarks from the new Common Core standards.

2. Andy "SuperFly" Rundquist Says:

I really appreciate how thorough your comments are here. Having not read the book, I think for the moment that I agree with you. On Twitter the pushback to specs grading seems to be that we really should be giving all kinds of feedback to students and that “just putting hoops in front of them” is antithetical to what we want to do as teachers. I don’t think that’s a fair description of spec grading, but your post has me thinking again along those lines.

In my 40 person class this semester, I had to give students all kinds of feedback to show them that their beautifully written solutions weren’t enough to demonstrate command of the subject. I would make a scast asking them to really show why a particular negative sign was warranted, or asking them to plot a particular function in addition to calculating it at the point the book was interested in. It didn’t surprise me that the students didn’t do that the first time. Instead, I assumed that would happen and I built in the feedback cycle to deal with it. This spec grading approach makes me nervous because I’d feel like I’d have to really work hard on the specs to make sure there weren’t any loopholes. My system has not loops or holes. Just conversation. And the conversation keeps going as long as the student wants it to (usually it’s when I finally rate their work a “4”).

Just some late night (for me) thoughts. Thanks again for the great post.

• bretbenesh Says:

Hi Andy,

I need to think about the “just put hoops in front of them” idea. I wonder whether both specs grading AND my current grading system do that. Can you say why you don’t think it is a fair description of specs grading (and my other systems)? It would make me feel better if you had a convincing reason.

I have mainly been focusing on how to grade courses with a lot of client disciplines (calculus, linear algebra), and I have been trying to develop a grading system for these classes. For my courses for mathematics majors, I envision more of a conversation, as you describe. I need to think about whether it is even correct to grade differently in these two types of courses, but I also feel some pressure to make sure that my students are able to jump through the various hoops that my client disciplines want (and I suppose that mathematics is its own client discipline, since my colleagues complain about mathematics majors not being able to differentiate polynomials. So I am not trying to blame other departments here).

How much time was your 40-person class? It sounds like it would take a pretty dedicated profressor (i.e. “you”) to create screencasts for 40 students every class/week/whatever.

• Andy "SuperFly" Rundquist Says:

Fair description: You establish the type of work you’re looking for, including length, type, and quality (though that one’s hard) and you make different types of assignments for different types of grades. I guess I’m mostly stuck on the quality part and what appears like a loss of conversation.

In my “big” class, I did weekly quizzes for the first assessment and they did vids, office visits, and oral exams (2 of those scheduled during the semester) for reassessments.

• bretbenesh Says:

“Quality” is exactly where I am stuck, too! Well said. This is the part that is missing from specs grading for me.

3. How Specs Grading Is Influencing Me | Solvable by Radicals Says:

[…] « Specifications Grading, Revisited […]