## Posts Tagged ‘assessment’

### Talbert was right; I was wrong.

January 21, 2015

I was thinking about specifications grading over break, and I came to realize that Robert Talbert was right and I was wrong.

My particular complaint about specifications grading for mathematics classes—that it is unrealistic to expect students to be able to judge that their work is mathematically correct—still holds. But Robert came up with a very slight modification that I was too quick to write off.

Robert’s solution was to move from two possible grades per assignment—PASS or NO PASS— to three: PASS, NO PASS, and PROGRESSING. The idea is that you can create all of the specifications you want—including whether the work is mathematically correct—and grade according to whether students have met the specifications. The one difference is that you split your specifications into two groups. The first group contains the specifications that students can easily check themselves, such as “There are no spelling mistakes” or “All variables are defined prior to use.” Failure to meet any of these specifications leads to a grade of NO PASS.

The second group of specifications are ones that students cannot necessarily judge for themselves, such as determining whether the work is mathematically correct. If a student satisfies all of the specifications in the first group but misses any in this group, the student is assigned a grade of PROGRESSING for the assignment (the student receives a grade of PASS if she meets all of the specifications).

The only difference between NO PASS and PROGRESSING is how easily students can re-do the assignment. If the student receives a NO PASS, the student needs to spend a token to re-do the assignment; a student who receives a PROGRESSING may re-do the assignment without any cost.

I initially did not like this system because I thought it simply added the complex token system on top of allowing unlimited re-dos—I preferred simply letting students do an unlimited number of re-dos. But I have changed my mind. I now think that raising expectations on specifications that students can easily evaluate themselves is a completely reasonable thing to do, and penalizing students for simply not doing it does not seem so unreasonable.

The benefits are that students get in the habit of evaluating as much of their work as they possibly can, I get to grade higher-quality work, and the students get used to creating higher quality work. The costs are implementing the mildly complex token system (although keeping track shouldn’t be too hard) and some potential loss of goodwill after penalizing students. But the higher quality work argument wins in the end for me.

I am rather pleased at Talbert’s plan, because this gives me the grading plan for proof-based classes that I am looking forward to (I think that I am planning on sticking with accumulation grading for the non-proof classes).

So—thanks, Robert. I am sorry I didn’t realize how good your plan was right away.

December 18, 2014

I have really enjoyed our discussions of Specifications Grading. I have learned a lot from it, and I have enjoyed the conversations (which I will continue to engage in). I particularly want to thank Theron Hitchman, Robert Talbert, and Andy Rundquist for helping me think through this. I feel like I kept asking the same questions, and everyone was very patient with me. In this post, I will post the answers I eventually came to to those questions.

My conclusion: my grading is going to become more like specifications grading, but I am not going to fully use it. I want to give my students specifications on how to do a good assignment; that is a great idea. But one of my specifications absolutely has to be “the mathematics is correct”—I cannot live with less than that.

But putting a correctness requirement in the specifications is problematic. Here is how Nilson introduces specifications (page 57, all emphasis is her’s):

For assignments and tests graded pass/fail, students have to understand the specs. They must know exactly what they need to do to make their work acceptable. Therefore, we have to take special care to be clear in laying out our expectations, and we cannot change or add to them afterward.

The problem is that the point of mathematics classes is arguably to teach the students when mathematics is correct and when it isn’t. This is obviously a huge simplification, but it would be ridiculous to expect students coming into a mathematics class to already know what is correct—it is our job to help the students learn this. As such, I think that it is not in the spirit of Nilson’s specifications grading to include a correctness specification (the same may be true of requiring that writing be clear).

Now I do not think that Nilson’s book was written on stone tablets, and Nilson herself has suggested that it may need to be modified for mathematics. I am happy to adapt specifications grading to make it work, but there is another issue: the tokens.

Viewed one way, the tokens are a way of allowing students a chance to reassess. I like that thought, but I can’t help but to view things the opposite way: tokens are a way of limiting reassessment chances. [Late edit: I think that specifications grading is a huge improvement over traditional grading, since it allows for reassessments. I just think that there are already better grading systems out there for mathematics courses. Thanks to Theron Hitchman for reminding me that I should say this.]

So we have a correctness specification that students do not understand, and they will not receive a passing grade if the work is not correct. Yet they only have limited chances to reassess due to the token system. So here is the situation I fear:

1. A student comes to the course without knowing how to create correct mathematics.
2. The student is given an assessment that says they are required to write correct mathematics to get a passing grade.
3. The student, still in the process of learning, turns in an incorrect assignment and receives a failing grade on the assignment.
4. The student uses a token to reassess; they may or may not get the mathematics correct on the reassessment because mathematics is hard. Maybe the students needs to use a second token to re-reassess.
5. This process repeats 3–4 times until the student is out of tokens.
6. The student never gets to reassess again, and therefore does not learn as much.

This is very similar to the reasons why Robert Talbert is considering moving from a PASS/FAIL specifications grading system to a PASS/PROGRESSING/FAIL system, where a grade of PROGRESSING is allowed to reassess without costing a token.

Here are a couple of other modifications that could avoid this:

1. Give students a lot of feedback-only assignments prior to the graded assignments to help students learn what it means to be correct.
2. Give students a lot of tokens so that they can get the feedback they need.

But if I give a lot of feedback-only assignments, why not give students credit it they demonstrate mastery? And if there are a lot of tokens, I think you may as well just allow an unlimited reassessments—you will probably come out ahead, time-wise, because you will not need to do the bookkeeping to keep track of the tokens (my opinion is that it is probably better to give unlimited reassessment opportunities over a PASS/PROGRESSING/FAIL system, too).

One clarification: when I say “unlimited” opportunities for reassessment, I do not literally mean “unlimited.” For one, the students will be limited by the calendar—there should not usually be reassessments once the semester ends. I am also fine limiting reassessments to class time, and not every class period needs to be for reassessment.

So I think that it is unfair to require a student’s mathematics be correct to pass an assignment, but then limit the number of reassessments. This is why I am not going to use specifications grading in my mathematics classes (I will just take some of the ideas of specifications grading and graft them onto accumulation grading).

That said, I like the general idea, and I would likely use it if and when I teach our First Year Seminar class. This is the class that Nilson mainly wrote about in the book, and I think that specifications grading could be fantastic for that class. But not for one of my mathematics classes.

Questions for you:

1. Is there a way that we can break down the “correct” specification so that the student can know it is correct prior to handing it in? This is reasonable for computational questions (use wolframalpha!), but I don’t see how to do it any other type of question.
2. Are there alternatives to the “lots of feedback-only assignments”/”lots of tokens”/”more than two possible grades” solutions to the issues above?

### How Specs Grading Is Influencing Me

December 17, 2014

I hope I have not come off too negatively about specs grading. Reflecting on what I have written, it could seem like I am trying to discourage people from using it. I hope that is not the case. I am engaging in this conversation so much because I am very hopeful about it.

So when I say that the examples of specs given in the book are “shallow,” I do not intend this to say that specs grading is bad. Rather, what I mean (but say poorly) is that the examples of specs do not capture what I would want in a mathematics class. To put a word count requirement on a proof would be a very shallow way to grade, but I do not necessarily think that word counts are bad for other subjects (at the very least, I don’t know enough how to teach other subjects to make a judgment).

So this whole process is mainly to help me figure out how to make specifications grading work in my courses. I apologize if it sounds complainy.

So I am going to switch gears to describe the positive things I learned from the book.

1. I should include specifications. I see no reason not to explicitly tell students what my expectations are; I just need to stop being lazy and do it.

For instance, I collected Daily Homework in my linear algebra class last spring. It was graded only on completion, but some students did not know what to do when they got stuck or didn’t understand the question. If I had explicitly given them a set of specifications for Daily Homework that included something like, “If you cannot solve the problem, you should show me how the problem relates to $\mathbb{R}^2$” (we often worked in abstract vector spaces), I think that I would have been much happier with the results.

Similarly, I gave my students templates (as Lawrence Leff does) for optimization and $\delta$$\epsilon$ proofs in calculus, but I could be doing more of that.

The one catch is that I do not know how to specify for “quality” (thanks, Andy!). I think I have been annoying people on Google Plus trying to figure out how to solve this—sorry. But this is essential for my proofs-based courses. If I can’t figure out how to specify for quality in those courses, I will likely have to modify specs grading beyond recognition if I am going to use it in those courses.

2. To get a higher grade in my course, I have been requiring students to master more learning goals. This is fine, but the book suggested that I could also consider having students meet the same learning goals, but have students try harder problems if they want a higher grade. Nilson’s metaphor is that the former is “more hurdles,” whereas the latter is “higher hurdles.”

I really like this idea, and I can sort of imagine how that could work. In my non-tagging system, I could give three versions of the same problem: C-level, B-level, and A-level. For optimization in calculus, I could imagine that a C-level problem would give the function to be optimized, a B-level question wouldn’t, and an A-level would just be a trickeier version of a B-level question.

This would require me to write more questions AND it would require me to be able to accurately judge the relative difficulty of problems. But I think that both are doable, and I like the idea.

3. Specs grading requires that students spend tokens before being allowed to reassess. The thinking is that if reassessments are scarce, students will put forth more effort the first time. The drawback is that each assessment has higher-stakes.

I definitely want to keep things low-stakes, but I am also finding that students aren’t working as hard as they should until the end of the semester. Using a token-like system could be a partial-solution to that.

4. The book reminds me that I should be assigning things that are not directly related to course content; the book calls them meta-assignments. Here is a relevant quotation:

Other fruitful activities to attach to standard assignments and tests are wrappers, also called meta-assignments, that help students develop metacognition and become self-regulated learners…Or to accompany a standard problem set, he might assign students some reflective writing on their confidence before and after solving each problem or have them do an error analysis of incorrect solutions after returning their homework (Zimmerman, Moylan, Hudesman, White, & Flugman, 2011).

One such idea that I had to help the students start working earlier in the semester (see my previous item) is to have students develop a plan of action for the semester. Determine a study schedule, set goals for when to demonstrate learning goals, and (if they want to) determine penalities for missing those goals.

5. I should consider including some “performance specs” (which simply measures the amount of work, not the quality of the work) in my grading. I don’t like this philosophically, but I think that it might help my students to practice more.

So even if I don’t convert to specifications grading, I have already learned a lot from it.

December 16, 2014

The great specifications grading craze of 2014 continues, with Evelyn Lamb joining in and Robert Talbert going so far as to actually design a course using specs grading.

I have now actually read the book, so all of my misunderstandings have been updated to ‘informed misunderstandings.’ The book contained a lot of useful references to the literature on assessment, and I am planning on reading a couple of her other books soon.

I will write a second post soon about the ways the book is challenging me to improve my courses soon.

tl;dr Executive Summary

Most of the examples of specifications in the book are, in my opinion, very shallow. This makes me skeptical specifications grading is useful in a problem-solving classroom. The one example that Nilson gives from a computer science course seems to be isomorphic to accumulation grading (it seems like Leff gives 10 points for each demonstration, which is equivalent to simply counting the number of successes, as in accumulation grading, and then multiplying by 10), and seems like it is closer to my description of accumulation grading than Nilson’s description of specification grading (unless a problem template is equivalent to a set of specifications, which seems reasonable to me for some—but not all&dash;types of problems).

Barriers to Implementing in a Mathematics Classroom

The reason why this system is called “specifications grading” is because each assignment comes with a set of detailed specifications to guide the students in creating it. I think that this is a great idea, and I will say more about how this idea may influence my teaching in the next section.

My concern is almost all of the examples of specifications from the book are “mechanistic.” “Mechanistic” is actually Nilson’s word from page 63. She was only referring to one particular set of specs, although this set does not seem to me to be much different from the other examples. Here are all of the examples of specs from the “Setting Specs” section of Chapter 5 that I found from skimming:

1. Do what the directions say.
2. Be complete and provide answers to all of the questions.
3. The assignment must contain at least $n$ words.
4. The assignment must be a good-faith effort.
5. All of the problems must be set up and attempted.
6. Focus on a couple ideas from the reading; explain how they relate to your everyday life.
7. Briefly summarize the article.
8. Describe in three or four sentences.
10. Read the article and summarize what you learned in five to eight sentences.
11. Write an essay of the following length.
12. Write an essay that is at least 1,250 words, answer the questions, include four resources (at most two can be from the internet), a personal reflection, and evidence of how the topic from the reading impacts society.
13. Adhere to the following requirements on length, format, deadliens, and submission via turnitin.com, and also summarize the essential points of the article and “provide your reaction to those essential points, including a thorough and thoughtful assessment of the implications for doing business, particularly as related to concepts and discussions from class” (page 60).
14. Write the specified number of pages (or words).
15. Cite references correctly.
16. Use recent references.
17. Organize this literature review around this controversy (or problem, or question).
18. The first paragraph should be about X. The second paragraph should be about Y. The paper should conclude with Z.
19. Use the following logical conjugations to “highlight the relationships among the different works cited” (p 61).
20. Write according to a certain length/for a certain purpose/for a certain audience.
21. Have the following citations.
22. Respond to the comments on the weblog.
23. Include at least one image.
26. Include 10 major concepts.
27. It must be at least 1,200 words.
28. The concept map must be at least four levels deep.
29. The performance must be at least three minutes long.
30. Research a topic and formulate a policy statement.
31. Create a persuasive recommendation.
32. Assess the accuracy of negative press and prepare a press release response.
33. “Submit a 12-line biography that highlights your professional strengths while still conveying some sense of your personality” (page 63).
34. Write 1,000 or 1,200 words.
35. “Explain your solution (policy stance, recommendation, press release) in the first paragraph” (p 63).
36. Make a three-point argument about why your idea is the best possible.
37. Use at least $n$ references, and the references must be of the following types.
38. Write with at most $n$ grammar/spelling/etc. errors.
39. Spend at least four hours working on this assignment.

Nilson then writes, “Then these are the only features you look for in your students’ work and the only criteria on which you grade” (page 64). That sounds reasonable, since that is the point of specs grading. However, although Nilson at one point writes, “These critiera are not all low level” (page 61), I have to disagree. It seems to me that these examples help students to, say, write a particular type of paper; it does not seem to me that these promote any actual learning goals like critical thinking, taking other people’s perspectives, etc. I would have hoped for some specifications like, “Use the speculative method for analyzing this text”

Perhaps I am underestimating the power of simply doing the assignment properly (with respect to specs like page counts) in helping students learn—I definitely have no idea about how this would help students outside of mathematics learn. But within mathematics, I imagine that I would get a lot of proofs where the variables are properly defined, the proper symbols are used, students use “therefore/thus/etc.” correctly, but the student does not demonstrate much of any understanding of what the ideas of the proof are.

In short, I think that these specifications could be fine for, say, a humanities class (altough I do not know enough about how to effectively teach a humanities course to be sure), but I have little confidence that it would be useful in a problem solving class.

Now, Nilson did provide examples from Lawrence Leff’s and Steve Stevenson’s computer science classes. Here is a quote from page 113:

Leff uses a point system…He defines several “genres” of points in which each genre represents one of the education goals (content mastery or cognitive skills) or performance goals (amount of work)…In Leff’s area, one major performance goal is writing a minimal number of lines of code. So he defines a genre for each essential piece of content mastery or skill (e.g. bit-diddling and arrays) and another for lines of code. Each assessment is worth so many points toward meeting one or more educational goals and one or more performance goals, and he sets a minimum number of points in each genre that students must accumulate to earn a passing grade for the course. This minimum number ensures that all passing students have done an acceptable job on at least one assessment of every required educational and performance goal.

Here is my take (I will use the ‘education goals’ and ‘performance goals’ vocabulary for the next several paragraphs): if you allow for partial credit, this last bit is just traditional grading situation within a specifications grading wrapper. You get some—but not all—of the benefits of specs grading, and you might get most of the drawbacks of traditional grading. Worse yet, this is essentially traditional grading on the part of the course that I am most interesting in—the education goals.

If you do not allow for partial credit (which Leff doesn’t), then this system is isomorphic to accumulation grading. But I am not convinced that this is specifications grading, since I am not certain that actual specifications are provided. Leff does provide his students with templates for the C-level problems; B-level problems require some modification of the template; the A-level problems require independent reading (often of computer manuals) to complete, and I imagine they might deviate more from the template.

So perhaps the template is the best we can do for specifications grading for problem-solving courses. I am not sure if I like this, though, since one of my goals is usually for a student to evaluate which method to use. For example, a D-level goal I had for my calculus students was to identify problems that can be solved with an integral (they literally had to just say, “This problem can be solved with an integral” to get credit; actually, they just need to write “D8”). I do not see how a template could cover this learning goal—the template would be doing all of the work for them!

Also, I am frankly less concerned with the performance goals and, in many cases, I think that the performance goals might actually work against the education goals. For instance, there are many cases where 20 good lines of code can completely replace 100 crappy lines of code. Having such performance goals could actually discourage students from trying to find the 20 good lines. Similarly with word counts/page number requirements: my take is that it is more difficult to write a good short paper than a good long paper, yet every spec that I list above required longer papers for the higher grades.

My purpose is not to question the writing and computer science instructors’ judgment here—they definitely know more about teaching writing and computer science than I ever will. Moreover,I could solve this by reversing the specs (e.g. requiring short proofs to get the A).

But my main point is this: when it comes down to it, I just don’t think that I care a lot about performance goals. I would rather just measure the educational goals. If a student can demonstrate my education goals in a three-page paper, I don’t want to give them a grade of “fail” because she did not meet the performance goals. Worse yet, I don’t want the more conscientious students to take an excellent three-page paper, realize it does not meet my specs, and then include two pages of fluff so that it does meet my specs.

One quick comment: I fully understand that, to meet the education goals, one must put in a certain number of reps. One takeaway that I have is that I might not be supporting my students to put these reps in enough in my courses. I will definitely consider whether I should add performance goals to get students to help encourage my students to get the reps in so that they can do the education goals. But before I do this, I need to make completely sure that I am not going to be adding a bunch of busywork for many of my students.

Conclusion: My word count is already over 1700, so I have done enough for an A. So I am going to stop here and put my report on the “good” things about the textbook in a separate post.

Final questions:

1. Am I underestimating how much students can learn by just adhering to the mechanistic specs?
3. Does a template constitute a set of specifications?
4. How would one set up specifications for, say, a typical calculus assignment?

December 8, 2014

Thursday, Robert Talbert and Theron Hitchman discussed the book Specifications Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time by Linda Nilson on Google Plus (go watch the video of the discussion right now!)

First, I would like to say that using Google Hangouts like this is not done enough. Robert and Theron wanted to discuss the book, but live in different states. Using Skype or Google Hangouts is the obvious solution, but not enough people make the conversation public, as Robert and Theron did. I learned a lot from it, and I hope that people start doing it more (including me). Additionally, I think that two people having a conversation is about the right number. I found it more compelling than when I have watched panel-type discussions of 4–6 people on Google Hangouts.

As some of you know, I have pompously started referring to my grading system as Accumulation Grading. When Robert first introduced me to the Nilson’s book, I ordered it through Interlibrary Loan immediately. It has not arrived yet, so I probably should wait until I read it before I start comparing Specification Grading to Accumulation Grading.

But I am not going to wait. The people are interested in Specification Grading now, and so I am going to compare the two now. Just know that my knowledge of Specification Grading is based on 30 minutes of Googling and 52 minutes and 31 seconds of listening to two guys talk about it on the internet. I will read the book as soon as it arrives, but feel free to correct any misconceptions about Specification Grading that I have (there WILL be misconceptions).

Here is how to implement Specification Grading in a small, likely misconceived nutshell:

1. Create learning goals for the course.
2. Design assignments that give the students opportunities to demonstrate they have met the learning goals.
3. Create detailed “specifications” on what it means to adequately do an assignment. These specifications will be given to the students to help them create the assignment.
4. “Bundle” the assignments according to grade. That is, determine which assignments a B-level student should do, label them as such, and then communicate this to the students. This has the result that a student aiming for a B might entirely skip the A-level assignments.
5. Grade all assignments according to the specifications. If all of the specifications are met, then the student “passes” that particular assignment. If the student fails to meet at least one of the specifications, the student fails the assignment. There is no partial credit.
6. Give each student a number of “tokens” at the beginning of the semester that can be traded for second tries on any assignment. So if a student fails a particular assignment, the student can re-submit it for potentially full credit. You may give out extra tokens throughout the semester for students who “earn” them (according to your definition of “earn”).
7. Give the student the highest grade such that the student passed all of the assignments for that particular grade “bundle.”

Recall that Accumulation Grading essentially counts the number of times a student has successfully demonstrated that she has achieved a learning goal (students accumulate evidence that they are proficient at the learning goals). My sense is that Accumulation Grading is a type of Specifications Grading, only with two major differences: in Accumulation Grading, the specifications are at the learning goal level, rather than the assignment level, and also the token system is replaced with a policy of giving students a lot of chances to reasses.

Let’s compare the two point-by-point (the Specification Grading ideas are in bold):

1. Create learning goals for the course.
This is exactly the same as in Accumulation Grading.

2. Design assignments that give the students opportunities to demonstrate they have met the learning goals.
This is exactly the same as in Accumulation Grading. In Accumulation Grading, this mostly takes the form of regular quizzes.

3. Create detailed “specifications” on what it means to adequately do an assignment. These specifications will be given to the students to help them create the assignment.
This is slightly different. In Accumulation Grading, the assignment does not matter except to give the student an opportunity to demonstrate a learning goal. So whereas Specifications Grading is focused on the assignments, Accumulation Grading is focused on the learning goals.

To compare: in Specifications Grading, students might be assigned to write a paper on the history of calculus. One specification might be that the paper has to be at least six pages long.

In Accumulation Grading, this would not matter— a four-page paper that legitimately meets some of the learning goals would get credit for those learning goals. If you wanted students to write a six page paper, you would create a learning goal that says, “I can write a paper that is at least six pages long.”

4. “Bundle” the assignments according to grade. That is, determine which assignments a B-level student should do, label them as such, and then communicate this to the students. This has the result that a student aiming for a B might entirely skip the A-level assignments.

This is technically happens in Accumulation Grading, as you can see at the end of my syllabus:

However, something else is going on, too. The learning goals are really the things that are “bundled,” as you can see in the list of learning goals below:

I love this flexibility. Every student (at least those who wish to pass, anyway) need to know that a derivative tells you slopes of the tangent lines and/or an instantaneous rates of change, but only student who wish to get an A needs to figure out how to do $\delta-\epsilon$ proofs on quadratic functions.

5. Grade all assignments according to the specifications. If all of the specifications are met, then the student “passes” that particular assignment. If the student fails to meet at least one of the specifications, the student fails the assignment. There is no partial credit.

This is similar to Accumulation Grading, but not exactly the same. In both, there is no partial credit. The difference is that—since the main unit of Accumulation Grading is the learning goal, not the assignment—students will have multiple ‘assignments’ (really, quiz questions) that get at the same learning goal. Students can fail many of these ‘assignments’ as long as they demonstrate mastery of the learning goals eventually.

6. Give each student a number of “tokens” at the beginning of the semester that can be traded for second tries on any assignment. So if a student fails a particular assignment, the student can re-submit it for potentially full credit. You may give out extra tokens throughout the semester for students who “earn” them (according to your definition of “earn”).

There are no tokens in Accumulation Grading. Rather, students get many chances at demonstrating a particular learning goal.

7. Give the student the highest grade such that the student passed all of the assignments for that particular grade “bundle.”

This is exactly the same in both grading systems.

So the fundamental difference seems to be that Accumulation Grading focuses on how well students do at the learning goals, while Specifications Grading focuses on how well students do on the assignments. As long as the assignments are very carefully constructed and specified, I don’t really see one as being “better” than the other. However, it seems more natural to focus on learning goals rather than assignments, as the assignments are really just proxies for the learning goals; I would rather focus on the real thing than the proxy.

Another major difference is that Specification Grading uses a token system while Accumulation Grading automatically gives students many, many chances at demonstrating proficiency. One system’s advantage is the other’s disadvantage here:

• Accumulation Grading requires creating a lot of assignments (which have mostly been quiz questions for me), whereas Specification Grading requires fewer assignments. Moreover, Accumulation Grading requires that a lot of time be spent on reassessment—either in class or out (this is probably a positive in terms of learning, but definitely a negative with respect to me having a lot of class time available for non-reassessment activities and getting home for dinner on time).
• Accumulation Grading ideally requires some time for students to learn each learning goal between when it is introduced and when the semester ends. This is because the student needs to demonstrate proficiency multiple times (usually four times) during the semester. So either the last learning goal must be taught well before the end of the semester, or the Accumulation Grading format must be tweaked for some subset of the learning goals (you could use a traditional grading system just for the learning goals at the end of the semester). I do not think that this is an issue for Specifications Grading. On the other hand, I do not think that Specifications Grading would give the same level of confidence in a student’s grade, as it does not necessarily require multiple demonstrations of each learning goal.
• I am concerned that the token system could hurt the professor-student relationship, whereas freely giving reassessments helps it. Specifically, I am concerned that it might seem overly arbitrary and harsh to deny a tokenless student a chance to reassess—I could see being frustrated with the professor toward the end of the term for not allowing a reassessment. On the other hand, the professor in Accumulation Grading is the hero, since she allows students as many times as possible to reassess.

That last sentence is a half-truth, since there are limitations. For instance, I only allow reassessments in class now, so that immediately limits the number of possible reassessments (my life got really crazy when I allowed out-of-class reassessments). But that seems to me to be more reasonable than the token system, since class days are not arbitrarily set by the professor, but the tokens are.

The main thing working against Accumulation Grading is that one must figure out how to reassess in a reasonable way. I have been compressing my semester to fit more quizzes in at the end of the semester, and that has worked well for me. Other people may be fine doing reassessments outside of class.

Please correct me on where I am wrong on any detail of Specifications Grading. Right now, I am still leaning toward Accumulation Grading, although I hope that Specifications Grading blows me away—I am always looking for a better system, and I will gladly switch if I find it better.

### Quiz-Video Combination Instead of Lecture

November 14, 2014

Here is a reminder of how I have been organizing my classes: I create learning goals for the course, and spend roughly two-thirds of the semester teaching them the content. The grading system is set up so that students have to demonstrate proficiency of each learning goal $n$ times, where $n \approx 4$. The last third of the semester is spend 50/50 on quizzes and review.

I have felt a tiny bit guilty about this format for two reasons. First, I was concerned that I was depriving the students of 1/3 of the traditional instruction time. Second, I felt like a slacker because I don’t usually have to prep much for classes in the last third of the semester (also during the quizzes: I am writing this post during one of their quizzes, and I am slightly uncomfortable that they are working so hard on the class and I am not).

But I don’t feel all that bad about things now, because I realized a couple of things.

First, taking quizzes is about as active as learning gets (and maybe there are Testing Effect-type effects, especially since I purposefully spread out the learning goals on the quizzes). So students are very actively thinking about the material during the quizzes. So I am definitely giving them learning experiences, which goes a long way to alleviate my first source of guilt.

Also, I spent a lot of time creating solutions for every quiz problem. These are posted right after the quizzes so that students can get immediate feedback. This makes me feel better about my current lack of prep time—especially since I am still spending a decent amount of time writing the quizzes.

This also feels a bit better about my students’ learning experience in the last third of the semester. One of the ways I compress the material down to two-thirds of the semester is that I go lighter on the number of examples I give in the first part of the semester. However, my students probably have at least as many examples from the videos by this point in the semester than they would have gotten under a more usual course structure, and they have the added benefit of having had to attempt the problem first before viewing the solution (I am thinking about trying to make this the norm as much as possible. Ideally, things would go: try a problem on your own, try the problem with your team, see me do the problem, then try a similar problem on your own. This is a different blog post, though).

Finally, my overall impression is that the course is going well. I think that students are learning, and they are probably learning more than previous times I have taught the course.

So how much am I simply rationalizing here, and how much of my reasoning is sound?

### Assessment Idea for Calculus I: Near Final Draft

August 18, 2014

Sorry about the two month hiatus—Dana Ernst sucked me into a great research project about games with finite groups.

I previously wrote about my plan for calculus I. Basically, it is this:

1. I give the students a list of learning goals. These are much finer than I have done in the past, which means that there are many more of them.
2. I give students quizzes in class.
3. For each quiz question, the student solves the problem as best as she can.
4. Here is the important part: after solving the problem, the student reviews her work and determines which learning goals she has met.
5. She indicates exactly where she met each learning goal. If she does not claim a learning goal, she does not get credit for the learning goal.

This basic idea has not changed; I have decided to go for this to see how it works. I have made a couple of changes since last time, though:

1. I change my learning goals (see below for a list).
2. I am only requiring that they demonstrate mastery of each learning goal four times, rather than the six that I previously had. There just is not enough time to assess that much, considering that I try to give my students at least twice as many attempts as is required. I am able to cut from six to four by scaling down homework: I previously required at least three demonstrations on a quiz and up to three demonstrations on homework, but I have changed this to requiring at least three demonstrations on a quiz and up to one demonstration on homework.
3. I change my quiz template to include a margin on the left side. This is where students will write their code for each achieved learning goal. They then need to circle exactly where the learning goal is met, and connect that circle to the code. This should make the quizzes easier to grade and easier to read (less messy). I think that I am not going to require that this be done in a different colored pen, either.

I think that is mainly it. I have included drafts of my learning goals and syllabus (sorry for being three weeks late on this, Robert) below. Please see my previous post to get an idea of what students will do with their quizzes.

As always: feedback is welcome.

View this document on Scribd
View this document on Scribd

### Assessment Idea for Calculus I: Feedback desperately wanted!

June 25, 2014

I am planning an overhaul of Calculus I for the fall. I used a combination of Peer Instruction and student presentations in Fall 2012, and I was not completely happy with it.

So I am starting from scratch. I am following the backwards design approach, and I feel like I am close to being done with my list of goals for the students. Here is my draft of learning goals, sorted by the letter grades they are associated with:

View this document on Scribd

I previously had lists of “topics” (essentially “Problem Types”). These lists had 10–20 items, and tended to be broad (e.g. “Limits,” “Symbolic derivatives,” “Finding and classifying extrema”). This list will give me (and, I hope, the students) more detailed feedback on what they know.

This differs from how I did things in the past, in that I used to list “learning goals” as very broad topics (so they weren’t learning goals at all, but rather “topics” or “types of problem”). Students would then need to demonstrate their ability to do these goals on label-less quizzes.

The process would be this:

1. A student does a homework problem or quiz problem.
2. The student then “tags” every instance of where she provided evidence of a learning goal.
3. The student hands in the problem.
4. The grader grades it in the following way: the grader scans for the tags. If the tags correspond to correct, relevant work AND if the tag points to the specific relevant part of the solution, the students gets credit for demonstrating that she understands that learning goal. Otherwise, no.
5. Repeat for each tag.
6. Students need to demonstrate understanding/mastery/whatever for every learning goal $n$ times throughout the semester.

Below are three examples of how this might be done on a quiz. The first example is work by an exemplary student: the student would get credit for every tag here (In all three of the examples, the blue ink represents the student work and the red ink indicates the tag).

View this document on Scribd

The second example has the same work and the same tags, but the student would not get credit due to lack of specificity; the student should have pointed out exactly where each learning goal was demonstrated.

View this document on Scribd

The third example (like the first) was tagged correctly. However, there are mistakes and omissions. In the third example, the student failed to claim credit for the “FToCI” and the “Sum/Difference Rule for Integrals.” Because of this, the student would not get credit for these two goals (even though the student did them; the point is to get students reflecting on what they did).

Additionally, the student incorrectly took the “antiderivative of the polynomial,” which caused the entire solution to the “problem of motion” to be wrong. Again, the student would not get credit for these two goals.

However, the student does correctly indicate that they know “when to use an integral,” could apply the “Constant Multiple Rule for integrals,” and “wrote in complete sentences.” The student would get credit for these three.

View this document on Scribd

I like this method over my previous method because (1) I can have finer grained standards and (2) students will not only “do,” but also reflect on what they did. I do not like this method because it is more cumbersome than other grading schemes.

My current idea (after talking a lot to my wife and Robert Campbell, and then stealing an idea from David Clark) is to require that each student show that he/she can do each learning goal six times, but up to three of them can be done on homework (so at least three have to be done on quizzes). I usually have not assigned any homework, save for the practice that students need to do to do well on the quizzes. This is a change in policy that (1) frees up some class time, (2) helps train the students on how to think about what the learning goals mean, (3) force some extra review of the material, (4) provide an additional opportunity to collaborate with other students, and (5) provide an opportunity for students to practice quiz-type problems.

My basic idea is that I will ask harder questions on the homework, but grade it more leniently (which implies that I will ask easier questions on the quizzes, but grade it more strictly).

I have been relying solely on quizzes for the past several years, so grading homework will be something that I haven’t done for a while. I initially planned on only allowing quizzes for this system, too, but it seemed like things would be overwhelming for everyone: we would likely have daily quizzes (rather than maybe twice per week); I would likely not give class time to “tag” quizzes, so students would do this at home (creating a logical nightmare); I would probably have to spend a lot more time coaching students on how to tag (whereas they now get to practice it on the homework with other people).

Let’s end this post, Rundquist-style, with some starters for you.

1. This is an awesome idea because …
2. This is a terrible idea because …
3. This is a good idea, but not worth the effort because …
4. This is not workable as it is, but it would be if you changed …
5. Homework is a terrible idea because …
6. You are missing this learning goal …
7. My name is TJ, and you are missing this process goal …

### The Importance of Feedback

May 22, 2014

My semester is ended, and now is the time to write some post-mortem entries into this weblog. The first idea is something that is probably obvious, but I over-thought it. I have been been putting more of the course’s assessment at the end of the semester lately, thinking that that is when students are most prepared to do well.

And I am correct, but I took it too far. I did not give my students enough regular feedback during the first part of the semester this spring. My education students actually pointed this out to me—I realized that they were correct as soon as they said it (it also reinforced that they are pretty on top of education issues). Fortunately, I get to teach that course for education majors again this fall; I will make things right this time.

Additionally, I am working on ways of getting students immediate feedback. Clickers are one way of doing this, but I also might have students start grading their own quizzes (I would provide a couple of solution keys and a marker for them) and doing more computer-graded stuff.

### Students Figure Out Which Standards They Meet

April 16, 2014

I am starting to think about planning for Calculus I for next year, and there is an idea I would like to try: I want to stop labelling problems according to the corresponding standard, and put the burden on the student to determine which standards they met. I have tried this before (as have other people), but I would implement it different from how I did it last time.

So each quiz would go like this: I give them several (unlabelled) quiz problems. The students do what they can. When they are done, they submit their work. However, when they submit, we make some sort of a copy (perhaps a paper copy, perhaps just take a picture with the smart phone), and then the student takes one copy home.

At home, the student tries to figure out which standards she met on the quiz. For each standard, she writes up an argument as to why she met that standard. Specificity is key—the student would need to explicitly say where and how she met the standard. She submits this at the next class period, and this is graded as I usually do.

1. Students have to reflect on their work in order to get credit. This could lead to higher quality writing.
2. Students would have to take ownership of their learning. They need to be aware of the standards they are missing, and make a concerted attempt to learn it well enough to be able to apply it on a quiz (including recognizing where it makes sense to apply it).
3. Students can solve problems any way they like. As long as they can solve the problem using a standard, it counts. For instance, a linear algebra student might get “eigenvalue” and “determinant” credit for finding the eigenvalues of a matrix.
4. Students are forced to really think about what the standards are and mean. There could be metacognitive benefits.
5. I can ask more synthesis questions on quizzes; I do not need to isolate ideas for each question.
6. Students no longer get the hint that the label provides (if the quiz question is labelled as corresponding to the “Tangent line” standard, then the student has a pretty good idea that he should find a tangent line at some point).
7. It might give me room to have more standards (and more specific standards of the “I can do this” variety, rather than standards that are really topics, as in “Tangent lines.” David Clark encouraged me to make this transition last weekend).

Here are some potential problems:

1. If the problems are too synthesis-y, then students won’t be able to do very many on each quiz. This might be fine, but it would be bad for a student who gets stuck and does not know where to start (on the other hand, maybe it would help teach students to start with something?).
2. Students may try to shoehorn standards where they do not belong. This is what I would do if I were missing a small subset of standards.
3. I am not certain I can write quiz problems that will give everyone the opportunities they need at the end of the semester. Students need different things, so I would have to have a lot of questions (note: this actually doesn’t need to be any different than how it is now; I can just provide straightforward, say, “Tangent lines” problems to quizzes if I need to. So this actually isn’t much of a problem).
4. It forces students to be aware of what they have not yet demonstrated; this might be asking too much of some first-years.

I am on the fence about this, although I would really like to try it. Perhaps I could do both: keep the old way (with the labels) and do the new way. I could make that work.

What am I missing? What other advantages, disadvantages, and difficulties would this have?