Grading is Better Than Ranking

I have written elsewhere about what I see as the profound weaknesses of the various systems of university ranking that have become so popular in recent years (see N1, below), and reading recent negative accounts of the employee ranking system at Microsoft (N2, N3, below) reminded me that individuals are also ranked by many organizations. For that matter, so too are countries, as indicated by a recent report of comparative ranks amongst Southeast Asian education systems (N4). 

I want to argue that all these ranking systems are both misguided and frequently generate negative consequences. If we must evaluate on a comparative basis (as sometimes I think we must) then grading of the kind I will describe is a much better system than ranking.

A. Let me start with my preferred system of grading. It is based on my experience and practice of teaching students, but I see no reason why we can’t extend it to grading universities, employees or national education systems.

-1. First, establish clear and public criteria for each grade (i.e. to get an ‘A’, such-and-such must be achieved; to get an ‘F’ requires such-an-such, etc).

-2. Next grade your students, universities, employees, education systems accordingly). Do NOT mark on a curve: if everyone is brilliant, everyone gets an ‘A’; if no one has a clue, everyone gets an ‘F’. Double check the results of those who are on the borderline between two grades.

-3. Discuss with all students (universities, employees, etc.) individually their strengths and weaknesses. What should they be doing to get a higher grade? How can they become stronger?

-4. Never be surprised if students who retake an exam the next trimester or year do much better than they did the first time. The capacity for individual improvement is enormous.

-5. Do NOT reify the grades. You are not everlastingly defined by last year’s ‘D’ in Mathematics. It indicated a weakness that you may be able to remedy through hard work and good tuition. Your goal should be a higher grade next year. As the psychologist Alfred Binet observed when he first developed the system of intelligence testing even weaker children can improve — their scores are not fixed (N5).

B. Ranking systems. It is comparatively easy to put individual students, universities, employees, education systems, national human rights records or whatever into grading categories (A, B, C, D and F if you want, or OK, brilliant and ‘needs improvement’ if you prefer), but it is normally extremely difficult to rank accurately unless the ranking system is very clear cut and simple. For example, it is fairly easy to rank competitors in a running competition — Jones came in first, Green was second, etc, but to say that Cambridge University is better than Oxford, or that Harvard is better than Yale seems to me rather silly — though that doesn’t stop international ranking organizations from doing it. I am sure that all four of these universities are amongst the best in the world (I will give them all ‘As’ in my system 🙂 ), and I suspect that for particular subjects one may be better than another but given that they are all enormous and complex educational structures of high quality I doubt that much is to be gained by ranking them. If I have understood Nassim Nicholas Taleb correctly (Black Swan, Fooled by Randomness), he reminds us that at the extremes of any distribution (the high and low ends of a distribution curve) then quite tiny differences can seem to make an enormous difference. We end up attributing what I have termed a ‘false facticity’ to minor or unimportant details.

As I note in my paper on University Ranking (N1), other problems of ranking systems include the biases introduced by the initial ranking criteria chosen, the unlikely changes that occur in a particular university’s rank from year to year, the unreliability of some of the data used and the very concept of a single ranking score.

C. Negative consequences of ranking. As I have also noted in my Ranking paper, ranking systems also encourage an obsession with short-term and narrowly focused ‘production targets‘ rather than wider and more general concerns with quality and excellence. Following the dictates of a prevailing ‘audit culture’, metrics of evaluation are sought or invented. As Einstein is alleged to have said ‘Not everything that can be counted counts, and not everything that counts can be counted’. Modern academics, for example, are increasingly evaluated on their publication output and a supposedly objective metric for the work’s ‘impact factor’. Apart from encouraging academics to write articles rather than books (to the horror of most people n the Humanities), this procedure downgrades the value of good teaching and long-term hard-to-measure importance. Many great thinkers of the past — Descartes, Locke and Adam Smith spring to mind — had really poor publication records, of course. and would probably find it difficult to get promotion at a modern university.

Again, I note that ranking systems also encourage a vast and time-consuming bureaucratic apparatus of data collection — as with the now infamous British ‘Research Assessment Exercise’. They also encourage universities to ‘game the system‘ (I give several examples in my paper, and recently came across another from Britain where lecturers were trying to tutor their students to give answers that would raise their institution’s national prominence [N6]).

Ranking systems also lead to false comparisons and often quite facile and bitter competitions for prestige (‘ranking envy’), University presidents, for example, may fret that their institution has been downgraded from 71 to 81 in some ranking system, and rather than question the validity of the two scores, or realize the their university hasn’t actually changed that much over the past year, embark on some quick-fix solution to improve their metrics. Or they may be forced from their job for having failed to maintain their university’s rank. The same evidently happens between individuals subject to a divisively competitive annual ranking process as at Microsoft (N2, N3). Such competitions are more likely to demotivate than to improve quality.

The competition for ranking prestige can also lead to a false conception of quality. Each ranking organization privileges a particular set of metrics and implicitly devalues others. Is a remedial language centre, for example, any less valuable than a prestigious research centre? In terms of university ranking, then research is obviously more important than remedial language, but in an era of internationalization, with ever larger numbers of students migrating to other countries for their education, adequate language ability becomes ever more important for a vast number of students. As with the ancient Greek concept of  Areté (see my blogpost), there are a variety of excellences, and all are worth striving for. The ‘excellence’ of a research centre is not the same as that of a remedial language centre, and they can not be measured in the same way, but both are important.

D. What is to be done?

-1. Critics of ranking systems face the difficulty that the ranking systems are enormously popular and are unlikely to end any time soon. Merely pointing out the weaknesses and deleterious consequences of ranking systems is unlikely to persuade participating organizations to end their involvement in the competition for prestige. What is needed is a practical alternative — which I think a simple grading system such as I have outlined above does.

-2. If ranking systems were abandoned or given less attention and importance, then universities and other organizations could devote more time to asking what they really mean by ‘quality’ or ‘excellence’. The answers that they got might not always be easily measured, but they might be more valid than the present metrics.









