“Life is not a quiz show”
One question, several alternative answers: Multiple-choice tests are widely used assessment tools. LMU psychologist Markus Bühner tells us why, and points out their limitations.
Schools and universities, quiz shows, and even language tests for refugees – make use of the multiple-choice format to test a candidate’s knowledge. Why is this form of assessment so popular?
Markus Bühner: Because it is economical. Every candidate’s performance can be measured with the same template, and it can now be done automatically by a scanner. With other forms of test, such as the question-and-answer format, the examiner may have to decipher the handwriting, and it is not always easy to decide whether the response is right or wrong. Multiple-choice tests have the great advantage of being, in large measure, objective.
Is the format more suitable for some subjects than others?
That is difficult to say. It all depends on the purpose of the exam concerned. The crucial question is always the same: What do my students need to know? Multiple-choice tests basically involve the recognition of knowledge. One can pose more complex problems in order to test comprehension, analysis and application of knowledge, but that still does not tell me whether candidates can reproduce what they know – in their own words.
Medical students refer to it as their ‘make-or-break’ exam: After their fourth term, they are confronted with 320 multiple-choice questions, and students of pharmacy have to answer no less than 360. Are marathon exams like these really informative?
The number and the types of questions are always determined by what the course was intended to teach, what the students are expected to have learned. If that cannot be done by means of a single exam, then the subject-matter should be divided up.
How long should such a test take?
I would say that the maximum period anyone can be expected to concentrate is 2 or 3 hours – maybe less, because answering multiple-choice questions is a very monotonous task. A multiple-choice test should not be a test of speed, of quick thinking. The problems set should be soluble within the time allowed. If only completed papers are marked, and 50% of the candidates fail to finish, then something is wrong. – Unless learning success in the subject is defined as work performed divided by the time taken. Otherwise, the result should not depend on how long the individual can keep his wits about him.
A multiple-choice exam is not meant to be a test of concentration. Performance should depend solely on one’s level of knowledge. Everything that inadvertently complicates the exercise means that other elements are being measured.
What else could be tested in this way?
One popular variant is to provide four or five propositions and ask the candidate to choose the correct combination. So A, B, C and D may be right, but not E, or D, E and F, but not B. That is a test of working memory.
What other features can make things difficult for candidates?
The defining feature of multiple-choice tests is the set of alternative answers, the so-called distractors. Even candidates who know the right answer may literally be distracted, and so led astray by the format. Then there are those who don’t know the answer, but can work it out.
So intelligence triumphs over factual knowledge?
Intelligence, for example, yes. But, of course, tests often contain unintentional clues. Sometimes the preceding questions may point toward the correct response, or some of the alternatives may not be grammatically correct responses to the question posed – that shouldn’t happen, but it does. Mistakes are often made in the placement of the right answers. Studies have shown that the authors seldom put the right answer first or last in the list. If, as a candidate, you know that, you already have an advantage.
Are there other features that lead candidates astray?
Negatively posed questions can easily be misunderstood. Little words like ‘one’/‘none’ or ‘true’/‘untrue’ are often misconstrued. And long answers or answers that differ markedly in length can unwittingly make tests more difficult.
What criteria should a good multiple-choice test meet?
In my opinion, formulating all the items in a grammatically correct and unambiguous manner is the greatest challenge. The alternative responses should make sense and be themselves instructive. In addition, they should include alternatives – or, rather, putative options – that would have serious repercussions if true. These provide an extra boost to the learning effect. The tests should also have something to do with real life. There are indications in the literature that sets of three alternative responses are a good idea, but that would also make guessing easier. Moreover, it’s not just a matter of the number of distractors. It also depends on how plausible they seem. At all events, the questions should be framed in such a way that they can be answered unambiguously, and care must be taken to ensure that everyone understands them to mean the same. That’s why it is important for the test developers to show the questions and distractors to others, to check that everyone interprets them in the same way. Otherwise, the choice between alternative interpretations, and not the level of the candidate’s knowledge, may determine the outcome of the test.
Does it make sense to prepare for such a test by using multiple-choice questions as a guide?
It’s always a good idea to familiarize oneself with the formal design of an exam beforehand, and the best way is to do so under realistic conditions. In other words, try to tackle the same number of items in the allotted time, in order to get a feel for what’s ahead.
What effect does all this have on the ability to reason scientifically, and might it have negative consequences in one’s later career?
That depends on how you prepare for an exam. If you only have to place crosses in boxes, one may indeed ask how much of what you know will stick in the mind. Of course, that is the problem with all forms of assessment. Nevertheless, learning as recognition of what you’ve already encountered is not the best possible kind of learning. Whether that reservation is sufficient reason for rejecting the use of multiple-choice tests is something their designers must decide. Whether or not the advantage of economy outweighs the disadvantages in terms of knowledge acquisition depends on the context, such as the number of candidates to be tested. In subjects where thousands of students must be tested, there is probably no alternative. – After all, who has the time to mark thousands of exam papers? It also depends on the content. Does the exam assess theoretical knowledge that will sink into the background later on, or knowledge that is really important or might even be life-saving? Life is not a quiz show. What are the chances of encountering, in everyday life, someone who asks: A, B, C or D – which would you prefer? Perhaps there are careers in which such situations repeatedly turn up. – Having to choose from a computer-generated list of possible actions is perhaps the closest analog to a multiple-choice exam. But in situations that require the ability to recall and organize knowledge at will, the multiple-choice paradigm is not very helpful.
The design of multiple-choice tests itself sounds very complicated.
That is indeed the case. I have therefore decided not to use the format any longer, and to pose open-ended questions instead. But then, I have – at most – only 250 candidates to deal with. The effort required to construct a multiple-choice test is just as great as that needed for the correction of answers to direct questions. Apart from that, I don’t want my students simply to reproduce things they have already come across, because I don’t think that reflects reality – at least not in my own discipline.
Prof. Dr. Markus Bühner holds the Chair of Psychological Methodology and Assessment at LMU.
Interview: Nicola Holzapfel