Is the marshmallow test still valid?
LMU economist Fabian Kosse has re-assessed the results of a replication study which questioned the interpretation of a classical experiment in developmental psychology. The new analysis reaffirms the conclusions of the original study.
Very few experiments in psychology have had such a broad impact as the marshmallow test developed by Walter Mischel at Stanford University in the 1960s. The test appeared to show that the degree to which young children are capable of exercising self-control is significantly correlated with their subsequent level of educational achievement and professional success. In the test, each child is given a treat – the eponymous marshmallow – and told that if she leaves it on the table until the experimenter returns, she will receive a second marshmallow as a reward. The ability to delay gratification of the desire to enjoy the treat serves as a measure of the child’s level of self-control. By its very nature, Mischel’s test is a prospective experiment, and he followed his experimental subjects over several decades. The results showed that the longer his 4- and 5-year-olds were able to resist the temptation presented by the first marshmallow, the better they performed in subsequent tests of educational attainment. The Mischel experiment has since become an established tool in the developmental psychologist’s repertoire. In 2018, the results of a new study designed to replicate Mischel’s experiment appeared in the journal Psychological Science. The report produced quite a stir in the media, as its conclusions appeared to be in conflict with those reached by Mischel.
Now a team led by Fabian Kosse, Professor of Applied Economics at LMU, has reassessed the data on which this interpretation is based, and the new analysis contradicts the authors’ conclusions. “The replication study essentially confirms the outcome of the original study. In fact it demonstrates that the marshmallow test retains its predictive power when the statistical sample is more diverse and, unlike the original work, includes children of parents who do not have university degrees. In our view, the interpretation of the new data overshoots the mark. The result actually points in the same direction as the study by Mischel and colleagues, but the effect itself is somewhat less pronounced.”
In collaboration with professors Armin Falk and Pia Pinger at the University of Bonn, Kosse has now reanalyzed the data reported in the replication study. In doing so, the team noticed two potentially significant methodological discrepancies between the experimental designs. In the Mischel experiment, the period during which the children could decide to eat the marshmallow was 15 minutes long. In the 2018 study, the duration of ‘temptation’ was shortened to 7 minutes. “Of course, whether one has to wait for 7 or for 15 minutes makes a big difference to a 4-year-old. Indeed, our statistical analysis suggests that this difference alone accounts for one-third of the difference in outcomes between the Mischel experiment and the replication study,” says Kosse.
The second criticism of the methodology relates to the choice of variables which the authors of the replication study used in their attempts to control for exogenous factors that could have distorted the relationship between self-control and subsequent educational attainment. “Children who waited for longer before eating their marshmallows differ in numerous respects from those who consumed the treat immediately. This makes it very difficult to decide which traits are causatively linked to later educational success. In their efforts to isolate the effect of self-control, the authors of the replication study conducted an analysis which suffers from what is known as ‘the bad control problem’. They tried to account for so many effects that it becomes impossible to interpret what these effects are telling us about the real relation between early self-control and later success.“ Falk, Kosse and Pinger have now performed a similar analysis. Crucially, however, they controlled only for confounding factors that could be clearly interpreted as such. Their re-examination of the data suggests that the replication study actually reveals a relatively strong correlation between readiness to delay gratification and subsequent scholastic success.
The results obtained by Fabian Kosse and his colleagues appear in the journal Psychological Science. “The new study provides an exemplary demonstration of how science should work. Everyone who deals with the marshmallow test in the future must take both the replication study and our commentary upon it into consideration, and can form her own opinion in relation to their implications,” says Kosse. “The team that performed the replication study, which was led by Tyler Watts, has made an important contribution by providing new data for discussion, which will allow other groups to analyze the predictive power of the marshmallow test on the basis of large and highly diverse sample of individuals. In our view, the new data confirm that personality differences that emerge very early in life are important indicators of later professional success. The children who succeed in delaying gratification in the experiment do significantly better in a test of educational attainment administered 10 years later than do those subjects who gobbled up the marshmallow immediately. Now we need to explore what determines whether children are capable of postponing gratification or not.”
Psychological Science 2019