I gave this talk at Research Ed 2015 on 5 September, the latest in a series of three national conferences organized by Tom Bennett.
Research Ed has grown into a vital event in the annual calendar for teachers interested in the theory of teaching. Nevertheless, my impression is that the centre of gravity of many of the talks at ResearchEd has veered away from an agenda that tries to promote sound, quantitative research, and is replaced with a softer account of both the role and methodology of research, as is suggested by the language of “action research” and “research-informed” teaching.
I believe that the problems with research are systemic and not just the result of incompetence. I argue in this piece that these systemic problems can (and can only) be solved by seeing teaching as a business which has a larger technical element than we commonly admit, and one that is less dependent on personal intuition (or what some call “tacit knowledge”). Such a realignment of our views on what teaching is, and how research into teaching should be conducted, will also underline a radical reevaluation of the role of technology in the classroom.
A lot of education research really isn’t very good.
That was what Rob Coe said in his keynote at ResearchEd2013. And he went on to say:
I don’t know how you improve the quality of a whole discipline by pulling it up from the bootstraps. How do you do that? Any ideas would be very welcome.
In this talk, I’m going to give you my ideas. I’m going to argue that the only way to pull up education research by the bootstraps is by the appropriate use of education technology.
I am going to start by making the case for quantitative research in education because if we don’t agree that it matters, we won’t be bothered to improve it.
Second, I will list the problems with research in education because unless we agree on the problems, it’s unlikely that we will agree on the solutions.
Third, I will describe what I mean by edtech. We are unlikely to agree that its the solution to anything unless we agree what it is.
And finally, I will match edtech solutions to research problems.
So why do I need defend the principle of quantitative reseach— at the ResearchEd conference? Maybe because Dylan Wiliam gave a keynote last year called
Why teaching will never be a research-based profession and why that’s a good thing.
Maybe because it is so often repeated that teaching might be research-informed but not research-based. And because this phrase is so clearly meant to downplay the importance of research.
Why do people make this distinction and why do I think they are wrong? I am going to cover seven arguments.
Michael Barber of Pearson says that its because it’s people who make the decisions in education, not the research itself.
Well of course. And that is true of every profession. But it doesn’t mean that people can’t justify their decisions by research, which is what we mean by the expression research-based.
Michael Barber and Dylan Wiliam both make the point that you can’t research what hasn’t been done yet, so innovation always requires an element of imagination and faith. True again. True for every type of technology and every profession. But when you make that leap of imagination and faith, you do so on the basis of existing empirical evidence. You want to have some reason to think that your leap might be successful. And after the leap has been taken, it becomes empirically testable. And in teaching, as in any profession, innovation occurs at the margins and does not represent the bread and butter of daily practice, which needs to be justified by the evidence of effectiveness that inevitably comes from repeated use.
Dylan Wiliam’s keynote last year listed a whole series of reasons why research in education is difficult: he considered
• the number of variables that have to be isolated;
• the difficulty of randomizing samples given that the population is clustered in classrooms and schools;
• how data could be distorted by cherry-picking,
• by over-claiming for conclusions based on under-powered surveys,
• by over-generalising from very particular circumstances,
• or by aggregating incompatible surveys into bogus meta-analyses.
It was a fascinating talk, well worth listening to again online. But just because something might be difficult or just because something might be done badly is no reason why it cannot be done well. The content of Dylan’s keynote did not support the case that he asserted in his title.
People constantly downplay quantitative research by disparaging its fundamental method: correlation. They repeat the popular aphorism “correlation does not imply causation”. Popular but untrue.
What people mean to say is that a correlation between A and B does not imply that A causes B. But it does imply either that A causes B, or that B causes A, or that a third variable, C, causes both A and B. A correlation always implies some sort of causation – you have just got to be careful to establish what sort. And the way that you can do that is by triangulating further correlations. If you read David Hume’s Enquiry Concerning Human Understanding, you will realise, not only that correlation does imply causation, but that correlation is ultimately the only evidence we ever have for causation.
If you are convinced by websites which purport to show all sorts of spurious correlations, such as between US spending on technology and deaths by strangulation, then you don’t know what a correlation is. You don’t establish a correlation between two single data points. You don’t establish a correlation between two trend lines. You don’t establish a correlation between datasets that have been doctored or selected to make them fit. You establish a correlation between large, random datasets which have not been vetted or pre-selected.
The objection against the significance of correlation, made so commonly by educationalists, is based on a fundamental misunderstanding of the nature of scientific knowledge.
Many people say that the question that is asked by quantitative research—what works?— is not a valid question because education is about values and values are subjective. If I say that I want to go to Edinburgh, who are you to say that I should really want to go to Cardiff instead?
But given that I do want to go to Edinburgh, you would be justified in telling me that I had better head north up the M1, rather than south down the M3. “What works?” is a question that is appropriate to ask of our means but not of our ends—and means only exist relative to defined ends.
Many people argue that we can’t define our aims in education because they are a personal, value-laden matter for different teachers to decide. I disagree. It is not the job of any service provider to determine the purpose of the service they provide. I go to the doctor because I want to be healthy, not because the doctor tells me I ought to be; I employ a plumber because I want a new bathroom, not the plumber tells me I ought to; and I employ a teacher – or the state employs a teacher on my behalf – because I or the state or society more generally has a clear idea of how it wants young people to be educated. And and in the light of those objectives, “what works?” is the key question that teachers – like any other service provider – have to answer.
Many people argue that the act of measurement is reductive, that the most important outcomes in education can’t be quantified—and if you can’t quantify, then you can’t do quantitative research. Again, I don’t agree. We can easily express any sort of capability—maybe what are called soft skills like teamwork or creativity—maybe on a five point scale, A to E. Nothing fancy there. The difficulty lies not in the act of quantification but in agreeing exactly what we mean by teamwork or creativity, in describing precisely the standard of performance that we expect, and in applying our A to E scale consistently. These are not problems with quantification: they belong to the previous point about how we describe our learning objectives. Our inability to do this is certainly a problem, it is something that we find difficult—and something I will be coming back to—but it is not something that can’t be done.
The last reason why people distrust quantitative research in education is in my view the only one that holds any weight. It is that the job of the teacher cannot be described precisely—and that means that it cannot be researched precisely—because it relies on personal intuition.
At the ITTE conference in July, I spoke after Miles Berry and we both used exactly the same slide as each other, showing a quote from Diana Laurillard. And we both asked the same question: what does Diana mean when she says that teaching is a design science? Miles said she meant that teaching was a craft. I said she meant that teaching was a technology. And the difference, I think, between our two answers is that “craft” is based on private knowledge while “technology” is explicit and replicable.
But if you believe that teaching is based on private knowledge and that what matters is the personal experience of the teacher, then you must to face up to the very serious implications of that position.
Dylan Wilam quotes research showing that the best teachers teach in six months what the worst teachers teach in two years. If you transferred that inconsistency of performance to the National Health Service, you would end up with excess death rates of about 200%. The excess death rate at Mid Staffordshire, which caused a national scandal, was a trivial 25%. Because the relationship between capability and output follows the law of diminishing returns…
…such highly inconsistent levels of performance imply that the mean capability of teachers is low.
Dylan Wiliam also shows that the rate of improvement of teachers over a twenty-five year career, based on personal experience, is very small indeed.
The time taken to teach a year’s worth of learning is reduced on average by 2 weeks – that’s a 4% improvement—over 25 years, which is tiny compared to the huge disparity of performance that we face.
The problem becomes worse as we have tried to scale up our education service.
Ever since Socrates, we have known how to do education really well for a few. It has never been hard to set up exemplary, model schools. The problem is always scaling out. We face chronic shortages of well-qualified teachers in many key subjects, and the good teachers we have too often suffer excessive stress and workload, caused by having to give personal attention and feedback to large numbers of students. Performance suffers, government responds with inspection and bureaucratic contols, and these only makes things worse.
The underlying problem was spotted by Kim Taylor, Director of the Nuffield Institute, who wrote this in 1970:
Schools are like the very earliest factories: simple materials, walls, workers and overseers. The tools of the trade, the machinery and equipment, are rudimentary. There is not much to counterbalance the skill, or lack of skill, of the individual teacher. Teaching is a job almost wholly dependent on manpower, and in the foreseeable future…the craftsmen we need are going to be in scant supply.
If you think that personal experience and private intuition is what really counts in teaching, then you must accept wildly inconsistent performance, poor levels of mean capability, the inability to improve that capability, and the inability to scale provision. It is a doctrine with catastrophic consequences for the long-term effectiveness of our education service.
But is it true?
I think it is true in part. Human relationships are obviously very important in the classroom. If for no other reason than that we learn, and we are motivated to learn, by imitation and role modelling. It is also only the human teacher that is able to interpret and manage the contingencies of the classroom, dominated as it is by those unpredictable human relationships.
But human relationships are not enough and motivation is not enough. We need to teach as well as to inspire and motivate. And even when you unpack what it takes to inspire and motivate, you will find that research suggests that the most important factors are:
• a sense of purpose
• a sense of autonomy
• a sense of increasing mastery.
The first and last of those suggest that one of the best ways of motivating children is to teach them really well. And to teach consistently at scale, we need systematic instruction grounded in justified theory.
We cannot measure, evaluate and replicate inspiring relationships – but we can measure, evaluate and replicate different forms of systematic instruction that are targeted at well-defined educational objectives.
There are aspects of teaching that can be called craft and there are aspects that can be called technology – and as technology is the application of science and science is based on empirical research, this part of our practice needs not only to be informed by research, it needs to be justified by research.
And if we are to improve the overall quality and consistency of our teaching, it is vital that we do just that.
Having made the case for quantitative research at some length, I will briefly list some of the problems that it faces.
1. It is widely distrusted both by teachers and—bizarrely enough—by researchers themselves.
2. We have not defined our educational objectives clearly (and without clear objectives, we cannot assess the effectiveness of our means). This is partly because people do not believe it is possible, partly because we have just not worked out how to do it. I don’t believe that there is anything wrong with “teaching to the test” if it is a good test – but this phrase encapsulates a justified concern that the objectives that are encoded into our present assessments, particularly in formulaic mark-schemes, are reductive. We need to describe all our educational objectives, including the softer skills. The abolition of levels recognizes that the criterion referencing on which we have been depending for nearly 30 years was ineffective because everyone interpreted the criteria differently.
3. There is too much variability in the way that a particular pedagogy, a particular instructional technique, is applied to allow for consistent analysis of its effect. This is because different teachers implement different pedagogies in different ways.
4. We do not have data in sufficient quantity or quality to construct effective RCTs. Educational research is data-hungry, for reasons that Dylan Wiliam explained last year, and schools are data-poor.
5. Educational research is subject to strong observation biases, such as the Hawthorne Effect, leading to large numbers of false positives.
6. The academic community has poor quality control, illustrated by what the Tooley Report called its game of “academic Chinese whispers”. If something was said by someone, somewhere—that is very commonly taken to be evidence that it must be true. The Education Endowment Foundation, which was meant to be the answer to poor quality research, has recently produced some very poor quality research of its own—illustrating nicely the systemic nature of the problem and the difficulty, as Rob Coe put it, of bootstrapping a whole discipline.
7. Finally, it is difficult to disseminate and apply the results, even of well-founded research such as Assessment for Learning.
My third task is to define “education technology”, which is a widely misunderstood term.
The primary definition of “technology” given by the Oxford English Dictionary is:
the application of scientific knowledge for practical purposes.
Technology is about the means by which we achieve our ends: means that are based on science or in other words research.
Science and technology are two sides of the same coin: one is about the inference of abstract principles from empirical observations of the real world; the other is about the application of those principles back in the real world.
I have already argued that quantitative research into the question “what works?”depends on having clearly defined objectives.
Technology also depends on having clearly defined objectives. Technology looks very different, depending on what our objectives are: military technology is not the same as horticulture; education technology should look quite different from telecoms or generic web-browsing.
But it doesn’t.
The edtech community commonly uses the term “technology” as if it referred to a generic commodity: iPads, mobile phones, whiteboards, digital “stuff”.
People talk about “teaching with technology” in the same way that they might ask you whether you wanted more peanut butter on your toast.
I will use “education technology” to refer to the evidence-based methods by which we teach—in other words, to pedagogy…
…and in particular to the ways in which we use digital technology as a pedagogical tool to support those forms of systematic instruction.
Because people use “edtech” to talk about stuff, without reference to purpose, it is not surprising that they go on to confuse the very different purposes for which you might want to bring digital hardware into school.
When the Education Technology Action Group—ETAG— was asked last year how technology could be used to improve education, it concluded that
the use of digital technology in education is not optional. Competence with digital technology to find information, to create, to critique and share knowledge, is an essential contemporary skill set.
They answered the wrong question. They were asked a question about the use of technology, about how we should teach; and they gave an answer which focused on what we should teach.
For me, “edtech” should always be used to refer to the use of digital technology as a pedagogical tool, and terms like “computing” or “digital skills” should be used to refer to the teaching of technology.
Not surprisingly, the existing approach to edtech, which was widely implemented and generously funded during the Becta years, has been completely ineffective.
Study after study has shown minimal benefits: the ICT Testbed project in 2006 to 8 gave an average of £1 million to spend on digital technology to each of 30 schools and found that there was no measurable impact at Key Stages 3 or 4 and only a very small impact at Key Stage 2—and even this can be explained by a combination of the Hawthorne effect and regression to the mean.
A study published by the LSE in May of this year, based on the exam results of 130,000 students at 90 schools shows that allowing students to Bring Your Own Device (a policy which was strongly recommended by the ETAG report) actually reduces attainment for all but the strongest students.
A review of the evidence conducted in 2012 for the Education Endowment Trust concludes that
the correlational and experimental evidence does not offer a convincing case for the general impact of digital technology on learning outcomes.
But the ETAG report dismisses this overwhelming body of evidence with what I find extraordinary chutzpah:
“Evidence” it says, “ is a problematic concept in education”.
The advocates of technology in education appear to have renounced the scientific principles by which technology is defined. People who do not concern themselves with evidence are not technologists but techno-zealots, promoting an ideology which has been used to subvert our educational objectives rather than meeting them more effectively.
This is hugely damaging —because it undermines the credibility of technology, which has transformed almost every sector of our professional lives except education. Only in education is technology seen as a preoccupation of cranky enthusiasts. Education desperately needs better technology – both in terms of general approach and in terms of the specific tools of the trade that we need if we are ever to deliver systematic instruction consistently and at scale.
Information technologists often talk about a technology stack.
At the bottom are the generic, infrastructural elements –hardware, operating systems, networks— at the top are the application-specific elements, and in IT, applications mean software.
People in education often say that its not about the technology, its about how it is used.
This suggests that you would be just as well tightening a nut with a screwdriver as with a spanner, or banging in a nail with a pair of pliers as with a hammer. It suggests that our advanced standard of living means that we are incomparably more skilled than, lets say, the Romans were. But in fact it is not the average level of skill that has changed, it is the technology—and the technology matters. The Allies won the second world war not because they were braver or more skilled than their opponents—but ultimately because they had the atomic bomb and their opponents did not.
This is another popular edtech aphorism which is complete nonsense. The tools matter, the tools must be developed to serve our particular purposes; and in education…
…these tools, these top layers of the technology stack, have never been developed.
The reason why edtech tech has been so ineffective is because we don’t have any. As Diana Laurillard put it in 2011,
What education has done has been to appropriate everyone else’s technologies.
…technologies developed for different purposes or more generic purposes to our own.
And that is why, nearly fifty years after Kim Taylor wrote for the Nuffield Institute, teachers still do not have the tools of the trade that they need to teach consistently and at scale.
So what will these tools of the trade look like?
Computer technology has three characteristics that suit it to the educational requirement.
First it is interactive, and we learn by doing and by receiving feedback.
In the early years, we interact with our physical environment but as our learning becomes more abstract, we become more dependent on interaction with teachers. Students are dependent on teachers for feedback, not as is so often implied by people who advocate independent learning, for information. And it is managing this feedback cycle which is so labour intensive and difficult to scale.
There is therefore enormous potential to create software that supports or mediates different sorts of learning activity, either providing feedback directly or supporting the provision of feedback from teacher or peers.
Some software platforms will support generic activities that are useful across the curriculum:
• flash-cards for learning simple facts,
• robo-markers for short answer questions,
• card games requiring the sorting or prioritising of different points;
• tools for textual analysis;
• stuctured debates or wikis to support useful groupwork.
Other sorts of activity will be subject-specific:
• timeline editors in history,
• laboratory simulations for Chemistry
• intelligent language laboratories,
• or equation editors for Maths.
You may say that these tools already exist. I don’t have the time to do a detailed analysis of current software—but the great majority of it suffers from at least some of the following problems:
• much lacks good user interfaces that make it quick and easy to use in the classroom;
• much lacks good feedback, which is critical —but technical challenging to provide, invisible most of the time, and its either of poor quality simply doesn’t exist;
• much offers shallow interaction – often the interactive element boils down to multiple choice questions, while all the development effort has gone into decoration or irrelevant gameplay;
• much cannot be used flexibly by teachers, but locks the student into a scheme of work devised by the software developer: prevents the it being used in different contexts or blended with classroom teaching.
A recent survey sponsored by the Gates Foundation into the use of digital courseware in HE, based on nearly 3,000 responses, found widespread dissatisfaction with the quality of courseware.
Of those who had used digital courseware, the great majority would not recommend its use to other people.
Its not that the technology doesn’t matter—quite the reverse. Its not that teachers are technophobic. The problem is that the general quality of the education-specific technology is poor.
The second benefit of computers is that they are excellent at managing complex logistical problems—as the sort of systematic instruction that I am advocating requires.
We need to control the sequencing of learning activities, monitor the success with which students complete those activities, and manage the feedback that the students receive.
This requires that good quality classroom management software must be able to interface automatically with all sorts of different third-party activity software, which in turn all need to be controlled by the same management software.
The crucial lever for change will be the right interoperability standards that will allow any conformant management platform to interface to any conformant activity platform. There have been various attempts to produce such standards over the last twenty years: BESA’s Open Integrated Learning System standard in the 1990s; the US Department of Defence’s SCORM standard in 2000 or their more recent Experience API; IMS Global Learning Consortium’s learning tools Interoperability specification. None have quite worked—but again, that is no reason why proper interoperability standards cannot be achieved, if government and industry work together to make it happen.
Once that we can achieve this interoperability, we will find that we can express different pedagogies—that underlying technological approach to education—in the way that we select and sequence different learning activities.
The third benefit that computers can offer to education is data analytics, shown here as part of the management process loop but also providing the basis for research and monitoring processes. Analytics will sniff out correlations automatically, showing what activities—and what sequences of activities—work best in different situations.
But it is important to appreciate that you can’t do analytics without data—large quantities of semantically meaningful data—and at the moment, schools are data-poor. The Achilles Heel of any information system is data input—and the only sustainable solution to this problem is nearly always some sort of automatic capture. The supremacy of the supermarkets has been based on digitally driven logistics systems—which have in turn depended for data capture on the bar-code reader. In education, Learning Analytics systems will only be viable when they can access good quality data that is generated automatically by learning activity software and automatically captured by learning management software.
The education technology that we need but do not yet have will very largely consist of software—software that can be thought of in three layers.
The bottom instructional layer will be made up of software that supports different learning activities.
This will plug into the second management layer, that will assign, sequence and track the outcomes of those learning activities.
And the third is the analytics layer, that will make sense of all that data, helping teachers to provide good feedback and make good sequencing decisions, helping researchers build a proper theoretical base for teaching practice; and helping government monitor the education service for which it is ultimately responsible, with a light touch, in a way that does not constantly impinge on the professional autonomy of teachers.
My final task is to suggest how the sort of edtech that I have described will revolutionize education research, which appears to have such a low status, and which suffers from so many problems. And I will do that by cross referencing the benefits that such edtech will bring with the problems that I have already listed:
• that research is mistrusted;
• that we lack clearly described objectives;
• that pedagogies – the subject of research – are inconsistently applied;
• that we lack sufficient quality or quantity of data;
• that research suffers from observation bias;
• that the outcomes of research are not properly contested;
• and that the conclusions of good research are hard to disseminate.
Learning activity software will encapsulate different pedagogical approaches in replicable form, helping to ensure that pedagogies are consistently and faithfully applied, rather than being customised and interpreted by every teacher that uses them. This addresses problem 3, which was that if a pedagogy is not consistently applied, it is almost impossible to research its effects. It also addresses problem 7, by allowing the conclusions of good research to be more easily disseminated, not in the form of academic papers but in the form of useable tools. I carry in my pocket the ability to apply huge quantities of research that I have never read.
Computer-mediated activity will automatically harvest outcome data in the quantities required to overcome the clustering effect and to isolate the many different variables that are present in classrooms. This will address the prevalence of underpowered research studies. (problem 4)
It will also allow student performance to be analysed without either student or teacher being constantly reminded that they are the subject of research, helping to reduce observation biases like the Hawthorne Effect (problem 5).
The constant monitoring of outcomes is essential to the implemention of adaptive pedagogies – and doing this systematically will help improve the dissemination of research findings which will rarely offer one-size-fits-all solutions (problem 7).
The market will help discriminate between products that work and products that don’t in an environment in which products stand proxy for theories. The market will help ensure that theories that don’t work fail fast. Supported by a newly invigorated professional press as well as user reviews, what is at present so often left uncontested in academia will be very effectively contested in the marketplace.
Analytics systems will automate the early stages of research, giving real-time feedback to teachers and school management on what is working. This will enable teachers to optimize their programmes of study through a rapid-iteration of application and testing, further circumventing many of the problems in the professional research community (problem 6 ). By producing robust research findings that are centred on the problems of the classroom, analytics systems will restore teachers’ confidence in the principle of quantitative research (problem 1).
There is one problem that remains unaddressed—our difficulty in describing clearly our educational objectives. Perhaps this is the most fundamental problem of all. If technology is the means by which we address our purposes, it is a serious challenge to my account of teaching as a technology if we cannot describe clearly what our purposes are.
I agree with what Daisy Christodoulou has been saying on her blog, that we need to describe our learning objectives by reference to exemplars of student performance, rather than by abstract descriptors, which everyone interprets differently. But there are also difficulties in this approach. You must not provide just a few, authoritative exemplars, because students will then tick the box by copying the exemplar, just as they ticked the box by memorizing the rubric of the descriptor.
We need a great multiplicity of exemplars—and this brings a further problem: how do you ensure that your exemplars are consistent, one with another?
The answer is again by analytics. By correlating both student performances and teacher evaluations, analytics systems can help show the internal consistency of a set of exemplars. And by creating internally consistent sets of exemplars, these systems can help communicate not just mechanical educational objectives, like learning vocabulary or performing simple algebra, but fuzzier objectives like creativity and teamwork, which we all value as the really important outcomes of education.
Do not imagine that the sort of systematic instruction that I am advocating is all focusing on what is easily measurable. Better technology will help us measure those things that we find difficult to measure at the moment, and so resolve so much of the frustration that teachers feel about reductive mark schemes and teaching to the test.
Teachers often talk with great passion about their professional status. But there are two major problems with that claim. First, the performance—not “outcomes”, which are affected by intake of course, but “performance”—the performance of teachers is wildly inconsistent. Second, teachers do not have a defensible theoretical basis for their practice. Teachers can be a profession or a collection of craftsmen – they cannot be both.
The sort of technological approach to teaching for which I have argued does not deny the importance of the human dimension to teaching, just as the many technological aspects of broadcasting do not deny the importance of the “talent” on which the success of the schedules depend. But I have argued that inspiration, personality and motivation are not enough. We also need to teach. Teaching is essentially a technical business, and digital technology has a huge role in providing teachers with what Kim Taylor called the “tools of the trade”, that are needed to manage that technical business consistently and at scale.
Such tools of the trade will not replace or deprofessionalise teachers — they will do the very opposite. The professionalism of doctors is not undermined but is rather enhanced by their access to modern medicines and well-equipped operating theatres. They achieve better outcomes. The quality of their working environment and interest of their work improves. And it will only be the effective deployment of technology in education that will bootstrap the quality of research and provide the defensible, evidence-based body of theory, without which practice will continue to be based on guesswork and without which the claim to professionalism will continue to ring hollow.