Why the views of our leading educationalists on the curriculum don’t add up
This is an expanded version of the talk that I gave at ResearchEd on 9 September 2017. In it I argue that Tim Oates, Dylan Wiliam and Daisy Christodoulou, all educationalists whom I admire, have nevertheless got much wrong in their account of the curriculum. 14,000 words. You can bookmark individual slides by right clicking on the “SLIDE X” caption and selecting “Copy link address”. Slides can be enlarged by clicking on the slide.
SLIDE 1

In this talk, I am going to challenge some of the key orthodox positions that you will hear widely when you go to conferences such as this. In particular, I am going to challenge some of the positions taken by Tim Oates, Dylan Wiliam and Daisy Christodoulou. These are key thought-leaders in our discourse about education at the moment. Although there is much in what they have been saying that I agree with, I believe that the key conclusions that emerge from their arguments are flawed.
SLIDE 2

I am going to start by discussing the context in which we find ourselves.
SLIDE 3

Nearly twenty years ago, Ofsted published the Tooley Report, which made a survey of all the papers published in one year in four of the leading educational journals. The report’s findings were one of the key sources cited by Tom Bennett as a reason for setting up the ResearchEd conferences.
Tooley proposed that almost none of the academic research on education was of any use to the classroom teacher. Only 10% of the papers sampled asked how we could improve teaching and learning, only 15% were based on robust quantitative data; and only 36% were free from major methodological flaws. If you ask what proportion of education research is about improving teaching and learning AND being based on robust data AND being free from major methodological flaws, and if you treat these as three independent variables, then the answer would be that about ½%. And it got worse. Tooley reported that studies were almost never repeated or contested: researchers tended to work “in a vacuum, unnoticed and unheeded” by anyone else. Views became accepted by a process of “academic Chinese whispers” – a sort of distorted rumour mill – and not through robust debate.
It may be that the amount of quantitative studies into teaching and learning has increased. We have seen a lot of talk recently about effect sizes, based on the work of people like John Hattie and the Education Endowment Foundation – but I shall be arguing today that the significance of the quantitative data that we have is still uncertain that that quality of the debate has not changed very much.
SLIDE 4

Another part of the context in which we find ourselves is that the performance of our education service is extremely inconsistent. Research by Eric Hanusheck, often quoted by Dylan Wiliam, suggests that the best teachers teach in 6 months what the worst teachers take 2 years to teach. That suggests a massive level of inconsistency and under-performance in the system, of a magnitude that would never be tolerated in business or in the health system. Such conclusions have been corroborated recently by the mediocre position of the UK and US in international league tables.
SLIDE 5

And a third element of the context that I want to start with is what I am calling the current orthodoxy in regards to pedagogy, expressed here by Dylan Wiliam in a recent keynote at the Bryanston Festival of education. The two things that Professor Wiliam thinks we need to do in order to improve the quality of our education is to focus on our curriculum and the amount of formative assessment.
The findings of the Tooley report were bad. The inconsistency of our provision is bad. But these basic conclusions sound plausible at the very least. Stating the argument at this high level, I agree with Dylan. But what these high level statements actually mean and why the curriculum matters is the subject of this talk.
SLIDE 6

Before I talk about the curriculum, I want to talk about formative assessment, also known as responsive teaching or assessment for learning.
SLIDE 7

This is Professor Rob Coe talking on the subject of Assessment for Learning at the first ResearchEd conference in 2013, saying that in spite of all the research and all the money spent on promoting assessment for learning, it has had, and I quote, “no impact at all” on educational outcomes. So we cannot continue to advocate assessment for learning without also acknowledging that for 10 or 15 years, assessment for learning has been promoted and funded to no effect. We cannot continue to take the argument for formative assessment seriously unless we also seek to explain this failure of implementation. So in the next section, I will do just that.
SLIDE 8

Although it is not entirely clear what formative assessment refers to (an issue that I will address in my next post), a fairly safe starting point is to accept Wikipedia when it says: “Feedback is the central function of formative assessment”. If you are going to give feedback to students, you have first to stimulate them to do something. You have to ask them a question, set them a problem, or give them a creative opportunity. In other words, you have to design a learning activity. And this is not always easy.
SLIDE 9

Tim Oates reports that when the Expert Panel for the National Curriculum Review consulted teachers in 2011, there was widespread opposition to the idea of more practice in the Maths curriculum because most teachers assumed that practice was all about dull repetition. But it isn’t. In Singapore, Oates suggests, the practice provided in textbooks is challenging and interesting. The reason why UK Maths teachers oppose the idea of practice is because they are themselves so bad at designing interesting exercises. And without interesting practice, without activity, there is no possibility of giving students useful feedback or assessing what they need to do next.
SLIDE 10

The second problem with assessment for learning is that assessing students accurately is difficult.
SLIDE 11

The 2015 Commission for Assessment without Levels, on which Daisy Christodoulou sat, recommended that summative and formative assessment should be treated quite separately. And in making this argument, the report makes an assumption that accuracy matters for summative assessment but not so much for formative assessment. It says, for example, that standardized tests – by which it means summative exams – “can offer very accurate and reliable information”. In fact standardization does nothing to improve accuracy, it merely ensures the comparability of results. And in fact, our summative exams are not accurate at all: it is commonly estimated that between one fifth and one quarter of our SATs level allocations are wrong.
The way you improve reliability (which is about the consistency of results, being an important prerequisite of accuracy) is by repetition. Any political pollster will tell you that you will not get a very accurate poll if you ask the opinion of only one person – you need to aggregate the responses of a large number of respondents. Our summative examinations are inherently unreliable because they depend on single-sample snapshots of performance.
When it comes to formative assessment, the Commission makes no mention whatsoever of the need for accuracy, while at the same time claiming that results need not be recorded. If you do not record results, you cannot aggregate or corroborate results – you are again committing yourself to a highly unreliable, single-shot form of assessment. The Commission says that the only thing that matters about formative assessment is that its results are acted on. This is an absurd statement. If the results are inaccurate, then the action will be wrong. Surely they aren’t saying that any action will do? Surely they want teachers to modify their teaching in ways that will improve learning and not reduce it? But if that is the case, then the accuracy of formative assessment matters.
The Commission, and the profession more widely, is wrong to assume that accuracy matters more for summative assessment than for formative. If my summative assessment is wrong, I will walk away with the wrong bit of paper at the end of my course. But if my formative assessment is systematically wrong, then the quality of my teaching and learning itself will be damaged. Inaccurate summative assessment creates paper errors; inaccurate formative assessment does real damage to people’s real education. The only reason why people are able to maintain that this is not so important is that the damage is invisible. It is not recorded. It is, in the language of the intelligence agencies, deniable.
The reason that the Commission for Assessment without Levels seeks to separate formative and summative assessment is that it perceives that what is commonly called the “validity” of summative assessment – whether it attempts to assess the right things – is weakened by its attempt to compensate for its inherent lack of reliability. It does this by becoming increasingly formulaic and reductive and the Commission does not want formative assessment to be contaminated by poorly conceived summative assessment practice.
But this is a weak argument. First, it is unrealistic. Do we really expect teachers not to teach to the test? Are they to tell their students, “I don’t care if you fail your exams – I will teach you something much more important”?
Second, if summative exams are formulaic and unreliable, then this is in itself a problem. Shouldn’t we be trying to solve the root problem with summative assessment, rather than merely to contain it?
SLIDE 12

Tim Oates recognises the problem with summative assessment. Cambridge Assessment “could make GCSEs incredibly reliable”, he says (implicitly admitting that they are not very reliable at the moment) – but only by making them “long and incredibly expensive”. This argument acknowledges the importance of repetition and corroboration that I mentioned a moment ago. Oates assumes that it would not be realistic to make exams 40 hours long. But he would be wrong, if we were to derive summative results from formative assessments, ignoring the recommendations of Wiliam and Christodoulou and the Commission for Assessment without Levels that formative data should not be recorded. If we created really good practice activities and constantly monitored student performance on those activities, we would improve the learning of students, we would improve the opportunities for formative assessment, and we would improve the validity and reliability of our summative assessment all at the same time. 40 hours of practice is not too long because practice is the very stuff of a good education. Teachers do not have the expertise or resources to produce and manage such extensive programmes of formative assessment – but that is no reason why specialist providers could not do it, if we gave them the opportunity.
So the second difficulty with formative assessment is that it is difficult to achieve accuracy in our assessments and that, contrary to the general perception of educationalists, we need our assessments to be accurate if our interventions are going to be effective.
SLIDE 13

A third difficulty with formative assessment is that it implies personalisation, which is difficult to manage. The need for personalisation is regularly denied by the proponents of Assessment for learning, such as Dylan Wiliam, who suggest that all that is required is a couple of oral questions to a class, allowing the teacher to derive a general impression of whether a class has “got it” or not. But actions are personal, misconceptions are personal and it follows that feedback and progression needs to be personal as well.
SLIDE 14

One of the key recommendations of the Commission for Assessment without Levels was for Mastery Learning. It did not explain what it meant by “Mastery Learning”. But whatever it was, it said that “high quality research” had found that such an approach produced “consistent and positive impacts”. It supported that statement with references to an Encyclopedia by Guskey and a meta-analysis by Kulik.
SLIDE 15

Mastery learning was an idea publicised from the 1960s, principally by Benjamin Bloom (of “Bloom’s taxonomy” fame). It involved setting a threshold test at the end of every topic and ensuring that the class did not move on until everyone had passed the test. Those who failed the test at the first attempt were given remedial teaching and the chance to re-take. Kulik published a meta-analysis in 1990, which claimed that the method was very successful – but this conclusion was very seriously challenged by Robert Slavin’s 1996 article, Mastery learning revisited.
SLIDE 16

Slavin pointed out that Mastery Learning required “enormous amounts of corrective instruction” for students who did not pass the threshold test – and this involved two fundamental problems. First, this amount of remediation “could never be applied in real classrooms”. Second, it invalidated the research that Kulik was reporting. It is not surprising that students who were instructed according to Mastery Learning principles did better that those who did not, as “the total instruction time provided was sometimes two or three times more in experimental groups than control groups”.
SLIDE 17

If the Commission for Assessment without Levels was not aware of the Slavin article, it only had to refer to Wikipedia to learn that the technique had generally been abandoned because of “the difficulty of managing the classroom when each student is following an individual course of learning”. The Commission for Assessment without Levels recommended Mastery Learning without explaining what it is; without acknowledging that it is almost impossible to implement, and justifying its recommendation by referencing discredited research.
SLIDE 18

I am not saying that the principles of Mastery Learning are not worth looking at carefully but like all formative learning, it implies personalisation, and personalisation is very difficult to manage.
SLIDE 19

A fourth problem with formative assessment is the need for timeliness of feedback. Take a basic learning interaction, like riding a bicycle. The student gets on the bicycle and pushes off – that is the action. The bicycle wobbles and the student falls off – that is the reaction that closes the feedback loop. The student reflects on what happened, and tries again. This basic feedback loop echoes what Professor Diana Laurillard has called the conversational framework – though in the case of a bicycle, the conservation is of a non-verbal kind. The learning happens because of the misalignment of the reaction and the student’s expectation, and because of the association that the student is able to make between exactly what was done and what happened as a result. There may be several iterations of reflection, some occurring immediately and some being delayed (a point I will come back to later), some sorts of reflection and consolidation may happen in your sleep or months later in the middle of the summer holidays. But if there is a long delay between action and reaction, then it will be difficult for the the student to make an association between the two and the learning opportunity will be lost. The opportunity for a quick iteration of attempts – to get back on the bicycle and try again – is also lost. Rapid feedback is therefore required – yet this requires individual, real-time interaction between student and the instructive other with which the interaction is occurring. In a class of 30 students, learning abstract concepts or thinking skills, the instructive other is likely to be a teacher and achieving that speed of response becomes almost impossible.
SLIDE 20

The fifth problem is the sensitivity of learning to different types of feedback.
SLIDE 21

Dylan Wiliam often cites the Education Endowment Foundation research to make the case that feedback is one of the most effective inventions that a teacher can make. At the same time, he acknowledges that the effectiveness of feedback varies widely: sometimes it is super-effective; but in nearly 40% of studies, feedback was actually counter-productive. Feedback is not a commodity that comes in a vat and can be sprayed around the room indiscriminately. To be useful, not only do you need to assess the student’s current state of knowledge accurately, you also need to choose the right sort of feedback to provide. That is difficult to do, especially when you might be having to give different types of feedback to 30 different students in a class.
SLIDE 22

The best sort of feedback, according to Dylan Wiliam, is something that he does not even call feedback, although I would. It is when the teacher provides, not an evaluation of the student’s performance, or criticism, or advice on how to do it better – but another activity, specially chosen to allow the student to focus on whatever is causing difficulty. You could call it remediation; in edtech circles, it might be called adaptive sequencing; Professor Wiliam tends to call it responsive teaching. According to Professor Wiliam, the research suggests that it is twice as effective as evaluation or criticism.
It is worth noting that better use of adaptive sequencing would also resolve many of the problems that have been pointed out with differentiation. This isn’t about putting people in the top set or the bottom set, with all the harmful messaging that that involves. It simply says to the student, “this is what you did last” and it therefore follows that “this is what you need to do next”.
SLIDE 23

If we recognise the value of adaptive sequencing as a way of implementing formative assessment and ask how difficult it would be to put into practice, we will find that we are brought back to the first of our challenges, which was to provide a well-designed learning activity. Not only do you need a well-designed learning activity to provoke the student to her first action; you need an extensive collection of well-designed learning activities to allow multiple, individualised learning pathways to be provided to different students, depending on their different needs.
So if there is a single thing on which almost everyone in education circles seems to be agreed at the moment, it is that formative assessment is our motherhood and apple pie – and yet no-one seems to be addressing the fact that in practice it doesn’t work, and no-one seems to be asking why it doesn’t work or trying to do anything about it. The truth is that to implement formative assessment at scale represents a massive logistical challenge.
SLIDE 24

Another bit of motherhood and apple pie, here voiced by Michael Wilshaw, is that “a school is only ever as good as its teachers”. The teacher is the essential, irreducible unit of supply in our current educational system.
I suspect that the reason that politicians say this so often is that it appears flattering to teachers, and it is always a good idea to flatter those on the front line, on whom all the top brass ultimately depend. But in this case the flattery is a poisoned chalice because, when things go wrong (as they often do), you have to conclude from this premise that it’s all the teachers’ fault. There is no-one else to blame.
SLIDE 25

The assumption that all you need is a good teacher goes back to Socrates. This is Raphael’s School of Athens and Socrates is the ugly guy in the green cloak, top left, talking to a rather coy and effeminate young man, watched by a crowd of onlookers. The Socratic method consisted of a conversation between an expert tutor and a very few students – normally one or two. It was a highly responsive dialogue, completely centered on an all-absorbing dialogue with a single teacher. It is a method that ticks all the boxes for formative assessment: it is highly interactive, the feedback is timely, it is personal, and it challenges the student to further intellectual efforts.
As far as instructional methodologies goes, the Socratic dialogue is still the gold standard. Four days ago, Oxford and Cambridge were rated the best two universities in the world. They both stick to a tutorial system that faithfully implements the Socratic method: one or two students in conversation with a leading expert. There is only one thing wrong with the Socratic method – there are not enough leading experts, and certainly not enough to deliver a comprehensive education system with a set-size of 1 or 2. The method doesn’t scale and is therefore intrinsically elitist.
SLIDE 26

Tim Oates recently said that the best learning resources were still Nuffield Combined Science and SMP Maths, both of which were created in the 1970s as a response to some of the problems that I have been discussing. It is worth noting, in passing, this extraordinary statement, which suggests that in those fifty years, in which publishing has experienced its most important revolution for five hundred years, the education system has not managed to make any significant improvement to its educational resources at all. It is another indicator that there is a serious dysfunction in our education system.
But the point I want to make about Nuffield is from a book written by Kim Taylor, the headmaster of my old school, Sevenoaks, who went on to become Director of Learning Resources at Nuffield and wrote a book, Resources for Learning, as a justification of the Nuffield programme.
Taylor argued that the real problem with comprehensive education had nothing to do with the principle of selection: the challenge was how you were going to handle scale. He observed that education was an extremely labour-intensive business. It is almost entirely dependent on “workers and overseers” and hardly at all on “machinery and equipment” or the “tools of the trade” that could help teachers do their job.
SLIDE 27

The problem with this model is that “the craftsmen we need are going to be in scant supply”. And in this he was prophetic – we have ever since suffered from a chronic shortage of teachers, particularly in the most economically valuable subjects, and these shortages are only going to get worse, as does the problem of workload for those who remain. This is yet another symptom of our dysfunctional system.
SLIDE 28

Worst of all, the way that we organise our scarce and expert teachers is inefficient. Individual teachers are essentially left to get on with it themselves. As Taylor puts it, “There is not much to counterbalance the skill, or lack of skill, of the individual teacher”…
SLIDE 29

…who will almost certainly fail to achieve the almost impossible task that is expected of him, requiring as it does “too many performances…prodigies of co-ordination, busking his restive audience like a one-man band”.
SLIDE 30

While the medical service relies on a series of overlapping roles between consultant, registrar, doctor, nurse, technician and pharmacist, allowing supervision, teamwork and on-the-job training; when it comes to teaching, “It is hard to think of any other trade in which such isolation persists”. It was in an attempt to improve this model of the isolated and unsupported teacher that Nuffield produced its programme of course materials that was intended to provide teachers with the “tools of the trade” that they needed to do their job consistently and at scale.
SLIDE 31

This slide summarizes the position that Taylor describes. At the top, government has overall responsibility for the education service. In the middle are the systematic elements of the service, what Tim Oates refers to as policy instruments: to illustrate this concept I have chosen textbooks, assessment and training – though there are others too, like inspection and curriculum.
At the bottom are teachers, who in our present system are isolated and unsupported, generally being expected to get on with things themselves, determining their own objectives, devising their own course materials, their own formative assessments, being the main source in the classroom of authoritative feedback, and often being resentful of what they see as unhelpful interventions in their private domain by education authorities. As we have already seen, these expectations are unrealistic, given the logistical complexity of the task that teachers face. It is these unrealistic expectations that result in a service with an extraordinarily inconsistent performance, as well as high workload and stress for teachers, which aggravates the wastage of staff and the overall under-performance of the system.
Tim Oates, like Kim Taylor and the Nuffield Institute in the 1970s, argues that we should place more emphasis on improving these systematic elements, and particularly textbooks, not in order to replace teachers but to give them more support.
SLIDE 32

Oates makes this argument by looking at the international comparisons. When asked whether they base their teaching on a textbook, only 4% of UK Science teachers and 10% of UK Maths teachers answer yes. But in jurisdictions at the top of the PISA tables, the results are very different. In Singapore, the equivalent answers are 68% ad 70%, and in Finland – a country often cited for the quality of its teachers – the answers are 94% and 95%.
This slide shows not just an association between the use of textbooks and high-performing education systems – it also indicates the prejudice against textbooks in the British profession. A teacher who relies heavily on the textbook, it is widely assumed, is a bad teacher. And if it is a bad textbook, they might be right.
SLIDE 33

This deprecation of textbooks is supported by the widespread view, strongly advocated by Dylan Wiliam, that the teacher’s expertise is at heart a matter of intuition, which cannot therefore be systematized..
SLIDE 34

Wiliam bases his argument on three pillars.
First is what in ancient Greek is called phronesis, translated as “practical wisdom”. This is taken from a recent interpretation of Aristotle by Bent Flyvbjerg, which is used to argue that:
- first, the expertise of teachers consists in the determination of your own objectives (that’s why this is about wisdom and not just technical ability);
- second, that this can only be done on the basis of your own private experience, which has primacy over any abstract theory.
The second pillar is Polanyi’s theory of tacit knowledge. This is the sort of knowledge that we can’t read in a book but that we have to develop ourselves, again, through our own private experience. Polanyi talks about maxims. These can be read in books but they only make sense to those who are already possessed of the knowledge of the art. Maxims can help codify or re-enforce our understanding but only when we have already acquired the basic knowledge through private experience.
Third is Csíkszentmihályi’s [pronounced “cheek-sent-me-high’s”] theory of flow, or “being in the zone”. This is the idea that to be really good at something, we often need to stop trying so hard. We need to relax into a state of proficiency.
All three of these theories come together, as I see it, to give a good account of performance art. A concert pianist does not think about what all her fingers are doing on the keyboard: that is tacit knowledge that she has already acquired by a lifetime of practice. At the time of performance, she forgets the mechanics and focuses instead on the end goal – the particular sort of artistic expression that she is aiming for. She depends on automatic, non-conscious skills, focuses on ends rather than means, and relaxes into a state of proficiency.
This might be a helpful way to think about teaching if you are a teacher working in the current system, in which your are almost entirely dependent on your performance in the classroom. And it will continue to be a helpful way to think about teaching if you think that teaching will remain a performance art. If you think that Michael Wilshaw is right, that it is all about the individual, front-line teacher and her performance. If you are happy with our highly decentralized organisational model, which depends on isolated and largely unsupported teachers. If, in placing such an emphasis on individual teachers, you are prepared to tolerate the extremely inconsistent performance of the system as a whole. In that case, you will be happy to accept an account of teaching which sees it as a kind of performance art.
But if you agree with me that we should be aiming for more consistent outcomes; if you agree that formative assessment makes logistical demands that isolated and unsupported teachers are unable to meet, then you may think that such a focus on the performance of teachers is unhelpful. Then, you may agree with Kim Taylor, that we place an expectation on teachers of “too many performances” and “prodigious feats of co-ordination, busking their restless audience like a one-man band” – and that these demands for “too many performances” do not always make teaching a very attractive proposition for practitioners, either.
These are the reasons for rejecting Professor Wiliam’s emphasis on intuition: it doesn’t work; it doesn’t lead to consistent outcomes; it cannot meet the logistical requirements of teaching at scale; and it doesn’t create the sort of a satisfying working environment that will attract and retain high quality teachers.
SLIDE 35

It is also relevant that Flyvbjerg’s interpretation of Aristotle, on which much of the theory is depends and which has been taken up by a series of contemporary educationalists, including Frank Furedi and Gert Biesta, is based on a serious misunderstanding of Aristotle. I have dealt with this issue in two of my blog posts, Aristotle’s phronesis misunderstood and Flyvbjerg, phronesis and the expertise of teachers. If you ask who am I to say that I am right and half of the professors of education in this country are wrong, then my argument has at least been endorsed by Professor Kristian Kristianson, Professor of Virtue Ethics at Birmingham University and an Aristotle expert, who commented on my blog that “I am very impressed with your rebuttal of common misunderstandings of phronesis in educational circles…I have made many of the same points myself”.
It is not that anyone here is necessarily that interested in Arisotle. The reason that our understanding of Aristotle matters is because what Aristotle said was right. He uses the example of a saddle-maker. The saddle-maker’s expertise lies in how to make a good saddle. It doesn’t consist in what constitutes a good saddle – that is part of the expertise of the cavalryman, the saddlemaker’s customer, who knows what is needed to ride a horse into battle. And while the cavalryman’s expertise lies in how to fight from horseback, it is not the cavalryman but the military general whose job it is to know what constitutes an effective cavalry charge. At each stage, the expertise of the supplier of a service lies in how to deliver that service but it is not for them to determine why. And it is the same for teachers. What we should teach in an academic schools’ maths curriculum is for the university maths professor to say – the next link in the supply chain; what skills are needed to boost the employability of students is for employers to say – the ends of education are for others to decide, not teachers. To argue from Aristotle’s theory of phronesis that every teacher should decide on objectives independently is to turn Aristotle’s theory on its head. And it is to turn common sense on its head too.
SLIDE 36

So we have seen that the emphasis on tacit knowledge and intuition, which Dylan Wiliam advocates, justifies a highly decentralized model of provision that is incapable of meeting the logistical demands of effective forms of formative assessment. We have also seen that the conflation of means and ends, that such a doctrine of intuition implies, is inconsistent with basic ethical principles and a proper understanding of the nature of expertise.
This brings me at last to consider curriculum, which is, after all, the subject of this talk.
SLIDE 37

A good place to start is with this true and important statement by Professor Wiliam, that “the word curriculum has no generally agreed meaning”. I started by saying that our educational system was dysfunctional – and I think this is another good indicator of that dysfunctionality. Saying that educationalists have not yet agreed what they mean by “curriculum” is a bit like saying that physicists have not yet agreed what they mean by “acceleration”. How can you build any coherent body of theory when you have not even defined your most elementary terminology?
SLIDE 38

I would suggest that there are at least four common meanings which people give to the word “curriculum”. When you hear politicians use the word, perhaps in House of Commons Committee rooms, they are generally referring to what is on the school timetable. If you ask “is ‘citizenship education’ in the curriculum?”, you mean “do students get any lessons in citizenship”. That is fairly uncontroversial but it is dealing the issue at a high level. And it re-enforces the model in which education is delivered by isolated and unsupported teachers. Yes, Jo Bloggs gets one period a week on citizenship with Mrs Jones – and he goes to the right classroom and the right time and we shut the classroom door and at that point the school authorities wash their hands of the whole business: what happens after that is up to the teacher, relying on intuitive judgments and personal experience, to decide both the detailed ends and the methodology that will be deployed to attain those ends.
If we delve any deeper, we find there are two radically different understandings of what the curriculum is.
The first I would define as “an aggregation of learning objectives” and when I talk about “learning objectives”, I am talking about the knowledge, skills, and understandings that we want the students to acquire. In my view, this is the most useful definition and the one that matches most closely our common-sense understanding. The curriculum is “the stuff that you are being taught” and by “stuff”, most people mean “knowledge, skills and understandings”.
SLIDE 39

But in the last fifteen or twenty years, this definition has been rejected by educationalists. According to Wiliam, it started with Stenhouse, who argued that setting predetermined learning objectives was reductive. It prevented students acquiring skills of originality and creativity and developing according to their own lights. We should instead give students experiences and let them make of those experiences what they like. In a phrase that is often repeated by Professor Wiliam, Stenhouse argued that setting predetermined objectives “made the teacher an intellectual navvy, knowing where to dig trenches without knowing why”. In fact, the opposite is the case: it is only by setting pre-determined objectives that you can communicate why you are doing anything at all. Without objectives, there is no reason to dig any trenches at all and there is no criterion by which you can judge whether the trenches you have dug are the right ones or not. Without objectives, the whole of education becomes, quite literally, aimless.
The argument about intellectual navvies also conflates the matter of understanding the objective and deciding on the objective. It is very important that everyone in a business understands why they are doing what they are doing, so that they can respond to unforeseen events and optimize their performance. But this is not the same as saying that it is for the supplier, still less for the individual worker, to decide the purpose of the work being done. That would be like telling everyone to dig trenches wherever they like, which is not a good way to build a railway.
The upshot of this very flaky argument, according to Wiliam, is that we should “specify content rather than objectives”.
The frequently-used word “content” is another indicator of a poorly defined terminology. It is almost completely meaningless. The contents of a handbag are quite different from the contents of a speech; the contents of a book are different from the contents of a lesson, which are different from the contents of a assessment, which may or may not be different from the contents of a curriculum (depending on what “curriculum” is taken to mean: according to my definition, curriculum content is an aggregation of objectives). Never talk about content unless it is clear what the stuff you are talking about is contained in. But let us assume that by “content”, Stenhouse means “content of a lesson” and that this means some sort of activity.
SLIDE 40

And so it is that the modern, official meaning of “curriculum” has morphed from an “aggregation of learning objectives” – the knowledge, skills and understandings that we want students to acquire; into a “programme of planned activities”. This is the sense that Ofsted uses the term when it inspects schools on their curricula. And it is the sense in which the term is used by the 1999 National Curriculum, which states [p10] that “The School Curriculum comprises all the learning and other experiences that each school plans for its pupils”. This extraordinary statement goes so far as to suggest that learning itself is a type of experience and does not therefore describe the acquisition of knowledge, skills and understanding. Such is the Alice-in-Wonderland world into which the whole education system is plunged when we are not clear about the meaning of the words that we use.
SLIDE 41

The change in meaning of the word “curriculum” has left everyone confused. Both Dylan Wiliam and Tim Oates, who were on the expert panel advising Michael Gove on the curriculum in 2012, both believe that “the National Curriculum is not really a curriculum at all” – because the National Curriculum is a statement of knowledge, skills and understandings and Wiliam and to some degree Oates both believe that a real curriculum should be a programme of planned activities. Their report to the Secretary of State used the word “curriculum” in both senses, almost indiscriminately. And it is clear, when you listen to the House of Commons Education Select Committee taking evidence from educationalists, that for most of the time the politicians have not got a clue what the educationalists are saying, because they are talking in an convoluted, inconsistent, private language.
In supporting a definition of curriculum as a “programme of planned activities”, Dylan William emphasizes that “curriculum is pedagogy”, pedagogy in this context meaning the method of instruction (in contrast to many Marxist educationalists, who understand pedagogy as an approach to education that is based on political ideology). It is important to note that this formula conflates the ends and means of education. We don’t really have a word any more that describes our educational objectives – all we talk about is what we do, the activities we plan, the means we employ. As the purpose of these activities is undefined – or at best, determined by the individual intuition of hundreds of thousands of different teachers – there is no way of determining whether those activities are well chosen or effective. There is no answer to the question “what works?” because there is no shared statement of what we are trying to achieve.
SLIDE 42

So the really important confusion is created by definitions 2 and 3: an aggregation of learning objectives and a programme of planned activities.
Tim Oates, whose view on this matters because he was the Chairman of the Expert Panel for the National Curriculum Review in 2011, Tim Oates references a fourth definition, according to which the curriculum is an aggregation of policy instruments – and by policy instruments, I am talking about that middle layer in my organisation diagram: training, textbooks, assessments, inspection etc. The curriculum, according to Oates, is more than just a programme of planned activities; it is everything we do to deliver our educational services.
That is what he explained in a talk at ResearchEd 2016 called “Why curriculum matters” – to which this talk is a direct response.
SLIDE 43

A lot of Tim Oates’s argument about the curriculum hinges on the term “curriculum coherence”, which he has taken from Professor Bill Schmidt, who worked on the data from the TIMSS Maths tests in the 1990s. You will see a lot of talk about “Bill Schmidt’s work on curriculum coherence” not just in what Oates writes but also, for example, in the Commission for Assessment without Levels. But it is clear that, just like the games of academic Chinese whispers that Tooley observed in 1998, no one actually understands what Schmidt meant by “curriculum coherence” because everyone is copying the explanation given by Oates, who got it wrong.
Describing “curriculum coherence” as “a highly precise technical term” [Could do better, p4], Oates explains that it has two key characteristics.
- First, “content” (whatever that is) is arranged in a way that matches age-related progression. Six year olds should either learn the things that six year olds are ready to learn or they should do the sorts of activities that six year olds are ready to do.
And in passing, I will note that there has been a lot of criticism of this sort of approach, which rests in part on the work of Jean Piaget and his idea of readiness. Maybe some people can do at 6 what other people have to wait until they are 12 before they can manage. Any idea of age-related progression seems to have the effect of holding back our brightest students and giving an excuse for low expectations. - Second, Oates observes, the coherent curriculum is one in which all the elements of the system, or policy instruments, line up. This is where we get the fourth definition of curriculum, which is an aggregation of policy instruments which, insofar as they are coherent, will be aligned.
To justify this definition, Oates references Schmidt’s 2006 paper, Curriculum coherence and national control of education.
SLIDE 44

But in this paper, Schmidt specifically denies that curriculum coherence is principally about the alignment of instruments. “Most of the studies”, he says, “have defined curriculum coherence as ‘alignment’. This is an important criterion, but we argue that…it is not a sufficient one”. You might say that Schmidt is allowing that alignment is at least part of the story – but this would over-interpret Schmidt’s begrudging concession that alignment is “an important criterion”. At best, I think that he is suggesting that alignment is a necessary precondition of any discussion about curriculum because in the absence of any clear statement of educational objectives, those objectives can be inferred by observing what people are actually doing. If you visit a rifle range and cannot see what the targets are, you might lean over someone’s shoulder, look along their sites, and see what they are aiming at. In that way, educational objectives might be inferred by looking at textbooks, on the assumption that the two are aligned. If you read the whole of this paper, which is the one that is always cited by Oates and those he has influenced, you will understand that it is not concerned to define what curriculum coherence is, but only to ask whether central state control is necessary in order to achieve it. In this sense, Oates is referencing the wrong paper. It is not in the 2006 paper that Schmidt gives his account of curriculum coherence, but in two earlier papers.
SLIDE 45

In his 2002 A coherent curriculum, Schmidt explicitly defines “standards and curricula as coherent if they are articulated as a sequence of topics and performances that are logical and reflect the sequential or hierarchical nature of the content”; and he makes clear that he is not using coherence to mean alignment by referring to the issue of alignment separately, complaining, for example, that “American students and teachers are greatly disadvantaged by our country’s lack of a common, coherent curriculum and
the texts, materials, and training that match it” [i.e. are aligned to it].
Note that in America, “standards” are what we would call criterion references or statements of attainment – in other words, learning objectives. So what Schmidt means by “curriculum” is not activities and certainly not policy instruments, but learning objectives – the knowledge, skills and understanding that is revealed by the student’s performances. Note too, Schmidt’s use of the word “performances”, which I shall come back to.
There is no mention here, nor anywhere else that I can find in Schmidt’s papers on this subject, about “age-related progression”. What he is really talking about is the intrinsic structure of the knowledge itself.
SLIDE 46

In this 2002 paper, and his earlier 1997 paper, A splintered Vision, Schmidt criticised the American maths curricula as comprising “long laundry lists of unrelated topics”, “a mile wide and an inch deep”. The point about this, the best-known of Schmidt’s analogies, is that a thin film of water has no structure. Curriculum coherence is all about the way that your different learning objectives are structured and how they relate to each other.
SLIDE 47

And the best way to start building that structure is to identify the really important, organising principles.
SLIDE 48

Dylan Wiliam gives what to my mind is the correct explanation of curriculum coherence. It is, he says in his Principled Curriculum Design, about “the internal logic of each discipline or subject” whose structure will best be revealed if you identify the “big ideas” of that subject. So far, so good. And he goes on to say that identifying these big ideas is “a very difficult task, requiring profound subject knowledge” as well as “substantial teaching experience”. I completely agree. But, first, let’s note that we are now talking about the curriculum as an aggregation of learning objectives and not, as Wiliam earlier argued in the same book, as a programme of planned activities. So one of my principle criticisms of Wiliam is inconsistency in the way that he defines a word to mean one thing and then uses it to mean something else.
Second (and remembering Wiliam’s emphasis on private intuition) why do we expect hundreds of thousands of individual teachers to come up with their own different accounts of the curriculum when it is clear from what Wiliam says that they are not qualified to do this? How is the Maths teacher who is being asked to teach Computing on the basis of a pretty superficial understanding of the subject – or the history teacher who is covering a new topic by staying a few pages ahead of the class in the textbook – how can it be reasonable to expect these teachers to complete this extremely challenging task, which must be done on the basis of “profound subject knowledge”.
And why do we need to ask tens of thousands of front-line teachers to reinvent this wheel, probably badly, over and over again? We do not need students in Hull to learn a different sort of Maths from students in Croydon. There is no good reason at all to adopt such an inefficient way of devising coherent curricula. These processes need to be centralised.
SLIDE 49

The difficulty we have if we want to centralise the process of curriculum design (understanding curriculum as an aggregation of learning objectives) is that the way that we have attempted to describe our learning objectives through criterion references, or rubrics, has been shown to be unreliable.
Anyone who has been attending these conferences will probably have heard from Dylan or Daisy about the inadequacy of rubrics. This argument has frequently been made by reference to an imaginary rubric which might require students to “compare two fractions and identify which is larger”. Research from the early 1980s shows what is predictable enough, that the difficulty of this task is entirely dependent on which fractions the student is asked to compare: 90% of 14-year-olds can tell which is larger of 3/7 and 5/7 – but only 15% of 14-year-olds can tell which is larger of 5/7 and 5/9. The research doesn’t say what proportion of 14-year-olds can tell which is larger between 3/7 and 5/9, where both numerator and denominator are different – perhaps only 1% or 2%. The point is that the rubric “compare two fractions and identify which is larger” gives no indication whatsoever of the difficulty of the task being prescribed or the level of understanding required by the student.
The first thing to be said about this argument is that the rubric “compare two fractions and identify which is larger” does not come from any real curriculum. It was invented by Dylan Wiliam specifically for the purpose of illustrating what a really bad rubric might look like. That is fine if you are illustrating a potential problem with rubrics. But to argue from that example that rubrics are intrinsically inadequate is a bit like saying that because you can’t fly to New York on that table, then it is pointless to try to fly to New York at all. It is a really bad argument. If this rubric is so bad, we should be asking ourselves how we write a better rubric – or better still, what else we can do to describe and communicate our learning objectives.
Both Dylan and Daisy will give you the answer if you ask them. They will say that learning objectives need to be exemplified.
SLIDE 50

And this was the original recommendation of the 1987 Task Group for Assessment and Testing, chaired by Professor Paul Black, which informed the 1988 Education Act and the introduction of the first National Curriculum. TGAT said that all attainment targets should be carefully exemplified. The problem with the National Curriculum and our recent history of criterion referencing, is that the recommendations of the TGAT report were not followed. And the problem with the argument being made by Dylan and Daisy is that they do not seem to recognise that exemplification is a way of describing and communicating learning objectives across the education system. Professor Wiliam recognises the importance of exemplars as a way for teachers to communicate a learning objective to their own children and recommends they always try and offer at least a couple of exemplars to their class. But this just gives yet another task to the isolated and unsupported teacher to complete and doesn’t help the consistency of standards across the system. Exemplification is not recognised as a systematic response that will allow for the central definition of learning objectives. That, it is assumed, can only be done by rubrics, which they dismiss as inadequate.
SLIDE 51

Instead of suggesting that we can clarify rubrics by the use of exemplars, Daisy Christodoulou argues that exemplars should replace rubrics. She proposes a way of moving beyond the rubric might be a method of assessment called comparative judgement. This requires that teachers develop their own tacit appreciation of their own educational objectives (what Dylan Wiliam calls their “nose for quality”) by comparing a series of pairs of student work and in each case deciding which is better of the two. After using this technique on a series of different pairs, teachers will have created a rank-order of quality across the group, which will have created a set of exemplars of what good looks like and what it doesn’t look like.
I am not against comparative judgement. Nor do I disagree with Professor Wiliam’s concept of a “nose for quality” – a phrase that sums up the idea that teachers need to internalize their appreciation of “what good looks like”. My problem with this campaign is that comparative judgement is being presented as an alternative to criterion referencing and the explicit communication of learning objectives between teachers.
What happens if one teacher comes up with a different rank order to another? It will probably be because different teachers are implicitly ranking the work according to different criteria.
At this point, I think I need to acknowledge that there is a political or ideological resonance to Christodoulou’s campaign – and one that I am sympathetic to. The implementation of the TGAT report in the late 1980s was influenced by left-wing education advisers, particularly in the Inner London Education Authority, who interpreted criterion references as binary checkboxes: either you had “got it” or you hadn’t. And if you were very clear about what your criteria were and you made sure that the criteria closely matched the natural age-related development processes of children and you were half-way competent in your teaching, there was no reason why almost all of your students should not end up “getting it”. This promised an egalitarian education system in which there was a realistic hope of prizes for all.
By insisting that you cannot understand what good looks like until you have put your students in rank order, Daisy is challenging this egalitarian ethic. To that extent, I agree with her. Education is intrinsically inegalitarian because it is about making people better. Any value-system that places undue weight on avoiding inequality will always tend to lower expectations and be corrosive to educational endeavour.
But this political point is not a reason to throw out criterion referencing, which was misinterpreted when it was assumed that it was about binary check-boxes. If you were judging a dance in Strictly Come Dancing, you might analyse a performance against several different criteria such as technical footwork, accurate timing, fluency of movement and artistic expression; and none of those criteria can be represented as binary check-boxes: they are all represented by marks out of 10, in which some might excel while others are merely competent.
Without any generally accepted definitions of learning objectives, comparative judgement creates a system in which teachers develop entirely private understandings of what good looks like. This re-enforces all our current problems of inconsistent performance, isolated and unsupported teachers, and a failure to develop centralised responses to educational requirements.
SLIDE 52

When the outputs from such as system need to be accepted as evidence for the purposes of statutory teacher assessment, Daisy’s answer is that they must be based on inter-school moderation that will establish standardised measures of performance. But moderation is potentially time consuming and needs to be approached with caution.
SLIDE 53

That was the conclusion of the 1994 Dearing Review into what had by then been recognised as the car crash of the early implementation of the 1988 National Curriculum: “Great care must be taken to assess the ‘opportunity costs’ of any moderation system. We must balance the need for objective scrutiny of the marking standards in individual schools against the very considerable cost in teachers’ time that such a system inevitably involves”. That advice should be taken particularly seriously in the light of our current workload crisis.
SLIDE 54

Daisy addresses the issue of workload by promoting the Sharing Standards scheme, a systematic process of online moderation, managed by digital analytics systems. In this diagram, taken from a sample report sent to a subscribing school, the range of standards of work submitted as evidence for KS2 writing is compared to the standards of writing examples produced by other schools in the scheme. In general terms, I think this represents exactly the right way to go: automated, centralised systems that support front-line teachers while minimizing workload.
But if you acknowledge that this is a system based on exemplars, you have to ask “exemplars of what?” And the answer is that we are exemplifying only the very highest-level learning objectives: in this case “writing ability”. The system will not tell you what aspects of writing ability teachers in other schools might value more highly than you do, nor will it help you ensure that the order in which you rank your students is more closely aligned to the way that other teachers would rank those same examples. It is not going to help you improve the quality of your formative assessment. It is not going to help classify different types of intervention designed, for example, to teach better paragraph structure or the use of metaphor. To paraphrase Dylan Wiliam and Stenhouse, it will tell you how many marks to give to your students but not why.
At the other end of the Christodoulou’s spectrum of recommendations, is the setting up national banks of question items. This also became a recommendation of the Commission for Assessment without Levels.
SLIDE 55

Associated with the promotion of national data banks of question items is Christodoulou’s advice that teachers track student performance against their answers to such individual questions and not against meaningless rubrics. These two screen grabs are taken from the online video of her presentation to Research Ed 2015.
Christodoulou tells teachers not to track students against criteria such as whether they are able to ask and answer questions or whether they can make inferences when reading texts; instead, they should track whether their students can answer question item 10,341 in the national item bank: “What is the verb in the sentence ‘I run to the shops’?”
So we are left in a situation in which we can measure whether a student is judged to be good at writing and whether they can answer question 10,341, but nothing in between.
SLIDE 55A

I believe it is useful to envisage the curriculum as a hierarchy of learning objectives, with concrete & specific objectives at the bottom (knowing which is the verb in the sentence “I run to the shops”) and high-level objectives at the top (being good at writing). In this model, the intermediate objectives allow a smooth progression from the mastery of tightly defined procedures and factual knowledge at the bottom up to the general skills and dispositions that allow people to respond to real-world requirements. They provide the definitions that Bill Schmidt argues we need in order to provide a structured, coherent curriculum, and the means to understand how our programmes of study need to be sequenced.
SLIDE 55B
But it seems to me that in Daisy Christodoulou’s model, in which we assess general writing skills on one hand and responses to individual questions on the other but disregard any intermediate statement of objectives, the whole of this important middle piece of the curriculum is lost. In this model, the curriculum has no structure, no chance of coherence, and there is no opportunity to create digital tools that will help us manage progression and sequencing, or aggregate individual results to improve the reliability of our assessments. It is a model to make any data scientist weep.
I acknowledge, of course that the criteria that Daisy cites, such as “Can ask and answer questions”, are meaningless. But just because we have created bad criteria in the past – shockingly bad criteria at that – is no reason why we cannot create good and meaningful criteria in the future. Just because we can’t fly to New York on that table is no reason why we can’t fly to New York in a 747.
This raises the question of how we are going to do in the future what we have failed to do properly in the past.
We have already established that we are going to create meaningful learning objectives by exemplification. That raises the next question, which is what is the nature of the examples we are going to use? And the answer is that you are going to produce examples of student performances, the same word that was used by Bill Schmidt when talking about the specification of curricula in terms of knowledge, skills and understanding.
This gets us into another misleading argument that is made by Daisy, Dylan, Warwick Mansell and many other educationalists, that we should not rely on performance as an indication of knowledge, skills and understanding.
SLIDE 56

The argument that is most commonly used is taken from Robert Bjork, who points out that “instructors frequently misinterpret short-term performance as a guide to long-term learning”.
SLIDE 57

Dylan Wiliam makes the point by saying that “psychologists tend to…[distinguish] between performance and learning. Performance is what we see when someone is being taught how to do something, while learning is defined…as ‘a change in long-term memory’”. This distinction is clearly false. It is simply not true that performance “is what we see when someone is being taught how to do something”: performance can occur at any time.
SLIDE 58

Indeed, there are many reasons why we need to require students to produce repeat performances, many of them being summed up by the phrase “spaced learning”: the likelihood that mastery will decay over time; the need to learn to apply principles in a variety of different contexts; the need to “put it all together” by applying isolated skills in more complex and what are often called “authentic” situations; the need progressively to withdraw teacher support or “scaffolding”. An additional reason, highlighted by the work of Bjork and other psychologists, is that learning can occur through processes of consolidation and delayed reflection, which may occur long after the receipt of instruction or participation in learning activity – and the effect of those delayed processes needs to be measured, and probably needs to be re-enforced, by delayed or spaced performances. Where student performances are regarded as opportunities for assessment as well as for teaching, it is also relevant that we need to build up reliability by repeat sampling.
All this emphasizes the fact that performance does not only occur during instruction, but can and should be repeated over extended periods, often well after instruction; or that instruction should not be regarded as a short episode, but rather an extended process, often revisiting familiar territory from slightly different perspectives.
SLIDE 59

It is also of no help to define learning as “a change in long-term memory”, claiming that this is something different from performance, because we cannot observe long-term memory directly – we do not even really understand what it is. Everything we know about someone’s long-term memory is inferred from our observation of their performances.
SLIDE 60

Bjork is right that we should not confuse short-term or isolated performances with long-memory and he is right that the relationship between an individual performance and the internal process of learning may very well be complex and counter-intuitive. But this is not the same as saying that that repeat performances made over an extended period of time are not a very good guide to long-term memory. Indeed, it is the only guide that we ever have to long-term memory: that is, to knowledge, skills and understanding.
SLIDE 61

And that is why Bill Schmidt defines the curriculum – the knowledge, skills and understandings that we wish our students to attain, in terms of the performances that we expect to see in the case that they have mastered those learning objectives. And it by the exemplification of performance that we can describe that curriculum.
Let me conclude this discussion of performance by summarising the two very important reasons why we should not mistrust performance.
SLIDE 62

First, because it is by performance that we learn. Practice is a form of performance and we learn new skills by practice. It is true that our active practice needs to be accompanied by internal processes that we can call consolidation or reflection, and that these internal processes may occur at different times. But these internal processes are the consequence of performance and practice, not an alternative to them.
SLIDE 63
Second, because the stuff that we want to teach – knowledge, skills and understanding – are invisible. There is no way that we can get inside people’s heads and determine what they know, what they understand and what they can do. The only way that we can know any of these things is by observing their performance. And so all of the different ways that we characterise our learning objectives – knowledge, skill, understanding, perhaps even attitude and physical development – can all be reduced to one word: capability. Capability is a disposition to perform. When we attribute a capability to someone, then we are predicting the types and standards of performance that they will produce in certain situations; and it is only by observing those performances over an extended period that we can infer that they have a certain enduring capability. If that is the only way we can infer knowledge, skills, understanding and attitude, then that is the only way that we can describe and define knowledge, skills, understanding and attitude (and if that conclusion seems odd, have a look at my discussion of logical positivism in The elephant in the room and Choose your paradigm).
And yet, in spite of the central importance of performance to any coherent theory of education, its importance is regularly denied by our leading educationalists.
SLIDE 64

There are several important conclusions that need to be drawn from this discussion. I have already made the argument that the attack on criterion referencing is unfounded because those making the attack have discounted the importance of exemplification – by which I mean the referencing of performance – as a way of describing learning objectives accurately. If you take Dylan Wiliam’s example of the ability to compare fractions, the problems with this objective would be revealed immediately, and the means to resolve the problems would at the same time be suggested, by the observation of the inconsistency of student performance across a a comprehensive body of exemplars.
Second, we have established that our observations of different performances need to be aggregated over time across different but related contexts. And it is by the accumulation and corroboration of different observations of performance that we achieve reliability in our assessment data. When Daisy Christodoulou bases her argument against our current methods of testing on the work of Daniel Koretz’s book, Measuring Up, neither Christodoulou nor Koretz makes any mention at all of the importance of building reliability by aggregating results. This is a significant intellectual failure on the part of the educational establishment in general: they have missed the most important point. Relying on single shot, high-stakes summative exams is like running a political poll by asking a single person what they are going to vote. And it is the intrinsic unreliability of single-shot exams that forces the exam boards to try and compensate by building formulaic and reductive exams, that gets us into the difficulty of teaching to the test. No-one is addressing this problem.
The aggregation of assessment data is entirely dependent on our ability to compare apples to apples. Averaging out a students ability to perform trigonometry and to write a moving poem will not be very useful. We need to classify the different tasks and questions against the learning objectives that they address. The retreat from criterion referencing means that such aggregation of data becomes impossible. At the same time, the recommendations of Christodoulou, Wiliam and the Commission for Assessment without Levels, that the results of regular formative tests should be discarded means that they will not be available in any case to build greater reliability in our assessment system. The conclusions of the Commission for assessment without levels are precisely – 180 degrees – wrong.
Third, perhaps the chief reason why teachers don’t understand the vital importance of aggregating results from different performances is the additional logistical complexity that this requires, in addition to all the logistical complexity that we have already discussed in the case of formative assessment. What is the point of talking about what is beyond the capabilities of isolated and unsupported teachers to achieve? We don’t have the right tools of the trade and the academics and thought leaders in the profession, who base their views on their observations of existing practice, do not understand what these tools might be or why they are needed.
SLIDE 65

So why is “curriculum” in particular such a problem?
First, because we talk about it all the time without knowing what it means.
Second, if we accept (as in my view we ought to do) that the curriculum should describe our learning objectives, then we have lost confidence in our ability to describe those learning objectives.
Third, the reason we have lost confidence in our ability to describe our learning objectives is because we have lost sight of the fact that learning is closely related to performance, both because it is through practice that we learn, and also because it is through the observation of performance that we define our objectives and measure our attainment.
And that is why the discourse of educationalists constantly falls back on a sort of romantic, mystical mumbo-jumbo about teacher judgement, intuition and tacit knowledge – which amounts in practice to an apology for the dysfunctional system in which teachers are currently trapped.
SLIDE 66

It is not enough to hand out tips and tricks to teachers on how to survive in a dysfunctional system. We need to fix the system. And systems can only work when we have clearly defined objectives. And the curriculum is what ought to define our objectives.
SLIDE 67

This is the slide which I used earlier to visualize our isolated and unsupported teachers at the bottom, and our inadequate and uncoordinated policy instruments in the middle. And I suggested that calling these elements “policy instruments” was itself unhelpful, because it suggested a top-down bureaucracy, it suggested that they were levers of power rather than elements in what ought to be a self-regulating system, not dependent on constant tinkering from Whitehall.
I agree with Tim Oates that if we are to achieve the sort of system reform that we need, then the alignment of these different elements is important. But alignment is not enough. When we are dead, we will have no heartbeat, no breathing, no brain activity, no reflexes: all our body functions will be perfectly aligned, but we’ll be dead. Our education systems, as well as being aligned, also need to be well designed in their own rights, and that means that there must be an opportunity to innovate. Innovation means change, and when you start to introduce change, things can often fall out of alignment. So there is at the very least a tension between innovation and alignment. We need to ask not just whether our textbooks are aligned with our assessment and our assessment is aligned with our curriculum – but how this alignment is achieved and whether it is helping us to design good individual processes and technologies.
SLIDE 68

One way of achieving alignment is by handing the whole shooting match to a single person or organization, who will make sure that the official textbooks, the official assessments, the official curriculum and the official training are all perfectly aligned. It could be the DfE or, more likely, some sort of outsourcer. It could be Cambridge Assessment, Tim Oates’ employer.
The problem with this solution, and the problem with monopolies in general, is that they suppress innovation and therefore reduce quality. There is no room for someone to come along and create a new curriculum, which better serves the needs of the modern world; or new textbooks which are more effective at supporting teachers in teaching an existing curriculum. Alignment in this model creates a sort of gridlock. And it tends to make it difficult, maybe impossible, even to perceive that there is a problem in the system because there is only a single source of truth. It would be a bit like living in the middle ages when it was impossible to challenge what was said by the church because they owned all the books.
SLIDE 69

If you think I am exaggerating the problem here, listen to Tim Oates talking about on the DfE’s YouTube channel about Assessment without levels. He says that the Expert Panel found three coexisting models of assessment and he runs through the very significant problems with each of those models. Compensation-based tests – in other words, the SATS test – awards a level 3 based on the number of marks you get in the test, even though you might have got all your marks from the level 2 questions and the level 4 questions and you got all the level 3 questions wrong. Best fit has similar problems – you might award a student level 3 because she is too good for level 2 and not good enough for level 4, even though she has serious gaps in her attainment judged against level 3 criteria. And threshold, when you award level three because you can find evidence of level 3 responses, is even worse – because you tend to award level 3 when a student is only just over the threshold of level 3 and still might have very large gaps in level 3 understanding.
All three of these assessment models are highly inaccurate in their different ways.
Now, if all three models were accurate, they would produce the same result. But because all three are inaccurate and they are all inaccurate in different ways, they produce different results.
SLIDE 70
Extraordinarily enough, Tim Oates argues that the inaccuracy of our assessment models isn’t really the problem. The only problem is that there are differences between the models which creates a disagreement . The problem is the coexistence of different models.
This is dangerous talk. The fact that the different models produce different results tells you that your assessment data is inaccurate – and that is an important thing to know. In my view, all assessment data should be accompanied by an indication of confidence. The main reason why this does not happen is that the authorities do not want to admit how low those confidence levels would be. But Tim Oates would be perfectly happy with this level of inaccuracy, so long as it nobody can see that it is inaccurate, which would be the case if there weren’t any alternative assessment models to compare it with.
This attitude illustrates the danger of a doctrine that emphasizes alignment above quality and ignores the vital importance of achieving reliability by aggregating and corroborating data, rather than trusting a single dataset, just because it is deemed to be authoritative.
SLIDE 71

Perhaps we shouldn’t be too shocked by Tim Oates’ totalitarian approach to assessment because this is what already happens. When it comes to public exams, there is a single source of truth which we know is very unreliable but we just have to put up with it. And the market for textbooks is uncompetitive because the most important thing to do if you are a textbook publisher is to get your textbooks endorsed by the main exam board. So the whole thing is run as a cabal and the excuse produced by all monopolists and all totalitarian regimes is that everything is very well aligned.
What we need is a system that encourages alignment while at the same time encouraging innovation and competition.
SLIDE 72

The only way, it seems to me, to achieve this is to define our educational objectives – in other words, our curriculum – and then allow a free market in the provision of all the other elements of the system.
The proper way to achieve alignment is for our means to be well aligned to our ends. The measure of quality for the market will be the ability of these other elements to assist students in the attainment of these predetermined objectives. By being aligned with the objectives, all the elements of the system would tend to align with each other. At the same time, there would be no restriction on innovation.
We have already seen that the way to define learning objectives clearly is by exemplification and that what we are exemplifying is performance and that performance is measured by assessment.
SLIDE 73

As assessment is one of the key system elements, and we require our assessment to be aligned with our learning objectives, and our learning objectives are themselves defined by reference to assessment outcomes, there is also a need to ensure that assessments by which students’ attainment is measured are well aligned with the exemplars of assessment by which our objectives are described.
SLIDE 74

This can be achieved by digital analytics systems, which can compare assessment data at scale and can ensure that consistent standards of performance are being maintained in respect of multiple learning objectives.
SLIDE 75

Instead of this very hierarchical, top-down model of our education system, with isolated and unsupported teachers on the front-line…
SLIDE 76

We would instead have a system in which the government can supervise from a distance what is essentially a self-regulating and self-optimizing system.
Once we have the means to describe our learning objectives clearly, then we can devise processes to ensure that we can create the different curricula that different students require, responding to a wide range of inputs from different societal stakeholders. And in this way the democratic accountability of the education service and its relevance to the requirements of the modern world will be enhanced.
Once those objectives are more clearly described, we can have an open competition between educational suppliers to provide the different system elements – the training, the textbooks, the digital technology, the assessments – providing the tools of the trade that front-line teachers need in order to cope with the problem of scale, which is the key challenge for modern education systems. Teachers will not be marginalized or de-skilled by this provision of technology, any more than a surgeon is de-professionalised by being given a well-equipped operating theatre in which to work. Teachers will be put in a more powerful position to manage the education of their students, well supported by the tools of the trade that they need, and with the time to focus on the building of relationships with their students and maintaining mastery of their subjects.
SLIDE 77

In order to achieve this transformation, we must start by defining our terms in a way that is clear and consistent. There is nothing more revealing about the intellectual poverty of our education theory than the Babel of inconsistent meanings that is given to some of the most basic terminology that teachers need to understand if they are to start to develop a coherent theory of practice.
We must find the means to describe our objectives which, as I have argued above, will depend on the development of digital technologies that are able to ensure the validity of data that is gathered against different statements of capability.
Contrary to the advice of Dylan Wiliam that curriculum is pedagogy, we must decouple pedagogy, defined as the means of achieving our goals, from curriculum, defined as the statement of those goals.
SLIDE 78
As one of my heroes, Brian Simon, said in his well-known 1981 essay, Why no pedagogy in England?, “Attempts to define common objectives for all pupils across the main subjects is the first necessary condition for identifying effective pedagogic means [to achieve them]”.
SLIDE 79

Curriculum matters because without clear learning objectives, there is no pedagogy, no measure of educational quality, no objective research, no pooling of resources to address provision at a systemic level. Without these things, we continue to be forced to rely on the efforts of isolated, unsupported and increasingly overworked, undervalued and demoralized teachers; with the government trying to compensate for the inadequacy of that delivery model with ineffective and frequently counter-productive bureaucratic controls.
But because no-one is recognising these problems, because no-one in the current system has the experience or perspective required to address them, and because our main thought leaders are promoting educational theories which actively prevent any progress in solving them – I can see no reason why our current dysfunctional education system should not survive for many decades to come.
SLIDE 80

If we are to have any chance of seeing an improvement, then the first thing that we need is to raise the quality of the academic discourse.
I have chosen to critique these three educationalists because in all three cases, they are thinkers that I rate. There is much that they have said that I agree with. I agree with Tim Oates that we need more systematic responses to our educational problems, in particular from learning materials such as textbooks, and I agree with him that curriculum matters. I agree with Dylan Wiliam that we need to improve our feedback and the extent to which our teaching can be adaptive. And on the whole I agree with Daisy’s critique of progressive education theory and support her interest in using technology to address the problems we face. I have chosen these three people to critique, not because I regard them as particularly egregious examples of deluded educationalists, but on the contrary because I think they are among the most interesting and important of our educational thinkers. But even so, I think there is an awful lot that they have got wrong. And when even our best and most influential thinkers have got so much wrong, there is clearly something wrong at a systemic level.
Tooley, with whom I started, said in 1998 that the work of most academics went “unnoticed and unheeded by anyone else”. Perhaps that cannot be said of my three subjects, who are all high-profile and influential figures. But what Tooley was really talking about in this phrase was the low level of contestation between academics. In that respect, little appears to have changed.
If you accept at least some of the criticisms that I have made in this article, you need also to ask why no-one else has made these arguments. Why, when there is no shortage of professional, academic educationalists, is it left to an amateur blogger to point out that the emperor is so scantily clad? I suspect that the world of bloggers and social media has made a situation that was bad enough in 1998 even worse, as people build their own echo chambers of support and ignore those criticisms that might compromise their online reputations. This matters because it is only through contestation that positions can be clarified, errors corrected, and new solutions to old problems suggested.
Of all the different ways in which I think our current education system can be described as dysfunctional, I think that this lack of serious, contested debate is perhaps the root of all evil. I hope that this essay might make a very small contribution to addressing this problem. And to that end, I hope that Tim Oates, Dylan Wiliam and Daisy Christodoulou will respond to the points that I have made, either in the comments section below, in guest posts that I shall be happy to provide on this blog, or on other platforms of their choosing.
I’m still working my way through this brilliant piece – but I have noticed a gorgeously ironic error in the slide 17 paragraph 😉
Many thanks for pointing that out, Adam. I referred to the “Commission for Assessment without Levels” as the “Commission for Assessment without Learning”…
Following on from twitter I’m posting my thoughts here (as requested). Firstly, great blog post really good to challenge thinking and help reflect on what we mean when we (educationalists) talk about these terms.
It is slide 11 where I couldn’t follow the logic and the argument for the need to record formative assessments to improve accuracy. As formative assessments inform teaching and learning they do represent a snap shot of a particle issue or point in a child’s learning. This means the assessment made by the teacher, like most assessments in reality, rests on a professional judgement dependent on the question and/or intervention the teacher gives along side their own subject and teaching knowledge. I believe well trained teachers make accurate judgements during that process and the requirement to record comes from a desire to moderate something which could be a unique situation dependent on context the formative assessment is in. I don’t see what recording adds that offsets the risks it presents in making formative assessments effectively summative.
Slide 12 is where my journey analogy comes in as you imply recorded formative assessments could be used to support summative judgements (and in many ways teachers do this already e.g. during report writing) However by formalising it I can see teachers recording outcomes from formative processes and this just giving a summative snap shot of one particular moment in the learning journey. Lots of these doesn’t mean the child has reached a point where a final summative outcome can be assured. Knowing all the steps to an experiment for instance doesn’t mean you are secure in it. Process doesn’t equal outcome (at least not always) which is how I interpreted that particular part of the blog.
Thanks very much for the comment, Marc. It is exactly the same criticism of my position that was made by Harry Fletcher-Wood after the talk in September (https://twitter.com/HFletcherWood/status/906990421443846147). So it is clearly one in which I need to make my argument more clearly.
For that reason, I am not going to answer here, but instead write up a (shorter than this) blog-post. I’ll try and get round to it as soon as I can, but it will probably be a week at least.
Crispin.
Hello Marc,
This is a very delayed second response, to let you know that my full explanation of why formative assessment results should be aggregated appeared in my essay “keep teaching to the test” at https://edtechnow.net/2018/07/15/assessment/.
To respond directly to your comments.
1. Any assessment (formative or summative) is based on the student’s performance in a particular context. But we are never really interested in how the student did actually respond in that particular context. We are more interested in how they would respond more generally in other, similar contexts. This inference might be made by some rather mysterious process of professional judgement, but it can more reliably be made (I argue) by aggregating the results of the student’s performance in many similar circumstances and observing a pattern of consistent performance. To do this, you need to record the individual performances in order to aggregate your observations across time and across different contexts.
2. I think your use of the journey analogy makes an unreasonable presumption that the preparatory stages of the journey are of little value and that all that matters is the final destination. This might be the case when the student performs an experiment in the lab. I think in most teaching situations, the student is practising and rehearsing knowledge and skills all the way through the course that is of lasting value at the end of the course. The intermediate outcome is closely correlated to the terminal outcome and therefore measuring it helps to corroborate (i.e. improve the accuracy) of the terminal judgement. I distinguish between “terminal” as in “occuring at the end of the course” and “summative” as in “summarising the whole course”.
But if those direct responses don’t do it for you, have a look at my longer essay on assessment, particularly from this point: https://edtechnow.net/2018/07/15/assessment/#bry18_72.
Thanks again for your comment. Crispin.
We’ve covered much the same ground in a paper we published earlier this year, but we looked at AfL from the perspective of the teacher. Dylan Wiliam notwithstanding, AfL has widely been interpreted as a means of personalising learning, which as you say has horrendous workload implications. No doubt this is why it has never been implemented with any conviction. In reference to a 2008 Ofsted evaluation of the implementation of AfL in 43 schools, a 2013 report by the CfBT commented: “It was noted that only a few schools of those inspected were actually putting AfL into practice and found that of those that were implementing it, many were not introducing it in accordance with the way advocates suggested was most desirable. That is why the practical feasibility of the approach has become an issue requiring further research in relation to this way of understanding assessment.”
There’s a lot to like in your arguments, but I think we will always be struggling so long as we use essays for high-stakes exams. Knowledge is easily taught and easily tested, yet far too many teachers are still trying to teach writing skills when there are huge voids at the bottom of Bloom’s pyramid. See http://parliamentstreet.org/wp-content/uploads/2017/02/Free-Schools-For-A-Free-Society.pdf
I am very grateful for your link to this excellent paper which, as you say, covers very similar ground (especially at the start) from a slightly different angle. We share a very similar analysis of the problem, including how unhelpful has been teachers’ identification with the views e.g. of Sir Ken Robinson (cf my https://edtechnow.net/2012/01/20/sir-ken-robinson/ and guest post from Scott Goodman at https://edtechnow.net/guest-posts/ken-robinson-rebuttal/).
I’d be interested to know more about the reaction that your paper has received. My comments about the lack of dialogue in my article are matched to some degree by the lack of engagement by politicians at the moment with the sort of radical ideas that are required. As Tim Stirrup recently posts, education authorities are not nearly bold enough (https://twitter.com/timstirrup/status/930760125454929920).
I like your image of our implementation of Bloom’s taxonomy being hollowed out. I am not so sold on removing essay questions altogether from our repertoire. Nor am I a keen supporter of David Didau’s recent work. It seems to me that proponents of the knowledge-based curriculum started from E.D.Hirsch’s position, which said that a mastery of skills and facts were (at least in academic domains) interdependent (I agree) but now seemed to have moved on to saying that only the facts need to be taught and the skills will emerge by themselves (or are a function of genetic inheritance). I find this implausible and potentially damaging. 1) the argument that domain knowledge is a necessary precondition of skills acquisition is perfectly compatible with the possibility of skills transfer between two domains, in both of which there is appropriate domain knowledge; 2) David’s estimation of the importance of genetic inheritance seems to ignore the point made by Dan Willingham that small differences in genetic predisposition are greatly amplified by the effects of environment, given that brighter students choose more educationally productive environments (though this divergence can be reduced by non-elective education); 3) the argument about biologically primary skills like language ignores the fact that exposure to these skills is ubiquitous in our normal, everyday environment, so the inference that what does not need to be taught is therefore a matter of genetic inheritance does not seem to me to be justified.
For these reasons, I do not accept the argument that skills do not need to be taught. I do think that the model of Blooms’ taxonomy (which is rejected by Christodoulou and Didau) or at least a rough interpretation of it, helps highlight the need to ensure a smooth progression from concrete and factual to complex and applied, and that it is a waste of time and effort to try and do the second before the first.
As you will have gleaned from my post, I am sceptical of the capability of teachers (on which I think you propose to rely) to create the sort of instructional materials (aka tests) and assignment / sequencing / tracking tools that we need, especially when targeting the more complex and applied forms of learning. I think I am more accepting of the importance of feedback (and hence, a certain sort of personalisation) than you. I also think that it is through the centralisation of data, collected automatically by digitally-mediated instructional software and processed by digital analytics systems, that we can achieve with relative ease a much greater degree of objectivity in the assessment of more complex and applied tasks than at present. The secret is that what might be dismissed as subjective judgments by teachers and examiners need to be corroborated against each other and against more objective (probably prerequisite) measures, and the reliability or unreliability of those judgments acknowledged and appropriate compensation made.
Although I criticise the current politicians for not being bold enough, I think one also has to consider the question of political risk and feasibility. The fact that you admit that your proposal to remove essays would attract massive opposition is, I suspect, to recognise that in political terms it is likely to be a non-starter; while removing Ofsted’s role entirely would get a similar reaction from those concerned (not without reason) the dysfunction of our education system. Could your Ofsted-free status survive its first scandal, when some terrible abuse was exposed in one of these uninspected schools? For that reason, I think that there is a possibility of slightly more slowly-slowly approach. The remit of Ofsted can be reduced to sweeping at the rear for unsatisfactory schools, placing more weight on data in their reports, and giving well-performing schools more leeway. That seems at least to be the direction of travel under Amanda Spielman, so there is perhaps room for hope.
My own proposals, which focus on the creation of an environment that will create a more efficient market for edtech, have the benefit of being non-invasive, from the perspective of political risk. Given clear educational objectives and effective means of achieving those objectives while at the same time reducing workload, teachers will willingly suck up the tools that a reinvigorated supply industry will offer them. If the industry doesn’t come up with anything that floats their boat, no-one needs buy anything or take to the streets in protest.
I don’t mean to pick arguments unnecessarily – but I think it is useful to explore where we differ as well as where our views overlap. I think the key point that we both share is the fundamental need to introduce objective measures of performance, if the steady drift and decay is to be halted and the question “what works?” is to be given any meaning. And I am also very happy to join you in saying that the best place to start on that effort is by looking at objectively measurable capabilities based on knowledge and simply procedural learning.
Thanks again – I really enjoyed your paper – and I hope that we have an opportunity some time to meet and to continue the conversation. Crispin.
Many thanks for your detailed response. You’ve raised a lot of points, and I’ve got a busy day in front of me, so I’ll just comment on a few of them.
In the first instance, I acknowledge that abolishing essay questions from GCSEs is not now in the realm of political reality, but by the same token we are making progress in that direction. I would contend that we really don’t know how to teach children to write well, and this certainly isn’t for a lack of trying. However, the main objections remain: essay questions focus teaching on unproductive activities, and they give us an extremely limited picture of what pupils have learned. I allow that in English Literature and to a lesser extent in History (my first degree subject) it’s rather difficult to fully test learning with objective questions, but in STEM subjects essay questions serve little purpose.
On the question of skills, I like to quote Heather Fearn, who taught her own children maths: “Understanding is just not a biggy. Identify the knowledge necessary to calculate the component parts of a problem and get fluency in those and generally activities for understanding become a (crucial but) small part the maths diet.” In any case, the division between knowledge and skills is to some extent a red herring: knowledge is in itself heirarchal; a well-structured curriculum is far more than rote memorisation of facts. Facts in themselves are meaningless out of context.
However, the greatest advantage of Michaela-style knowledge tests is that they motivate pupils to a degree that is simply not understood by teachers who’ve grown up with the belief that competition demotivates the least-able pupils. My own teaching experience (other than training in the building trades and the Army) is exclusively teaching literacy skills to SEN pupils, and every single lesson I’ve ever delivered ended with a competition or test of some sort. And my pupils thrived on this–if my lesson ran a couple of minutes late, a hand was guaranteed to shoot up: ‘Please, sir, can we have our competition?’. We should bear in mind that teams in the relegation zone try even harder than those in the middle of the league standings.
My co-author, Colin McKenzie, has initiated weekly knowledge checks in Science at his comprehensive in Burnley, and they have transformed the classroom climate to such a degree that they are now being rolled out across the curriculum. Far from being a reductionist approach, as pupils accumlate knowledge, their interest in the subject increases. This should be common sense; the more we know about a subject, the more likely that new information will connect with what we already know, thereby building increasing rich schemata. From the teachers’ perspective, life is vastly easier–instead of the complex actions required by AfL, pupils can be relied upon to test each other on the relevant material in their knowledge books. They really are that motivated.
Just about enough time for a comment on Ofsted–in my 2009 CPS paper on School Quangos, I argued that Ofsted should only inspect teaching and learning if exam results fell below a certain level, if parents requested it, or if there were any evidence of impropriety. Although most of my other recommendations were acted on, I knew at the time that this was a bridge too far. Nick Gibb told me as much–ministers view Ofsted as their main instrument for initiating change, and I was unable to convince him that the result would be to make a good idea unpopular with the profession. I don’t share your optimism about Amanda Spielman; although I certainly agree with most of her vision for what schools should be doing, only serving teachers understand how these things play out in schools when Ofsted takes up the cudgel. I’d like to post Colin’s reaction to her ideas, but it’s largely unprintable!
Thanks Tom,
Likewise, I am also rushing today, but time for some quick responses.
I completely agree with you on the motivating effect of knowledge tests, particularly on those who struggle with the more complex objectives. That is also my experience as a History teacher. As was the need to have a knowledge test before you let them write an essay, otherwise you would completely waste your time reading essays written by students who knew nothing about the subject. Finally, I was always struck by the willingness of other teachers to give reasonable marks to students for essays, simply on the basis that they had written something that might possibly be construed as relevant to some rubric, even when it was clear they had not even begun to answer the question.
Nevertheless, my own experience of reading History was a continual dirge of teachers telling me to “answer the question” and not simply to regurgitate a lot of facts that might possibly be relevant. In my introspective view, which also coloured my own teaching, this was the most valuable lesson that I think I learnt from the subject, and I would suggest that there is a very clear difference between an essay that carries through a clear argument and one that is merely a catalogue of facts. So, even though I accept that we have not been successful in teaching students how to write this sort of essay (probably partly because the majority of History teachers at secondary level have a pretty hazy idea of it themselves), I do not agree that it is not worth teaching. But as disagreements go, I do not really think that this one matters. Given the political impossibility of removing essays from the curriculum anyway, I would be completely aligned with your recommended actions, to codify and emphasise facts, at least as a first step up the Bloom ladder.
Peer-to-peer conversations and tests are a great way to scale interaction in the classroom (something that Dylan has focused on) and I agree that with clearly defined objectives, then this can work well, especially (as you say) when everyone is motivated. Eric Mazur’s peer instruction offers a structured approach to leveraging peer-to-peer interaction on more applied problems – though much of the experience of groupwork is that peer-to-peer interaction quickly goes off the rails if the performance of the group is not assessed fairly quickly by a more authoritative figure.
My own argument for edtech is that it can support rapid, authoritative feedback of a relatively simple kind, and is therefore well adapted to supporting the practice of this sort of factual and simple procedural knowledge. With a bit of imagination (as used by Mazur) it can also be incorporated into a other sorts of interaction to keep them on track, without it becoming a matter of everyone strapping on their virtual reality headsets and disappearing into their own private universe. I think that Ministers are (or ought to be) highly receptive to these ideas, but are understandably cautious about edtech because of the large amounts of guff that have been produced about it, its association with progressive theories of independent learning, and the amount of money that government wasted under New Labour on tech projects in education and elsewhere. That is why I think they need to be made to understand that market-based approaches allow the government to provide much needed leadership in this area, without taking major political or financial risk.
As for Amanda Spielman, I am not necessarily optimistic, nor have not been following her progress very closely. I merely recognise that lighter-touch inspection & more reliance on data seems to be the ostensible direction of travel. My last contact with her was during a Select Committee conference on the purpose of education, about a year ago, when she interrupted from the audience my question, in which I was saying that single-shot exams were intrinsically unreliable, to say that I was well behind the times if I thought that exams were continuing to emphasize reliability at the expense of validity. I certainly thought it was not a good sign that she had so completely misunderstood the point that I was making.
Thanks again for the interesting discussion. Crispin.
Very impressive work. And lines up very coherently with my own experience. I will be re-reading… Great job.
Pingback: What Teachers Tapped This Week #17 - 22nd January - Teacher Tapp
Reblogged this on Tyoschin's Notions.
Hi Crispin,
This is the comment I tried to post off the back of our Twitter exchange, but which got lost in the ether. I hope it gets through this time.
Best,
JL
While there’s much to agree with here in terms of your critique of curriculum, the fundamental flaw in your argument from my point of view is your easy dismissal of Socratic dialogue as unable to scale to the demands of the modern classroom.
This idea of upscaling underpins and undermines your argument for far greater command and control of curriculum and pedagogy – better rubrics, tighter scaffolding, more exemplars – which feed into your belief in marketisation, and your comparison of the teacher to the saddlemaker.
You repeat a number of times that teachers operate in isolation – something we can both agree is fundamentally detrimental to their practice – yet your model goes on to reiterate their isolation, albeit on a different level. While you would have them less isolated in their practice, you validate their isolation from their purpose. The result will be the same.
Going back to Socrates, it is clear that his dialogues did him no favours either. While they have left a lasting imprint on our culture – and that only thanks to Plato, who abandoned the dialogical approach in favour of instruction – they led only to his death sentence. But why?
I’ve never heard this question asked, which is probably down to my lack of erudition, but I have my own theory. Socrates was put to death by his community. Offered the opportunity to escape this fate, he rejected it specifically on the grounds that to run away would be to acquiesce to their guilty judgment.
Accused of corrupting the youth and impiety to the gods of Athens – yet innocent, as far as he was concerned – to then flout Athens’s laws by evading the hemlock would have been to make himself guilty. Instead, he chose to make himself an example, but an example of what?
Our Christian heritage leads us to think of martyrdom, of personal sacrifice in the name of truth. And yet, Socrates and the very notion of dialogue weren’t about truth at all, but undermining anyone who pompously thought they had privileged access to it.
Socrates wasn’t much of a pragmatist. He didn’t solve problems, but he was very good at positing them. Our tradition, unfortunately (and ironically only this one time!), seems to have focused more on what he did well than on what he didn’t. While this has led, eventually, to prizing the scientific method, it has meant that we have by and large failed to learn his lesson.
Had Socrates been more of a pragmatist, he might have realised earlier that in order to question and challenge social norms and assumptions, in order to educate children, specifically, to be critical thinkers (a requisite of democratic society), you have to have their parents on side. You have to bring along the whole community for the ride. Socrates didn’t, and his chickens came home to roost.
We still don’t, and teachers are continually sentenced to career death because of it. Your model, in the end, can only be another iteration of that. Isolated from their purpose, which in the end IS community, teachers can continue to be saddlemakers, headteachers to be cavalry, and EdSecs to be generals.
But in this centralised model of democracy, where manifesto commitments on education are furthest from most voters’ minds, the General Melchetts will continue to enjoy their filet mignons while the drafted saddlemakers and the occasional cavalryman go over the top of the trenches never to return.
Teachers working in isolation is a sin. But isolation from whom? Other teachers? Experts? Yes, but these are the easy answers. We might add policy makers. Your saddlemaker analogy doesn’t preclude that. The cavalryman needs to know what is possible at least, so that he doesn’t waste resources on flights of fancy.
The harder answer – absent entirely from your post – is parents. Our educational thinking ignores them altogether, or else denigrates them. At best, it includes them tokenistically in a Parent Forum or asks their opinion in a patronisingly limited way like Ofsted’s ParentView survey.
Independent schools are vaunted for doing it better because parents pay, and can choose not to. As if this transactional relationship is a meaningful one. In the end, their parental model is based on the same principle: trust. And it’s amazing how much trust we afford to the things we pay a lot for, and unfathomable its effect!
But why should parents trust us? An Ofsted banner? A catchment full of similar people whose trust seems rewarded? Exam results? All of these are very poor substitutes for building relationships. They’re shortcuts. In fact, they might even be morally dubious if we take the time to analyse them.
So before I endorse a model that wants to relegate me to a saddlemaker, I’d like to know who’s riding in the saddles. Because if I don’t like the cut of his gib(b), I’ll retrain to be a plumber.
And before I accept a market of tools, I want to know who I’m deploying them for, and whether buying them from this or that seller is commensurate with mine and my ‘customer’s’ long-term interests.
And before I dismiss Socratic dialogue, I’d like to try it, with parents and carers on side creating and supporting a culture of learning. With them on the team, the ratio changes from 1:30 to at least 31:30. Now THAT is scaling up.
And before we make any other changes to curriculum, perhaps we could start by making changes to accountability incentives. Because teachers and schools rarely fail students alone. The community has failed somewhere too. Just like it did for Socrates and his students.
Hi, thanks for the comment.
1. You say I am wrong to argue that the Socratic dialogue doesn’t scale – but it seems to me to be obvious that one teacher talking to a couple of students for a long period (ie. a tutorial) is very expensive. So I don’t understand your point here.
2. I am not arguing for “command and control” of pedagogy and not necessarily of curriculum. Even if we centralise the production of pedagogical tools, it is for teachers to select those that they believe will work. As for curriculum, for the most part this needs to be set by organisations representing a variety of stakeholder interests, not the DfE.
3. I don’t understand what you mean by teachers being isolated from their purpose. Their practice would be driven by clearly described purpose, which is not at all to be isolated from it.
4. I don’t agree that the validity of Socrates’ work was undermined by his execution. Socrates went cheerfully to his death because he did not value his life, nor the unjustified views of the community precisely because, as you say, truth is not about who you are – and that includes the majority or community. I certainly reject your view that edu purpose is community.
5. I am not arguing for a hierarchy of (often incompetent) authority as your Blackadder analogy suggests, but for respect for expertise. In some degree, parents themselves need to have more of a say in determining the purpose of schools, which is what is achieved by some degree of marketisation of schools (though I think this is only achievable and desirable to a limited degree) and parents are not the only customers here. What I am arguing for is more of a social market for curricula, driven by common expertise and public debate.
6. Engaging parents as assistant pedagogues seems to me to be entirely beneficial and I think a lot of schools are quite rightly placing more emphasis on parental engagement. But I don’t think it is helpful to blame the parents for being disengaged when what we can change and what is at issue is the effectiveness of schools. Instead we should be thinking of parental engagement as part of the pedagogy of the school.
7. Socrates, by the way, was very aware of the “problem of parents”. His answer was to take children away from parents at birth and bring them up by experts who knew what they were doing. I don’t agree with him, but it is at least an response to the problem!
Pingback: Useful bits and pieces – A Chemical Orthodoxy
Pingback: Our favourite education conversations of 2018 (and how they might develop in 2019) - Institute for Teaching