Why progress on edtech is dependent on a better understanding of educational purpose
If my last post was a light-hearted love story, this one is more of an attempt to write a Tractatus Logico-Philosophicus. But I hope that the reader will be compensated by finding the argument to be original. I make the case that the recommendations of the recent Commission on Assessment without Levels are fundamentally mistaken.
It is a response I submitted to the current House of Commons Select Committee’s enquiry into the purpose of education. When announced, this enquiry was dismissed by Andrew Old as a waste of time: the purpose of education, said Andrew, was simply to make people cleverer. While I know where Andrew is coming from (who needs another waffle-fest with a lot of high-faluting rhetoric?) and agree with Andrew most of the time, I disagree with him that this is a pointless enquiry, if the question is understood in the way that I will suggest.
As my previous posts have argued (especially How technology will revolutionize education), technology is not about generic kit, but about the systematic means by which we pursue our ends. If no-one can be sure what our ends (or purposes) are, then any technological approach to education is doomed to fail. If we are not taking a technological approach to the business of teaching itself, then what hope is there in applying digital technology to education productively? Judging by the success we have had so far, the answer has to be “not a lot”.
My contention in this piece is therefore that the development of systematic pedagogy will never be achieved until (a) we know how to describe our educational objectives more clearly and consistently, and (b) until we understand the role of digital technology, both in supporting the description of educational objectives and in implementing the pedagogies required to attain those objectives.
Education has many purposes and few of them will be directly defined by central government. Key academic objectives are best set by academic specialists, workplace skills defined by employers’ organisations, while schools may create supplementary curricula to create a distinctive appeal to learners and parents.
Central Government’s principle responsibility is to ensure that all these objectives can be defined clearly, in ways that allow for the performance of students, education providers, learning materials and instructional techniques to be measured against those objectives. The Committee is right to bracket purpose and quality. “Quality” is a measure of the effectiveness of the means by which a purpose is achieved. Without a clearly defined purpose, no objective measure of quality is possible.
The 1988 National Curriculum tried for the first time to define learning objectives at a level of granularity that would allow for the effective management of the education service. It was a fundamental assumption of the 1987 Task Group on Assessment and Testing (TGAT) that:
A school can function effectively only if it has adopted clear aims and objectives.
By the scrapping of so-called “levels” in 2014, it was implicitly recognised that the 1988 National Curriculum had failed in its aspiration to map these objectives. Against the background of a retreat from “levels”, there has been a general assumption that the purpose of education should be established by more informal and decentralised processes, with every school enjoying a substantial degree of autonomy.
This doctrine has appeared to be consistent with the finding of the PISA reports, which showed the benefits of local autonomy in high-performing education systems. It was also justified by supposing that different schools were likely to have significantly different educational priorities.
This argument has become confused. First, the meaning of key words like “curriculum” and “pedagogy” have become muddled. It is not clear whether “curriculum” is used to refer to an aggregation of educational objectives (i.e. the ends of education), or to programmes of study (i.e. the means by which such objectives are to be achieved).
This conflation of ends and means is long-standing problem amongst educationalists and has been deliberately encouraged by many academics who oppose any “instrumental” view of education. In 1981, Professor Brian Simon wrote a seminal article, “Why no pedagogy in England?”, complaining of a discourse about education “reflecting deep confusion of thought, and of aims and purposes, relating to learning and teaching—to pedagogy.” In thirty-five years, little has changed.
This paper will argue that:
- the benefits of local autonomy, shown by PISA, apply mainly to the selection of appropriate instructional techniques (i.e. the means of education), rather than to the selection of key learning objectives (i.e. the ends of education);
- while different types of student may wish or need to pursue different educational pathways, the degree of variability that this introduces is not so great that every school needs to work out its own key educational objectives—the default position is that schools will share most of their educational objectives, certainly up to GCSE;
- the development of a coherent curriculum, the effective sequencing of educational objectives, and the development of valid and reliable assessments all require considerable time and expertise, which is a powerful argument for a greater centralization of these functions;
- we now have an opportunity to use new digital technologies to manage complex curriculum mappings and to track student progress against those objectives—allowing us to handle the considerable complexity that such an approach entails;
- many of the assumptions made by recent government advisory groups, which emphasise the autonomous role of the classroom teacher acting on the basis of private intuition, will hinder the development of these opportunities.
The nature of educational objectives
Before suggesting how educational objectives can be clearly described, it is necessary to agree on their nature. Describing something as an objective tells us that it is the object of an aspiration to attain it, but it does not tell us whether the nature of that aspiration is to walk to particular map reference, attract a mate, get a promotion, or attain eternal salvation. The term “objective” describes an intent but paradoxically says nothing about the object of that intent. What, then, is the nature of educational objectives?
The debate over the 1988 National Curriculum was characterised by a perceived dichotomy between “knowledge” and “skills”. Similar terms like “understanding”, “attitude”, or “cleverness” might also be used to describe different educational objectives. All these terms suffer from three problems:
- none of them are sufficiently abstract to be able to describe all educational objectives: some objectives might be described in terms of knowledge, some in terms of skill, some in terms of attitude and values;
- there is an increasing consensus that these different characteristics are interdependent and so any attempt to discriminate between knowledge and skill will often turn out to be problematic;
- they all describe internal psychological states that cannot be directly observed or evaluated.
It follows from the third point that the only way that we can define the aims of education is by observing external performance. This does not mean that we are interested in the performance itself (which is a short-term phenomenon that is soon finished), but in the disposition to produce similar performances in the future. The best term for the disposition to perform in particular ways is “capability”.
I therefore propose that:
- educational objectives should be considered as referring to the acquisition and improvement of different capabilities;
- that capability should be described in terms of the different sorts of performance with which that capability is associated;
- that performance refers to a set of actions that is executed by the actor in the expectation of achieving a certain type or quality of outcome;
- and that “learning” should refer to an increase in capability, where an increase in capability is associated with an increase in the quality of corresponding types of performance.
The diagram below shows that by the observation of performance, we are able to infer capability, while the attribution of capability to a performer can be checked by the accuracy with which it predicts future performances.
Several conclusions may be drawn from these relationships between performance, capability and learning.
- The inference of capability is a probabilistic process and should therefore be accompanied by an indication of the confidence with which that inference has been made.
- It is unlikely that reliable inferences of capability can be made without repeated sampling. Depending on the nature of the capability being inferred, effective repetition of sampling will often require that performances are observed in a range of different contexts. In most situations, the capability being assessed will not be tied to a single criterion of performance, but to a certain sort of performance.
- Where repeated sampling is performed by different assessors and/or processes, the aggregation of those samples will need to take into account not only the confidence with which individual inferences are asserted, but the reliability that is imputed to the attributer, on the basis of the predictive reliability of the inferences that the attributer is commonly observed to make.
- The best way to describe capability in the abstract is to describe the performances with which the capability is associated. While the “can do…” rubrics that are commonly associated with criterion referencing may appear to describe such performances, they are often subject to different interpretations, or are found to include performances of very different levels of accomplishment. It is therefore desirable that capability descriptions are accompanied by clear exemplars of the sort of performance with which the capability representation is associated.
- An exemplar is not a reference: its purpose is to elucidate an understanding of the capability as an abstract concept. The capability does not consist in the ability to ape a particular exemplar. Exemplars should be multiple and replaceable. Those working with a capability representation ought to be able to create new exemplars that are accepted by other professionals. Aggregate data showing the quality of student performances across a body of exemplars ought to be consistent. These processes illustrate how the coherence of a capability representation can be tested.
- Unlike criteria or competencies, the expression of capability is generally scalar and not binary (true/false). A capability may be attributed to a performer at different degrees of proficiency.
Many teachers and educationalists condemn any focus on “performative” measures, which are taken to give an unsatisfactory account of true understanding. What is generally meant by this criticism is that a durable mastery cannot be demonstrated by a single performance. The underlying capability might degrade quickly after being learnt or might be closely linked to a particular context, often being the same context in which the capability is first learnt. Although the underlying insight is valid, it is not accurate to express this point by criticising performative measures: as already discussed, understanding cannot be developed, described or assessed except through performance. But although learning must be defined and measured in terms of performance, it is the disposition to perform, rather than the performance itself, which is of real interest—and the disposition or capability to perform can only be measured by repeated performances in varied contexts and often over an extended period of time.
The phrasing of educational objectives as binary criteria or competencies tends to encourage a tick-box approach to assessment, in which a single performance is taken as satisfactory evidence of mastery. This is an important reason for the failure of criterion-based educational objectives, being cited by the Commission on Assessment Without Levels as a major reason why the system of “levels” led to superficial mastery of individual criteria, and premature progression to more difficult topics.
In summary, the purpose of education is to increase the capability of students. Individual educational objectives are best represented by capability representations that describe a certain sort of performance. To be meaningful and coherent, a capability representation should normally be backed by appropriate exemplars and supported by processes to ensure that the different exemplars are consistent and interchangeable. They should be supported by guidance on how to assess and express the quality of associated performances. Such expressions of the quality of performance will normally require a more complex syntax that the sort of binary true/false metric that has traditionally been associated with criterion-referencing.
All of these desirable qualities of educational objectives can be represented by a well-conceived data standard. Such a standard could provide the basis for powerful new digital systems that will be needed to manage learning and assessment at this level of precision.
Relationships between different capability representations
Capability representations may be created by many different authorities (“authority” in this context may refer to an awarding body, a national curriculum authority, a publisher, or an individual education provider). Such capability representations may have different sorts of relationship, one with another.
Overlap and equivalence
Capability representations created by different authorities will frequently overlap and/or have degrees of equivalence. For example, one authority may have a slightly different understanding to another of a capability in “Elementary Mathematics”. To some extent the two capability representations may be similar, to some extent they may be different. Such differences should not be excluded (for example by an attempt to create a single set of definitions) because some representations of basic objectives and the frameworks in which those individual representations are embedded, may turn out to be educationally more valuable than others. The premature standardisation of a single curriculum framework (as happened in the case of the 1988 National Curriculum) may prevent the development of superior frameworks that will turn out to be more conducive to good pedagogy.
At the same time, equivalence needs to be recognised, to enable the same underlying capability to be recognised when measured against symbolic representations that have been published by different authorities. Such equivalence can be asserted by data records that capture relationships, and such assertions can be tested (given sufficient data about student performance) by demonstrating the correlation of performance of different students, when measured against different capability representations.
High level capability representations (such as “Arithmetic” or “Teamwork”) will often need to be disaggregated into component capabilities. Such hierarchical frameworks can often help clarify objectives which might otherwise be extremely vague and subject to widely differing interpretation.
In these cases, the meaning of the parent capability will be shown not only by its own description and exemplars, but also by as an aggregation of the capability descriptions provided by its component (or child) capabilities. Each framework publisher must be able to retain control over the declaration of child (or component) relationships, which affect the meaning of the parent.
The mastery of some capabilities will often be found to be a necessary prerequisite—or at least a useful preparation—for the mastery of other capabilities. The identification of such prerequisite or intermediate educational objectives will make an important contribution to the proper sequencing of instruction and to the development of good pedagogy in schools.
The identification of discreet educational objectives (in the form of capability representations) will allow the description of relationships between those objectives in ways that will help to improve our curricula frameworks over time, will help define the meaning of high-level objectives, and will help improve the sequencing of instruction in our schools. Such integrated curricula frameworks will form a necessary foundation for the development of effective, digital management systems for education.
The problem with assessment
The practice of assessment in UK and US schools has recently come under criticism, for example by Daniel Koretz in Measuring Up, by the 2011 Expert Panel for the National Curriculum Review, by the 2015 Commission for Assessment Without Levels, and by the teaching profession more widely. This criticism has often focused on the distorting effect of “teaching to the test”. This criticism can be summarised as follows.
- Tests tend to focus on what is easily measured rather than what is important, so distorting the curriculum.
- The association of tests with high-stakes consequences, both for teacher and student, tends further to distort teaching priorities and to reduce the reliability of assessment to the point, at its most extreme, of encouraging teachers and students to cheat.
- The confusion of formative and summative purposes of assessment inhibits teachers from realising the pedagogical advantages that research shows to be produced by the proper use of formative assessment.
- The demands of data collection for summative purposes has increased teachers’ workload.
The conclusions drawn by the Commission for Assessment Without Levels are:
- that formative and summative assessments should be kept separate,
- that data from formative assessments should not be kept for any longer than is required to make a single proceed/do-not-proceed decision,
- that summative assessments should be—as far as possible—light-touch and infrequent.
It is the contention of this paper that these conclusions are misguided in every respect.
The purpose of assessment
Central to the conclusions of the Commission for Assessment Without Levels is that teachers should not confuse the formative and summative purposes of assessment. In fact, assessment has only a single purpose, which is to discover the level of capability of the student in relation to the learning objective. That same knowledge, which is reliably achieved only through repeated assessments, is subsequently used for a variety of purposes, such as modifying further instruction, certificating the achievement of the student, improving the quality of pedagogy, and holding the education provider accountable for their service.
In other words, the process of assessment occurs in two stages:
- assess performance in order to infer capability (a single purpose);
- use this capability attribution in a variety of different ways.
It is argued that high-stakes uses of assessment results tend to distort the assessment process itself. That might be true—but it is a problem for summative assessment just as much as for formative assessment and is not, therefore, an argument for discriminating between different forms of assessment on the grounds of their different purpose. Such distortion will be mitigated by basing summative results on the aggregation of frequent, low-stakes assessment, as opposed to basing the summative conclusion on a single end-of-course exam. This suggests the opposite approach to assessment to the one recommended by the Commission.
It is argued that summative assessment must necessarily fall at the end of a period of instruction (in order to show the student’s final level of achievement) while formative assessment will need to occur during instruction (in order to have a chance to influence subsequent instruction). While it is true that summative assessment needs to reveal the capability of the student at the end of instruction, the reliability of that measurement will be improved if the capability of the student can be tracked throughout instruction. The result of the final exam can then be corroborated against a well-informed expectation of what any particular student’s final position is likely to be. A traditional navigator might use a similar principle when implementing a “running fix”, by which the position of the vessel is determined by a series of observations made over a period of time.
It might be argued that summative instruction only needs to show whether the student has mastered the educational objective, while formative assessment needs also to show the type of misconceptions that the student holds, so that appropriate corrective action can be taken. Similarly, it might be argued that summative assessment needs only to produce a single result (such as “A” for Maths), while formative assessment needs to collect information at a much more detailed level of granularity. These arguments stem in part from the assumption that mastery of the capability is expressed as a binary metric (true/false). In fact, the types of misconception held by the student are an important consideration when expressing their overall capability for summative purposes. Some misconceptions might be shown by aggregated data to be more difficult to correct than others, and therefore to suggest a lower level of mastery. To the extent that summative assessment produces metrics at a more generalised granularity than formative assessment, this is no reason why data collected for formative purposes cannot be aggregated to produce authoritative summative metrics.
The assumption by the Commission for Assessment Without Levels that assessment is performed for different summative and formative purposes ignores the views of many authoritative voices in the literature. The TGAT report, for example, states on p.25 that:
It is possible to build up a comprehensive picture of the overall achievements of a pupil by aggregating, in a structured way, the separate results of a set of assessments designed to serve formative purposes.
Nor is any rationale offered by the Commission’s report to support its rejection of this position, proposed by TGAT, among others.
The conclusion of the Commission for Assessment Without Levels would force the education service to continue to rely on end-of-course formal assessments, as its source of summative data. Such exams suffer from several serious drawbacks:
- they rely on a single sample of student work, resulting in poor reliability (in spite of their formulaic nature, SATs exams are commonly estimated to give between 20% to 25% of students the wrong level);
- they increase stress levels on the student;
- given the Commission’s recommendation for the use of informal types of formative assessment (often verbal question-and-answer) whose results are neither preserved nor disseminated beyond the classroom, their position would perpetuate a model of teaching which is heavily dependent on isolated and unsupported classroom teachers, rather than focusing on the production of better learning resources or a move to more collaborative forms of team teaching;
- the recommendation that the data from formative assessments should not be preserved or aggregated indicates a willingness to tolerate high levels of inaccuracy in formative assessment.
Teaching to the test
In Measuring Up, Daniel Koretz uses political polling as an analogy for assessment: in this analogy the sample selected by a pollster for questioning represents the test, while the complete population whose views are of interest to the pollster represents the whole subject domain in the case of assessment. If the sample/test is not truly representative of the population/subject domain, then the results of the assessment will be invalid.
There are two or three reasons why the test may not be representative of the whole subject domain:
- because some aspects of the subject domain are difficult to assess reliably;
- because in order to reduce the possibility that students might misunderstand what the question is asking them, a certain amount of predictability is deliberately built into the questioning;
- because the high-stakes consequences of summative assessment incentivize both awarding bodies and teachers to increase the marks awarded to students, so that questions are made more predictable in the interests of improving perceived levels of performance.
In order to increase the fairness and reliability of single-shot, high-stakes exams, these exams have become increasingly formulaic. It is this trend that has been responsible for distorting the curriculum, not the principle of testing itself. Recent literature on this subject, such as Koretz, the Commission on Assessment Without Levels, and Dylan Wiliam’s Principled Assessment Design, all focus almost exclusively on the reliability of single assessments, at the same time as advising that the results of single assessments should be treated with considerable caution. Very little attention is given to the opportunities to improve reliability by the aggregation of results of repeated assessments that target a single, well-defined objective.
It is commonly observed, however, that there is a trade-off between reliability (or consistency) of assessments and the validity of the assessment (or its alignment with a recognised education objective). If repeated assessment can be used to improve reliability, so too can it be used to improve validity—or our ability to measure those aspects of the subject domain that might at first sight appear to be more difficult to measure.
The problem of “teaching to the test” can be solved by a combination of the following measures.
- Questioning should be made less predictable. This requirement is implied by the position (taken above) that capability is predictive of performance in a wide variety of contexts. It is also consistent with the need to ensure that students cannot be prepared for the test, without at the same time being taught all the aligned educational objectives (i.e. the entire domain). Such unpredictability will—at the level of individual assessments—reduce the reliability of the assessment by introducing a greater element of luck. Not having been forewarned of the context of a question, some students will find themselves well prepared and others will not. At the same time, the higher level of unpredictability will increase the validity of the assessment.
- Assessment should be prepared to use judgements, for example about creative work, that might normally be regarded as subjective. In the same way as in point 1, this will reduce the reliability of the assessment but increase its validity, by allowing for the assessment of creative processes and complex forms of problem solving.
- The decrease in reliability that will result from the first two points should be remedied by:
- an increased frequency of sampling;
- (as noted above) the use of analytics systems to evaluate the reliability of the assessors.
Until now, there have been a number of reasons to avoid an increased frequency of sampling:
- it is prohibitively expensive;
- it relies on the judgement of classroom teachers, who may have an inconsistent understanding of the educational objective being assessed and/or a conflict of interest in respect of their students who are the subject of the assessment;
- the continuous collection of assessment data increases teacher workload;
- teachers complain that too much assessment represents a distraction from teaching.
The first three of these problems can be resolved if:
- summative and formative purposes are served by the same assessments, which are performed continuously throughout instruction;
- these assessments are centrally created and if digital means are used for their dissemination and administration, and for the collection of marks data;
- marking is either automated or, where teacher judgements are required for the assessment of higher-order thinking, the reliability of these judgements is automatically assessed by digital analytics systems, which can show how consistent judgements are—this process, known as “calibration” and has long been recognised as being alternative and complementary to moderation as way of ensuring consistency in assessment.
The complaint by teachers that assessment is a distraction from useful instruction reflects:
- the formulaic nature of many current assessments (which would be remedied by the proposals above);
- a failure of many teachers to understand that practice (which, when outcome results are automatically recorded, is indistinguishable from assessment) is the principle means by which students learn.
It follows from these points that the frequent repetition of appropriately varied and unpredictable practice exercises, aligned with precisely described educational objectives, will not only improve the validity and reliability of assessment (subsequently used for both formative and summative purposes) but will also improve pedagogy by encouraging more frequent and more interesting practice in classrooms.
Autonomy, accountability and politicisation
Much of the hostility of teachers to the current testing regime stems from the use of summative results to hold teachers and schools to account in ways that are widely seen to be unfair. It is clear that the most important causal factor in test results is not the quality of instruction but the prior attainment and socio-economic background of the students.
Value-add measures attempt to compensate for these confounding variables by measuring the progress made between the start and end of a period of instruction. These measures are also unreliable, because, as Dan Willingham argues in Why students don’t like school, learning follows an exponential pathway: the more you know, the easier it is to learn. As Dylan Wiliam notes in Principled Assessment Design, if the assessment of the student’s starting position and finishing position are both unreliable, any vector plotted between the two positions will be doubly unreliable.
The resentment of teachers at being held to account by unfair and unreliable metrics is exacerbated by the perception that the awarding bodies and inspectors on whose judgements the performance of teachers is assessed are not themselves held to account for the reliability of those judgements.
A central argument of this submission is that:
- the reliability of assessment would be dramatically increased if it were continuous,
- that such continuous and reliable assessment is only practical if it is mediated by digital systems,
- that all inferences of capability should be accompanied by metrics that record the confidence with which the inference was made,
- that by the use of digital analytics systems, the reliability of the agencies that make such inferences can itself be assessed and taken into account when those inferences are aggregated.
The implications of this argument are profound. Agencies such as awarding bodies, on whose judgements the fate of teachers may often hang and who are currently perceived to operate in an unaccountable “black box”, would themselves become accountable. The question quis custodiet ipsos custodies? would have been answered.
It is likely that the reliability of value-add measures would increase at the same time as the confidence with which they are asserted would be reduced. In these circumstances, a data-driven accountability regime would be better justified, more authoritative, and more humanely applied.
A further resentment felt by teachers is that their practice is often subject to arbitrary pronouncements by politicians—a tendency that might be labelled as the “politicisation” of education. Ministers, conversely, may with some justification feel frustrated by the inconsistency of performance that the education service often appears to deliver.
In arguing for a clear separation between the ends and means of education, this paper assumes that the proper autonomy of teachers lies in their freedom to select the most effective means of education and not in deciding on the proper ends of education. By proposing better ways of describing educational objectives clearly, this paper assumes that educational objectives will be set by a wide range of consultative processes, which may in some circumstances involve Ministers along with other stakeholder groups.
Although it is not for any service provider to determine the purpose of his service, teachers will also have an important role in these consultative processes, working in two slightly different capacities:
- on the assumption that many education providers operate in a market, it is reasonable that schools should be able to create and advertise supplementary, non-mandatory curricula that are designed to give that provider a distinctive appeal to parents and students;
- even where ultimate objectives are set by more central authorities, it is a basic part of the job of the instructor to determine how a student can best progress towards these ultimate objectives—this will require the setting of “intermediate” objectives as part of what is an essentially pedagogical role.
In both these respects, some degree of curriculum autonomy will be required and will encourage innovation. The ability to express educational objectives in a consistent manner will help the sharing of such innovations between practitioners, where they are successful.
The combination of the clear definition of objectives, the reliable assessment of performance measured against those objectives, and the ability to correlate the importance of mastery achieved in respect of different objectives, will allow a more informed conversation to occur about the aims of education, avoiding the perception that the objectives of the national education system are being unduly influence either by the personal whim of individual Ministers or by the interests of the teachers.
The clear description of educational objectives will encourage the centralization of pedagogical services (such as the design of compelling instructional materials and assessments). Such centralization will help to increase the quality of instruction while at the same time reducing the workload of front-line teachers.
Why the National Curriculum failed
Many of the arguments made in this paper would have been familiar to the authors of the 1988 National Curriculum. The aspiration of the authors of the TGAT report was to create a single, national framework of educational objectives that would provide the foundation of a more systematic approach to pedagogy in the classroom, achieved to some extent by the centralization of the production of learning materials.
It is now generally accepted that the system of “levels” that characterised the National Curriculum has failed. It may be worth offering an explanation for that failure, in order to ensure that the proposals made in this paper would not be likely to repeat the same failures.
The first version of the National Curriculum proposed multiple Attainment Targets, which were conceived as reflecting at granular level what Dylan Wiliam has called the “big ideas” in each subject area. This system was quickly replaced in the early 1990s, on account of it being thought too complicated for teachers to manage. Similar complaints were made about the QCA’s Assessing Pupils’ Progress (APP) scheme, introduced in 2008.
This paper argues for a return to the representation of different educational objectives at granular level, even though such complex systems have proved unworkable in the past. The difference is that, given the creation of appropriate data formats, it is now possible to manage such a systematic approach to education using digital learning management systems. The potential for these systems is described in more detail below.
Prescription of unproven progression model
The second problem with the 1988 National Curriculum was that in proposing a scale of 10 (later 8) levels, it attempted to express progress following an unproven model of child development. It was assumed that children would naturally meet the criteria specified for level 3 before they could meet the criteria specified for level 4, with these levels of mastery being connected to certain age ranges. This model was often shown to be incorrect.
The requirement for curriculum coherence is a powerful reason not to specify a single National Curriculum but instead to provide the data formats by which different agents can prove and improve different curriculum models. This provides for a resilient, self-optimizing system, that will benefit from the wealth of evidence that will be collected by digital systems and the power of modern analytics.
Insufficiently detailed specification
The third problem was that the “levels” were provided at such a high level of progression (on average, it was assumed that children would progress by one level every two years) that they were of little use for formative assessment. The assumption of this proposal is that individual capabilities will be defined at much finer levels of progression and will be associated with different metrics that would be more appropriate for formative use in the classroom.
Just as the levels of progression were provided at a very high level, so too were the number of attainment targets significantly reduced in 1993. The description of educational objectives at high level make them less useful as guides to instruction, and mean that the capability of students needs to be averaged out, making it more likely that particular misconceptions and weaknesses will be overlooked.
Poorly defined criteria
The fourth problem was that the criteria by which levels were defined were often vague and were not associated with appropriate exemplars. As well as providing such exemplars, this proposal would rely on digital analytics systems to show that the exemplars associated with a particular capability were consistent.
The need for a new approach to education technology
The approach to curriculum and pedagogy proposed in this paper can only be implemented in association with digital technology. This is required:
- to track the multiple capability representations and the relationships between them and to correlate their equivalences and interdependences;
- to ensure the consistency of the different exemplars by which the meaning of a capability representation is shown;
- to assess the performance and track the capability of individual students against those representations, doing so in a way that improves the reliability and validity of assessment, at the same time as encouraging frequent practice and the formative use of assessment results;
- to track the confidence with which different inferences of capability are made, the reliability of different assessing agents, and to aggregate assessment results that originate from different sources;
- to create instructional software engines that will support the increased frequency of assessment that this paper proposes, engaging students in instructional activity that promotes practice while at the same time automatically monitoring performance;
- to control the assignment, sequencing and tracking of those digitally-mediated instructional activities.
The development of the suite of instructional and management software that will be required to implement such a vision might be thought by many to be unrealistically ambitious. The answer to this criticism is twofold.
- The development of these systems will be undertaken by the industry and not directly by government, meaning that the risk to government is relatively small. The main role of government will be to work with industry to develop appropriate data formats and open standards in order to stimulate the development of an open and efficient market in the new technology. The creation of the digital architecture that is outlined above is wholly dependent on the creation of appropriate open standards for the sharing of data.
- The objective should be achieved by small steps. This paper does not recommend that government should make another attempt at an all-encompassing national curriculum. By working on a data format that would allow curriculum objectives to be clearly expressed, it will allow a process of cautious advancement, starting by addressing appropriate “low-hanging fruit”. An example of such low hanging fruit might be the recently announced ambition to focus on the mastery of multiplication tables by the age of 11.
The ability to describe educational objectives clearly is fundamental to any conception of quality in education. There has been a long-standing failure to do this. The 1988 Education Act represents a radical attempt to address this problem—but it is an attempt that failed. The reaction to the abolition of levels by the Coalition Government has been driven by a determination to do something different—but without providing a clear account of what the National Curriculum was attempting to achieve or why it failed, and without proposing a clearly argued solution to those problems.
The fundamental reason for the failure of the 1988 Education Act was the inability of manual systems to manage the complexity that such a system, properly implemented, would entail. This problem was exacerbated by the lack of a clearly articulated theoretical basis for teaching practice and what were, as a result, poorly described objectives. Without access to data-driven analytics, the National Curriculum did not have the means to identify or rectify its own deficiencies.
The development in recent years of new digital technologies, capable of handling complex, data-driven systems, provides the opportunity to implement business systems that would improve the reliability, validity and transparency of education provision. Such systems would support teachers, reduce their workload, and provide them with the tools of the trade that they need to do their job consistently and at scale.
The government’s role in effecting such a transformation is not to embark on any high-risk, grands projets; but rather to work with the industry, supporting the development of the open data formats that the industry needs if it is to address these problems in a coherent way, and supporting the right proof-of-concept trials to demonstrate how such a new approach could work.