NAPLAN

The good, the bad and the pretty good actually

Every year headlines proclaim the imminent demise of the nation due to terrible, horrible, very bad NAPLAN results. But if we look at variability and results over time, it’s a bit of a different story.

I must admit, I’m thoroughly sick of NAPLAN reports. What I am most tired of, however, are moral panics about the disastrous state of Australian students’ school achievement that are often unsupported by the data.

A cursory glance at the headlines since NAPLAN 2022 results were released on Monday show several classics in the genre of “picking out something slightly negative to focus on so that the bigger picture is obscured”. 

A few examples (just for fun) include:

Reading standards for year 9 boys at record low, NAPLAN results show 

Written off: NAPLAN results expose where Queensland students are behind 

NAPLAN results show no overall decline in learning, but 2 per cent drop in participation levels an ‘issue of concern’ 

And my favourite (and a classic of the “yes, but” genre of tabloid reporting)

‘Mixed bag’ as Victorian students slip in numeracy, grammar and spelling in NAPLAN 

The latter contains the alarming news that “In Victoria, year 9 spelling slipped compared with last year from an average NAPLAN score of 579.7 to 576.7, but showed little change compared with 2008 (576.9). Year 5 grammar had a “substantial decrease” from average scores of 502.6 to 498.8.”

If you’re paying attention to the numbers, not just the hyperbole, you’ll notice that these ‘slips’ are in the order of 3 scale scores (Year 9 spelling) and 3.8 scale scores (Year 5 grammar). Perhaps the journalists are unaware that the NAPLAN scale ranges from 1-1000? It might be argued that a change in the mean of 3 scale scores is essentially what you get with normal fluctuations due to sampling variation – not, interestingly, a “substantial decrease”. 

The same might be said of the ‘record low’ reading scores for Year 9 boys. The alarm is caused by a 0.2 score difference between 2021 and 2022. When compared with the 2008 average for Year 9 boys the difference is 6 scale score points, but this difference is not noted in the 2022 NAPLAN Report as being ‘statistically significant’ – nor are many of the changes up or down in means or in percentages of students at or above the national minimum standard.

Even if differences are reported as statistically significant, it is important to note two things: 

1. Because we are ostensibly collecting data on the entire population, it’s arguable whether we should be using statistical significance at all.

2. As sample sizes increase, even very small differences can be “statistically significant” even if they are not practically meaningful.

Figure 1. NAPLAN Numeracy test mean scale scores for nine cohorts of students at Year 3, 5, 7 and 9.

The practical implications of reported differences in NAPLAN results from year to year (essentially the effect sizes) are not often canvassed in media reporting. This is an unfortunate omission and tends to enable narratives of largescale decline, particularly because the downward changes are trumpeted loudly while the positives are roundly ignored. 

The NAPLAN reports themselves do identify differences in terms of effect sizes – although the reasoning behind what magnitude delineates a ‘substantial difference’ in NAPLAN scale scores is not clearly explained. Nonetheless, moving the focus to a consideration of practical significance helps us ask: If an average score changes from year to year, or between groups, are the sizes of the differences something we should collectively be worried about? 

Interestingly, Australian students’ literacy and numeracy results have remained remarkably stable over the last 14 years. Figures 1 and 2 show the national mean scores for numeracy and reading for the nine cohorts of students who have completed the four NAPLAN years, starting in 2008 (notwithstanding the gap in 2020). There have been no precipitous declines, no stunning advances. Average scores tend to move around a little bit from year to year, but again, this may be due to sampling variability – we are, after all, comparing different groups of students. 

This is an important point for school leaders to remember too: even if schools track and interpret mean NAPLAN results each year, we would expect those mean scores to go up and down a little bit over each test occasion. The trick is to identify when an increase or decrease is more than what should be expected, given that we’re almost always comparing different groups of students (relatedly see Kraft, 2019 for an excellent discussion of interpreting effect sizes in education). 

Figure 2. NAPLAN Reading test mean scale scores for nine cohorts of students at Year 3, 5, 7 and 9.

Plotting the data in this way it seems evident to me that, since 2008, teachers have been doing their work of teaching, and students by-and-large have been progressing in their skills as they grow up, go to school and sit their tests in years 3, 5, 7 and 9. It’s actually a pretty good news story – notably not an ongoing and major disaster. 

Another way of looking at the data, and one that I think is much more interesting – and instructive – is to consider the variability in achievement between observed groups. This can help us see that just because one group has a lower average score than another group, this does not mean that all the students in the lower average group are doomed to failure.

Figure 3 shows just one example: the NAPLAN reading test scores of a random sample of 5000 Year 9 students who sat the test in NSW in 2018 (this subsample was randomly selected from data for the full cohort of students in that year, N=88,958). The red dots represent the mean score for boys (left) and girls (right). You can see that girls did better than boys on average. However, the distribution of scores is wide and almost completely overlaps (the grey dots for boys and the blue dots for girls). There are more boys at the very bottom of the distribution and a few more girls right at the top of the distribution, but these data don’t suggest to me that we should go into full panic mode that there’s a ‘huge literacy gap’ for Year 9 boys. We don’t currently have access to the raw data for 2022, but it’s unlikely that the distributions would look much different for the 2022 results.  

Figure 3. Individual scale scores and means for Reading for Year 9 boys and girls (NSW, 2018 data).

So what’s my point? Well, since NAPLAN testing is here to stay, I think we can do a lot better on at least two things: 1) reporting the data honestly (even when its not bad news), and 2) critiquing misleading or inaccurate reporting by pointing out errors of interpretation or overreach. These two aims require a level of analysis that goes beyond mean score comparisons to look more carefully at longitudinal trends (a key strength of the national assessment program) and variability across the distributions of achievement.

If you look at the data over time NAPLAN isn’t a story of a long, slow decline. In fact, it’s a story of stability and improvement. For example, I’m not sure that anyone has reported that the percentage of Indigenous students at or above the minimum standard for reading in Year 3 has stayed pretty stable since 2019 – at around 83% up from 68% in 2008. In Year 5 it’s the highest it’s ever been at 78.5% of Indigenous students at or above the minimum standard – up from 63% in 2008. 

Overall the 2022 NAPLAN report shows some slight declines, but also some improvements, and a lot that has remained pretty stable. 

As any teacher or school leader will tell you, improving students’ basic skills achievement is difficult, intensive and long-term work. Like any task worth undertaking, there will be victories and setbacks along the way. Any successes should not be overshadowed by the disaster narratives continually fostered by the 24/7 news cycle. At the same time, overinterpreting small average fluctuations doesn’t help either. Fostering a more nuanced and longer-term view when interpreting NAPLAN data, and recalling that it gives us a fairly one-dimensional view of student achievement and academic development would be a good place to start.

Sally Larsen is a Lecturer in Learning, Teaching and Inclusive Education at the University of New England. Her research is in the area of reading and maths development across the primary and early secondary school years in Australia, including investigating patterns of growth in NAPLAN assessment data. She is interested in educational measurement and quantitative methods in social and educational research. You can find her on Twitter @SallyLars_27

AERO’s writing report is causing panic. It’s wrong. Here’s why.

If ever there was a time to question public investment in developing reports using  ‘data’ generated by the National Assessment Program, it is now with the release of the Australian Educational Research Organisation’s report ‘Writing development: What does a decade of NAPLAN data reveal?’ 

I am sure the report was meant to provide reliable diagnostic analysis for improving the function of schools. 

It doesn’t. Here’s why.

There are deeply concerning technical questions about both the testing regime which generated the data used in the current report, and the functioning of the newly created (and arguably redundant) office which produced this report.

There are two lines of technical concern which need to be noted. These concerns reveal reasons why this report should be disregarded – and why media response is a beatup.

The first technical concern for all reports of NAPLAN data (and any large scale survey or testing data) is how to represent the inherent fuzziness of estimates generated by this testing apparatus.  

Politicians and almost anyone outside of the very narrow fields reliant on educational measurement would like to talk about these numbers as if they are definitive and certain.

They are not. They are just estimates – but all of the summary statistics reports are just estimates.  

The fact these are estimates is not apparent in the current report.  There is NO presentation of any of the estimates of error in the data used in this report. 

Sampling error is important, and, as ACARA itself has noted, (see, eg, the 2018 NAPLAN technical report) must be taken into account when comparing the different samples used for analyses of NAPLAN.  This form of error is the estimate used to generate confidence intervals and calculations of ‘statistical difference’.  

Readers who recall seeing survey results or polling estimates being represented with a ‘plus or minus’ range will recognise sampling error. 

Sampling error is a measure of the probability of getting a similar result if the same analyses were done again, with a new sample of the same size, with the same instruments, etc.  (I probably should point out that the very common way of expressing statistical confidence often gets this wrong – when we say we have X level of statistical confidence, that isn’t a percentage of how confident you can be with that number, but rather the likelihood of getting a similar result if you did it again.)  

In this case, we know about 10% of the population do not sit the NAPLAN writing exam, so we already know there is sampling error.  

This is also the case when trying to infer something about an entire school from the results of a couple of year levels.  The problem here is that we know the sampling error introduced by test absences is not random and accounting for it can very much change trend analyses, especially for sub-populations So, what does this persuasive writing report say about sampling error? 

Nothing. Nada. Zilch. Zero. 

Anyone who knows basic statistics knows that when you have very large samples, the amount of error is far less than with smaller samples.  In fact, with samples as large as we get in NAPLAN reports, it would take only a very small difference to create enough ripples in the data to show up as being statistically significant.  That doesn’t mean, however, the error introduced is zero – and THAT error must be reported when representing mean differences between different groups (or different measures of the same group).

Given the size of the sampling here, you might think it ok to let that slide.  However, that isn’t the only short cut taken in the report.  The second most obvious measure ignored in this report is measurement error.  Measurement error exists any time we create some instrument to estimate a ‘latent’ variable – ie something you can’t see directly.  We can’t SEE achievement directly – it is an inference based on measuring several things we can theoretically argue are valid indicators of that thing we want to measure.  

Measurement error is by no means a simple issue but directly impacts the validity of any one individual student’s NAPLAN score and an aggregate based on those individual results.  In ‘classical test theory’ a measured score is made of up what is called a ‘true score’ and error (+/-).  In more modern measurement theories error can become much more complicated to estimate, but the general conception remains the same.  Any parent who has looked at NAPLAN results for their child and queried whether or not the test is accurate is implicitly questioning measurement error.

Educational testing advocates have developed many very mathematically complicated ways of dealing with measurement error – and have developed new testing techniques for improving their tests.  The current push for adaptive testing is precisely one of those developments, in the local case being rationalised as adaptive testing (where which specific test item is asked of the person being tested changes depending on prior answers) does a better job of differentiation those at the top and bottom end of the scoring range (see the 2019 NAPLAN technical report for this analysis). 

 This bottom/top of the range problem is referred to as a floor or ceiling effect.  When large proportion of students either don’t score anything or get everything correct, there is no way to differentiate those students from each other – adaptive testing is a way of dealing with floor and ceiling effects better than a predetermined set of test items.  This adaptive testing has been included in the newer deliveries of the online form of the NAPLAN test.

Two important things to note. 

One, the current report claims the performance of high ‘performing’ students’ scores has shifted down – despite new adaptive testing regimes obtaining very different patterns of ceiling effect. Second, the test is not identical for all students (they never have been).  

The process used for selecting test items  is based on ‘credit models’ generated by testers. Test items are determined to have particular levels of ‘difficulty’ based on the probability of correct answers being given from different populations and samples, after assuming population level equivalence in prior ‘ability’ AND creating difficulties score for items while assuming individual student ‘ability’ measures are stable from one time period to the next.  That’s how they can create these 800 point scales that are designed for comparing different year levels.

So what does this report say about any measurement error that may impact the comparisons they are making?  Nothing.

One of the ways ACARA and politicians have settled their worries about such technical concerns as accurately interpreting statistical reports is by introducing the reporting of test results in ‘Bands’.  Now these bands are crucial for qualitatively describing rough ranges of what the number might means in curriculum terms – but they come with a big consequence.  Using ‘Band’ scores is known as ‘coarsening’ data – when you take a more detailed scale and summarise it in a smaller set of ordered categories – and that process is known to increase any estimates of error.  This later problem has received much attention in the statistical literature, with new procedures being recommended for how to adjust estimates to account for that error when conducting group comparisons using that data.  

As before, the amount of reporting of that error issue? Nada.

 This measurement problem is not something you can ignore – and yet the current report is worse than careless on this question.

It takes advantage of readers not knowing about it. 

When the report attempts to diagnose which component of the persuasive writing tasks were of most concern, it does not bother reporting that the error for each of the separate measures of those ten dimensions of writing has far more error than the total writing score, simply because the number of marks for each is a fraction of the total.  The smaller the number of indicators, the more error (and less reliability).

Now all of these technical concerns simply raises the question of whether or not the overall findings of the report will hold up to robust tests and rigorous analysis – there is no way to assess that from this report, but there is even bigger reason to question why it was given as much attention as it was.  That is, for any statistician, there is always a challenge to translate the numeric conclusions into some for of ‘real life’ scenario.

To explain why AERO has significantly dropped the ball on this last point, consider its headline claim that year 9 students have had declining persuasive writing scores and somehow representing that as a major new concern.  

First note that the ONLY reporting of this using the actual scale values is a vaguely labelled line graph showing scores from 2011 until 2018 – skipping 2016 since the writing task that year wasn’t for persuasive writing (p 26 of the report has this graph).  Of those year to year shifts, the only two that may be statistically significant, and are readily visible, are from 2011 to 2012, and then again from 2017 to 2018.  Why speak so vaguely? From the report, we can’t tell you the numeric value of that drop, because there is no reporting of the actual number represented in that line graph.  

Here is where the final reality check comes in.  

If this data matches the data reported in the national reports from 2011 and 2018, the named mean values on the writing scale were 565.9 and 542.9 respectively.  So that is a drop between those two time points of 23 points.  That may sound like a concern, but recall those scores are based on 48 marks given for writing.  In other words, that 23 point difference is no more than one mark difference (it could be far less since each different mark carries a different weighting in formulation that 800 scale).  

Consequently, even if all the technical concerns get sufficient address and the pattern still holds, the realistic title of Year 9 claim would be ‘Year 9 students in 2018 NAPLAN writing test scored one less mark than the Year 9 students of 2011.’

Now assuming that 23 point difference has anything to do with the students at all, start thinking about all the plausible reasons why students in that last year of NAPLAN may not have been as attentive to details as they were when NAPLAN was first getting started.   I can think of several, not least being the way my own kids did everything possible to ignore the Year 9 test – since the Year 9 test had zero consequences for them.  

Personally, these reports are troubling for many reasons, inclusive of the use of statistics to assert certainty without good justification, but also because saying student writing has declined belies that obvious fact that is hasn’t been all that great for decades.  This is where I am totally sympathetic to the issues raised by the report – we do need better writing among the general population.  But using national data to produce a report of this calibre, by an agency beholden to government, really does little more than provide click-bait and knee jerk diagnosis from all sides of a debates we don’t really need to have.

James Ladwig is Associate Professor in the School of Education at the University of Newcastle.  He is internationally recognised for his expertise in educational research and school reform.  Find James’ latest work in Limits to Evidence-Based Learning of Educational Science, in Hall, Quinn and Gollnick (Eds) The Wiley Handbook of Teaching and Learning published by Wiley-Blackwell, New York. James is on Twitter @jgladwig

AERO’s response to this post

ADDITIONAL COMMENTS FROM AERO provided on November 9: information about the statistical issues discussed, a more detailed Technical Note is available at AERO.

On Monday, EduResearch Matters published the above post by Associate Professor James Ladwig which critiqued the Australian Education Research Office’s Writing development: what does a decade of NAPLAN data reveal? 

AERO’s response is below, with additional comments from Associate Professor Ladwig. 

AERO: This article makes three key criticisms about the analysis presented in the AERO report, which are inaccurate.

Ladwig claims that the report lacks consideration of sampling error and measurement error in its analysis of the trends of the writing scores. In fact, those errors were accounted for in the complex statistical method applied. AERO’s analysis used both simple and complex statistical methods to examine the trends. While the simple method did not consider error, the more complex statistical method (referred to as the ‘Differential Item Analysis’) explicitly considered a range of errors (including measurement error, and cohort and prompt effects).

Associate Professor Ladwig: AERO did not include any of that in its report nor in any of the technical papers. There is no overtime DIF analysis of the full score – and I wouldn’t expect one.  All of the DIF analyses rely on data that itself carries error (more below). There is no way for the educated reader to verify these claims without expanded and detailed reporting of the technical work underpinning this report. This is lacking in transparency, falls shorts of the standards we should expect from AERO and makes it impossible for AERO to be held accountable for its specific interpretation of their own results.

AERO: Criticism of the perceived lack of consideration of ‘ceiling effects’ in AERO’s analysis of the trends of high-performing students’ results, omits the fact that AERO’s analysis focused on the criteria scores (not the scaled measurement scores). AERO used the proportion of students achieving the top 2 scores (not the top score), for each criterion, as the matrix to examine the trends. Given only a small proportion of students achieved a top score for any criterion (as shown in the report statistics), there is no ‘ceiling effect’ that could have biased the interpretation of the trends.

Associate Professor Ladwig made his ‘ceiling effect’ comments while explaining how the NAPLAN writing scores are designed not in relation to the AERO analysis.

AERO: The third major inaccuracy relates to the comments made about the ‘measurement error’ around the NAPLAN bands and the use of adaptive testing to reduce error. These are irrelevant to AERO’s analysis because the main analysis did not use scaled scores, it did not use bands, and adaptive testing is not applicable to the writing assessment.

Associate Professor Ladwig’s comment was about the scaling in relation to explaining the score development, not about the AERO analysis.

In relation to the AERO use of NAPLAN criterion score data in the writing analysis, however, please note that those scores are created either through scorer moderation processes or (increasingly where possible) text interpretative algorithms.  Here again the address of the reliability of these raw scores was absent, but with one declared limitation noted, in AERO’s own terms:

Another key assumption underlying most of the interpretation of results in this report is that marker effects (that is, marking inconsistency across years) are small and therefore they do not impact on the comparability of raw scores over time. (p[.66)

This is where AERO has taken another short cut, with an assumption that should not be made.  ACARA has reported the reliability estimates to include that in the scores analysis.  It is readily possible to report those and use them for trend analyses.

AERO: A final point: the mixed-methods design of the research was not recognised in the article. AERO’s analysis examined the skills students were able to achieve at the criterion level against curriculum documents. Given the assessment is underpinned by a theory of language, we were able to complement quantitative with a qualitative analysis that specifically highlighted the features of language students were able to achieve. This was validated by analysis of student writing scripts.

Associate Professor Ladwig says this is irrelevant to his analysis. The logic of this is also a concern. Using multiple methods and methodologies does not correct for any that are technically lacking.  In relation to the overall point of concern, we have a clear example of an agency reporting statistical results in a manner that elides external scrutiny accompanied by an extreme media positioning. Any of the qualitative insights to the minutia these numbers represent will probably very useful for teachers of writing – but whether or not they are generalisable, big, or shifting depends on those statistical analysis themselves. 

Is the NAPLAN results delay about politics or precision?

The decision announced yesterday by ACARA to delay the release of preliminary NAPLAN data is perplexing. The justification is that the combination of concerns around the impact of COVID-19 on children, and the significant flooding that occurred across parts of Australia in early 2022 contributed to many parents deciding to opt their children out of participating in NAPLAN. The official account explains:

“The NAPLAN 2022 results detailing the long-term national and jurisdictional trends will be released towards the end of the year as usual, but there will be no preliminary results release in August this year as closer analysis is required due to lower than usual student participation rates as a result of the pandemic, flu and floods.”

The media release goes on to say that this decision will not affect the release of results to schools and to parents, which have historically occurred at similar times of the year. The question that this poses, of course, is why the preliminary reporting of results is affected, but student and school reports will not be. The answer is likely to do with the nature of the non-participation. 

The most perplexing part of this decision is that NAPLAN has regularly had participation rates below 90% at various times among various cohorts. That has never prevented preliminary results being released before.

What are the preliminary results?

Since 2008, NAPLAN has been a controversial feature of the Australian school calendar for students in Years 3, 5, 7 and 9. The ‘pencil-and-paper’ version of NAPLAN was criticised for how statistical error impacts its precision at the student and school level (Wu, 2016), the impact that NAPLAN has had on teaching and learning (Hardy, 2014), and the time it takes for the results to come back (Thompson, 2013). Since 2018, NAPLAN has gradually shifted to an online, adaptive design which ACARA claims “are better targeted to students’ achievement levels and response styles meaning that the tests “provide more efficient and precise estimates of students’ achievements than do fixed form paper based tests. 2022 was the first year that the tests were fully online. 

NAPLAN essentially comprises four levels of reporting. These are student reports, school level reports, preliminary national reports and national reports. The preliminary reports are usually released around the same time as the student and school results. They report on broad national and sub-national trends, including average results for each year level in each domain across each state and territory and nationally. Closer to the end of the year, a National Report is released which contains deeper analysis on how characteristics such as gender, Indigenous status, language background other than English status, parental occupation, parental education, and geolocation impact achievement at each year level in each test domain.

Participation rates

The justification given in the media release concerns participation rates. To understand this better, we need to understand how participation impacts the reliability of test data and the validity of inferences that can be made as a result (Thompson, Adie & Klenowski, 2018). NAPLAN is a census test. This means that in a perfect world, all students in Years 3, 5, 7 & 9 would sit their respective tests. Of course, 100% participation is highly unlikely, so ACARA sets a benchmark of 90% for participation. Their argument is that if 90% of any given cohort sits a test we can be confident that the results of those sitting the tests are representative of the patterns of achievement of the entire population, even sub-groups within that population. ACARA calculates the participation rate as “all students assessed, non-attempt and exempt students as a percentage of the total number of students in the year level”. Non-attempt students are those who were present but either refused to sit the test or did not provide sufficient information to estimate an achievement score. Exempt students are those exempt from  one or more of the tests on the grounds of English language proficiency or disability.

The challenge, of course, is that non-participation introduces error into the calculation of student achievement. Error is a feature of standardised testing, it doesn’t mean mistakes in the test itself, it rather is an estimation of the various ways that uncertainty emerges in predicting how proficient a student is in an entire domain based on a relatively small sample of questions that make up a test. The greater the error, the less precise (ie less reliable) the tests are. With regards to participation, the greater the non-participation, the more uncertainty is introduced into that prediction. 

The confusing thing in this decision is that NAPLAN has regularly had participation rates below 90% at various times among various cohorts. This participation data can be accessed here.  For example, in 2021 the average participation rates for Year 9 students were slightly below the 90% threshold in every domain yet this did not impact the release of the Preliminary Report. 

Table 1: Year 9 Participation in NAPLAN 2021 (generated from ACARA data)

These 2021 results are not an anomaly, they are a trend that has emerged over time. For example, in pre-pandemic 2018 the jurisdictions of Queensland, South Australia, ACT and Northern Territory did not reach the 90% threshold in any of the Year 9 domains. 

Table 2: Year 9 Participation in NAPLAN 2018 (generated from ACARA data)

Given these results above, the question remains why has participation affected the reporting of the 2022 results, but Year 9 results in 2018, or 2021, were not similarly affected?

At the outset, I am going to say that there is a degree of speculation in answering this question. Primarily, this is because even if participation declines to 85%, this is still a very large sample with which to predict the achievement of the population in a given domain, so it must be that something has not worked when they have tried to model the data. I am going to suggest three possible reasons:

  1. The first is likely, given that it is hinted at in the ACARA press release. If we return to the relationship between participation, error and the validity of inferences, the most likely way that an 85% participation rate could be a problem is if non-participation is not randomly spread across the population. If non-participation was shown to be systematic, that is it is heavily biassed to particular subgroups, then depending upon the size of that bias, the ability to make valid inferences about achievement in different jurisdictions or amongst different sub-groups could be severely impacted. One effect of this is that it might become difficult to reliably equate 2022 results with previous years. This could explain why lower than 90% Year 9 participation in 2021 was not a problem – the non-participation was relatively randomly spread across the sub-groups.
  2. Second, and related to the above, is that the non-participation has something to do with the material and infrastructural requirements for an online test that is administered to all students across Australia. There have long been concerns about the infrastructure requirements of NAPLAN online such as access to computers, reliable internet connections and so on particularly in regional and remote areas of Australia. If these were to influence results, such as through an increased number of students unable to attempt the test, this could also influence the reliability of inferences amongst particular sub-groups. 
  3. The final possibility is political. It has been obvious for some time that various Education Ministers have become frustrated with aspects of the NAPLAN program. The most prominent example of this was the concern expressed by the Victorian Education Minister in 2018 about the reliability of the equation of the online and paper tests. (Education chiefs have botched Naplan online test, says Victoria minister | Australian education | The Guardian) During 2018, ACARA were criticised for showing a lack of responsible leadership in releasing results that seemed to show a mode effect, that is, a difference between students that sat the online vs the pen and paper test not related to their capacity in literacy and numeracy. It may be that ACARA has grown cautious as a result of the 2018 ministerial backlash and feel that any potential problems with the data need to be thoroughly investigated before jurisdictions are named and shamed based on their average scores. 

Ultimately, this leads us to perhaps one of the more frustrating things, we may never know. Where problems emerge around NAPLAN, the tendency is for ACARA and/or the Federal Education Minister to whom ACARA reports, to try to limit criticism by denying access to the data. In 2018, at the height of the controversy of the differences between the online and pencil and paper modes, I formed a team with two internationally eminent psychometricians to research whether there was a mode effect between the online and pencil and paper versions of NAPLAN. The request to ACARA to access the dataset was denied with the words that ACARA could not release item level data for the 2018 online items, presumably because they were provided by commercial entities. In the end, we just have to trust ACARA that there was not one. If we have learnt anything from recent political scandals, perfect opaqueness remains a problematic governance strategy.

Greg Thompson is a professor in the Faculty of Creative Industries, Education & Social Justice at the Queensland University of Technology. His research focuses on the philosophy of education and educational theory. He is also interested in education policy, and the philosophy/sociology of education assessment and measurement with a focus on large-scale testing and learning analytics/big data.

Why appeasing Latham won’t make our students any more remarkable

Are our schools making the kids we think we should? The tussle between politics and education continues and Latham is just the blunt end of what is now the assumed modus operandi of school policy in Australia. 

Many readers of this blog no doubt will have noticed a fair amount of public educational discussion about NSW’s School Success Model (SSM) which, according to the Department flyer, is ostensibly new. For background NSW context, it is important to note that this policy was released in the context of a new Minister for Education who has openly challenged educators to ‘be more accountable’, alongside of an entire set of parliamentary educational inquiries set up to appease Mark Latham, who chairs a portfolio committee with a very clear agenda motivated by the populism of his political constituency.  

This matters because there are two specific logics used in the political arena that have been shifted into the criticisms of schools: the public dissatisfaction leading to accountability question (so there’s a ‘public good’ ideal somewhere behind this), and the general rejection of authorities and elitism (alternatively easily labelled anti-intellectualism.)  Both of these political concerns are connected to the School Success Model.  The public dissatisfaction is motivating the desire for measures of accountability that the public believes can be free of tampering, and ‘matter’.  Test scores dictating students’ futures, so they matter, etc. The rejection of elitism is also embedded in the accountability issue. That is due to a (not always unwarranted) lack of trust.  That lack of trust often gets openly directed to specific people

Given the context, while the new School Success Model (SSM) is certainly well intended, it also represents one of the more direct links between politics and education we typically see.  The ministerialisation of schooling is clearly alive and well in Australia.  This isn’t the first time we have seen such direct links – the politics of NAPLAN was, afterall, straight from the political intents of its creators.  It is important to note that the logic at play has been used by both major parties in government.  Implied in that observation is that the systems we have live well beyond election cycles.

Now in this case, the basic political issues how to ‘make’ schools rightfully accountable, and at the same time push for improvement. I suspect this are at least popular sentiments, if not overwhelmingly accepted as a given by the vast majority of the public.  So alongside from general commitments to ‘delivering support where it is needed, and ‘learning from the past’, the model is most notable for it main driver – a matrix of measures ‘outcome’ targets.  In the public document that includes targets are the systems level and school level – aligned.  NAPLAN, Aboriginal Education, HCS, Attendance, Students growth (equity), and Pathways are the main areas specified for naming targets.

But, like many of the other systems created with the same good intent before it, this one really does invite the growing criticism already noted in public commentary. Since, with luck, public debate will continue, here I would like to put some broader historical context to these debates, take a look under the hood of these measures to show why they really aren’t fit for purpose for school accountability purposes without far more sophisticated understanding of what they can and can not tell you.

In the process of walking through some of this groundwork, I hope to show why the main problem here is not something a reform here or there will change.  The systems are producing pretty much what they are designed to do.  

On the origins of this form of governance

Anyone who has studied the history of schooling and education (shockingly few in the field these days) would immediately see the target-setting agenda as a ramped up version of scientific-management (see Callaghan, 1962), blended with a bit of Michael Barber’s methodology for running government (Barber, 2015), using contemporary measurements.

More recently, at least since the then labelled ‘economic rationalist’ radical changes brought to the Australia public services and government structures in the late 1980s and early 1990s, the notion of measuring outcomes of schools as a performance issue has matured, in tandem with the past few decades of an increasing dominance of the testing industry (which also grew throughout the 20th century). The central architecture of this governance model would be called neo-liberal these days, but it is basically a centralised ranking system based on pre-defined measures determined by a select few, and those measures are designed to be palatable to the public.  Using such systems to instil a bit of group competition between schools fits very well with those who believe market logic works for schooling, or those who like sport.

The other way of motivating personnel in such systems is, of course, mandate, such as the now mandated Phonic Screening Check announce in the flyer.

The devil in details

Now when it comes to school measures, there are many types we actually know a fair amount about most if not all of them – as most are generated from research somewhere along the way. There are some problems of interpretation that all school measures face which relate the basic problem that most measures are actually measures of individuals (and not the school), or vice-versa.  Relatedly, we also often see school level measures which are simply the aggregate of the individuals.  In all of these cases, there are many good intentions that don’t match reality.

For example, it isn’t hard to make a case for saying schools should measure student attendance.  The logic here is that students have to be at school to learn school things (aka achievement tests of some sort). You can simply aggregate individual students attendance to the school level and report it publicly (as on MySchool), because students need to be in school. But it would be a very big mistake to assume that the school level aggregated mean attendance of the student data is at all related to school level achievement.  It is often the case that what is true for individual, is also not true for the collective in which the individual belongs.  Another case in point here is policy argument that we need expanded educational attainment (which is ‘how long you stay in schooling’) because if more people get more education, that will bolster the general economy.  Nationally that is a highly debatable proposition (among OECD countries there isn’t even a significant correlation between average educational attainment and GDP).  Individually it does make sense – educational attainment and personal income, or individual status attainment is generally quite positively related.  School level attendance measures that are simple aggregates are not related to school achievement (Ladwig and Luke, 2011).  This may be why the current articulation attendance target is a percentage of students attending more than 90% of the time (surely a better articulation than a simple average – but still an aggregate of untested effect).  The point is more direct – often these targets are motivated by an goal that has been based on some causal idea – but the actually measures often don’t reflect that idea directly.

Another general problem, especially for the achievement data, is the degree to which all of the national (and state) measures are in fact estimates, designed to serve specific purposed.   The degree to which this is true varies from test to test.   Almost all design options in assessment systems carry trade offs.  There is a big difference between an HSC score – where the HSC exams and syllabuses are very closely aligned and the student performance is designed to reflect that; as opposed to NAPLAN, which is designed to not be directly related to syllabuses but overtly as a measure designed to estimate achievement on an underlying scale that is derived from the populations.  For HSC scores, it makes some sense to set targets but notice those targets come in the forms of percentage of students in a given ‘Band.’

Now these bands are tidy and no doubt intended to make interpretation of results easier for parents (that’s the official rational). However, both HSC Bands and NAPLAN bands represent ‘coarsened’ data.  Which means that they are calculated on the basis of some more finely measured scale (HSC raw scores, NAPLAN scale scores).  There are two known problems with coarsened data: 1) in general they increase measurement error (almost by definition), and 2) they are not static overtime.  Of these two systems, the HSC would be much more stable overtime, but even there much development occurs overtime, and the actual qualitative descriptors of the bands changes as syllabuses are modified.  So these band scores, and the number of students in each, is something that really needs to understood to be very less precise than counting kids in those categories implies. For more explanation and an example of one school which decides to change its spelling programs on the basis of needing one student to get one more item test correct, in order for them to meet their goal of having a given percentage of students in a given band, (see Ladwig, 2018).

There is a lot of detail behind this general description, but the point is made very clearly in the technical reports, such as when ACARA shifted how it calibrated its 2013 results relative to previous test years – where you find the technical report explaining that ACARA would need to stop assuming previous scaling samples were ‘secure’.  New scaling samples are drawn each year since 2013. When explaining why they needed to estimate sampling error in a test that was given to all students in a given year, ACARA was forthright and made it very clear: 

‘However, the aim of NAPLAN is to make inference about the educational systems each year and not about the specific student cohorts in 2013’ (p24).

Here you can see overtly that the test was NOT designed for the purposes to which the NSW Minister wishes to pursue.  

The slippage between any credential (or measure) and what it is supposed to represent has a couple of names.  When it comes to testing and achievement measurements, it’s called error.  There’s a margin within which we can be confident, so analysis of any of that data requires a lot of judgement, best made by people who know what and who is being measured.  But that judgement can not be exercised well without a lot of background knowledge that is not typically in the extensive catalogue of background knowledge needed by school leaders.

At a system level, the slippage between what’s counted and what it actually means is called decoupling.  And any of the new school level targets are ripe for such slippage.  Numbers of Aboriginal students obtaining an HSC is clear enough – but does it reflect the increasing numbers of alternative pathways used by an increasingly wide array of institutions? Counting how many kids continue to Year 12 make sense, but it also is motivation for schools to count kids simply for that purpose. 

In short, while the public critics have spotted potential perverse unintended consequence, I would hazard a prediction that they’ve just covered the surface.  Australia already has ample evidence of NAPLAN results being used as the based of KPI development with significant problematic side effects – there is no reason to think this would be immune from misuse, and in fact invites more (see Mockler and Stacey, 2021).

The challenge we need to take is not how to make schools ‘perform’ better or teachers ‘teach better’ – any of those a well intended, but this is a good time to point out common sense really isn’t sensible once you understand how the systems work.  To me it is the wrong question to ask how we make this or that part of the system do something more or better.

In this case, it’s a question of how can we build systems in which school and teachers are rightfully and fairly accountable and in which schools, educators, students are all growing.  And THAT question can not reached until Australia opens up bigger questions about curriculum that have been locked into what has been a remarkable resilience structure ever since the early 1990s attempts to create a national curriculum.

Figure 1 Taken from the NAPLAN 2013 Technical Report, p.19

This extract shows the path from a raw score on a NAPLAN test and what eventually becomes a ‘scale score’ – per domain.  It is important to note that the scale score isn’t a count – it is based on a set of interlocking estimations that align (calibrate) the test items. That ‘logit’ score is based on the overall probability of test items being correctly answered. 

James Ladwig is Associate Professor in the School of Education at the University of Newcastle and co-editor of the American Educational Research Journal.  He is internationally recognised for his expertise in educational research and school reform.  Find James’ latest work in Limits to Evidence-Based Learning of Educational Science, in Hall, Quinn and Gollnick (Eds) The Wiley Handbook of Teaching and Learning published by Wiley-Blackwell, New York (in press). James is on Twitter @jgladwig

References

Barber, M. (2015). How to Run A Government: So that Citizens Benefit and Taxpayers Don’t Go Crazy: Penguin Books Limited.

Callahan, R. E. (1962). Education and the Cult of Efficiency: University of Chicago Press.

Ladwig, J., & Luke, A. (2013). Does improving school level attendance lead to improved school level achievement? An empirical study of indigenous educational policy in Australia. The Australian Educational Researcher, 1-24. doi:10.1007/s13384-013-0131-y

  Ladwig, J. G. (2018). On the Limits to Evidence‐Based Learning of Educational Science. In G. Hall, L. F. Quinn, & D. M. Gollnick (Eds.), The Wiley Handbook of Teaching and Learning (pp. 639-658). New York: WIley and Sons.

Mockler, N., & Stacey, M. (2021). Evidence of teaching practice in an age of accountability: when what can be counted isn’t all that counts. Oxford Review of Education, 47(2), 170-188. doi:10.1080/03054985.2020.1822794

Main image:

DescriptionEnglish: Australian politician Mark Latham at the 2018 Church and State Summit
Date15 January 2018
Source“Mark Latham – Church And State Summit 2018”, YouTube (screenshot)
AuthorPellowe Talk YouTube channel (Dave Pellowe)

Q:Which major party will fully fund public schools? A:None. Here’s what’s happening

You would be forgiven for thinking that policy related to schooling is not a major issue in Australia. In the lead up to the federal election, scant attention has been paid to it during the three leaders’ debates. One of the reasons could be because the education policies of the major parties have largely converged around key issues.

Both Labor and the Coalition are promising to increase funding to schools but neither is prepared to fully fund government schools to the Schooling Resource Standard (SRS).  Under a Coalition government public schools will get up to 95 per cent of the Schooling Resource Standard by 2027, under a Labor government they will get 97 per cent by 2027. Either way we are talking two elections away and to what degree public schools will remain underfunded.

Both the Coalition and Labor plan to fully fund allprivate schools to the Schooling Resource Standard by 2023. Some private schools are already fully funded and many are already over funded

Yes, Labor is promising to put equality and redistribution back on the agenda in areas such as tax reform and childcare policy, but its Fair funding for Australian Schools policy fails to close the funding gap between what government schools get, and what they need.  And yes Labor is promising to put back the $14 billion cut from public schools by the Coalition’s Gonski 2.0 plan and will inject $3.3 billion of that during its 2019-22 term, if elected.

The point I want to make is neither major party is prepared to fully fund government schools to the level that is needed according to the Schooling Resource Standard.

I find this deeply disappointing.

There are certainly differences between Coalition and Labor education policies, the main being that Labor will outspend the Coalition across each education sector from pre-schools to universities.

However, as I see it, neither major party has put forward an education policy platform. Instead, they have presented a clutch of ideas that fail to address key issues of concern in education, such as dismantling the contrived system of school comparison generated by NAPLAN and the MySchool website, and tackling Australia’s massive and growing equity issues.

Both major parties believe that the best mechanism for delivering quality and accountability is by setting and rewarding performance outcomes. This approach shifts responsibility for delivering improvements in the system down the line.

And let’s get to standardised testing. There is a place for standardised tests in education. However, when these tests are misused they have perverse negative consequences including narrowing the curriculum, intensifying residualisation, increasing the amount of time spent on test preparation, and encouraging ‘gaming’ behaviour.

Labor has promised to take a serious look at how to improve the insights from tests like NAPLAN, but this is not sufficient to redress the damage they are doing to the quality of schooling and the schooling experiences of young people.

These tests can be used to identify weaknesses in student achievement on a very narrow range of curriculum outcomes but there are cheaper, more effective and less problematic ways of finding this out. And the tests are specifically designed to produce a range of results, so it is intended for some children to do badly; a fact missed entirely by the mainstream media coverage of NAPLAN results.

National testing, NAPLAN, is supported by both Labor and the Coalition. Both consistently tell us that inequality matters, but both know the children who underperform are more likely to come from communities experiencing hardship and social exclusion. These are the communities whose children attend those schools that neither major party is willing to fund fully to the Schooling Resource Standard.

Consequently, teachers in underfunded government schools are required to do the ‘heavy lifting’ of educating the young people who rely most on schooling to deliver the knowledge and social capital they need to succeed in life.

The performance of students on OECD PISA data along with NAPLAN show the strength of the link between low achievement and socio-economic background in Australia; a stronger link than in many similar economies. This needs to be confronted with proper and fair funding plus redistributive funding on top of that.

A misuse of standardised tests by politicians, inflamed by mainstream media, has resulted in teachers in our public schools being blamed for the persistent low achievement of some groups of children and, by extension, initial teacher education providers being blamed for producing ‘poor quality’ teachers.

There is no educational justification for introducing more tests, such as the Coalition’s proposed Year 1 phonics test. Instead, federal politicians need to give up some of the power that standardised tests have afforded them to intervene in education. They need to step away from constantly using NAPLAN results to steer education for their own political purposes. Instead they need to step up to providing fair funding for all of Australia’s schools.

I believe when the focus is placed strongly on outputs, governments are let ‘off the hook’ for poorly delivering inputs through the redistribution of resources. Improved practices at the local level can indeed help deliver system quality, but not when that system is facing chronic, eternal underfunding.

Here I must comment on Labor’s proposal to establish a  $280 million Evidence Institute for Schools.  Presumably, this is Labor’s response to the Productivity Commission’s recommendation to improve the quality of existing education data. Labor is to be commended for responding to this recommendation. The Coalition is yet to say how they will fund the initiative.

However what Labor is proposing is not what the Productivity Commission recommended. The Commission argued that performance benchmarking and competition between schools alone are insufficient to achieve gains in education outcomes. They proposed a broad ranging approach to improving the national education evidence base, including the evaluation of policies and building an understanding of how to turn what we know works into into common practice on the ground.

Labor claims that its Evidence Institute for Schools will ensure that teachers and parents have access to ‘high quality’ ‘ground breaking’ research, and it will be ‘the right’ research to assist teachers and early educators to refine and improve their practice.

As an educational researcher, I welcome all increases in funding for research but feel compelled to point out according to the report on Excellence in Research for Australia that was recently completed by the Australian Research Council, the vast majority of education research institutions in Australia are already producing educational research assessed to be of or above world class standard.

The problem is not a lack of high quality research, or a lack of the right kind of research. Nor is it the case that teachers do not have access to research to inform their practice. Without a well-considered education platform developed in consultation with key stakeholders, this kind of policy looks like a solution in search of a problem, rather than a welcome and needed response to a genuine educational issue.

Both major parties need to do more to adequately respond to the gap in the education evidence base identified by the Productivity Commission. This includes a systematic evaluation of the effects of education policies, particularly the negative effects of standardised tests.

The people most affected by the unwillingness of the major parties to imagine a better future for Australia’s schools are our young people, the same young people who are demanding action on the climate crisis. They need an education system that will give them the best chance to fix the mess we are leaving them. Until we can fully fund the schools where the majority of them are educated in Australia we are failing them there too.

Dr Debra Hayes is Head of School and Professor, Education & Equity at the Sydney School of Education and Social Work, University of Sydney. She is also the President of the Australian Association for Research in Education. Her next book, co-authored with Craig Campbell, will be available in August – Jean Blackburn: Education Feminism and Social Justice (Monash University Press). @DrDebHayes

NAPLAN is not a system-destroying monster. Here’s why we should keep our national literacy and numeracy tests

Australia’s numeracy and literacy testing across the country in years 3, 7, and 9 is a fairly bog standard literacy and numeracy test. It is also a decent, consistent, reliable, and valid assessment process. I believe the National Assessment Program-Literacy and Numeracy (NAPLAN) is a solid and useful assessment.

Education experts in Australia have carefully designed the testing series. It has good internal consistency among the assessment items. It has been shown to produce consistent results over different time points and is predictive of student achievement outcomes.

However there are special characteristics of NAPLAN that make it a target for criticisms.

Special characteristics of NAPLAN

What is particularly special about NAPLAN is that most students around the country do it at the same time and the results (for schools) are published on the MySchool website. Also, unlike usual in-house Maths and English tests, it was developed largely by the Australian Government (in consultation with education experts), rather than being something that was developed and implemented by schools.

These special characteristics have meant that NAPLAN has been under constant attack since its inception about 10 years ago. The main criticisms are quite concerning.

Main criticisms of NAPLAN

  • NAPLAN causes a major distortion of the curriculum in schools in a bad way.
  • NAPLAN causes serious distress for students, and teachers.
  • NAPLAN results posted on MySchool are inappropriate and are an inaccurate way to judge schools.
  • NAPLAN results are not used to help children learn and grow.
  • NAPLAN results for individual children are associated with a degree of measurement error that makes them difficult to interpret.

The above criticisms have led to calls to scrap the testing altogether. This is a rather drastic suggestion. However, if all the criticisms above were true then it would be hard to deny that this should be a valid consideration.

Missing Evidence

A problem here is that, at present, there does not exist any solid evidence to properly justify and back up any of these criticisms. The Centre for Independent Studies published an unashamedly pro-NAPLAN paper that does a fair job at summarising the lack of current research literature. However, as the CIS has a clearly political agenda, this paper needs to be read with a big pinch of salt.

My Criticisms  

Rather than completely dismissing the criticisms due to lack of evidence, as was done in the CIS paper mentioned above, based on my own research and knowledge of the literature I would revise the criticisms to:

  • In some (at present indeterminate) number of schools some teachers get carried away with over-preparation for NAPLAN, which unnecessarily takes some time away from teaching other important material.
  • NAPLAN causes serious distress for a small minority of students, and teachers.
  • Some people incorrectly interpret NAPLAN results posted on MySchool as a single number that summarises whole school performance. In fact school performance is a multi-faceted concept and NAPLAN is only a single piece of evidence.
  • It is currently unclear to what extent NAPLAN results get used to help children at the individual level as a single piece of evidence for performance within a multi-faceted approach (that is, multiple measurement of multiple things) generally taken by schools.
  • While NAPLAN results are associated with a degree of measurement error, so too are any other assessments, and it is unclear whether NAPLAN measurement error is any greater or less compared to other tests.

I realise my views are not provocative compared with the sensationalized headlines that we constantly see in the news. In my (I believe soberer) view, NAPLAN becomes more like any other literacy and numeracy test rather than some education-system-destroying-monster.

NAPLAN has been going for about 10 years now and yet there is no hard evidence in the research literature for the extreme claims we constantly hear from some academics, politicians, and journalists.

My views on why NAPLN has been so demonised

From talking to educators about NAPLAN, reviewing the literature, and conducting some research myself, it is clear to me that many educators don’t like how NAPLAN results are reported by the media. So I keep asking myself why do people mis-report things about NAPLAN so dramatically? I have given some thought to it and believe it might be because of a simple and very human reason: people like to communicate what they think other people want to hear.

But this led me to question whether people really do interpret the MySchool results in an inappropriate way. There is no solid research that exists to answer this question. I would hypothesize however that when parents are deciding on a school to send their beloved child, they aren’t making that extremely important decision based on a single piece of information. Nor would I expect that even your everyday Australian without kids really thinks that a school’s worth is solely to be judged based on some (often silly) NAPLAN league table published by a news outlet.

I also think that most people who are anti-NAPLAN wouldn’t really believe that is how people judge schools either. Rather, it is more the principle of the matter that is irksome. That the government would be so arrogant as to appear to encourage people to use the data in such a way is hugely offensive to many educators. Therefore, even if deep down educators know that people aren’t silly enough to use the data in such an all-or-none fashion, they are ready to believe in such a notion, as it helps to rationalize resentment towards NAPLAN.

Additionally, the mantra of ‘transparency and accountability’ is irksome to many educators. They do so much more than teach literacy and numeracy (and even more than what is specifically assessed by NAPLAN). The attention provided to NAPLAN draws attention away from all the additional important hard work that is done. The media constantly draws attention to isolated instances of poor NAPLAN results while mostly ignoring all the other, positive, things teachers do.

Also I will point out, schools are already accountable to parents. So, in a way, the government scrutiny and control sends a message to teachers that they cannot be trusted and that the government must keep an eye on them to make sure they are doing the right thing.

I can understand why many educators might be inclined to have an anti-NAPLAN viewpoint. And why they could be very ready to believe in any major criticisms about the testing.

NAPLAN has become the assessment that people love to hate. Therefore the over-exaggerated negative claims about it are not particularly surprising even if, technically, things might not be so bad, or even bad at all.

My experience with the people who run the tests

In the course of carrying out my research I met face-to-face with some of the people running the tests. I wanted to get some insights into their perspective. I tried my best to go into the meeting with an open mind so what I wasn’t anticipating was an impression of weariness. I found myself feeling sorry for them more than anything else. They did not enjoy being perceived as the creepy government officials looking over the fence at naughty schools.

Rather, they communicated a lot of respect for schools and the people that work in them and had a genuine and passionate interest in the state of education in our country. They saw their work as collecting some data that would be helpful to teachers, parents and governments.

They pointed out the MySchool website does not produce league tables. A quote from the MySchool website is: “Simple ‘league tables’ that rank and compare schools with very different student populations can be misleading and are not published on the My School website”.

Personally, I think it is a shame that NAPLAN testing series has not been able to meet its full potential as a useful tool for teachers, parents, schools, researchers and governments ( for tracking students, reporting on progress, providing extra support, researching on assessment, literacy and numeracy issues and allocating resources).

Value of NAPLAN to educational researchers

Where NAPLAN has huge potential, generally not well recognized, is its role in facilitating educational research conducted in schools. Schools are very diverse, with diverse practices. Whereas NAPLAN is a common experience. It is a thread of commonality that can be utilized to conduct and facilitate research across different schools, and across different time points. The NAPLAN testing has huge potential to facilitate new research and understanding into all manner of important factors surrounding assessment and literacy and numeracy issues. We have an opportunity to better map out the dispositional and situational variables that are associated with performance, with test anxiety, and engagement with school. The number of research studies being produced that are making use of NAPLAN is increasing and looks set to continue increasing in the coming years (as long as NAPLAN is still around). There is real potential for some very important research making good use of NAPLAN to come out of Australian universities in the coming years. There is possibility for some really impressive longitudinal research to be done.

Another positive aspect that is not widely recognized but is something mentioned by parents in research I have conducted, is that NAPLAN tests might be useful for creating a sense of familiarity with standardized testing which is helpful for students who sit Year 12 standardized university entrance exams. Without NAPLAN, students would be going into that test experience cold. It makes sense that NAPLAN experience should make the year 12 tests a more familiar experience prior to sitting them, which should help alleviate some anxiety. Although I must acknowledge that this has not received specific research attention yet.

Perhaps focusing on the importance of NAPLAN to research that will benefit schooling (teachers, parents, schools) in Australia might help change the overall narrative around NAPLAN.

However there are definitely political agendas at work here and I would not be surprised if NAPLAN is eventually abandoned if the ‘love to hate it’ mindset continues. So I encourage educators to think for themselves around these issues, and instead of getting caught up in political machinations, if you find yourself accepting big claims about how terrible NAPLAN supposedly is, please ask yourself: Do those claims resonate with me? Or is NAPLAN just one small aspect of what I do? Is it just one single piece of information that I use as part of my work? Would getting rid of NAPLAN really make my job any easier? Or would I instead lose one of the pieces of the puzzle that I can use when helping to understand and teach my students?

If we lose NAPLAN I think we will, as a country, lose something special that helps us better understand our diverse schools and better educate the upcoming generations of Australian students.

 

Dr Shane Rogers is a Lecturer in the School of Arts and Humanities at Edith Cowan University. His recent publications include Parent and teacher perceptions of NAPLAN in a sample of Independent schools in Western Australia in The Australian Educational Researcher online, and he is currently involved in research on What makes a child school ready? Executive functions and self-regulation in pre-primary students.

 

How school principals respond to govt policies on NAPLAN. (Be surprised how some are resisting)

School principals in Australia are increasingly required to find a balance between improving student achievement on measurable outcomes (such as NAPLAN) and focusing energies on things that can’t as easily be measured; such as how well a school teaches creative and critical thinking, how it connects with its local community or how collaboratively teachers on its staff work together.

Governments and systems would expect a school leader to deliver across all of these policy areas, and many others.

It is a significant part of the work of school principals to continually take policies designed to apply to an often-vast number of schools, and find ways to make them work with their specific local community and context. Different policies can often have conflicting influences and pressures on different schools.

This is an issue of ‘policy enactment’. That is, how principals implement, or carry out, policy in their schools. It is of particular interest to me.

Policy Enactment Studies

My research takes up the idea of policy enactment. This approach to studying the effects of policy starts from the idea that school leaders don’t just neatly apply policy as-is to their schools.

Instead, they make a huge number of decisions. They ‘decode’ policy. This involves considering the resources, relationships and local expertise that is available to them. They also consider the local needs of their children, parents, teachers and school community. They consider the ‘histories, traditions,and communities’ that exist in their school.

It is a complex process that takes leadership expertise and requires wide collaboration within a school community and the principal’s network. Research in this area might seek to understand the local conditions that influence principals’ policy enactment processes.

My recent research had a particular focus on how principals enacted school improvement policies. This was a specific push by the Australian Government to improve student outcomes on measures including NAPLAN testing. I wanted to better understand how traditions, histories, and communities (and other factors) influenced the decisions principals made.

How did local contexts, and the things principals and their wider school communities valued, influence what they focused on? How did principals and schools respond to pushes for ‘urgent improvement’ on NAPLAN testing?

Context

The reforms I studied stemmed from the Rudd/Gillard/Rudd government’s ‘Education Revolution’. The education revolution reforms were referred to at the time  by the government as some of the largest-scale reforms in Australia’s recent history. They involved the introduction of NAPLAN testing, the introduction of the MySchool website to enable publication of school data and easier comparison of schools, and spurred on local improvement agendas such as Queensland’s United in our Pursuit of Excellence.

My Case Study

My research involved a longitudinal study that spanned three school years. I worked closely with three public school principals, interviewing them throughout this period, and analysing documents (including school strategic plans, school data, policy documents, and school improvement agenda documents). The principals were all experienced and had been leading their schools for some time. They were seen as high performing principals and were confident in their approaches towards leading their rural and regional schools. One of the principals, ‘Anne’, was particularly interesting because she was emphatic about valuing the things that could not be easily measured on NAPLAN and the other tools being used to measure improvement and achievement.

Shift away from the focus on NAPLAN and other measurement tools

While research has shown the ways testing such as NAPLAN can narrow the focus of education to that which can be measured, Anne emphasised a more holistic view of education. She was able to resist some of the potential narrowing effects of school improvement. She prioritised the arts, musicals, social and interpersonal development, and individual student wellbeing and learning journeys. She had less of a focus on the data being ‘red or green’ on MySchool and focused instead on the distance travelled for her students. She was confident that unlocking student confidence and fostering a love of schooling engaged those students who were less confident in the areas being measured on improvement data – and she articulated the ways their engagement and confidence translated into improved learning outcomes, with school data that supported her comments.

How did the principal shift the school focus away from testing?

So how did she achieve this? My study found two main ways that she managed to resist the more performative influences of school improvement policies. Firstly, the school had a collaboratively-developed school vision that focused on valuing individual students and valuing the aspects of education that can’t be easily measured. The power of the vision was that it served as a filter for all policy enactment decisions made at the school. If it didn’t align with their vision, it didn’t happen. There was also agreement in this vision from the staff, students, and community members, who kept that vision at the forefront of their work with the school.

The second key aspect was that Anne had developed a strong ‘track record’ with her supervisors, and this engendered trust in her judgment as a leader. She was given more autonomy to make her policy enactment decisions as a result, because of this sense of trust. It was developed over a long time in the same school and in the same region before that. To develop her track record, Anne worked hard to comply with departmental requirements (deadlines, paperwork, and other basic compliance requirements). In addition to this, the school’s data remained steady or continued to improve. Anne was emphatic that this was due to the school’s holistic approach to education and their long-term focus on individual learning journeys rather than reacting to data with quick-fixes.

Case study shows a contrast to trends – what can we learn?

This case study worked in contrast to trends of how “teaching to the test” and NAPLAN in particular, is narrowing the school curriculum. This is important because research presented within this blog in the past has shown us how testing regimes can impact on students, can give less precise results than they appear to, and can further marginalise students and communities.

The school pushed for a wider picture of education to be emphasised, resisting some of the possible unintended effects of testing cultures. We can learn some lessons from this case study. It shows us that communities can collaboratively articulate what is important to them, and work together to maintain a focus on that. This shows us one way that schools can enact policy rhetoric about having autonomy to meet local needs and make local decisions.

The case study also shows us the power of a ‘track record’ for principals when they want to enact policies in unexpected or unusual ways. When they are trusted to make decisions to meet their local communities’ needs, the policy rhetoric about leadership and autonomy is further translated into practice.

These are just some of the insights these case studies were able to provide. Other findings related to how school data was guiding principals’ practices, how the work of principals had been reshaped by school improvement policies, and how principals felt an increased sense of pressure in recent years due to the urgency of these reforms.

If you’d like to read more about these issues, please see my paper The Influence of Context on School Improvement Policy Enactment: An Australian Case Study in the International Journal of Leadership in Education.

 

Dr Amanda Heffernan is a lecturer in Leadership in the Faculty of Education at Monash University. Having previously worked as a school principal and principal coach and mentor for Queensland’s Department of Education, Amanda’s key research interests include leadership, social justice, and policy enactment.

Amanda also has research interests in the lives and experiences of academics, including researching into the changing nature of academic work. She can be found on Twitter @chalkhands

 

NAPLAN testing begins for 2018 and here’s what our children think about it

Australia’s national literacy and numeracy testing program, NAPLAN, for 2018 begins today, on Tuesday 15th May. Classrooms have been stripped of all literacy and numeracy charts and posters, and chairs and tables set out for testing. Our news feeds will be full of adults talking about the program, especially what they think is going wrong with it.

I am much more interested in what children think about NAPLAN.

I know from my research that many children do not like the tests and it is not because ‘not many children like taking tests at any time’ as the Australian Curriculum, Assessment and Reporting Authority (ACARA), which oversees the program, has told us.

Sitting tests is just one form of assessment and as such is a normal part of the rhythms and patterns of everyday school life: children go to school and now and then the type of assessment they do is a test.

But to claim NAPLAN is just another test is a simplistic adult perspective. Some children see it very differently.

I asked the children about assessments at school

I asked 105 children in Years 3, 5 and 7, as well as their parents, teachers and principals, about their experiences and views of NAPLAN. While they cannot speak for every child, their accounts give us insights into how children actually experience the tests.

When I spoke to the Year 7 children about which type of assessment they prefer, some favoured assignments, while others explained that ‘I prefer tests because if I get something wrong, I can see where I’ve gone wrong easier’, and ‘if I get a lot wrong it’s easier to talk to the teacher about it’.

So, what is it about NAPLAN that makes it such a negative experience for some children, even for those who normally prefer tests as a form of assessment?

I have written previously about why some children construct NAPLAN as high-stakes, even though it has been designed to be a low-stakes test. However, there are other major differences between NAPLAN and the usual type of school-based tests. There are big differences in the test papers as well the testing protocols, or conditions, under which the tests are taken.

The NAPLAN test papers

NAPLAN’s distinctive format causes confusion for some children which leads to mistakes that are unrelated to the skills being tested. For example, when colouring the bubbles related to gender, one Year 3 girl in my study mistakenly coloured the bubble marked ‘boy’.

Level of difficulty

While some children described NAPLAN as ‘easy’, with some equating ‘easy’ with ‘boring’, others found it difficult; with one Year 3 child saying, ‘People should think if children can do it’. For some children, especially in Year 3, this related to unfamiliar vocabulary, which was clear in their questions, ‘What is roasted?’, ‘what is a journal?’, and ‘what does extract mean?’ during practice tests. Others, particularly in Year 7, found the test difficult because the content was unfamiliar: ‘I got annoyed with some of the questions because I hadn’t heard it before’ and ‘some parts of the maths we had not learned about’.

Feedback

Some children do prefer tests to other types of assessment, as I mentioned before, because they find it easier to talk through their answers with their teachers. However, NAPLAN results are simply indicated by a dot positioned within reported bands for their year level, with no substantive feedback. And the results arrive far too late, months after the testing, to be of use anyway.

The testing conditions

NAPLAN does not only involve the tests themselves, but the conditions under which the children take them. In addition to the change in the teachers’ role from a mentor, or helper, to a supervisor who reads scripted directions, NAPLAN’s testing protocols produce a very different classroom atmosphere to that which would be usual for a class or group test – particularly in primary school.

Isolation

During NAPLAN, the room must be stripped of all displays and stimulus, and the students must sit in isolation so that they cannot talk with other students or see their work. Only the Year 7 children had experience in taking similar extended tests, which raises the issue of NAPLAN’s suitability for younger children. For the children in my study, this isolation was not usually a part of taking school-based tests; they simply completed their tests at their desks which stayed in the usual classroom configuration.

Time

The Year 7 children were also encouraged to read a novel or continue with an assignment when they were finished school-based tests, to give all children enough time to finish. This is a sharp contrast to NAPLAN’s strict testing protocols, where such behavior would be seen as cheating.

Other children found NAPLAN difficult because of insufficient time: ‘I hate being rushed by the clock. When I am being rushed I feel like … I will run out of time which makes it super hard to get it done’ (Year 7 child), and ‘I felt a little worried because I didn’t get a few questions and there wasn’t much time left, so I didn’t know if I was going to do them all’ (Year 3 child).

Test preparation: The spillover from the testing week to everyday school life

These differences between NAPLAN and everyday school life, including school-based tests, mean that many teachers consider test preparation necessary. While most of these teachers did not agree with test preparation, they felt they had little choice, as ‘the kids need to be drilled on how the questions are going to be presented and to fill in the bubbles and all the jargon that goes with that’, and ‘to give them the best chance, to be fair to them’. As a result, the negative effects of the testing week spilled over into everyday school life in the months leading up to the tests; albeit to varying degrees within the different classrooms.

The daily ‘classroom talk’ which helped the children to clarify and refine their understandings was conspicuously absent. The students’ learning context shifted from tasks requiring higher order thinking skills, such as measuring the lengths and angles of shadows at different times during the day; pretending to be reporters to research the history of the local community; or developing and proposing a bill for the Year 7 Parliament; to isolated test practice which included colouring bubbles, ‘because if you don’t do it properly they won’t mark it’.

Some children found this shift frustrating, which affected student-teacher relationships, with some Year 7 children reporting that ‘[she gets] more cranky’ and ‘[he is] more intense’ as NAPLAN approached. For children with psychological disabilities, this shift was particularly difficult; with outbursts and ‘meltdowns’ resulting in negative consequences that deepened their alienation from their teacher and peers.

NAPLAN goes against everything we try to do in class

The separated desks and stripped walls not only make the classroom look different, but feel alien in comparison to the children’s everyday school life. This was reflected in some students’ reports that ‘It’s scary having all our desks split up and our teacher reading from a script and giving us a strict time limit’. This was supported by one of the teachers:

NAPLAN goes against everything we try to do in class. You’re getting the kids to talk to each other and learn from each other, and learn from their peers and challenge their peers, and yet they’ve got to sit on their own, isolated for such a period of time. It’s not even a real-life scenario.

ACARA maintains that the primary purpose of NAPLAN is to ‘identify whether all students have the literacy and numeracy skills and knowledge to provide the critical foundation for other learning and for their productive and rewarding participation in the community’ (ACARA, 2013). Further, that the testing environment must be tightly controlled to ensure that the tests are fair.

However, the issues I found in my research raise critical questions regarding NAPLAN’s ability to achieve the government’s primary goals of: (1) promoting equity and excellence in Australian schools; and (2) for all young Australians to become successful learners, confident and creative individuals and active, informed citizen; as outlined in the Melbourne Declaration of Educational Goals for Young Australians.

Many Year 7 students in my study reported that NAPLAN was a waste of time that hindered their learning; with some children reporting that as a result, they had disengaged from the test and any associated preparation. This raises significant questions about the extent to which NAPLAN can do the job it was designed to do.

As we embark on another year of NAPLAN testing, it is time to rethink the test, and this requires authentic conversations with, rather than about, students and their teachers.

 

Dr Angelique Howell is a casual academic at The University of Queensland. She is working on several research projects relating to students’ engagement in meaningful learning and exploring how young people, schools and communities can work together to enhance student engagement. An experienced primary teacher, her research interests include social justice; counting children and young people in, together with the other stakeholders in educational research; and apprenticing students as co-researchers.

 

Learning to write should not be hijacked by NAPLAN: New research shows what is really going on

You couldn’t miss the headlines and page one stories across Australia recently about the decline of Australian children’s writing skills. The release of results of national tests in literacy and numeracy meant we were treated to a range of colour-coded tables and various info graphics that highlighted ‘successes’ and ‘failures’ and that dire, downward trend. A few reports were quite positive about improved reading scores and an improvement in writing in the early years of schooling. However, most media stories delivered the same grim message that Australian students have a ‘major problem’ with writing.

Of course politicians and media commentators got on board, keen to add their comments about it all. The release of NAPLAN (National Assessment Program – Literacy and Numeracy) every year in Australia offers a great media opportunity for many pundits. Unfortunately the solutions suggested were predictable to educators: more testing, more data-based evidence, more accountability, more direct instruction, more ‘accountability’.

These solutions seem to have become part of ‘common sense’ assumptions around what to do about any perceived problem we have with literacy and numeracy. However, as a group of educators involved in literacy learning, especially writing, we know any ‘problem’ the testing uncovers will be complex. There are no simple solutions. Certainly more testing or more drilling of anything will not help.

What worries us in particular about the media driven responses to the test results is the negative way in which teachers, some school communities and even some students are portrayed. Teachers recognise it as ‘teacher bashing’, with the added ‘bashing’ of certain regions and groups of schools or school students.  This is what we call ‘deficit talk’ and it is incredibly damaging to teachers and school communities, and to the achievement of a quality education for all children and young people.

Providing strong teaching of literacy is an important component of achieving quality outcomes for all students in our schools. There’s little doubt that such outcomes are what all politicians, educators, students and their families want to achieve.

As we are in the process of conducting a large research project into learning to write in the early years of schooling in Australia we decided to have a say. We have a deep understanding of the complexities involved in learning to write. Especially, our research is significant in that it shows teachers should be seen as partners in any solution to a writing ‘problem’ and not as the problem.

Our project is looking at how young children are learning to write as they participate in producing both print and digital texts with a range of tools and technologies. While the project is not complete, our work is already providing a fresh understanding of how the teaching of writing is enacted across schools at this time. We thought we should tell you about it.

What we did

Our research was carried out in two schools situated in low socio-economic communities across two states. The schools were purposefully selected from communities of high poverty that service children from diverse cultural and/or linguistic backgrounds in Australia.  Schools like these often achieve substantially below the national average in writing as measured by NAPLAN. These two schools are beginning to demonstrate that this does not need to be the case.

We looked at how, when, where, with what, and with whom children are learning to write in early childhood classrooms. We want to know what happens when writing, and other text production, is understood to be a collaborative, shared practice rather than an individual task; and when teaching and learning has a focus on print and digital tools, texts resources and devices. We worked collectively with the schools to think about the implications for teaching and learning.

Spending time in these schools has giving us a deeper understanding of how poverty and access to resources impact on student outcomes. We found many positive things, for example the way the teachers, researchers, children, their families and communities work together enthusiastically to plan and implement high quality literacy curriculum and teaching to all students.

As part of our study, we audited the current practices of teaching and learning writing. We interviewed teachers and children to gather their perspectives on what learning to write involves, asking them about when they write, where they write, who they write with and the resources they use when writing. By combining teacher and children’s perspectives, we aim to understand how children learn to write from a variety of perspectives.

What we found (so far)

This is just the first step in sharing the results of our research (there is much more to come) but we thought this was a good time to start telling you about it. It might help with an understanding of what is happening in schools with writing and where NAPLAN results might fit in.

We identified four vital areas. Each is important. This is just an overview, but we think you’ll get the idea.

Teaching skills and time to write

Teachers are indeed teaching basic print-based skills to their students. This is despite what you might be told by the media. What teachers and children have told us is that they need more time to practise writing texts. Our observations and discussions with teachers and children suggest that the current crowded curriculum and the way schools now expect to use a range of bought systems, tools, kits and programs to teach the various syllabuses, is providing less time for children to actually write and produce texts. We believe this has significant implications for how well children write texts.

Technology and writing

We captured the close and networked relationship between texts, technologies, resources and people as young children learn to write. In summary, we believe print-based and digital resources need to come together in writing classrooms rather than be taught and used separately.

Another important point is that there is a problem with equity related to access to technology and digital texts. Children in certain communities and schools have access while those in other communities do not. This is not something teachers can solve. It is a funding issue and only our governments can address it.

Writing as a relational activity

We know that teachers and children understand that learning to write is a relational process. It needs to be a practice that people do together – including in classrooms when the learners and the teacher and other adults work on this together. When asked, children represented themselves as active participants in the writing process. This is a positive outlook to have. They talked about being able to bring their ideas, preferences, and emotions, not just their knowledge of basic skills, to the mix. They represented writing as an enjoyable activity, particularly when they were able to experience success.

Who is helping children to learn to write?

Children saw other children and family members, as well as their teachers, as key human resources they could call upon when learning to write. Children perceived these people as being knowledgeable about writing and as being able to help them. Again this is a positive finding and has many implications for the way we teach writing in our schools, and the way we engage with parents.

We know that learning to write should not be considered an individual pursuit where the goal is to learn sets of composite skills, even if these skills are easy to test. Rather, it is a process where the goal should always be to learn how to produce texts that communicate meaning.

We hope our work can help you to see that learning to write is not a simple process and that any problems encountered won’t have simple solutions.

For schools in communities of poverty, the aim to achieve improvements in how well students write will be impacted upon by a variety of complex social, economic, political and material issues. Teachers do play an important role. However, while teachers are held accountable for student outcomes, so too should systems be held accountable for balancing the policy levers to enable teachers to do their job.

If the latest NAPLAN results mean that standards in writing in Australia are declining (and we won’t go into how that could be contestable) it is unlikely that any of the simple solutions recently offered by media commentary or politicians will help. More testing leading to more box ticking means less time to learn to write and less time to write.

We will have more to tell you about our research into young children learning to write in the future. Watch out for our posts.

————————————————————————–

**The blog is drawn from the ARC funded project, Learning to write: A socio-material analysis of text production (DP150101240 Woods, Comber, & Kervin). In the context of increased calls for improved literacy outcomes, intense curriculum change and the rapidly increasing digitisation of communication, this project explores the changing practices associated with learning to write in contemporary Early Childhood classrooms. We acknowledge the support of the Australian Research Council and our research partners who are the leaders, teachers, children and their families who research with us on this project.

 

Annette Woods is a professor in the Faculty of Education at Queensland University of Technology. She researches and teaches in school reform, literacies, curriculum, pedagogy and assessment. She leads the Learning to write in the early years project (ARC DP150101240).

 

 

Aspa Baroutsis is a senior research fellow in the Faculty of Education at Queensland University of Technology. She is currently working on the Learning to write in the early years project (ARC DP150101240). Her research interests include media, policy, social justice, science education, digital technologies and literacies.

 

 

Lisa Kervin is an associate professor in language and literacy in the Faculty of Social Sciences and a researcher at the Early Start Research Institute at the University of Wollongong. Lisa’s current research interests are focused on young children and how they engage with literate practices. She is a chief investigator on the Learning to write in the early years project (ARC DP150101240).

 

 

Barbara Comber is a professor in education at the University of South Australia. Barbara researches and teaches in literacies, pedagogy and socioeconomic disadvantage. She is a chief investigator on the Learning to write in the early years project (ARC DP150101240).

 

The dark side of NAPLAN: it’s not just a benign ‘snapshot’

The release of the latest NAPLAN results this week identified a problem with student performance in writing. This prompted the federal minister for education, Simon Birmingham, to state these results “are of real concern”. And the CEO of Australian Curriculum, Assessment and Reporting Authority, Robert Randall, added that “we’ll have a conversation with states and territories” to pinpoint the exact problem.

You get the message: there is a problem. As I see it we have a much bigger problem than the one the minister and ACARA are talking about.

At the moment, we have two concurrent and competing ‘systems’ of education operating in Australia, and particularly in NSW: one is the implementation of the state-authorised curriculum and the other, the regime of mass tests which includes NAPLAN and the Higher School Certificate.

The bigger problem

 NAPLAN results get everyone’s attention, not just mainstream media and parents, but also teachers and school communities. Attention is effectively diverted from curriculum implementation. That means that resources, teacher attention and class time is soaked up with attempts to improve the results of under-performing students. It means that the scope and depth of the curriculum is often ignored in favour of drills and activities aimed at improving student test performance.

In a way, this is sadly ironic for NSW, given that new syllabuses rolled out across 2014-2015 have the development of literacy and numeracy skills as two of seven general capabilities. Specific content in these syllabuses has been developed to strengthen and extend student skills in these two areas. 

Before teachers had the chance to fully implement the new syllabuses and assess student learning, the NSW government jumped in and imposed a ‘pre-qualification’ for the HSC: that students would need to achieve a Band 8 in the Year 9 NAPLAN reading, writing and numeracy test. Yet another requirement in the heavily monitored NSW education system.

And if the federal education minister has his way, we’ll see compulsory national testing of phonics for Year 1 students, in addition to the NAPLAN tests administered in Years 3, 5, 7 and 9; and then in NSW, students will have to deal with the monolithic HSC.

So the ongoing and worsening problem for schools will be finding the space for teaching and learning based on the NSW curriculum.

Similar things are happening in other states and territories.

The dark side of national testing

As we know, mass testing has a dark side. Far from being the reasonable, benign ‘snapshot’ of a child’s skills at a point in time, we know that the publication of these tests increase their significance so that they become high-stakes tests, where parental choice of schools, the job security of principals and teachers and school funding are affected.

And here I will add a horror story of how this can be taken to extremes. In Florida in 2003, the Governor, Jeb Bush, called the rating of schools based with a letter A-F, based on test results, a “key innovation”. Using this crude indicator, schools in this US state were subsequently ‘labelled’ in a simplistic approach to numerous complex contextual features such as attendance rates, student work samples, the volume and types of courses offered and extracurricular activities.

Already in Australia NAPLAN results have a tight grip on perceptions of teacher and school effectiveness. And quite understandably, schools are concentrating their efforts in writing on the ‘text types’ prescribed in the NAPLAN tests: imaginative writing – including narrative writing, informative writing and persuasive writing.

So what might be going wrong with writing?

As I see it, the pressure of NAPLAN tests is limiting our approaches to writing by rendering types of writing as prescriptive, squeezing the spontaneity and freshness out of students’ responses. I agree it is important for students to learn about the structural and language features of texts and to understand how language works. However it appears that schools are now drilling students with exercises and activities around structural and language features of text types they’ll encounter in the test.

Has the test, in effect, replaced the curriculum?

Again taking NSW as an example, writing has always been central, dating back over a century to the reforms in both the primary and secondary curriculum in 1905 and 1911 respectively. The then Director of Education, Peter Board, ensured that literature and writing were inextricably linked so that the “moral, spiritual and intellectual value of reading literature” for the individual student was purposeful, active and meaningful. In addition to this, value and attention was assigned to the importance of personal responses to literature.

This kind of thinking was evident in the 1971 NSW junior secondary school English syllabus, led by Graham Little, which emphasised students using language in different contexts for different purposes and audiences. In the current English K-10 Syllabus, the emphasis is on students planning, composing, editing and publishing texts in print or digital forms. These syllabus documents value students engaging with and composing a wide range of texts for imaginative, interpretive and analytical purposes. And not just to pass an externally-imposed test.

In a recent research project with schools in south-west Sydney, participating teachers, like so many talented teachers around Australia, improved student writing skills and strengthened student enjoyment of writing by attending to pedagogical practices, classroom writing routines and strategies through providing students choice in writing topics and forms of writing; implementing a measured and gradated approach to writing; using questioning techniques to engage students in higher order thinking and portraying the teacher as co-writer.

These teachers reviewed the pressures and impact of mass testing on their teaching of writing, and like so many around Australia, looked for ways to develop the broad range of skills, knowledge and understandings necessary for all students, as well as ways to satisfy the accountability demands like NAPLAN.

Without the yoke of constant mass testing I believe teachers would be able to get on with implementing the curriculum and we’d see an improvement not only in writing, but also across the board.

Don Carter is senior lecturer in English Education at the University of Technology Sydney. He has a Bachelor of Arts, a Diploma of Education, Master of Education (Curriculum), Master of Education (Honours) and a PhD in curriculum from the University of Sydney (2013). Don is a former Inspector, English at the Board of Studies, Teaching & Educational Standards and was responsible for a range of projects including the English K-10 Syllabus. He has worked as a head teacher English in both government and non-government schools and was also an ESL consultant for the NSW Department of Education. Don is the secondary schools representative in the Romantic Studies Association of Australasia and has published extensively on a range of issues in English education, including The English Teacher’s Handbook A-Z (Manuel & Carter) and Innovation, Imagination & Creativity: Re-Visioning English in Education (Manuel, Brock, Carter & Sawyer).