Sally Larsen

NAPLAN: Where have we come from – where to from here?

August 28, 2023NAPLANAARE blog, NAPLAN, Sally Larsen, University of New England

With the shift to a new reporting system and the advice from ACARA that the NAPLAN measurement scale and time series have been reset, now is as good a time as any to rethink what useful insights can be gleaned from a national assessment program.

The 2023 national NAPLAN results were released last week, accompanied by more than the usual fanfare, and an overabundance of misleading news stories. Altering the NAPLAN reporting from ten bands to four proficiency levels, thereby reducing the number of categories students’ results fall into, has caused a reasonable amount of confusion amongst public commentators, and many excuses to again proclaim the demise of the Australian education system.

Moving NAPLAN to Term 1, with all tests online (except Year 3 writing) seems to have had only minimal impact on the turnaround of results.

The delay between the assessments and the results has been a limitation to the usefulness of the data for schools since NAPLAN began. Added to this, there are compelling arguments that NAPLAN is not a good individual student assessment, shouldn’t be used as an individual diagnostic test, and is probably too far removed from classroom learning to be used as a reliable indicator of which specific teaching methods should be preferred.

But if NAPLAN isn’t good for identifying individual students’ strengths and weaknesses, thereby informing teacher practices, what is it good for?

My view is that NAPLAN is uniquely powerful in its capacity to track population achievement patterns over time, and can provide good insights into how basic skills develop from childhood through to adolescence. However, it’s important that the methods used to analyse longitudinal data are evaluated and interrogated to ensure that conclusions drawn from these types of analyses are robust and defensible.

Australian governments are increasingly interested in students’ progress at school, rather than just their performance at any one time-point. The second Gonski review (2018) was titled Through Growth to Achievement. In a similar vein, the Alice Springs (Mparntwe) Education Declaration (2019) signed by all state, territory and federal education ministers, argued,

“Literacy and numeracy remain critical and must also be assessed to ensure learning growth is understood, tracked and further supported” (p.13, my italics)

Tracking progress over time should provide information about where students start and how fast they progress, and ideally, allow insights into whether policy changes at the system or state level have any influence on students’ growth.

However, mandating a population assessment designed to track student growth, does not always translate to consistent information or clear policy directions – particularly when there are so many stakeholders determined to interpret NAPLAN results via their own lens.

One recent example of contradictory information arising from NAPLAN, relates to whether students who start with poor literacy and numeracy results in Year 3 fall further behind as they progress through school. This phenomenon is known as the Matthew Effect. Notwithstanding widespread perceptions that underachieving students make less progress on their literacy and numeracy over their school years compared with higher achieving students, our new research found no evidence of Matthew Effects in NAPLAN data from NSW and Victoria.

In fact, we found the opposite pattern. Students who started with the poorest NAPLAN reading comprehension and numeracy test results in Year 3 had the fastest growth to Year 9. Students who started with the highest achievement largely maintained their position but made less progress.

Our results are opposite to those of an influential Grattan Institute Report published in 2016. This report used NAPLAN data from Victoria and showed that the gap in ‘years of learning’ widened over time. Importantly, this report applied a transformation to NAPLAN data before mapping growth overall, and comparing the achievement of different groups of students.

After the data transformation the Grattan Report found,

“Low achieving students fall ever further back. Low achievers in Year 3 are an extra year behind high achievers by Year 9. They are two years eight months behind in Year 3, and three years eight months behind by Year 9.” (p.2)

How do we reconcile this finding with our research? My conclusion is that these opposing findings are essentially due to different data analysis decisions.

Without the transformation of data applied in the Grattan Report, the variance in NAPLAN scale scores at the population level decreases between Year 3 and Year 9. This means that there’s less difference between the lowest and highest achieving students in NAPLAN scores by Year 9. Reducing variance over time can be a feature of horizontally-equated Rasch-scaled assessments – and it is a limitation of our research, noted in the paper.

There are other limitations of NAPLAN scores outlined in the Grattan Technical report. These were appropriately acknowledged in the analytic strategy of our paper and include, modelling the decelerating growth curves, accounting for problems with missing data, allowing for heterogeneity in starting point and rate of progress, modelling measurement error, and so on. The latent growth model analytic design that we used is very suited to examining research questions about development, and the type of data generated by NAPLAN assessments.

In my view, the nature of the Rasch scores generated by the NAPLAN testing process does not require a score transformation to model growth in population samples. Rasch scaled scores do not need to be transformed into ‘years of progress’ – and indeed doing so may only muddy the waters.

For example, I don’t think it makes sense to say that a child is at a Year 1 level in reading comprehension based on NAPLAN because the skills that comprise literacy are theoretically different at Year 1 compared with Year 3. We already make a pretty strong assumption with NAPLAN that the tests measure the same theoretical construct from Year 3 to Year 9. Extrapolating outside these boundaries is not something I would recommend.

Nonetheless, the key takeaway from the Grattan report, that “Low achieving students fall ever further back” (p.2) has had far reaching implications. Governments rely on this information when defining the scope of educational reviews (of which there are many), and making recommendations about such things as teacher training (which they do periodically). Indeed, the method proposed by the Grattan report was that used by a recent Productivity Commission report, which subsequently influenced several Federal government education reviews. Other researchers use the data transformation in their own research, when they could use the original scores and interpret standard deviations for group-based comparisons.

Recommendations that are so important at a policy level should really be underpinned by robustly defended data analysis choices. Unfortunately the limitations of an analytic strategy can often be lost because stakeholders want takeaway points not statistical debates. What this example shows is that data analysis decisions can (annoyingly) lead to opposing conclusions about important topics.

Where to from here

Regardless of which interpretation is closer to the reality, NAPLAN 2023 represents something of a new beginning for national assessments in Australia. The key change is that from 2023 the time series for NAPLAN will be reset. This means that schools and states technically should not be comparing this year’s results with previous years.

The transformation to computer adaptive assessments is also now complete. Ideally this should ensure more precision in assessing the achievement of students at the both ends of the distribution – a limitation of the original paper-based tests.

Whether the growth patterns observed in the old NAPLAN will remain in the new iteration is not clear: we’ll have to wait until 2029 to replicate our research, when the 2023 Year 3s are in Year 9.

Sally Larsen is a Lecturer in Learning, Teaching and Inclusive Education at the University of New England. Her research is in the area of reading and maths development across the primary and early secondary school years in Australia, including investigating patterns of growth in NAPLAN assessment data. She is interested in educational measurement and quantitative methods in social and educational research. You can find her on Twitter @SallyLars_27

Confusion on PIRLS reporting – some outlets make major mistakes

May 22, 2023PIRLS, Progress in International Reading Literacy Study (PIRLS), standardised testingAARE blog, Nicole Mockler, PIRLS, Progress in International Reading Literacy Study, Sally Larsen, standardised testing

The Progress in International Reading Literacy Study (PIRLS) results were released last Tuesday, generating the usual flurry of media reports. PIRLS selects a random sample of schools and students around Australia, and assesses the reading comprehension of Year 4 students. The sampling strategy ensures that the results are as representative of the Australian population of Year 4 students as they can be.

These latest results were those from the round of testing that took place in 2021 amid the considerable disruptions to schooling that came with the COVID-19 pandemic. Indeed, the official report released by the team at ACER acknowledged the impacts on schools, teachers and students, especially given that PIRLS was undertaken in the second half of the 2021 school year – a period with the longest interruptions to face-to-face schooling in the two largest states (NSW and Victoria).

Notwithstanding these disruptions, the PIRLS results showed no decline in the average score for Australian students since the previous testing round (2016), maintaining the average increase from the first PIRLS round Australia participated in (2011). The chart below shows the figure from the ACER report (Hillman et al., 2023, p.22).

The y-axis on this chart is centred at the historical mean (a score of 500) and spans one standard deviation above and below the mean on the PIRLS scale (1SD = 100). The dashed line between 2016 and 2021 is explained in the report:

“Due to differences in the timing of the PIRLS 2021 assessment and the potential impact of COVID-19 and school closures on the results for PIRLS 2021, the lines between the 2016 and 2021 cycles are dashed.” (Hillman et al., 2023, p.22).

Despite these results, and the balanced reporting of the ACER official report, reiterated in their media release and piece in The Conversation, the major newspapers around Australia still found something negative to write about. Indeed, initial reporting collectively reiterated a common theme of large-scale educational decline.

The Sydney Morning Herald ran with the headline: ‘Falling through the cracks’: NSW boys fail to keep up with girls in reading. While it’s true to say the average difference between girls and boys has increased since 2011 (from 14 scale scores to 25 in 2021), boys in NSW are by no means the worst performing group. Girls’ and boys’ average reading scores mirror a general trend in PIRLS: that is, improvement from 2011 and pretty consistent thereafter (see Figure 2.11 from the PIRLS report below). Observed gender gaps in standardised tests are a persistent, and as yet unresolved, problem – one that researchers and teachers the world over have been considering for decades. The words ‘falling through the cracks’ implies that no one is looking out for boys’ reading achievement, an idea that couldn’t be further from the truth.

Similarly, and under the dramatic headline, Nine-year-olds’ literacy at standstill, The Australian Financial Review also ran with the gender-difference story, but at least indicated that there was no marked change since 2016. The Age took a slightly different tack, proclaiming, Victorian results slip as other states hold steady, notwithstanding that a) the Victorian average was the second highest nationally after the ACT, and b) Victorian students had by far the longest time in lockdown and remote learning during 2021.

Perhaps the most egregious reporting came from The Australian. The story claimed that the PIRLS results showed “twice as many children floundered at the lowest level of reading, compared with the previous test in 2016 … with 14 per cent ranked as ‘below low’ and 6 per cent as ‘low’”. These alarming results were accompanied by a graph showing the ‘below low’ proportion in a dangerous red. The problem here is that whoever has created the graph has got the numbers wrong. The article has reversed the proportions of students in the two lowest categories.

A quick check of the official ACER report shows how they’ve got it wrong. The figure below shows percentages of Australian students at each of the five benchmarks in the 2021 round of tests (top panel) and the 2016 round (bottom panel), taken directly from the respective year’s reports. The proportions in the bottom two categories – and indeed all the categories – have remained stable over the five-year span. This is pretty remarkable considering the disruption to face-to-face schooling that many Year 4 children would have experienced during 2021.

But, apart from the unforgivable lack of attention to detail, why is this poor reporting a problem? Surely everyone knows that news articles must have an angle, and that disaster stories sell?

The key problem, I think, is the reach of these stories relative to that of the official reporting released by ACER, and by implication, the impact they have on public perceptions of schools and teachers. If politicians and policymakers are amongst the audiences of the media reports, but never access the full story presented in the ACER reports, what conclusions are they drawing about the efficacy of Australian schools and teachers? How does this information feed into the current round of reviews being undertaken by the federal government – including the Quality Initial Teacher Education Review and the Review to Inform a Better and Fairer Education System? If the information is blatantly incorrect, as in The Australian’s story, is this record ever corrected?

The thematic treatment of the PIRLS results in the media echoes Nicole Mockler’s work on media portrayals of teachers. Mockler found portrayals of teachers in news media over the last 25 years were predominantly negative,continually calling into question the quality of the teaching profession as a whole. A similar theme is evident even for a casual observer of media reporting of standardised assessment results.

Another problem is the proliferation of poor causal inferences about standardised assessment results on social media platforms – often from people who should know better. Newspapers use words like ‘failed’, ‘floundered’, ‘slipped’, and suddenly everyone wants to attribute causes to these phenomena without apparently questioning the accuracy of the reporting in the first place. The causes of increases or declines in population average scores on standardised assessments are complex and multifaceted. It’s unlikely that one specific intervention or alteration (even if it’s your favourite one) will cause substantial change at a population level, and gathering evidence to show that any educational intervention works is enormously difficult.

Notwithstanding the many good stories – the successes and the improvements that are evident in the data – my prediction is that next time there’s a standardised assessment to report on, news media will find the negative angle and run with it. Stay tuned for NAPLAN results 2023.

Sally Larsen is a Lecturer in Learning, Teaching and Inclusive Education at the University of New England. Her research is in the area of reading and maths development across the primary and early secondary school years in Australia, including investigating patterns of growth in NAPLAN assessment data. She is interested in educational measurement and quantitative methods in social and educational research. You can find her on Twitter @SallyLars_27

AARE 2022: That’s a wrap for a spectacular conference

December 5, 2022Australian Education Research Organisation, Centre for Educational Statistics and Evaluation, Gonski, Grattan Institute, inequity in schooling, NAPLAN, parents, philanthropy in schoolsAARE blog, AARE conference, Andrew Wade, Anna Hogan, Beatriz Gallo Cordoba, Emma Rowe, Gonski, Jamie Manolev, Jung-sook Lee, Maria Gindidis, Meghan Stacey, Melissa Tham, Sally Larsen, Sigrid Hartong, Simone McDonald, Venesser Fernandes

It goes without saying that it’s been a difficult few years for in-person conferences. I’m sure many of us had high hopes for AARE 2022 and it certainly delivered spectacularly! From the excellent opening session on Monday morning, through all the presentations I was lucky enough to catch, to the opportunities to connect with colleagues old and new, I couldn’t fault anything (ok maybe too much cake at morning tea but a small price to pay for a lovely few days). As an early career researcher it was encouraging to see many just-graduated PhDs present their research, to audiences containing not only their supervisors, but also the many others who attended their presentations. The sense of community was certainly apparent.

It is challenging for ECRs to step into the realm of national research conferences. It takes a while to figure out whether you’re conferencing in the right way or not. AARE 2022 was the first in-person conference I’ve attended, having completed my entire PhD during COVID-19 lockdowns and travel restrictions. I’d heard about the generative nature of these events but I had to experience it first-hand to see how productive they can be. Everyone I met and talked with over the few days – no matter their role, position or length of time in the industry – was welcoming, encouraging and interested in the future of education research in Australia. If AARE 2022 is anything to go by, the future of our field is looking very strong.

My personal highlights included:

The welcome to country by Uncle Mickey: Thank you. We were so welcomed to Kaurna country and the theme of knowledge sharing permeated the days of the conference.
Professor Allyson Holbrook’s outgoing presidential address which prompted me to reflect on the uniqueness of a PhD undertaken in the field of education. We are rare indeed. Supporting the progress and career development of our current PhD students, and attracting more people with educational qualifications to pursue research will be an ongoing – but necessary – challenge.
The City West Campus of UniSA was a really spectacular location: I didn’t get lost even once! The weather was perfect and the outdoor spaces allowed many serendipitous meetings not possible in online conference format. Huge congratulations and thanks should go to all those who helped organise such an excellent event.

Finally, the many individual talks interposed by themed symposiums are always the ultimate highlight of an in-person conference. In the following section I’ve drawn together some threads emerging from several different presentations that I observed during the 2022 AARE conference.

The missing link: Considering the agency of parents in the Australian educational landscape

I think it was Emma Rowe who had a beautiful metaphor about pulling the threads of seemingly different phenomena and watching how they unravelled (Day 2, Politics and Policy in Education symposium). In a similar vein I’d like to pull out some threads from multiple presentations in disparate streams and try to capture something missing.

First the presentations: In the Day 3 Sociology of Education stream, Jung-sook Lee and Meghan Stacey from UNSW spoke about their work looking at perceptions of fairness in relation to educational inequities. The researchers presented a fictional scenario to a sample of almost 2000 Australian adults in which ‘students from high-income and low-income families have achievement gaps due to different quality of education provided to them’ (from the abstract). The scenario identified a situation where better-quality teachers for children from high-income families led to better educational outcomes for these children.

Interestingly people with children either currently in school or soon to attend school were less likely to perceive this scenario as unfair.

Prompted by the concluding questions proposed by the authors, audience discussion turned to the issue of why people – and parents in particular – might hold this oddly contradictory opinion. We pride ourselves in Australia (apparently) on being proudly egalitarian. The Gonski reviews (both the first and the second) were largely positively received in the Australian community. Yes! Of course children should have equitable access to educational resources. #IgiveaGonski.

So why might the idea of educational equity not apply when considering the educational experiences of our own children? Why would it be ok, in the perceptions of the survey respondents, that some children get a better deal because their families have the capacity to pay for it?

The second presentation in the Schools and Education Systems stream (also Day 3) was that by Melissa Tham, Shuyan Huo and Andrew Wade from Victoria University. The study used data from the Longitudinal Study of Australian Youth (LSAY) and demonstrated that attendance at academically selective schools has apparently no long-term benefits for students attending these schools. The authors looked at a range of outcomes including university participation and completion, whether participants were employed, and life satisfaction at age 19 and again at age 25. None of these differed for students who had attended selective schools versus those who had not.

The discussion again turned to the question of why parents are invested in sending their kids to academically selective schools if there’s no observable long-term benefit of doing so. [Of course, academically selective schools always top the rankings for the ATAR each year, but this is likely because the kids in these schools are already high-achievers, not because the selective schooling system adds value to their educational experience]. Indeed, there may be considerable medium-term disadvantages for some students in contexts where kids are grouped together in hothouses of ultra-competitiveness.

A third paper that I wasn’t able to attend on Day 4 in the Social Justice stream touched again on the question of whether a private school education adds any value to educational outcomes (broadly defined). The authors Beatriz Gallo Cordoba, Venesser Fernandes, Simone McDonald and Maria Gindidis, looked at the way differences in Year 9 NAPLAN numeracy scores between public and private schools were related to funding inequities between these contexts, rather than school quality differences. While the abstract argued that ‘the increasing number of parents sending their children to private schools has been a growing trend causing controversy’, I am inclined to think that if equity is not the foremost consideration for parents in their school decision-making, then it’s not a controversy for them. Like all of us, parents want the best for children. It just so happens that they may make different decisions when it’s their own children (real and concrete as they are), rather than other people’s children (in the abstract).

Anecdotally, people are aware that there’s no academic benefit to these kinds of schools – neither the academically selective type nor the financially selective type. Earlier this year in The Conversation we summarized research showing no advantages to sending children to private schools when NAPLAN results are considered as an ‘outcome’. Apart from being roundly criticized once or twice for the apparently obvious findings, the thousands of comments we received on social media channels and on the website largely indicated that parents weren’t thinking of academics when they paid for a private education for their kids. But if not academics then what? And if we ostensibly believe in equity until it’s our kids in the mix then do we really believe it at all? What is going on with parents’ decision-making that means these kinds of contradictory decisions are being made about their children’s schooling?

This brings me (finally!) to my point: it felt like the missing thread drawing these disparate research papers together is the influence of parents. After all, which is the largest group of stakeholders in this game after teachers and children themselves? I think we downplay the influence of parents in the education of children at our peril. We can train teachers to be absolute superstars, we can lobby governments for more equitable funding allocations and better conditions for teachers, we can study cognitive development and how children learn in schooling contexts, we can work on inclusion, fairness and tolerance among students in school communities. But I wonder: if the influence of parents is not directly and explicitly confronted in research that examines educational inequities, policy or social justice (whether the influences are positive or negative), do we have a confounding variable problem? And if so, how can this be resolved?

No offence intended to the (possibly multiple) papers at AARE 2022 that did consider the role of parents in the education of their children. In particular among the presentations that I wasn’t able to catch on the final day was an intriguing one in a Politics and Policy symposium entitled ‘The construction of (good) parents (as professionals) in/through learning platforms’ presented by Sigrid Hartong and Jamie Manolev. Secondly, Anna Hogan presented her work in the Philanthropy in Education symposium, examining the changing role of Parents and Citizens (P&C) organisations in public schools. The findings of this work show how ‘parents are now operating as new philanthropists, solving the problem of inadequate state funding through private capital raising’ in public schools (from the abstract). I’m looking forward to papers for both of these studies in the near future!

Postscript

These last few years have been challenging times for researchers in many fields, but maybe particularly so for education. Oftentimes it seems as though we move in totally different realms to the governments that make educational policy and the school sites which contain the teachers and students we are interested in supporting. The rise of research agencies external to universities (e.g. the Grattan Institute, the Centre for Independent Studies and AERO) or those subsumed within government departments (e.g. the Centre for Educational Statistics and Evaluation) may mean that our research work is sidelined or ignored, particularly when the findings are not immediately applicable or contradictory to national narratives of educational decline.

AARE 2022 has reinforced to me the quality and depth of the research that is happening in universities across Australia in many diverse subfields of educational scholarship. I found out so much that I did not know before: and perhaps this in itself is a challenge for us. We know that our work is important and to whom it should apply. We can see the value in each other’s work when we attend conferences and allow the space to connect, discuss and imagine. How then do we ensure this value is recognised not only by the wider community, but also by all the teachers, early childhood educators, policymakers, parents and young people who are both the subjects and potential beneficiaries of our research?

The good, the bad and the pretty good actually

November 3, 2022NAPLANAARE blog, NAPLAN, Sally Larsen

Every year headlines proclaim the imminent demise of the nation due to terrible, horrible, very bad NAPLAN results. But if we look at variability and results over time, it’s a bit of a different story.

I must admit, I’m thoroughly sick of NAPLAN reports. What I am most tired of, however, are moral panics about the disastrous state of Australian students’ school achievement that are often unsupported by the data.

A cursory glance at the headlines since NAPLAN 2022 results were released on Monday show several classics in the genre of “picking out something slightly negative to focus on so that the bigger picture is obscured”.

A few examples (just for fun) include:

Reading standards for year 9 boys at record low, NAPLAN results show

Written off: NAPLAN results expose where Queensland students are behind

NAPLAN results show no overall decline in learning, but 2 per cent drop in participation levels an ‘issue of concern’

And my favourite (and a classic of the “yes, but” genre of tabloid reporting)

‘Mixed bag’ as Victorian students slip in numeracy, grammar and spelling in NAPLAN

The latter contains the alarming news that “In Victoria, year 9 spelling slipped compared with last year from an average NAPLAN score of 579.7 to 576.7, but showed little change compared with 2008 (576.9). Year 5 grammar had a “substantial decrease” from average scores of 502.6 to 498.8.”

If you’re paying attention to the numbers, not just the hyperbole, you’ll notice that these ‘slips’ are in the order of 3 scale scores (Year 9 spelling) and 3.8 scale scores (Year 5 grammar). Perhaps the journalists are unaware that the NAPLAN scale ranges from 1-1000? It might be argued that a change in the mean of 3 scale scores is essentially what you get with normal fluctuations due to sampling variation – not, interestingly, a “substantial decrease”.

The same might be said of the ‘record low’ reading scores for Year 9 boys. The alarm is caused by a 0.2 score difference between 2021 and 2022. When compared with the 2008 average for Year 9 boys the difference is 6 scale score points, but this difference is not noted in the 2022 NAPLAN Report as being ‘statistically significant’ – nor are many of the changes up or down in means or in percentages of students at or above the national minimum standard.

Even if differences are reported as statistically significant, it is important to note two things:

1. Because we are ostensibly collecting data on the entire population, it’s arguable whether we should be using statistical significance at all.

2. As sample sizes increase, even very small differences can be “statistically significant” even if they are not practically meaningful.

Figure 1. NAPLAN Numeracy test mean scale scores for nine cohorts of students at Year 3, 5, 7 and 9.

The practical implications of reported differences in NAPLAN results from year to year (essentially the effect sizes) are not often canvassed in media reporting. This is an unfortunate omission and tends to enable narratives of largescale decline, particularly because the downward changes are trumpeted loudly while the positives are roundly ignored.

The NAPLAN reports themselves do identify differences in terms of effect sizes – although the reasoning behind what magnitude delineates a ‘substantial difference’ in NAPLAN scale scores is not clearly explained. Nonetheless, moving the focus to a consideration of practical significance helps us ask: If an average score changes from year to year, or between groups, are the sizes of the differences something we should collectively be worried about?

Interestingly, Australian students’ literacy and numeracy results have remained remarkably stable over the last 14 years. Figures 1 and 2 show the national mean scores for numeracy and reading for the nine cohorts of students who have completed the four NAPLAN years, starting in 2008 (notwithstanding the gap in 2020). There have been no precipitous declines, no stunning advances. Average scores tend to move around a little bit from year to year, but again, this may be due to sampling variability – we are, after all, comparing different groups of students.

This is an important point for school leaders to remember too: even if schools track and interpret mean NAPLAN results each year, we would expect those mean scores to go up and down a little bit over each test occasion. The trick is to identify when an increase or decrease is more than what should be expected, given that we’re almost always comparing different groups of students (relatedly see Kraft, 2019 for an excellent discussion of interpreting effect sizes in education).

Figure 2. NAPLAN Reading test mean scale scores for nine cohorts of students at Year 3, 5, 7 and 9.

Plotting the data in this way it seems evident to me that, since 2008, teachers have been doing their work of teaching, and students by-and-large have been progressing in their skills as they grow up, go to school and sit their tests in years 3, 5, 7 and 9. It’s actually a pretty good news story – notably not an ongoing and major disaster.

Another way of looking at the data, and one that I think is much more interesting – and instructive – is to consider the variability in achievement between observed groups. This can help us see that just because one group has a lower average score than another group, this does not mean that all the students in the lower average group are doomed to failure.

Figure 3 shows just one example: the NAPLAN reading test scores of a random sample of 5000 Year 9 students who sat the test in NSW in 2018 (this subsample was randomly selected from data for the full cohort of students in that year, N=88,958). The red dots represent the mean score for boys (left) and girls (right). You can see that girls did better than boys on average. However, the distribution of scores is wide and almost completely overlaps (the grey dots for boys and the blue dots for girls). There are more boys at the very bottom of the distribution and a few more girls right at the top of the distribution, but these data don’t suggest to me that we should go into full panic mode that there’s a ‘huge literacy gap’ for Year 9 boys. We don’t currently have access to the raw data for 2022, but it’s unlikely that the distributions would look much different for the 2022 results.

Figure 3. Individual scale scores and means for Reading for Year 9 boys and girls (NSW, 2018 data).

So what’s my point? Well, since NAPLAN testing is here to stay, I think we can do a lot better on at least two things: 1) reporting the data honestly (even when its not bad news), and 2) critiquing misleading or inaccurate reporting by pointing out errors of interpretation or overreach. These two aims require a level of analysis that goes beyond mean score comparisons to look more carefully at longitudinal trends (a key strength of the national assessment program) and variability across the distributions of achievement.

If you look at the data over time NAPLAN isn’t a story of a long, slow decline. In fact, it’s a story of stability and improvement. For example, I’m not sure that anyone has reported that the percentage of Indigenous students at or above the minimum standard for reading in Year 3 has stayed pretty stable since 2019 – at around 83% up from 68% in 2008. In Year 5 it’s the highest it’s ever been at 78.5% of Indigenous students at or above the minimum standard – up from 63% in 2008.

Overall the 2022 NAPLAN report shows some slight declines, but also some improvements, and a lot that has remained pretty stable.

As any teacher or school leader will tell you, improving students’ basic skills achievement is difficult, intensive and long-term work. Like any task worth undertaking, there will be victories and setbacks along the way. Any successes should not be overshadowed by the disaster narratives continually fostered by the 24/7 news cycle. At the same time, overinterpreting small average fluctuations doesn’t help either. Fostering a more nuanced and longer-term view when interpreting NAPLAN data, and recalling that it gives us a fairly one-dimensional view of student achievement and academic development would be a good place to start.

Everything you never knew you wanted to know about school funding

June 2, 2022Gonski, private schools in Australia, School fundingAARE blog, Chris Bonnor, Gonski, Sally Larsen, School funding, Tom Greenwell, Waiting for Gonski

Book review: Waiting For Gonski: How Australia Failed its Schools, by Tom Greenwell and Chris Bonnor

With the 2022 federal election now in the rear-view mirror and a new Labor government taking office, discussions about the Education portfolio have already begun. As journalists and media commentators noted, education did not figure largely in the election campaign, notwithstanding the understandable public interest in this area. One of the enduring topics of education debates – and the key theme of Waiting For Gonski: How Australia Failed its Schools, by Tom Greenwell and Chris Bonnor – is school funding.

It is easy, and common, to view the school funding debate as a partisan issue. Inequities in school funding are often presumed to be an extension of conservative government policies going back to the Howard government. Waiting for Gonski shows how inaccurate this perception is, and how far governments of any political persuasion have to go before true reform is achieved.

The first part of the book is an analysis of the context that gave rise to the Review of Funding for Schooling in 2011, commonly known as the Gonski Report. Greenwell and Bonnor devote their first chapter to an overview of the policy arguments and reforms that consumed much of the 20^th century, leading to the Gillard government establishing the review. This history is written in a compelling, detailed and interesting way, and contains many eye-opening revelations. For example, the parallels between the 1973 Karmel report and the 2011 Gonski version are somewhat demoralizing for those who feel that school funding reform should be attainable in our lifetimes. Secondly, the integral role that Catholic church authorities have played in the structure of funding distributions that continue to the present day is, I think, a piece of 20^th century history that is very little known. Julia Gillard’s establishment of the first Gonski review is thus situated as part of a longer narrative that is as much a part of Australia’s cultural legacy as are questions around national holidays, or whether or not Australia should become a republic.

Several subsequent chapters detail the findings of the 2011 Gonski review, its reception by governments, lobby groups, and the public, and the immediate rush to build in exceptions when interest groups (particularly independent and catholic school bodies) saw they would “lose money”. The extent to which federal Labor governments are equally responsible for the inequitable state of school funding is made more and more apparent in the first half of the book. Greenwell and Bonnor sought far and wide for comments and recollections from many of the major players in this process, including politicians of both colours, commentators, lobbyists, and members of the review panel itself. This certainly shows in the rich detail and description of this section.

Rather than representing a true champion of equity and fairness, the Gonski report is painted as one built on flawed assumptions, burdened with legacies that were not properly unpacked, and marred by a multitude of compromises, designed to appease the loudest proponents of public funding for private and catholic schools. The second Gonski review, officially titled, Through Growth to Achievement: Report of The Review to Achieve Educational Excellence in Australian Schools, is given less emphasis perhaps because this second review was less about equity and funding and more about teacher quality and instructional reform – a book-length subject in itself.

Waiting for Gonski is most certainly an intriguing and entertaining read (a considerable achievement, given its fairly dry subject matter), and is highly relevant for those of us working towards educational improvements of any description in Australia. My main criticism of the book is that it tends to drag a little in the middle third. While the details of machinations between political leaders and catholic and independent school lobbyists are certainly interesting, the arguments in these middle chapters are generally repetitions from earlier chapters, with reiterated examples of specific funding inequities between schools.

A second concern I have is the uncritical focus on Programme for International Student Assessment (PISA) data to support claims of widespread student academic failure. While it’s true that PISA shows long-term average declines in achievement amongst Australian school students, these assessments are not the only standardized tests of student achievement in this country. The National Assessment Program: Literacy and Numeracy (NAPLAN) is briefly touched upon in Chapter 8, but not emphasized. The reality is that while average student achievement on NAPLAN literacy and numeracy tests have not increased – after their initial boost between 2008 and 2009 – nor have students’ results suffered large scale declines. Figure 1 demonstrates this graphically, showing the mean scores for all cohorts who have completed four NAPLAN assessments (up until 2019).

Figure 1. Mean NAPLAN reading achievement for six cohorts in all Australian states and territories. Calendar years indicate Year 3. (Data sourced from the National Assessment Program: Results website)

It seems somewhat disingenuous to focus so wholeheartedly on one standardized assessment regime at the expense of another to support claims that schools and students are ‘failing’. For example, in Chapter 3 the authors argue that,

“…the second unlevel playing field [i.e. the uneven power of Australian schools to attract high performing students] is a major cause of negative peer effects and, therefore, the decline in the educational outcomes of young Australians witnessed over the course of the 21^st century” (p.93)

In my view, claims such as these are over-reach, not least because arguments of a decline in educational outcomes rely solely on PISA results. Furthermore, the notion that the scale and influence of peer effects are established facts is also not necessarily supported by the research literature. Other claims made about student achievement growth are similarly unsupported by longitudinal research. In this latter case, not because claims overinterpret existing research, rather because there is very little truly longitudinal research in Australia on patterns of basic skills development – despite the fact that NAPLAN is a tool capable of tracking achievement over time.

Using hyperbole to reinforce a point is not a crime, of course, however the endless repetition of similar claims in the public sphere in Australia tends to reify ideas that are not always supported by empirical evidence. While these may simply be stylistic criticisms, they also throw into sharp relief the research gaps in the Australian context that could do with addressing from several angles (not just reports produced by the Australian Curriculum, Assessment and Reporting Authority [ACARA], which are liberally cited throughout).

I hope that the overabundance of detail, and the somewhat repetitive nature of the examples in this middle section of the book, don’t deter readers from the final chapter: Leveling the playing field. To the credit of Greenwell and Bonnor, rather than outline all the problems leaving readers with a sense of despair, the final chapter spells out several compelling policy options for future reform. While structures of education funding in Australia may seem intractable, the suggestions give concrete and seemingly-achievable options which would work presuming all players are equally interested in educational equity. The authors also tackle the issue of religious schools with sensitivity and candour. It is true that some parents want their children to attend religious schools. How policy can ensure that these schools don’t move further and further along the path of excluding the poorest and most disadvantaged – arguably those whom churches have the greatest mission to help – should be fully considered, without commentators tying themselves in knots over the fact that a proportion of Australia’s citizens have religious convictions.

Questions around school funding, school choice and educational outcomes are perennial topics in public debate in Australia. However, claims about funding reform should be underpinned by a good understanding of how the system actually works, and why it is like this in the first place. This is the great achievement of Greenwell and Bonnor in Waiting for Gonski. The way schools obtain government funding are obscure, to say the least, and there is a perception that private schools are not funded to the same extent as public schools. Waiting for Gonski clearly shows how wrong this idea is. As the book so powerfully argues, what Australia’s school funding system essentially does is allow children from already economically advantaged families to have access to additional educational resources via the school fee contributions these families are able to make. The book is a call to action to all of us to advocate for a rethink of the system.

Education is at the heart of public policy in many nations, not least in Australia. Waiting for Gonski is as much a cautionary tale for other nations as it is a comprehensive and insightful evaluation of what’s gone wrong in Australia, and how we might go about fixing it.

Waiting for Gonski: How Australia Failed its Schools by Tom Greenwell & Chris Bonnor. 367pp. UNSW Press. RRP $39.99