PISA and media reporting

PISA-shock: how we are sold the idea our PISA rankings are shocking and the damage it is doing to schooling in Australia

February 19, 2018PISA-shockAspa Baroutsis, Bob Lingard, PISA and media reporting, PISA rankings, PISA-shock, PISA-shock in Australia

When the first PISA results were released in 2001, there was a reaction in Germany that is now referred to as ‘PISA-shock’. It was likened to a tsunami-like impact where the perceived poor performance of German children compared with those in other countries participating in the international rankings dominated the news in Germany for weeks. Germans had believed they had one of the best schooling systems in the world and this first round of PISA results seriously challenged their perception. The shock led to major changes in education policy that Germany is still dealing with today.

Part of Germany’s PISA-shock was also precipitated by the fact that Finland was the outstanding performer in all the PISA tests in 2000. Historically, Finland had looked to other nations, including Germany, to learn about how schooling might be improved.

The term PISA-shock is now used widely within education circles. We would define PISA-shock as the impact of PISA results when those results are disjunctive with a nation’s self-perception of the quality of the schooling system.

We believe Australia also experienced PISA-shock in 2009 and this was subsequently compounded in 2012. Education policy changed here too as a result of PISA-shock. As with Germany, Australia is still dealing with the fallout of those changes.

In this blog post we want to look at what happened with that PISA-shock. Specifically we want to look how it played politically and educationally in Australia, the role the Australian media played and most importantly what Australia should be doing about its PISA-shock.

What is PISA?

The OECD’s PISA was first administered in 2000 and then every three years. PISA tests a sample of 15 year-olds in all participating nations on measures of reading, mathematical and scientific literacies. The number of nations participating has increased substantially since 2000 with 71 nations participating in 2015, including the 35 OECD member countries. The PISA results are reported in December of the year after the test is administered.

The test reports results on two dimensions, namely quality and equity. Quality refers to a nation’s performance on each of the tests, which usually have a mean score of 500, and documents the comparative performance against all other participating nations. Equity refers to the strength of the correlation between students’ socio-economic backgrounds and performance. Interestingly and importantly in policy terms, PISA results have shown that high performing nations tend to have more equitable schooling systems.

PISA-shock around the world

This PISA shock had real policy impact in Germany, leading to a large number of reform measures, both at national and Länder (states) levels, aimed at improving Germany’s subsequent PISA performance. We note here that Germany, like Australia, has a federal political structure and that some of the states did well on PISA 2000, but others did poorly. However, the aggregated German results demonstrated overall poor comparative performance.

We believe the German PISA shock in 2001 and its significant policy impact were important factors in insuring the legitimacy and significance of the PISA testing regime.

From the time of the first PISA, more nations have participated giving even greater significance to PISA in national policy reforms. As more nations have participated and as PISA has continued to provoke PISA-shocks, there has been enhanced media coverage in national and metropolitan newspapers of a nation’s comparative performance.

In 2009, several cities and provinces in China participated in PISA for the first time. Yet the Chinese government intervened and only allowed the public publication of Shanghai’s results. We stress here that Shanghai is not representative of China and that indeed access to the results of all participating systems suggest that at an aggregated level, China did much worse than Australia in 2009. However, it was Shanghai’s stellar performance on all the test measures that precipitated a PISA-shock in Australia.

PISA-shock in Australia

Political context

There is a specific context to Australia’s PISA-shock. Since the time of the Hawke/Keating governments, Australia has been seeking to reorient its economic policies towards Asia. There has been much talk as well of the 21st century being the Asian Century with the socio-political and economic rise of China. Australia’s response to Shanghai’s results must be seen in this context. The federal Labor government had commissioned the Henry Review on Asia and Australia’s economic future.

2009 and the beginning of our PISA-shock

There was a great deal of media coverage in 2010 in Australia of Australia’s poor and declining comparative performance on PISA 2009. We had our own ‘tsunami-like impact’ of media coverage. All major news services covered our ‘declining’ rankings and broadcasters and media commentators offered much advice as to why Australian schooling was ‘failing’.

Also contributing to this PISA shock was the fact that four of the top performing nations in PISA 2009 were located in East Asia (Shanghai, South Korea, Hong Kong, Singapore).

2012 and our PISA-shock deepens

Contributing further to Australia’s PISA shock was the extensive media coverage given in January 2012 to a report produced by an independent think tank, the Grattan Institute, Catching Up: Learning from the best school systems in East Asia.

The Prime Minister at the time, Julia Gillard, along with Australian and East Asian education system leaders, Andreas Schliecher from the OECD, and a number of academics had all attended a seminar convened by the Grattan Institute in late 2011 focusing on the nature of East Asian schooling systems that had performed so well in PISA 2009. The media coverage of the Grattan report and of this meeting caused another spike in media coverage. This occurred in January 2012 and it could be described as a media ‘frenzy’ about Australia’s PISA performance.

We note that ‘research reports’ produced by think tanks like the Grattan Institute are written with a media audience in mind. They are purposefully produced to impact on politicians and policy makers, and the broader public through media. They utilise the genre of a high-quality media story rather than an academic research report. Think Tank usage of publicly available PISA data has real media effects.

In the front-page story in The Australian on the 24th January 2012 the headline read: We risk losing education race, PM warns. In this story the then Prime Minister, Julia Gillard, was quoted as saying:

Four of the top five performing school systems in the world are in our region and they are getting better and better … On average, kids at 15 in those nations are six months ahead of Australian kids at 15 and they are a year in front of the OECD mean … If we are talking about today’s children – tomorrow’s workers – I want them to be workers in a high-skill, high-wage economy where we are still leading the world. I don’t want them to be workers in an economy where we are kind of the runt of the litter in our region.

Flawed use of mean scores

When framing counts and comparisons, the press frequently utlilised mean scores to rank participating countries as a mode of evidence regarding performance. In reading, Australia went from a mean score of 528 in 2000 to 512 in 2012, a drop of sixteen points, with a drop of seven points in science literacy, from 528 in 2000 to 521 in 2012. The worst change was in mathematics literacy where the country fell 29 points from a mean score of 533 in 2000 to 504 in 2012. This enabled dramatisation-style media coverage (with visuals such as graphs) as a downward trend and provided greater opportunity for sensationalism. For example, using mean scores and country ranks, Australia’s performance in mathematics shows a downward trend, with a significant decline starting in 2003, and subsequently out of the top 10 by 2006.

We would suggest that discussions about a country’s performance, based solely on mean scores and averages, are flawed. Focusing on Australia’s mean scores hides the substantial disparities between the performance of the States and Territories. The ACT, for example, always does well, while Tasmania and the Northern Territory always do poorly. All of the subsequent league tables and visual representations that continue relentlessly from the media in Australia are therefore flawed.

There is limited coverage of the equity measure, which shows a strengthening correlation between socio-economic background and performance and a substantial Socio Economic Status (SES) impact on performance. Furthermore, the number of 15 year old Australians from the bottom quartile of socio-economic background who perform in the top categories on each of the tests has declined sharply since the first test was administered in 2000.

The education fallout from PISA-shock in Australia

An upshot of this Australian PISA shock was the Gillard government legislating amendments to the Education Act that Australia would be back in the top 5 in PISA by 2025.

We see this as classic ‘goal displacement’. We believe what is required is better quality and more equitable outcomes for all young Australians. That needs to be the target; it needs to be the goal of policy. Improved performance on PISA would flow from policy interventions aimed at achieving that goal. What we need is redistributive and targeted funding, along with research-informed interventions for classroom and school change.

Following Shanghai’s stellar performance on PISA 2009 and the extensive media coverage of Australia’s declining comparative performance, Australia joined the nations that have responded very seriously in political and policy terms to PISA-shocks

Very different results if we go back to the original set of countries

However we point out that there would be very different results if we go back to the original set of countries that participated in PISA in 2000 and compare Australia’s results to this particular set of countries.

Only 43 nations participated in the 2000 PISA, however the number of participating countries has grown substantially since that time with 65 nations participating in 2012, with a further 40% increase in participation rates in 2015 to 71 participating countries. Many of the additional countries are East Asian with Confucian traditions. Four countries in the top five ranks in 2009 were Australia’s East Asian neighbours.

These increases in the number of participating countries are rarely acknowledged in the press when discussing Australia’s position in global rankings. But this is a fundamental piece of information. Simple mathematics would suggest that ranks are more likely to change and decrease when the number of participants changes, irrespective of changes in performance.

Furthermore, it is probably only statistically reliable to compare longitudinal changes in performance across the years when one of the test domains was the major focus (e.g. science in 2006 and 2015). This is neglected in media coverage.

We conducted a subsequent analysis of Australia’s PISA rank using only participating countries that were represented in all five test years (2000-2012). Only 32 countries participated in PISA each year with data being available across the three literacies. Our analysis illustrates the arbitrary nature of using mean scores to rank countries and not taking into account the increases in numbers of countries participating over the years.

For example, in each of the literacies, Australia is ranked higher in 2009 and 2012 when analysed against the 32 countries, than when compared with the participating countries of a particular year, making the changes in position less dramatic. For example, in mathematics, Australia is placed 12th rather than 19th; in reading, 9th rather than 13th; and in science, 10th rather than 16th.

Our comparisons like those of the newspapers were conducted longitudinally across independent data-sets (year of test). However, the difference was that the number of participating countries was consistent, thereby eliminating this variance and producing a very different result in the rankings.

Importance of sociocultural and socio-political differences

Except for Finland, all other countries in the top five on the 2009 and 2012 PISA are Asian. Each of these nations is significantly different from Australia in sociocultural and socio-political terms, but they are still identified as reference societies for Australian educational reforms. Subsequently, a nation’s referential position is no longer conditioned and legitimated by similarities with a society and a schooling system (for example in the past, the UK), but on the basis of their placement in the global rankings on PISA.

Media constructions also emphasise policy, rather than structural inequality explanations of national performance. While the Australian press did not stop referencing Finland, coverage also included Asian nations, especially Shanghai in 2009. The Australian reported, ‘Shanghai, which joined the international testing movement in 2009 and ousted Finland from the top spot it had occupied for almost 10 years’ with the Sydney Morning Herald adding, ‘Australian policy makers could learn much from China’. The Grattan Institute Report (mentioned above) sought to draw on the high performing East Asian nations to make policy suggestions for Australia.

Despite major cultural, demographic and political differences between Finland and Australia, and Shanghai and Australia, and Shanghai being erroneously seen as all of China, this did not prevent media constructions of Shanghai as a suitable reference system for Australian schooling.

Talking about ‘Australian’ performance hides the large disparities within Australia

The media speak of Australia’s performance more than they speak of say New South Wales’ performance or Western Australia’s performance on PISA. This approach hides quite large disparities in performance across the various state schooling systems in Australia. Yet Australia oversamples on PISA (we have more children sit the tests than required) so that the results can be disaggregated to school system levels (other countries, such as the US, do not do this). The media rarely acknowledge these disparities in their PISA reporting.

On the analyses for 2012 PISA, Western Australia and the Australian Capital Territory did very well, while the Northern Territory and Tasmania performed comparatively poorly. This went largely unreported and what we saw instead was the media’s fixation on national average scores and international comparisons within league tables.

What we should be doing with PISA results

As suggested above, PISA provides important data for policy makers on the quality and equity of schooling systems. As we have already noted, the media fail to report the increasing inequities in Australian schooling. There is a deafening media silence about this situation; indeed, almost no media coverage of equity in respect of PISA.

The PISA test is administered every three years (beginning in 2000). The results for each PISA are released in December of the year following. In the subsequent year after the publication of the results, the OECD releases very detailed secondary analyses of the PISA data, with these reports usually running to about 1200 pages.

While there is always huge media frenzy over the initial release of results of international rankings there is seldom any media coverage of the subsequent detailed reports. In our view, it is these analyses that should inform policy makers and indeed the Australian people.

The PISA-shock type media coverage has huge policy effects. Governments make decisions that have lasting fallout on our education systems as a result of this coverage. However the deep inequities of performance based on socio-economic background that show up in detailed PISA results and the differences between the jurisdictional schooling systems is where the media should be shining the spotlight. This is where the real story of what is happening in school education in Australia can be uncovered. This is where policy makers should be searching for policy changing data.

There is a pressing policy need for the inequities uncovered by PISA testing to be addressed by federal and state governments, in both funding and policy ways. We think these inequities are symbiotic with broader structural inequalities and historical legacies, which also need to be addressed by a range of new social policies.

As with all tests, PISA should be used for the purposes for which it was constructed, that is, to help policy makers to make informed decisions about schooling to ensure we have a high quality and equitable schooling systems.

Full report Counting and comparing school performance: an analysis of media coverage of PISA in Australia, 2000–2014

Aspa Baroutsis is a senior research fellow in the Faculty of Education at Queensland University of Technology. She is currently working on the Learning to write in the early years project (ARC DP150101240). Her research interests include media, policy, social justice, science education, digital technologies and literacies.

Bob Lingard is a Professorial Research Fellow in the School of Education at The University of Queensland, where he researches in the sociology of education. His most recent books include: Globalizing Educational Accountabilities (Routledge, 2016), co-authored with Wayne Martino, Goli Rezai-Rashti and Sam Sellar, National Testing in Schools (Routledge, 2016) (The first book in the AARE series Local/Global Issues in Education),co-edited with Greg Thompson and Sam Sellar, and The Handbook of Global Education Policy (Wiley, 2016), co-edited with Karen Mundy, Andy Green and Antonio Verger. Bob is a Fellow of the Australian Academy of Social Sciences and Co- Editor of the journal, Discourse: Studies in the Cultural Politics of Education. You can follow him on Twitter @boblingard86

Serious flaws in how PISA measured student behaviour and how Australian media reported the results

October 2, 2017PISAAlan Reid, PISA, PISA and classroom discipline, PISA and media reporting, PISA rankings, PISA tests

International student performance test results can spark media frenzy around the world. Results and rankings published by the Organisation for Economic Co-operation and Development (OECD) are scrutinized with forensic intensity and any ranking that is not an improvement is usually labelled a ‘problem’ by the politicians and media of the country involved. Much time, energy and media space is spent trying to find solutions to such problems.

It is a circus that visits Australia regularly.

We saw it all last December when the latest Programme for International Student Assessment (PISA) results were published. We were treated to headlines such as ‘Pisa results: Australian students’ science, maths and reading in long-term decline’ from the Australian edition of the Guardian.

In March a follow-up report was published by the Australian Council for Educational Research (ACER) highlighting key aspects of the test results from an Australian perspective.

Australian mainstream media immediately zeroed in on one small part of the latter report dealing with classroom ‘disciplinary climate’. The headlines once again damned Australian schools, for example, Education: Up to half of students in Australian classrooms unable to learn because of ‘noise and disorder’ from The Daily Telegraph and Australian students among worst behaved in the developed world from The Australian.

This is pretty dramatic stuff. Not only do the test results apparently tell us the standard of Australian education is on the decline, but they also show that Australian classrooms are in chaos.

As these OECD test results inform our policy makers and contribute to the growing belief in our community that our education system is in crisis, I believe the methods used to derive the information should be scrutinised carefully. I am also very interested in how the media reports OECD findings.

Over the past few years, many researchers have raised questions about whether the PISA tests really do tell us much about education standards. In this blog I want to focus on the efficacy of some of the research connected to the PISA tests, specifically that relating to classroom discipline, and examine the way our media handled the information that was released.

To start we need to look closely at what the PISA tests measure, how the testing is done and how classroom discipline was included in the latest results.

What is PISA and how was classroom discipline included?

PISA is an OECD administered test of the performance of students aged 15 years in Mathematical Literacy, Science Literacy and Reading Literacy. It has been conducted every three years since 2000, with the most recent tests being undertaken in 2015 and the results published in December 2016. In 2015, 72 countries participated in the tests which are two hours in length. They are taken by a stratified sample of students in each country. In Australia in 2015 about 750 schools and 14,500 students were involved in the PISA tests.

How ‘classroom disciplinary climate’ was involved in PISA testing

During the PISA testing process, other data are gathered for the purpose of fleshing out a full picture of some of the contextual and resource factors influencing student learning. Thus in 2015, Principals were asked to respond to questions about school management, school climate, school resources, etc; and student perspectives were gleaned from a range of questions and responses relating to Science which was major domain in 2015. These questions focused on such matters as classroom environment, truancy, classroom disciplinary climate, motivation and interest in Science, and so on.

All these data are used to produce ‘key findings’ in relation to school learning environment, equity, and student attitudes to Science. Such findings emerge after multiple cross correlations are made between PISA scores, student and schools’ socio-economic status, and the data drawn from responses to questionnaires. They are written up in volumes of OECD reports, replete with charts, scatter plots and tables.

In 2015 students were asked to respond to statements related to classroom discipline. They were asked: ‘How often do these things happen in your science classes?

Students don’t listen to what the teacher says
There is noise and disorder
The teacher has to wait a long time for the students to quieten down
Students cannot work well
Students don’t start working for a long time after the lesson begins.

Then, for each of the five statements, students had to tick one of the boxes on a four point scale from (a) never or hardly ever; (b) in some lessons; (c) in most lessons; and (d) in all lessons.

Problems with the PISA process and interpretation of data

Even before we look at what is done with the results of the questions posed in PISA about classroom discipline, alarm bells would be ringing for many educators reading this blog.

No rationale for what is a good classroom environment

For a start, the five statements listed above are based on some unexplained pedagogical assumptions. They imply that a ‘disciplined’ classroom environment is one that is quiet and teacher directed, but there is no rationale provided for why such a view has been adopted. Nor is it explained why the five features of such an environment have been selected above other possible features. They are simply named as the arbiters of ‘disciplinary climate’ in schools.

Problem of possible interpretation

However, let’s accept for the moment that the five statements represent a contemporary view of classroom disciplinary climate. The next problem is one of interpretation. Is it not possible that students from across 72 countries might understand some of these statements differently? Might it not be that the diversity of languages and cultures of so many countries produces some varying interpretations of what is meant by the statements, for example that:

for some students, ‘don’t listen to what the teacher says’, might mean ‘I don’t listen’ or for others ‘they don’t listen’; or that students have completely different interpretations of ‘not listening’;
what constitutes ‘noise and disorder’ in one context/culture might differ from another;
for different students, a teacher ‘waiting a long time’ for quiet might vary from 10 seconds to 10 minutes;
‘students cannot work well’ might be interpreted by some as ‘I cannot work well’ and by others as ‘they cannot work well’; or that some interpret ‘work well’ to refer to the quality of work rather than the capacity to undertake that work; and so on.

These possible difficulties appear not to trouble the designers. From this point on, certainty enters the equation.

Statisticians standardise the questionable data gathered

The five questionnaire items are inverted and standardised with a mean of 0 and a standard deviation of 1, to define the index of disciplinary climate in science classes. Students’ views on how conducive classrooms are to learning are then combined to develop a composite index – a measurement of the disciplinary climate in their schools. Positive values on this index indicate more positive levels of disciplinary climate in science classes.

Once combined, the next step is to construct a table purporting to show the disciplinary climate in the science classes of 15 year olds in each country. The table comprises an alphabetical list of countries, with the mean index score listed alongside each country, so allowing for easy comparison. This is followed by a series of tables containing overall disciplinary climate scores broken down by each of the disciplinary ‘problems’, correlated with such factors as performance in the PISA Science test, schools and students socio-economic profile, type of school (eg public or private), location (urban or rural) and so on.

ACER reports the results ‘from an Australian perspective’

The ACER report summarises these research findings from an Australian perspective. First, it compares Australia’s ‘mean disciplinary climate index score’ to selected comparison cities/countries such as Hong Kong, Singapore, Japan, and Finland. It reports that:

Students in Japan had the highest levels of positive disciplinary climate in science classes with a mean index score of 0.83, followed by students in Hong Kong (China) (mean index score: 0.35). Students in Australia and New Zealand reported the lowest levels of positive disciplinary climate in their science classes with mean index scores of – 0.19 and – 0.15 respectively, which were significantly lower than the OECD average of 0.00 (Thomson, Bortoli and Underwood, 2017, p. 277).

Then the ACER report compares scores within Australia by State and Territory; by ‘disciplinary problem’; and by socio-economic background. The report concludes that:

Even in the more advantaged schools, almost one third of students reported that in most or every lesson, students don’t listen to what the teacher says. One third of students in more advantaged schools and one half of the students in lower socioeconomic schools also reported that there is noise and disorder in the classroom (Thomson et al, 2017, p. 280).

What can we make of this research?

You will note from the description above, that there would need to be a number of caveats placed on the research outcomes. First, the data relate to a quite specific student cohort who are 15 years old of age, and are based only on science classes. That is, the research findings cannot be used to generalise about other subjects in the same year level, let alone about primary and/or secondary schooling.

Second, there are some questions about the classroom disciplinary data that call into question the certainty with which the numbers are calculated and compared. These relate to student motivation in answering the questions, and to the differing interpretations by people from many different cultures about the meaning of the same words and phrases.

Third, there are well-documented problems related to the data with which the questionnaire responses are cross-correlated, such as the validity of the PISA test scores.

In short, it may well be that discipline is a problem in Australian schools, but this research cannot provide us with that information. Surely the most one can say is that the results might point to the need for more extended research. But far from a measured response, the media fed the findings into the continuing narrative about falling standards in Australian education.

The media plays a pivotal role

When ACER released its report, the headlines and associated commentary once again damned Australian schools. Here is the daily paper from my hometown of Adelaide.

Disorder the order of the day for Aussie schools (Advertiser, 15/3/2017)

‘Australian school students are significantly rowdier and less disciplined than those overseas, research has found. An ACER report, released today, says half the students in disadvantaged schools nationally, and a third of students in advantaged schools, reported ‘noise and disorder’ in most or all of their classes…. In December, the Advertiser reported the (PISA) test results showed the academic abilities of Australian students were in ‘absolute decline’. Now the school discipline results show Australian schools performed considerably worse than the average across OECD nations…. Federal Education Minister Simon Birmingham said the testing showed that there was ‘essentially no relationship between spending per student and outcomes. This research demonstrates that more money spent within a school doesn’t automatically buy you better discipline, engagement or ambition’, he said (Williams, Advertiser 15/3/17).

Mainstream newspapers all over the country repeated the same messages. Once again, media commentators and politicians had fodder for a fresh round of teacher bashing.

Let’s look at what is happening here:

The mainstream press have broadened the research findings to encompass not just 15 year old students in science classrooms, but ALL students (primary and secondary) across ALL subject areas;
The research report findings have been picked up without any mention of some of the difficulties associated with conducting such research across so many cultures and countries. The numbers are treated with reverence, and the findings as the immutable ‘truth’;
The mainstream press have cherry picked negative results to get a headline, ignoring such findings in the same ACER report that, for example, Australia is well above the OECD average in terms of the interest that students have in their learning in Science, and the level of teacher support they receive;
Key politicians begin to use the research findings as a justification for not having to spend more money on education, and to blame schools and students for the ‘classroom chaos’.

These errors and omissions reinforce the narrative being promulgated in mainstream media and by politicians and current policy makers that standards in Australian education are in serious decline. If such judgments are being made on the basis of flawed data reported in a flawed way by the media, they contribute to a misdiagnosis of the causes of identified problems, and to the wrong policy directions being set.

The information that is garnered from the PISA process every three years may have the potential to contribute to policy making. But if PISA is to be used as a key arbiter of educational quality, then we need to ensure that its methodology is subjected to critical scrutiny. And politicians and policy makers alike need to look beyond the simplistic and often downright wrong media reporting of PISA results.

Alan Reid is Professor Emeritus of Education at the University of South Australia. Professor Reid’s research interests include educational policy, curriculum change, social justice and education, citizenship education and the history and politics of public education. He has published widely in these areas and gives many talks and papers to professional groups, nationally and internationally. These include a number of named Lectures and Orations, including the Radford Lecture (AARE); the Fritz Duras Memorial Lecture (ACHPER); the Selby-Smith Oration (ACE); the Hedley Beare Oration (ACE -NT); the Phillip Hughes Oration (ACE – ACT); the Garth Boomer Memorial Lecture (ACSA); and the national conference of the AEU.