March.7.2022

What we now know about student evaluations is much more depressing than you thought

By Katharine Gelber

Many in the education sector believe Students Evaluations of Teaching (SETs) evidence a gender bias. There have been decades of research into whether this is the case, however the results are often inconclusive. Although a recent large study at UNSW, relying on over 5000,000 survey results across 7 years, found evidence of bias against teachers identifying as women, and those with non-English speaking backgrounds, other studies have been inconclusive, or found that gender bias does not exist.

This disagreement can sometimes be explained by investigating the research design used in each of the studies more closely. For example, qualitative studies that look beyond the scores achieved by teachers  tend to suggest that gender may lead students to reward different kinds of behaviour in male and female identified teachers, including in my field, political science.

Given this disagreement, our team of researchers decided to undertake a new study which focussed on the comments that students wrote in their evaluations. In our paper, Gendered mundanities: gender bias in student evaluations of teaching in political science, we looked at all the evaluations in the School of Political Science and International Studies at the University of Queensland from 2015 to 2018, and examined the students’ answers to the standard qualitative questions in the SETs. What aspects of this teacher’s approach best helped your learning? What would you have liked this teacher to have done differently?

The University has an internal procedure for removing egregiously offensive comments from the surveys before they are passed on to the staff. This is important, since evidence from other Australian universities shows that some allow these comments to be passed on to staff, having a significant negative impact on their wellbeing and safety at work. The set of data we worked with had 0.15% of comments redacted, a very small proportion.

It is important to note that results of these evaluations in terms of the actual scores were high. They also showed no evidence of gender bias in so far as there was no statistically significant difference in the scores achieved by male and female identified teachers. This enabled us to focus on the question of whether the exact same set of data – showing that both male and female identified teachers achieve similarly high teaching scores – may produce a different result using a qualitative research design. We undertook a qualitative content analysis of the students’ answers to the two open ended questions.

Our first finding was that both male and female identified students evaluated female identified teachers in similar ways, but that male and female identified students evaluated male identified teachers in different ways. This implies that gender is doing some work, because otherwise the results would be similar for both groups of teachers. So we needed to look further to find out what kind of work that was.

We delved more closely into the comments about female identified teachers, and found that the most prominent traits associated with these teachers (who had achieved high numerical scores on the evaluations) were: approachable, questions, discussion, helpful, encouraged, input, time, friendly, ideas and feedback. Both male and female identified students evaluated female identified teachers consistently.  This led to our second finding: that when students comment on what they find most helpful about the teaching they receive, the traits most rewarded in female identified teachers are those related to stereotypically gendered expectations of women. Female-identified teachers were described as helping students’ learning when they were approachable, encouraged questions and discussion, allowed for student input, gave time, were friendly, and gave more feedback out of class time. These activities are time consuming, and emotionally burdensome.

We also delved more closely into the comments about male identified teachers (who had also achieved high numerical scores on the evaluations). There was greater variability in how students evaluated male identified teachers. Male identified students evaluated male identified teachers with a focus on knowledge, knowledgeable, inspiring, excellent, theoretical, passionate and best. Female identified students evaluated male identified teachers with a focus on funny, knows, and fun. Both male and female identified students evaluated male identified teachers based on their enthusiasm, passion and teaching style.  This led to our third finding: that the traits most commonly associated with male-identified teachers are likely to be related to stereotypically gendered masculine expectations. These are traits such as being knowledgeable, theoretical, engaging, and passionate. Notably, exhibiting these traits is unlikely to require additional time beyond normal preparation for teaching, or to constitute additional, burdensome, emotional labour.

Overall, our study showed that analysis of students’ comments can, and does, reveal a gender bias that may be invisible when one focusses solely on the scores achieved. We showed that the ways in which gender bias present can be mundane – we termed them gendered mundanities; harmful expectations of gendered behaviour that are invisible because of their everyday nature. The patterns we identified constituted regular reminders about what behaviour is required from male identified and female identified teachers to be seen by students as good at their teaching role. 

This means that SETs may be rewarding female and male staff for behaviours that conform to gender stereotypes. It also may mean that female and male staff are rewarded for behaviours that have differentiated impacts on the amount of time and energy they have available for other activities, including of course research. 

It is clear that SETs do not only measure the quality of teaching performance. They interact in gendered ways with students’ expectations of their male and female teachers. Universities still need to evaluate teaching performance, but they need to find a range of ways to do so, and be attentive to the gendered mundanities of students’ expectations of their teachers when doing so.

Katharine Gelber is a Professor in the School of Political Science and International Studies at the University of Queensland, a Fellow of the Academy of Social Sciences Australia, and a former ARC Future Fellow (2012-2015). 

Republish this article for free, online or in print, under Creative Commons licence.

One thought on “What we now know about student evaluations is much more depressing than you thought

  1. These results on student evaluations are only depressing for those who thought such surveys were an objective measure of teaching quality. Like any consumer survey, the results will reflect the biases of those surveyed. That the quantitative results are consistent is a positive result. That should be enough to use the surveys as a early warning indicator of a problem with a course or teachers. If the numbers for one course or teacher is very low, then there is a reason to check why. But sifting through what students write on surveys to decide if one teacher is slightly better than another is nuts. Also such a survey should have nothing to do with the amount of time a teacher spends teaching. They should spend the time needed to meet learning outcomes for students, which have nothing to do with a popularity poll.

Comments are closed.