REFLECTION IN THE DEVELOPMENT OF TEACHING SKILLS: PRE-POST TESTING Peter Galbraith The University of Queensland Koehler (1985) has claimed that research into the effectiveness of teacher education programs is often weak because the absence of control groups or pretests means that long term effects are not measured. In this study both a control group and pre-post testing formed part of the design, and the purpose of this paper is to report on this aspect of the project. Instruments 1. Priorities in Teaching Questionnaire (PITQ) Initial beliefs of prospective teachers are held to be a significant influence in their development as professionals (Wittrock, 1986; Clandinin and Connelly, 1991; Calderhead, 1989). Zeichner and his associates e.g. (Zeichner and Liston, 1987; Tabachnick and Zeichner 1984) refer to the stability of beliefs among student teachers, and their apparent imperviousness to change. Students came into the program with initial perspectives and beliefs about the role of teacher and the curriculum, and left with those beliefs essentially intact. Tabachnick and Zeichner (1984) The pre-post administration of the beliefs questionnaire (50 items) tested these claims for the student cohort across a set of categories selected for their relevance to the project. Of the categories previously listed for this instrument, constructivist approaches to teaching, transmissive approaches to teaching, and feedback followed directly from the theoretical basis of the project. The remaining categories are consistent with the theoretical basis, and with research articulating problem areas for beginning teachers (see, for example, the review by Veenman, 1989, of 83 empirical studies conducted in Western Europe, U.S.A., Canada, and Australia.) The nine scales of the PITQ were defined following a factor analysis of a bank of items using responses from a previous cohort of students. Some additional items were added to ensure a minimum of four for each scale. Sample items for the PITQ are given below for each of the nine categories. A. Constructivist Approaches toB. Transmissive Approaches to Teaching Teaching Gets students to generate theirProvides summaries for own individual summaries. students to take down and Helps students to construct and learn. work out knowledge for Selects the material and themselves. methods to be used by students. C. Academic Organisation D. Management Maintains effective use of Enforces his/her own OHP/blackboard/computer. prescribed directives. Allocates time to different Develops and maintains shared lesson phases. principles of classroom behaviour. E. Teacher Self-Presentation F. Social/Affective Learner Exhibits patience and Characteristics understanding. Takes account of students' Stresses academic values. attitude to learning. Knows students as individuals. G. Cognitive Learner H. Feedback Characteristics Helps students reflect on the Takes account of prior quality of their work. knowledge and skills of Sets tasks to provide diagnostic individual students in the feedback for learners. subject matter area. Takes account of the reading and language level of individual students. I. Motivation Uses curiosity as a means of gaining and maintaining interest. Sets challenging tasks to encourage high commitment. Likert type data were obtained from responses rating (i) the importance of the respective items, and (ii) the confidence the student teachers felt in their ability to understand and utilise them. 2. Sources of Knowledge Questionnaire (SOKQ) This 62 item questionnaire sought information about personal motivations towards teaching, and about sources of knowledge beginning teachers use in developing expertise. Nine categories were defined on the basis of a factor analysis conducted on the responses of an earlier cohort of students, Evans (1991). Sample items illustrating the categories are given below. A. Motivation and Attitude B. Personality and Values Motivation to be a good Personal value system. teacher. Personal ambition. Enthusiasm for teaching so far. C. Prior Knowledge and Experience D. School Models, Advice, Experience as a student. Feedback Modelling own past teachers. Modelling supervising teachers. Advice and feedback from other teachers. E. Non-school Models, Advice, F. Reflective and Control Processes Feedback Feedback from students being Modelling previous Universitytaught. teachers. Reflecting on what went right Advice and feedback from and wrong in previous lessons. DipEd lecturers. G. Theory and Practice H. Library Resources Applying theories to see how Help from school library. they work. Help from DipEd A-V Trying continually to put own resources. ideas of teaching into practice. I. Non-library Resources Help from class texts. Materials prepared by self. Likert scale data were obtained from student responses estimating how important the factors were in helping them to teach well. 3. Study Process Questionnaire (SPQ) The 42 item Study Process Questionnaire (Biggs, 1987) assesses motive and strategy along the dimensions of surface, deep, and achievement-oriented approaches to studying. This instrument was used only as a post-test measure as indicated in the earlier symposium paper. The instrument has a 2 x 3 factor structure which is illustrated by the sample items (paraphrased) provided below. A. Motive (surface) B. Strategy (surface) Present course chosen withBrowsing is a waste of time. view to job situation rather than Only material given in class or intrinsic interest. course outlines is worth Poor marks are discouraging studying. and a source of worry for the Preferred subjects contain a lot next test. of factual content rather than theory. C. Motive (deep) Studying gives a feeling of D. Strategy (deep) deep personal satisfaction. Reading new material Virtually any topic can be highly continually enables known interesting once involved in it. material to be viewed in a new light. Most new topics are interesting and extra time is often spent trying to obtain more information about them. E. Motive (achievement-oriented) F. Strategy (achievement- oriented) Success in studies and career Look at most of the suggested is more important than readings that go with lectures. popularity with fellow students. Work consistently throughout Capacity to excel in a course is the course and review regularly a most important consideration when assessment is close. in choosing it. The response format for the SPQ provides Likert scale data. 4. Lesson Observation Task (LOT) This task differed from all other pre and post measures in that it obtained performance data rather than statements of opinions and perceptions. As described in the earlier symposium paper the task required subjects to record observations, relate inferences to theoretical concepts, and categorise lesson excerpts as examples of specific teaching approaches or principles. Each student teacher responded to two five minute videotaped lesson segments - one a general teaching situation, and the other within the specific subject area of the participant (Mathematics or Science or Social Education). The subjects were asked to respond as fully as possible to the following questions. 1. Describe what you observed in the lesson excerpt. 2. What inferences can you make from your observations? 3. Relate your observations and inferences to any theoretical concepts about teaching that you think are relevant. 4. What general teaching situation, approach, or principle might this excerpt illustrate? The segments were replayed twice, and the subjects allowed as much time as they needed to respond. Scoring performance on the task It was necessary to develop a scoring procedure which retained the richness inherent in the responses, and which identified differences, both quantitative and qualitative, between each subject's pre-test and post-test responses. To achieve this, the questionnaires were first read to gauge the general level and scope of the responses, and to obtain a feel for consistencies/similarities that occurred across the questionnaires. In performing this task it became evident that the classes of response types resembled closely those contained in the priorities in teaching (beliefs) questionnaire (PITQ). Consequently the corresponding categories were used as a basis for the first stage of the analysis. That is, statements made by subjects in responding to the lesson excerpts, were labelled in terms of the category of the beliefs questionnaire to which they corresponded. A summary sheet was created for each subject, on which was recorded the category (A to I) of each statement, and whether it was made pre or post. Further analysis was then conducted using two different procedures. (1) On each summary sheet note was made of any general pre-post changes that seemed interesting, relevant, or important. Such changes were identifiable from a direct comparison on pre-post responses (e.g. more observations, more evaluative comment), or from the categorisation present in the summary sheet (e.g. more emphasis on social context, more emphasis on student learning processes). A numbered list of pre-post changes was then developed. This iterative process involved adding new entries to the progressive list, and then reviewing previously analysed scripts for evidence of the newly identified change. The outcome was an empirically based set of categories that could be used to identify theoretically coherent similarities and differences between the experimental and control groups. In all, eleven categories of pre-post change were identified as listed below. 1. More inferences made about the existence of pre- established rules and routines. 2. Excerpt identified as an example of a specific teaching principle, situation, or approach. 3. Inferences integrated from the evidence of several observations (rather than separate inferences for each observation). 4. More inferences made about the participants in the teaching segment e.g. about the teachers' beliefs, students' behaviour. 5. More evaluative/normative comment provided. 6. More emphasis on the social context e.g. classroom climate, individualisation, teacher-student relationships. 7. More observations in total. 8. A higher theoretical content evident in the responses. 9. More emphasis on student learning processes e.g. involvement, participation, use of prior knowledge. 10. Observations grouped according to theoretical concepts rather than made in isolation. 11. Generation of alternative perspectives or interpretations (going beyond what-is to what-could-be). On the basis of this classification, two types of data were recorded for each subject (a) the number of pre-post changes exhibited. (b) the type of pre-post changes exhibited. For (a) the counts were coded as GENVID and CURVID. They refer respectively to the general and specific curriculum lesson videos. For (b) the relative frequencies (f) of each of the eleven pre- post changes were calculated for the experimental and control groups. f = number of subjects in group who exhibited pre- post change total number of students in the group This analysis therefore produced two numerical outcome measures for group comparisons on the basis of (a) the most frequent types of pre-post changes for each group. (b) the differences between the groups occurring in respect of the types of change identified. (2) A second analysis procedure was generated from the data entered on the summary sheets. Two sets of entries were recorded for each subject (a) the number of pre and post statements made in each of the priorities in teaching categories (A - I). (b) the total number of pre and post statements made for the general lesson video (PREGEN/POSTGEN), and for the specific curriculum videos (PRECUR/POSTCUR). A further set of composite variables were defined to capture the total number of statements that were respectively student focussed (PREVIDSF/POSVIDSF) and teacher focussed (PREVIDTF/POSVIDTF). For this purpose the counts on the general and specific curriculum videos were combined. Student focussed statements were defined to be those corresponding to the priorities in teaching categories A, D, F, G, H, I. Teacher focussed statements were defined as those corresponding to the categories B, C, E. Instrument Reliabilities Three of the instruments are based on interval scales. Prior testing with an earlier cohort of students took place during the construction of the Priorities in Teaching Questionnaire (PITQ), and resulted in scale reliabilities between à = 0.61 and à = 0.76. The Sources of Knowledge Questionnaire (SOKQ) as initially derived had a comparable range of scale reliabilities, and the Study Process Questionnaire (SPQ) is a well researched publicly available instrument with verified reliability. Reliability measures were also calculated using the data obtained within the current project. In general reliabilities of the above order were obtained, although for each instrument a rare and inexplicably low reliability was obtained for some scale. In view of the stability achieved during prior testing no action was taken as a result of these isolated phenomena. Statistical Analysis Data from the pre-post testing were used to address a set of specific questions derived from the general research aims. (1) To identify changes in priorities and confidence that occurred with respect to the categories in the priorities in teaching questionnaire (PITQ). (2) To identify changes that occurred in the emphasis assigned to sources of knowledge used by student teachers to inform their practice (SOKQ). (3) To investigate whether surface, deep, and achievement oriented approaches to learning, have counterparts in learning to teach (SPQ). (4) To identify changes in the ability of student teachers to recognise and apply skills of analysis to a lesson observation task (LOT). As indicated in the earlier symposium paper, three groups of students provided data for analysis. A - interview group (N=6) D - reflective intervention group (N=8) E - control group (N=13) For some analyses A, D, E were compared with one another. In other analyses the experimental groups were combined so that A+D was compared with E. In some few analyses the experimental groups A, D were compared with each other. As described previously the groups contained a representative selection from three curriculum areas, and were approximately balanced with respect to gender. The small total numbers did not make interaction studies (by gender and curriculum area) a meaningful proposition. All experimental intervention took place within the practice school setting. However subjects in the experimental groups shared with the control group participation in all other aspects of the Diploma in Education program. Consequently they received tuition, and engaged in a variety of other activities, that had purposes related to the project aims. For example, the importance of individual differences, motivation, diagnostic feedback, and constructivist approaches to learning, are addressed in all curriculum areas, so that both experimental and control group subjects would receive exposure to these issues outside the research project. Furthermore all students completed an 11 week period of practice teaching. Hence the experimental interventions may best be viewed as seeking to provide a 'value added' component rather than adding a dimension unknown to the control group. Results 1. Priorities in teaching questionnaire (PITQ) Two questions were addressed: (1) What changes in priorities and confidence occur within the nine categories of teaching emphasis? (2) What differences between the experimental and control groups can be attributed to the intervention programs? Subjects in the experimental groups might be expected to show greater sensitivity to the importance of constructivist approaches to teaching, and the importance of feedback than those in the control group, and to be less enamoured with the transmissive approach. Other categories in the PITQ might be expected to rate well for all groups in terms of the general values conveyed in a teacher training program. In relation to confidence it was conjectured that assistance provided through the school based interventions could make students in the experimental groups more confident in using feedback and promoting constructivist teaching methods. Any such effect would be stronger in the reflective intervention group (D) than in the interview group (A). For (1) t-test results using pre-post data are first reported for all students in the program. Differential effects between groups are considered in (2) below. On the importance criterion significant differences (p<.01) were obtained for the scales of (D) Management (F) Social/affective learning characteristics (G) Cognitive learner characteristics. A significant difference (p<.05) was obtained for (I) Motivation. On the confidence criterion significant differences (p<.01) were obtained for all nine scales. These results indicate that the importance attached to some aspects of teaching had been substantially affected by the teacher training program. It was comforting to find that the subjects demonstrated increased confidence on every dimension. For (2) ANCOVA analyses were undertaken using pre-test data as covariates and post-test data as dependent variables. The results indicated no significant differences between the experimental and control groups, as far as self-reported priorities and confidence are concerned. Such significant differences as emerged in the analyses, were always attributable to the pre-test data, entered as covariates. In summary, these results indicate that the teacher education program was effective in promoting confidence in all the defined categories, and some value shifts were achieved. The experimental research program, however, cannot claim special credit for these changes as expressed through the perceptions and beliefs of participants. 2. Sources of knowledge questionnaire (SOKQ) Subjects in the experimental groups might be expected to re- evaluate the worth of the sources of knowledge available as resources for lesson preparation and execution. Reflection after one lesson and before the next, such as promoted by both Special Reflection Sessions and Reflective Interventions, should result in more conscious and deliberate uses of such sources. Of the nine categories in the SOKQ, scales D, E, F, G were felt to be relevant to the type of interventions introduced by the experimental program (refer to previous listing). Analysis of variance by group was conducted using the post-test data as dependent variables, with pre-test data entered as covariates. The results of interest are shown in Table 3. Scale Group Mean Significance level Non-school models,Experimental3.13 advice, feedback(A+D) (N=14) 0.07 Control (N=13) 2.76 F.Reflective and control processes Experimental 3.68 0.08 (A+D) (N=14) Control (N=13)3.32 G.Theory and practiceExperimental3.45 (A+D) (N=14) 0.03 Control (N=13) 3.15 Table 3. ANOVA by group (SOKQ) These results suggest that the interventions had a substantial effect on the way student teachers sourced their knowledge. The experimental program was conducted by non-school personnel, the interventions focussed on reflective and control processes, with a major aspect the consideration of theory based practice. No significant differences were identified between the two experimental groups, suggesting that, in this respect, the Special Reflection Session interviews were as effective as the more intensive Reflective Interventions. 3. Study Process Questionnaire (SPQ) This instrument was given only as a post-test. There is interest in exploring the extent to which surface, deep, and achievement oriented approaches to learning (Biggs, 1987) may be mirrored in the approach of beginning teachers to the task of teaching. To this purpose responses to the SPQ were analysed in the following ways (1) Correlations were obtained between the six dimensions of the Study Process Questionnaire (SPQ) and, (a) the scales of the Sources of Knowledge Questionnaire (SOKQ) that were concerned with theory, reflection, and feedback (b) the importance scales of the Priorities in Teaching Questionnaire (PITQ). (2) ANOVAS by group were conducted using the SPQ scale scores as dependent variables. For (1) significant correlations were obtained between the SOKQ category (use of school models, advice, and feedback), and the SPQ scales of motive-deep (r=.489, p=.015), strategy-deep (r=.440, p=.032), and motive-achievement (r=.400, p=.053). These results suggest that the student teachers saw the feedback during practice teaching as both substantial and important for achievement. There were no significant correlations with either of the surface scales of the SPQ suggesting that the subjects do not see any particular sources of knowledge as quick recipes for success. For (2) no significant results were obtained - there is no evidence to suggest that the experimental program produced effects that could be reliably associated with the SPQ scales. 4. Lesson Observation Task (LOT) Data collected for this task measured the capacity of subjects to observe and label significant classroom events, and to make inferences based on theoretical concepts and teaching/learning principles. Both the quantity and quality of observations are important measures of enhanced capacity at this task. Since the experimental program sought specifically to enhance reflective abilities and theoretical orientations, it was predicted that the relative amounts of student focussed comment would follow the same pattern. Analyses of variance by group were conducted using the variables previously defined as measures on this task. The following results pertain to the predicted outcomes. GENVID by group : D>A>E p = .014 CURVID by group : D>A>E p = .075 POSGEN with PREGEN by group : D>A>E p = .019 POSCUR with PRECUR by group : D>A,E not sig POSVIDSF with PREVIDSF by group : D>A>E p = .082 The direction of the effects is as predicted. Where the differences do not reach conventional levels of significance, the results remain consistent with the predictions. Thus with respect to both the quantity of observations recorded and the nature of the pre-post changes identified, the experimental groups have demonstrated analytical and theoretical abilities not present to the same extent in the control group. Both the Reflective Interventions and the Special Reflection Session interviews have achieved a degree of success in enhancing this aspect of student teacher performance. Table 4 contains descriptive data illustrating the nature and extent of pre-post changes across the eleven categories previously defined from responses to the LOT. Categories of pre-post change (relative frequencies) n 1 2 3 4 5 6 7 8 9 10 11 GROUP D8x2 4/16 8/16 4/16 8/16 4/16 3/16 4/16 5/16 3/16 0/16 (Intervention) 2/16 0.25 0.50 0.25 0.50 0.25 0.19 0.25 0.31 0.19 -0.12 GROUP A6x2 0/12 6/12 2/12 2/12 3/12 2/12 3/12 3/12 (Interview) 4/12 2/12 0/12 -0.50 0.17 0.17 0.25 0.17 0.25 0.25 0.33 0.17 -GROUP E13x2 3/26 2/26 3/26 3/26 4/26 2/26 (Control) 3/26 10/26 5/26 0/26 0/26 0.12 0.08 0.12 0.12 0.15 0.08 0.12 0.38 0.19 -- Note: Frequencies are calculated as: No. S's in the Group (A, D or E) who exhibit the pre-post change Total No. S's in the Group (A, D or E) Table 4: Combined Pre-Post Changes - Both Videotapes (LOT) In summary the pre-post test results suggest the following implications. Priorities in teaching have shown some movement across all groups, although changes cannot be attributed to the experimental interventions. In general the items in the PITQ involve propositions that most sensitised beginning teachers might be expected to support, so that the effect of the Diploma in Education program as a whole could be expected to mask influences of the specific interventions. Similarly all groups indicated substantial increases in confidence across the attributes of the nine scales of the instrument. By contrast, however, by their performance on the lesson observation task, the students in the experimental groups demonstrated an increased capacity to analyse a teaching segment, to adopt theoretical perspectives, and to focus on student learning processes. Furthermore the reflective intervention group was more effective in this respect than the interview group. Given that this analysis used the categories in the PITQ as performance criteria, the results suggest that, while, at the level of perceptions, no significant differences occur in the values and confidences expressed across groups, the capacity to actually perform on these criteria has been enhanced by the interventions. Consistent with this finding the subjects in the experimental groups also demonstrated an increased ability to re-evaluate sources of knowledge used in preparing to teach, giving greater subsequent value to reflective processes and to links between theory and practice. The two experimental groups did not differ from each other on these dimensions. Hence the emphasis on reflection and feedback processes that characterised the interventions seems to have enhanced the capacity of these subjects with respect to intentions, goals, and plans and with respect to knowledge of methods and models of teaching. Note: There are three papers in this symposium. References for this paper are to be found at the end of the final paper.