The Canadian Journal of Higher Education, Vol. XI-1, 1981 La revue canadienne d'enseignement supérieur, Vol. XI-1, 1981 The Relationships Between Student Ratings and Instructor Behavior: Implications for Improving Teaching PATRICIA A. CRANTON*, WILLIAM HILLGARTNER** ABSTRACT Although students' ratings of instruction have been examined in detail by educational researchers, the relationship between ratings and actual classroom behavior has not often been investigated. This study explores the relationship between student ratings and classroom observations. Twenty-eight professors from a wide range of academic disciplines participated in the study. Mean student ratings and frequencies of behavior in several categories were obtained for each professor. It was found that instructor behavior significantly predicted student questionnaire responses in three general areas. (1) When instructors spent time structuring classes and explaining relationships, students gave higher ratings on logical organization items. (2) When professors praised student behavior, asked questions and clarified or elaborated on student responses, ratingson the effectiveness of discussion leading were higher. (3) When instructor time was spent in discussions, praising student behavior, and silence (waiting for answers), students tended to rate the classroom atmosphere as being one which encourages learning. RÉSUMÉ Si les évaluations faites par les étudiants sur le mode d'enseignement one été étudiées en détail par les chercheurs des sciences de l'éducation, le rapport entre ces évaluations et le comportement réel en salle de classe a rarement retenu l'attention des chercheurs. Notre étude porte sur le rapport entre les évaluations faites par les étudiants et les observations faites en classe. Vingt-huit professeurs représentant une vaste gamme de disciplines universitaires ont participé à cette étude. Pour chaque professeur, nous avons établi la moyenne des évaluations faites par les étudiants et la fréquence des divers comportements dans plusieurs catégories. Nous avons trouvé que c'est le comportement des enseignants qui expliquait largement les réponses aux questionnaires remplis par les étudiants dans trois domaines principaux. (1) Lorsque les enseignants passaient leur temps à structurer les classes et à expliquer les rapports, les étudiants accordaient des notes élevées aux items portant sur l'organisation logique. (2) Lorsque les professeurs louangeaient les étudiants sur leur comportement, posaient des questions et élucidaient ou développaient les réponses des étudiants, les notes sur l'efficacité de la conduite des discussions étaient * Centre for Teaching and Learning Services, McGill University **Instructional Communications Centre, McGill University 74 Patricia A. Cranton, William Hillgartner plus élevées. (3) Lorsque l'enseignant passait son temps à discuter avec les étudiants, à faire l'éloge du comportement de ceux-ci et à marquer des silences (pour attendre les réponses), les étudiants tendaient à juger que l'atmospère qui régnait en classe favorisait l'acquisition des connaissances. The University or college instructor who decides to improve his teaching almost inevitably uses student questionnaires to uncover the strengths and weaknesses in his classroom performance, then attempts to make changes in the weak or low-rated areas. Although extensive research has been done on the reliability and validity of student ratings (cf. Kulik & Kulik, 1974; Meredith, 1976), few attempts have been made to determine which teacher behaviors actually yield high student ratings. Consequently the instructor who receives low ratings in an area often does not know what changes to make in order to improve those ratings. Instructors who participated in a teaching improvement program will, at best, "feel more positive" about their teaching (Erickson & Erickson, 1979); they do not otherwise utilize student ratings to actually change their behaviors (Pambookian, 1972; Centra, 1973). This study explores the relationship between student ratings of instruction and observed classroom behaviors for professors who are attempting to improve their teaching by using student evaluations. Once such relationships are established, the next step would be to determine whether changes in these classroom behaviors produce corresponding changes in student ratings and other student outcomes (Cranton, Note 1). INSTRUMENTS The questionnaire used in this study was the Teaching Analysis by Students (TABS) questionnaire, an instrument designed to be an integral part of a teaching improvement process (cf. Anon, 1973). The basic questionnaire consists of 38 items, but additions and deletions may be made by the professor in consultation with the teaching improvement specialist. For a list of the actual items, the reader should consult Bergquist and Phillips (1975, pp. 80-82). Observation data were collected using a category observation system developed by Shulman (Note 2) and based on Flanders Interaction Analysis (Flanders, 1970). In the system, an instructor's class is video-taped, with a digital clock providing a time reference on the tapes. A trained rater unaware of questionnaire results enters a category number on a score sheet every five seconds yielding a record of all classroom interactions. The fifteen categories of behavior are described in Table 1. Inter-rater reliability coefficients of .86 and .87 have been found; test-retest coefficients are reported to be .90, .88 and .94. PROCEDURE Twenty-eight professors were involved in the study. All were interested in evaluating their teaching for the purpose of improvement and were working with a teaching improvement specialist to this end. Professors were from a variety of academic disciplines: law, engineering, management studies, library science, physics, biology, education, psychology, arts, anthropology and continuing education. Class sizes ranged from 20 to 100 students, 75 T h e Relationship between S t u d e n t Ratings and Instructor Behavior: Table 1 D e s c r i p t i o n s of 1. Data Lecturing: Categories g i v i n g f a c t s or o p i n i o n s a b o u t c o n t e n t ; o n e ' s own i d e a s ; a s k i n g r h e t o r i c a l i n c l u d e s problem 2. Data A.V.: 3. Data Illustration Includes using the Data Linking: audio-visual blackboard. i l l u s t r a t i n g d a t a with p e r s o n a l c a s e p r e s e n t a t i o n s and r o l e 4. questions; solving. p r e s e n t i n g d a t a with t h e a i d of materials. expressing anecdotes, real playing. in p r e s e n t i n g d a t a , u s i n g t h e s p e c i f i c s k i l l s of generalizing ( r e l a t i n g c o n t e n t t o o t h e r academic d i s c i p l i n e s and i d e n t i f y i n g c o n n e c t i o n s between c o n c e p t s ) o r summarizing ( r e v i e w i n g d a t a ) or p r o v i d i n g c o n n e c t i o n s between s t u d e n t i n t e r e s t and the data. 5. Management: a d m i n i s t r a t i v e t a s k s ; s t a t e m e n t s or questions d e a l i n g with s c h e d u l e s , d e a d l i n e s , r e a d i n g l i s t s , I n c l u d e s t h e a c t of handing o u t or collecting m a t e r i a l s ; g i v i n g q u i z z e s or w r i t t e n 6. Structuring: exercises. c o n t r a c t i n g and o r g a n i z i n g t h e c l a s s in r e g a r d t o c o n t e n t and p r o c e d u r e . past material I n c l u d e s b r i e f l y summarizing and a c t i v i t i e s , setting objectives, and g i v i n g commands and d i r e c t i o n s t o be f o l l o w e d . 7. Silence: p a u s e s , s h o r t p e r i o d s of s i l e n c e . c o n f u s i o n or l a u g h t e r when s c o r e d with a n o t h e r 8. Questions: etc. Indicates simultaneously category. asking a question about c o n t e n t with the t h a t someone a n s w e r . intent 76 Patricia A. C r a n t o n , William Hillgartner Table 1 (continued) 9. Discussion: e n c o u r a g i n g or f a c i l i t a t i n g i n t e r a c t i o n and d i s c u s s i o n between s t u d e n t s . For example, a s k i n g members t o respond t o a s t u d e n t ' s 10. Clarifying: class comment. s t a t e m e n t s and q u e s t i o n s by t h e i n s t r u c t o r designed t o e n c o u r a g e a s t u d e n t t o e l a b o r a t e an idea o r question i n i t i a t e d by t h e s t u d e n t . Includes para- p h r a s i n g which a t t e m p t s t o c l a r i f y a n o t h e r p o i n t of view. 11. Crediting: praising i d e a s , p e r f o r m a n c e or work 12. Criticizing: d i r e c t or i n d i r e c t c r i t i c i z i n g , patterns. in a d e s t r u c t i v e manner, of i d e a s , p e r f o r m a n c e or work 13. Demand: making a demand f o r work. Includes patterns. constructive c r i t i c i s m and i n s i s t i n g on f o c u s . 14. Monitoring: c a l l i n g a t t e n t i o n t o p r o c e s s in o r d e r t o and e x p l o r e b l o c k s or p o t e n t i a l c l a s s r o o m work. Affect: blocks to e f f e c t i v e Includes p e r i o d i c a l l y checking f o r a t t e n t i o n , comprehension, 15. etc. c l a r i f y i n g t h e f e e l i n g of o t h e r s in t h e O f f e r i n g o n e ' s own f e e l i n g s . p o s i t i v e or n e g a t i v e . recalling identify classroom. F e e l i n g s may be Includes predicting or feelings. with an average size of about 50 students. Courses were both graduate and undergraduate (the majority being the latter), both half year and full year courses, and included a wide variety of teaching styles and methods. During the fourth or fifth week of the semester, the TABS questionnaire was administered and the class video-taped. One trained rater viewed all tapes, and was unaware of questionnaire results. The number of occurences of each behavior category was determined for an individual professor. Data were then recorded in the form of percentages of the total time spent on each category. A professor would have 15 category "scores", each one being a percentage of time spent showing the behavior (lecturing, structuring, etc.). Questionnaire data for each professor were recorded in terms of averaged student ratings on each of the 38 TABS items. 77 The Relationship between Student Ratings and Instructor Behavior: Table 2 P r e d i c t e d and S i g n i f i c a n t * Relationships Between B e h a v i o r s and R a t i n g s Item # 1. Category # Item E x p l a n a t i o n of Behavior 5 6 objectives S t r u c t u r i ng Management R ? - 4. E x p l a n a t i o n of work e x p e c t e d 5 6 Structuring* Management* .21 .24 5. R e l a t i o n s h i p between c o n t e n t and objectives 4 6 Data L i n k i n g * Structuring .14 R e l a t i o n s h i p s among t o p i c s 4 6 Data L i n k i n g * Structuring .21 4 6 Data L i n k i n g * Structuring* .23 .25 6. 7. D i s t i n c t i o n between m a j o r and minor t o p i c s 8. Paci ng 14 Monitoring 9. A b i l i t y to c l a r i f y material 2 3 4 10 Data A.V. Data I l l u s t r a t i o n Data L i n k i n g * Clarifying* 7 8 10 Si l e n c e Questions Clarifying 7 8 9 Silence Questions Clarifying 8 9 10 11 13 14 15 Questions Discussion* Clarifying Crediting* Demand* Monitoring Affect 8 9 10 11 13 14 15 Questions Di s c u s s i o n * Clarifying Crediting* Demand* Monitoring Affect 8 9 10 11 13 14 15 Questions Discussion* Clarifying Credi t i n g Demand Monitoring Affect* 11. 12. 14. 15. 16. Asking e a s i l y u n d e r s t o o d Asking t h o u g h t - p r o v o k i n g questions questions E f f e c t i v e n e s s as a d i s c u s s i o n A b i l i t y to get s t u d e n t s p a r t i c i pate to F a c i l i t a t i n g d i s c u s s i o n among students leader - - - - .17 .22 - - .20 - .36 .29 - - .22 - .31 .33 - - .24 - .18 78 Patricia A. Cranton, William Hillgartner Table 2 (continued) Item # R Management - F l e x i b i l i t y in o f f e r i n g o p t i o n s 10 15 Clarifying Affect* .16 Taking a c t i o n when s t u d e n t s a r e bored 3 4 8 14 15 Data I l l u s t r a t i o n Data L i n k i n g Questions Monitoring Affect* 2 4 7 9 11 Data A.V. Data L i n k i n g Silence* Di s c u s s i o n * Credi t i n g * 2 4 7 9 11 Data A.V.* Data L i n k i n g * Si 1 enee Discussion Crediting* Management of detai1s 28. 29. 33. 2 Behavior 5 27. 32. Category # Item administrative Atmosphere t o encourage A b i l i t y to i n s p i r e learning interest - - .16 - .44 .26 .35 .35 .23 - .30 36. Getting students to challenge 9 Di s c u s s i o n - 37. R e l a t i o n s h i p between p e r s o n a l v a l u e s and c o u r s e c o n t e n t 9 Discussion - 38. Making s t u d e n t s aware of issues 9 Di s c u s s i o n - value The hypotheses were tested by a series of stepdown multiple regression analyses, with rating items used as the dependent variables and percentages of time spent showing the behaviors as the independent variables. RESULTS In Table 2, the behaviors marked by an asterisk were those which accounted for a significant proportion of the variance in their prediction of the student questionnaire item. Categories 5 and 6 (Structuring and Management) accounted for a significant amount of variance (R 2 = .24) in their prediction of Questionnaire Item 4, (explanation of work expected). The professor who spent a higher proportion of time in structuring and management behaviors was rated higher by students on ability to explain work expected. However, this relationship did not appear for Item 1 (explanation of course objectives). It is likely that behavior related to this item took place earlier in the semester and was not observed in our videotapes. Questionnaire items concerned with the clarification of relationships among topics ( # 5, 6 and 7) were predicted by the Data Linking category (R 2 = .14, .21, .23). The instructor who spends time generalizing, summarizing and providing connections, will tend to be rated higher on ability to clarify relationships. 79 The Relationship between Student Ratings and Instructor Behavior: No relationship was found between adjusting the rate of presentation (Item 8) and monitoring behavior (Category 14). Instructors who checked for attention, comprehension, etc. did not necessarily adjust their pacing, or students did not necessarily perceive them as adjusting their pacing. Student ratings of ability to clarify material which needed elaboration (Item 9) were predicted by the Data Linking and Clarifying categories of behavior (R 2 = .22). That is, when a higher percentage of time was spent generalizing, summarizing or providing connections, students perceived clarification skills more positively. Questionnaire Items 11 and 12 (asking questions) were not related to the Questioning, Silence or Clarifying behavior categories. A professor spending a higher proportion of time in these activities did not necessarily receive a higher rating on his or her questioning skill. Questionnaire items concerned with the instructor's effectiveness as a discussion leader were predicted by a number of behavior categories. Discussion, Crediting and Demand, came first into the regression equations for Items 14 and 15, accounting for 36% and 33% of the variance, respectively. Other significant predictors were Categories 8 and 10 (Questions and Clarifying) for Questionnaire Item 14 and Category 14 (Monitoring) for Item 15. Item 16, on the other hand, was concerned with facilitating discussion among students, and was best predicted by the Affect category (R 2 = .16) followed by the Discussion category (R 2 = .24). Contrary to expectations, Item 27 on management of administrative detail was not related to time spent in Management activities. It is possible that students perceive higher percentages of time used for administrative tasks as an indication that the instructor is poorly organized and is wasting class time. A professor who was perceived as being flexible in offering options for individual students (Item 28) was also one who was spending a higher proportion of time exhibiting behavior in the Affect category (R 2 = .16). Item 29, taking action when students are bored, was also predicted by the Affect category (R 2 = .16) but not by the other expected categories (Data Illustration, Questions, Monitoring, etc.). The Questionnaire items concerned with creating a learning atmosphere were related to several categories. For Item 32, the Discussion, Crediting and Silence categories were the first to enter the equation (R 2 = .44); for Item 33, the Data Linking, Crediting and Data A.V. categories were the best predictors (35% of the variance was accounted for). In other words, students saw time spent on discussion as contributing to an atmosphere which encouraged learning. However, when rating the instructor's ability to inspire interest in the content of the course, more time spent on generalizing, summarizing, providing connections, praising students, and using audiovisual aids was related to higher questionnaire ratings. Finally, items concerned with value issues, and getting students to challenge points of view were not predicted by the Discussion category: professors spending more time in discussion activities did not necessarily raise these issues. DISCUSSION The instructor who is embarking on a program to improve his teaching often has very little information as to what changes in classroom behavior should be made in response to poorly rated questionnaire items. This research has attempted to establish some relation- 80 Patricia A. Cranton, William Hillgartner ships between behavior and ratings, to begin to answer the question, "what does the highly rated instructor actually do?" For those teaching skills related to "structuring" (clarifying expectations and clarifying relationships) it was found that the proportion of time spent on the activities is relevant to the student ratings of ability in those areas. The relationship is straightforward: devoting time to the area is probably a good first step in improving. This also appears to be true for clarification of material: generalizing, summarizing, providing connections and paraphrasing are relevant to student ratings of the instructor's ability to clarify. Other teaching skills are not so directly related to time spent on the apparently relevant behaviors. Monitoring student behavior, for example, does not necessarily result in the ability to adjust the rate of presentation. Spending time questioning students may not lead to students' satisfaction with questioning ability. Dedicating class time to administrative details does not seem to be related to high ratings of the instructor's management skills. In each of these areas, students probably use other indications of effectiveness. For example, perception of pacing ability may be based on the student's own understanding of material, and therefore would be influenced by student characteristics. It may be possible that the class means are not the appropriate unit of analysis for items where individual student differences are so relevant to the rating (Cranton, Note 3). The same issue arises for questioning ability: student judgements will be made individually as to whether they understand the questions, or find them thought-provoking, and the amount of time spent asking questions does not seem relevant. Further research is required before questionnaire items in these areas can become useful in improving instruction. In yet another group of teaching skills, the rating-behavior relationships provide some interesting and useful information. The effective discussion leader appears to be the professor who exhibits more crediting behavior (praising ideas, performance and work patterns), asks questions, and clarifies (in terms of encouraging a student to elaborate an idea or question). The professor who is rated highly on ability to get students to participate in discussion tends to show behavior in the Demand category (including constructive criticism and focus) and the Monitoring category (exploring blocks in classroom work, checking for attention, comprehension). Being an effective discussion leader tends to be related to not only the presence of discussion in the classroom, but also to providing feedback, learning what the students know, encouraging, focusing, and criticizing (constructively). The professor who facilitates discussion among students, as opposed to between the students and himself, has a higher proportion of time spent in the Affect category; clarifying feelings, offering feelings, predicting or recalling feelings. Also the professor who is perceived as flexible in offering options to individual students, and the professor who is seen to take appropriate action when students are bored, both spend a higher proportion of time expressing feelings. The ability to encourage learning and to create interest in the course content is a priority of most instructors; students must be motivated before other goals can be achieved. A wide variety of behaviors predict these ratings, as would be expected. In creating an atmosphere that encourages learning, discussing, crediting, praising student ideas, and waiting for responses are relevant. Three out of the four Data categories are also related to ratings in this area. Although significant contributors, they only account for an additional 7% of the variance. In other words, when students feel that the classroom activities are conducive to learning, the emphasis is on interpersonal interactions. There is discussion; the instructor reinforces student participation, and waits for students to respond. 81 The Relationship between Student Ratings and Instructor Behavior: Inspiring interest or excitement is related to somewhat different types of behavior — discussion is not as relevant; data linking and crediting are the first contributors. In order to interest students, as most instructional design texts suggest, it is important to relate course concepts to other disciplines and to students' experiences. Crediting is also clearly relevant: students are encouraged to present their own ideas, and these ideas can then be related to the topic being discussed. Silence is again significant, adding 8% to the variance. The importance of silence is probably the importance of giving students time to respond. SUMMARY The study was able to isolate some specific classroom behaviors that are related to student ratings of instruction. These relationships can provide guidance for the instructor who is using student ratings to improve instruction. Research should continue this type of investigation: a larger sample size would permit a more thorough analysis of relationships, follow-up studies (i.e., post-improvement) are needed, and other student outcomes (e.g., student behavior, student ratings of their own learning) should be included. REFERENCES Anon, Teaching Analysis by Students (TABS). Amherst, Mass.: Clinic to Improve University Teaching, University of Massachusetts, 1973. Bergquist, W., & Phillips, S. A handbook for faculty development. Washington: Council for the Advancement of Small Colleges, 1975. Centra, J.A. Effectiveness of student feedback in modifying college instruction. Journal of Educational Psychology, 1973,65, 395-401. Erickson, G.L., & Erickson, B.L. Improving college teaching: An evaluation of a teaching consultation procedure. The Journal of Higher Education, Flanders, N.A. Analyzing 5 (5), 670-683. teaching behaviour. Reading, Mass: Addison Wesley, 1970. Kulik, J.A., & Kulik, C.C. Student ratings of instruction. Teaching of Psychology, 1 9 7 4 , 1 , 51-57. Meredith, I.M. Structure of the teaching analysis by students. The Journal of Psychology, 1976, 92, 235-242. Pambookian, H.A. Initial level of student evaluation of instruction as a source of influence on instructor change after feedback, Journal of Educational Psychology, 1974, 66, 52-56. Reference Notes 1. Cranton, P. A. The effect of evaluation for improvement on teaching performance and student comes. Montreal: McGill University, Centre for Teaching and Learning Services, 1980. out- 2. Shulman, L. Development of a category observation system for the analysis of video-taped sessions. Montreal: McGill University, Centre for Learning and Development, 1975. class 3. Cranton, P. A. The evaluation and interpretation of student ratings of instruction. University, Centre for Teaching and Learning Services, 1979. Montreal: McGill
Author
Author