The Canadian Journal of Higher Education La revue canadienne d'enseignement supérieur Volume XXX, No. 2,2000 pages 143-164 Student Evaluations of College Professors: Identifying Sources of Bias* KENNETH M. CRAMER & LOUISE R. ALEXITCH University of Saskatchewan ABSTRACT Previous studies have found that students' evaluations of their professors' teaching ability may be affected by such factors as students' expectations and gender stereotypes. The present study examined how students' evaluation of faculty may be affected by student's gender, professor's gender, discipline, and a variety of demographic and social varia b l e s . U n d e r g r a d u a t e s t u d e n t s (N = 910) e v a l u a t e d one of their professors on sensitivity to students' needs, quality of teaching, course structure, and treatment of designated group members (e.g., visible minorities). Results showed that female students rated their professor higher on sensitivity to students' needs and treatment of designated groups than male students. Science students rated their professor lower on teaching quality and treatment of designated groups than either Social Science or Fine Arts/Humanities students. In addition, students' ratings correlated with how often a professor met with students outside of class, when the class was scheduled, and class size. The implications for using student evaluations to accurately assess professors' teaching ability are discussed. * T h i s study w a s f u n d e d b y a President's Social Science and H u m a n i t i e s R e s e a r c h Council Grant (#7-70405) awarded to both authors. The sequence of authorship w a s randomly determined and thus both authors deserve equal merit for this work. Special thanks is given to both R. Paola Lake and Steven Lake for data collection. 144 K.M. Cramer & L.R. Alexitch RÉSUMÉ Des études antérieures ont démontré que les évaluations faites par les étudiants au sujet de l'aptitude à l'enseignement de leurs professeurs peuvent être affectées par des facteurs tels que les attentes des étudiants et les stéréotypes sexuels. Cette étude examine comment l'évaluation des étudiants de leur faculté peut être affectée par leur genre, celui du professeur, la discipline, et une variété d'autres variables démographiques et sociales. Des étudiants de premier cycle (N = 910) ont évalué un de leurs professeurs sur la sensibilité de celui-ci à leurs besoins, sur la qualité de l'enseignement, sur l'organisation du cours, et sur le comportement à l'égard des membres de certains groupes particuliers (e.g., les minorités visibles). Les résultats ont montré que les étudiantes, en comparison avec les étudients, ont évalué plus fortement leurs professeurs quant à la sensibilité à leurs besoins et au comportement à l'égard des membres de groupes particuliers. Les étudiants en Sciences en comparison avec ceux des Beaux Arts et des Lettres ont évalué plus faiblement la qualité de l'enseignement et le comportement à l'égard des membres des groupes particuliers. De plus, les évaluation des étudiants étaient en correlation avec le nombre de fois où un professeur s'est trouvé avec les étudiants en dehors du cours, la place occupée par le cours à horaire, et la taille du groupe. Les conséquences de l'utilisation des évaluations d'étudiants pour mésurer avec précision la capacité d'enseigner de leurs professeurs sont discutées. Some review articles (e.g., Marsh & Roche, 1997; McKeachie, 1997) expounding upon the validity of student evaluations of their professors, have determined that students' ratings reflect a multidimensional construct, that variables such as expected — actual grade, class size, and instructor's rank have mixed effects on students' ratings, and that (for the most part) students' ratings are valid indicators of professors' teaching. While the present authors do not dispute these conclusions, it is conceivable that many of these extraneous variables may have interactional effects on students' evaluations (e.g., professor gender and student gender), and The Canadian Journal of Higher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors 145 that students' perceptions of the overall institutional climate may affect their ratings of professors. According to Baird (1990) and Pascarella (1984), the interplay among people, processes, and institutions creates a climate that guides the behaviour of its constituents. In either college or university settings, the climate includes the perceptions, expectations, satisfactions, and dissatisfactions of the people who make up the campus community, such that an individual may behave uniquely or feel treated distinctly in such a climate. Early investigations into campus climate assessed the nature of student gender bias (Harvey & Hergert, 1986; Williams, 1990), and revealed a learning environment less favorable to women (Carelli, 1988; Heller, Puff, & Mills, 1985). For instance, Constantinople, Cornelius, and Gray (1988) assessed teacher and student gender, type of curriculum, and time of semester in first and second year courses at Vassar College, and found that male students spoke up more frequently in female-led classes, initiated more classroom discussion, and participated more in class overall. Studies using students from significantly sized universities or college students in large samples show similar sex differences (Brady & Eisler, 1995; Crawford & MacLeod, 1990). Researchers now admit that classroom climate both affects and is affected by more than student gender. In fact, the reported student gender differences may be artifactually produced by variables that have largely gone unstudied, such as teacher gender and the student-teacher interaction. Brady and Eisler (1995, p. 14) write that "the mere presence of a male instructor, not his behavior, may have contributed to an intimidating environment for females." Given that female instructors remain under-represented in college faculty, and given that the ratio of male to female instructors is not consistent across academic discipline, students' classroom experience may be dictated by the instructor's gender, the course material instructed, and the students' gender-related expectations (Deaux & Major, 1987). "Because students expect female professors to be more personal, supportive and motherly than male professors, it is possible that the students' gender-role expectations influenced their relative amount of participation" (Brady & Eisler, 1995, p. 14). The Canadian journal of Higher Education Volume XXX, No. 2, 2000 146 K.M. Cramer & L.R. Alexitch In short, if the presence of female instructors activate more gender stereotypes (and unique evaluations) among students in disciplines occupied less by female faculty (e.g., sciences) compared to those occupied more (e.g., social sciences), then evidence of a chilly college climate would be revealed through differential instructor evaluations, a critical tool utilized by institutions to help in decisions concerning personnel retention, promotion, tenure, and salary increases (Cashin & Downey, 1992; Divoky & Rothermel, 1988). As a result, the favorability of professors could interact among many variables, such as student sex, professor sex, and discipline. For example, Divoky and Rothermel (1988) assessed students' instructor ratings based on the importance students held for the dimensions of teaching effectiveness (e.g., delivery, depth of k n o w l e d g e , i n t e r p e r s o n a l skills). Results showed that: (a) instructor delivery was important to students in non-major courses compared to major courses; (b) instructor depth of knowledge was important for students in major-elective courses than either non-major elective or required courses; and (c) instructor interpersonal skills were important for students in major required courses than students in major elective courses. The authors concluded (p. 45) that "instructors should not be compared by their mean ratings on each item on an evaluation form but rather a weighted average rating could be calculated for an instructor in each type of course he or she teaches." Overall, one may ask whether student bias has created unequal evaluation criteria for male and female faculty. Whereas most research indicates that female professors receive higher evaluations than male professors (Feldman, 1993; Freeman, 1994; Horn, DeNisi, Kinicki, & Bannister, 1982; Marsh, 1980), these studies often fail to account for many important intervening variables such as student sex, professor rank and discipline, and student personality traits. When researchers account for the influence of these variables, results show female faculty receiving lower ratings than male faculty, especially from males students (Basow & Silberg, 1987; Murray, Rushton, & Paunonen, 1990). Recently, Basow (1995) examined gender bias in faculty evaluations based on students' assessments in three disciplines (humanities, natural, and social sciences) for each of four years at Lafayette College. In addition The Canadian Journal of Higher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors 147 to professor sex, student sex, and discipline, Basow assessed the hour the class met, professor rank and teaching experience, and student year, grade point average, and expected final grade. Across the four years, results consistently showed that male faculty were rated similarly by their female and male students, regardless of discipline affiliation, whereas female faculty were rated highest by female humanities students but lowest by male social science students. Whereas Basow delineated many of the influences on instructor evaluation, there are several concerns with the research that merit discussion. First, by using over 2000 subjects in a multivariate analysis, Basow achieved an astoundingly high level of statistical power, which yielded many significant but largely trivial effects. Second, students were sampled from a small liberal arts college in the Northeastern United States, which only a few years prior began to admit female students. This bias does not invalidate the findings, but does invite a similar investigation at a larger, more typical North American institution. As Brady and Eisler (1995) recommend, a large sample of students and teachers taken from multiple departments would help to increase the representativeness and generalizability of the sample drawn. Finally, because Basow conducted the research in the United States, it warrants a similar (although modified) investigation in Canada for the purposes of comparison, standardization, and validation. Due to the typical multicultural student profile at Canadian universities, faculty evaluations may also be influenced by social characteristics such as ethnicity, disability, Aboriginal status (see Chambers, Lewis, & Kerezsi, 1995; D'Augelli & Hershberger, 1993) that have remained largely unstudied. Present Study To address the issues outlined above, the present study examines several basic questions concerning students' differential evaluation of faculty instruction as affected by student gender, instructor gender, discipline, and a variety of demographic and social variables. Is faculty evaluation significantly and noticeably influenced by variables such as faculty gender, student gender, and/or discipline? Is such an influence affected by student and faculty social variables such as minority status? The Canadian Journal ofHigher Education Volume XXX, No. 2,2000 148 K.M. Cramer & L.R. Alexitch As a modified replication, the present study evaluated Canadian university professors from the college of arts and sciences using the same measurement instrument as Basow. Additionally, other sources of bias such as Aboriginal, disability, and minority status, were assessed to draw out potential covariates. Are the interaction effects reported by Basow (1995) replicable in a Canadian sample, and if so, do the effects remain small enough to be practically unimportant? Past research has been mixed in outlining the relationship between faculty and student variables in the evaluation of course instruction (Basow, 1995; Feldman, 1993; Murray et al., 1990). Whereas Basow (1995) found significant effects of faculty, student, and discipline variables on evaluations, the findings are qualified by significant but practically unimportant results and investigation at an atypical institution. By modifying Basow's original work at a Canadian institution, results from the present study should: (1) help researchers understand the extent of student bias in faculty evaluations; (2) help administrators better evaluate their teaching faculty; (3) better guide teaching faculty to improve their quality of instruction to students; and (4) evaluate the degree of student and faculty sensitivity to critical social issues. METHOD Participants Professor respondents. Overall, the profile of the 293 tenured and tenure-track professors in the College of Arts and Sciences at the University of Saskatchewan appears as follows: 244 males (83%) and 49 females (17%); 164 full professors (56%), 95 associate professors (32%), and 34 assistant professors (12%). By discipline, there are 78 male and 28 female professors in Fine Arts/Humanities, 64 male and 13 female professors in Social Sciences, and 102 male and 8 female professors in Sciences. Using these characteristics, classes were stratified according to discipline and year to maximize both sample-to-population similarity and generalization of results across the College. Furthermore, in order to sample across different class sizes and still recruit between 800 and 1200 student respondents, 32 professors (approximately 11%) The Canadian Journal ofHigher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors 149 were randomly selected and both they and their students were invited to participate in the investigation. Of those 32 professors approached, 28 (8 females, 20 males) agreed to participate (1 female and 3 males refused), resulting in 910 student respondents. Overall, the characteristics of this sample demonstrated adequate similarity to the overall profile in the College of Arts and Sciences. Student respondents. There were 910 undergraduate students (399 males, 509 females,and 2 unspecified) enrolled in the College of Arts and Science at the University of Saskatchewan who participated in this study. Half of the sample (n = 455) was 19 to 21 years old; 151 students (17%) were younger than 19 years, 177 students (20%) were 22 to 24 years old, and the remainder (n = 125, 14%) was older than 24 years. Ninety-six students (11%) were members of a visible minority, 25 students (3%) were Aboriginal, and 43 students (5%) had a disability. There were 302 (33%) first-year students, 223 (25%) second-year students, 166 (18%) third-year students, and 135 (15%) fourth-year students. The remaining 84 students (9%) either did not indicate their year or were beyond their fourth year in university. Students were enrolled in Science (n = 335, 37%), Fine Arts and Humanities (n = 205, 23%), or Social Science (n = 192, 21%) programs; 178 students (20%) were in enrolled in "Other" programs (e.g., combined majors). Students reported their current and expected final marks in the course they were evaluating for this study. For their current marks, 240 students (26%) reported an average between 80% and 89%, 315 students (35%>) reported an average between 70% and 79%, and 205 students (23%) reported an average between 60% and 69%. The remaining 141 students (16%) reported averages that were either above 89% or below 60%. Nine students did not report their current average mark. Students' expected final course marks were distributed similarly: 303 students (33%) expected a final mark between 80% and 89%, 354 students (39%) expected a final mark between 70% and 79%, and 143 students (16%) expected a final mark between 60% and 69%. Seventyfive students (8%) expected a final mark that was either above 89% or below 60%, and 35 students (4%) did not answer this question. Tfte Canadian Journal ofHigher Education Volume XXX, No. 2, 2000 150 K.M. Cramer & L.R. Alexitch Measures Students' Teaching Needs Questionnaire. This three-part scale, based on 19 items developed by Basow (1995), asks students to evaluate their professor's teaching style in a given course. Specifically, students indicate the extent to which they believe their professor is enthusiastic, fair, helpful, knowledgeable, organized, and sensitive to their needs. Furthermore, students rate to what extent the professor treats them with respect, stimulates thinking, expresses ideas well, avoids unnecessary repetition, speaks in an appropriate manner, provides clear explanations and objectives, good feedback and fair grading, and assigns appropriate readings and tests/papers. Finally, students provide an overall evaluation of both course and professor. The present scale includes three additional items on professors' sensitivity to targeted student groups such as visible minorities or disabled students (e.g., "Your instructor is sensitive to the needs of students with disabilities"). In the first section (16 items), students rate their professor on helpfulness, sensitivity to students' learning needs, teaching style, and teaching effectiveness using a 5-point Likert scale format (1 = Strongly Disagree, 5 = Strongly Agree). In the second section (6 items), students rate the overall course structure, requirements, and objectives using a 5point Likert scale format (1 = Poor, 5 = Excellent). In the third and final section (9 items), students indicate their gender, age, designated group status (e.g., visible minority, Aboriginal, disabled), academic major, years in university, and current and expected final marks in the course. Professor's Questionnaire. Professors in each of these courses provided information about their course and about their teaching experience on a six-item questionnaire. The questions asked for professor's gender and years of teaching, as well as the number of hours the professor met with students outside of class, number of students in the class, when the class was held (Morning, Afternoon, or Evening), and how often the class met per week. For the 28 courses used in this study, the 8 female and 20 male professors had an average of 15.3 years (SD = 11.5; range from 3 to 40 years) teaching experience. The average class size was 59 students The Canadian Journal of Higher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors 151 (SD = 76), ranging from 8 to 330 students. Twenty courses were held in the morning, seven in the afternoon, and one in the evening. About half of the courses (n = 16, 57.1%) met three times a week, four courses met twice a week, and eight courses met once a week. Professors reportedly met with students an average of 4.2 hours per week (SD = 4.0) outside of class time. Procedure Courses were randomly selected from three discipline areas (Fine Arts and Humanities, Social Sciences, and Sciences) in the College of Arts and Science. The researchers contacted professors of the selected courses to ask for their participation. For those professors who agreed to participate in the study, a research assistant administered a consent form and the Students ' Teaching Needs Questionnaire to each student during class time. While students were completing their questionnaire package, their professor completed a consent form and filled out the Professor s Questionnaire. Both the student and professor consent form emphasized the confidentiality of responses and the voluntary nature of the participation. Professors and students were debriefed and thanked for their participation. RESULTS Evaluation Scale Structure Using principal components analysis with oblique rotation, the 910 student responses on the 22-item questionnaire were reduced to four factors whose eigenvalues before rotation exceeded unity. The solution accounted for approximately 64% of the available factor space. The first factor, labelled Course Structure (accounting for 42% of the factor space; eigenvalue = 9.22), consisted of 6 items of students' impressions of the course tests, readings, objectives, etc. The second factor, labelled Student Needs (10%, 2.18), consisted of 5 items of students' impressions of how sensitive the professor was to their own academic experience (e.g., respect, sensitivity, helpfulness). The third factor, labelled Teaching Quality (7%, 1.56), consisted of 6 items of students' impressions of the The Canadian journal of Higher Education Volume XXX, No. 2, 2000 152 K.M. Cramer & L.R. Alexitch professor's ability to stimulate thinking, present clear and organized ideas, etc. The fourth factor, labelled Treatment of Target Groups (5%, 1.19), consisted of 3 items tapping students' impressions of their professors' sensitivity to the issues of minority groups, women's issues, and the disabled. Based on these results, four subscales were constructed as the sum of their c o m p o n e n t items (e.g., Target Group Treatment Subscale equalled the sum of the three component items). Two items were excluded because they did not load significantly on any of the four factors. The four subscales correlated significantly with one another, ranging from .32 for Course Structure and Target Group Treatment, to .67 for Student Needs and Teaching Quality. In general, the overall evaluation scale and component subscales indicated that students gave moderately positive ratings to their courses and professors (see Table 1 for means, standard deviations, and internal consistency reliability estimates). Because of the sizeable number of participants (N = 910) used in this correlational analysis, even modest correlations were statistically significant. Therefore, only correlations that accounted for greater than 10% of the variance (r > .32) were considered important and interprétable. None of the correlations met this criterion. In fact, the relation between the subscales and each of Aboriginal, minority, and disabled status was not significant. Tests for Gender and Discipline Differences The evaluation scale and subscales (Overall Evaluation, Course Structure, Student Needs, Teaching Quality, Treatment of Target Groups) served as dependent variables in a series of 2 x 2 x 3 analyses of variance (ANOVAs), with Student Gender (male vs. female), Professor Gender (male vs. female), and Discipline (Science, Social Science, Fine Arts/Humanities) as the independent variables (see Table 2 for means, standard deviations, and group sizes; see Table 3 for an ANOVA summary). B e c a u s e group sizes were not equal (i.e., a n o n o r t h o g o n a l design), Type III unique sums of squares were utilized to assess all main effects and interactions independent of each another. In addition, given the large sample size and comparably high statistical power, the co2 The Canadian Journal of Higher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors o J3 *u fcl o U ~ O U o. X w a Os O oo o "2 <3 O rH Os C N j j O 153 • 4 — > a o •u 3 m . ca £M£ r* t-< IU > 'S s « tJ) < n> to g ci o Cl O as Q m fN TJ-' o so •sr § rso Os • — i (N o 2 C3 O CC G o %-» C3 J3 "c3 > m T3 < u IL) 13 t3 u •O S 00 (N es SO O O ci i i SO VI 1 1 ci oo (N 1 oo oo fN O «n o o o Cl 1 ! SO (N O oo oo m as so so <N ci o >o ci fo i C S u U S u H O o 3 (N 00 ci sd o cs c o H o 3 C M SO O O f — 1 <N Os O O ( H O u s? C 3 H g J3 13 > o Os W s o O O H O Z The Canadian journal of Higher Education Volume XXX, No. 2, 2000 154 K.M. Cramer & L.R. Alexitch u <u .g £ oo II S u K u -a c < u O 00 o Tl" C N 00 O SD r-; >/ -> C N V O o" oo co -o a « o C N <4-1 O t-l Oh C O Os os C N O Os C N C N C N ^ CO Os C O V O o O P t Os SD C N C N as i/-> SD Os oo Tt; - ci O oo SD Os oo C N SD C O r ro Os ro SD it Jr -i <r> ^ CN O ^ - VI 5 C N C N Os" TlC N y — N oo ^ ro c <u a m u 1. C N Os o s ec a> T3 a U K U ' , w / < L > in O O N C C u || o 00 C a 'o a> o s G> •S o o 00 in 1 w u o a < L > '3 00 SD co O O o ro' C N •n m co II cn" r-; o co C N oo oo Os (N Os CN C N C N O oo CN CN Os •sf oo t-; ^ ^ so oT C N •«t Os ro ro C N O^ co co C N o C N ro ro C N r- V O sô" Os ro O ÇN- 00 G" OS Os C N SO ro SD 00 'e o a> Q T3 u « •a a « co S» a « a> S < U C IL) O c u •a 5 oo JU "es o V) s < D II R co f T o O o C N Os Tl- ro C N ^ u Os as "ôs ro II S c O O C N >/"! as ro C N (N < D es P « S w OO O C N <N sq C N O in oo Os ro r- a « o t» >Û S CO t» S tO C N OS r» u e m o 00 a o 'is 13 > PQ u •o u m O s u T3 C3 3 a 00 g B 00 The Canadian Journal of Higher Education Volume XXX, No. 2, 2000 o IL) f- a, 3 O l-t O < D 6a = H S o S 4-» 00 u w (h O U a o 'is J3 13 > W o > O Student Evaluations of College Professors 155 Table 3 Analysis of Variance Summary Table by Subscale and Effect Subscale Main Effect/Interaction MSE œ2 .060 164.80 .005 2,700 .039 164.80 .006 3.98 1,324 .047 190.79 .013 4.53 1,715 .034 16.90 .005 Discipline 4.09 2,717 .017 20.42 .002 Student Gender 3.57 1,717 .059 20.42 .003 Discipline x Student Gender 4.13 2,717 .017 20.42 .008 Discipline (Males only) 8.31 2,328 .001 23.11 .042 Discipline (Females only) 3.69 2,325 .026 18.13 .013 Discipline 5.62 2,705 .004 27.89 .013 Student Gender 5.25 1,705 .022 27.89 .006 Professor Gender 4.19 1,705 .041 27.89 .013 Simple Effects F dfs Discipline 2.82 2,700 Discipline x Student Gender 3.26 p Overall Student Gender (Science only) Student Needs Student Gender Teaching Quality Targets Note: Nonsignificant main effects and interactions are not listed. The Canadian journal ofHigher Education Volume XXX, No. 2, 2000 156 K.M. Cramer & L.R. Alexitch statistic estimates the proportion of variance explained by the factors (Howell, 1996). This statistic is useful in identifying significant but trivial e f f e c t s , w h e r e values b e l o w .03 are deemed u n i m p o r t a n t . The ANOVA of students' overall professor evaluations showed a marginal main effect for Discipline; but a significant Student Gender x Discipline interaction. Mean inspections showed that Science students gave a slightly lower overall evaluation compared to Social Science and Fine Arts/Humanities students. Simple effects tests of the interaction showed that for Science students, males gave significantly lower overall evaluations than female students (Ms = 72.98 and 76.02, respectively). There were no significant differences by Student Gender for Fine Arts/Humanities or Social Sciences. Although there were no significant differences in Course Structure, there were significant differences in Student Needs by Student Gender, whereby female students believed their professors were more sensitive to their needs than male students. For Teaching Quality, there was a marginal main effect by Student Gender, a significant main effect by Discipline; and a significant Student Gender x Discipline interaction. Mean inspections revealed that female students gave slightly higher ratings than male students. As a test of the pairwise differences among the three levels of Discipline, the REGWQ multiple comparison procedure was selected because it offered the most powerful test of mean differences when the number of groups is few, while still controlling the familywise error rate (Howell, 1996, p. 381). In addition, the Games-Howell correction method (Toothaker & Miller, 1996) was used to account for unequal group sizes. Results showed that Science students gave significantly lower ratings on teaching style than both Social Science and Fine Arts/Humanities students, whose ratings did not differ. To assess the nature of the interaction, a simple effects test of Discipline (controlling for Student Gender) showed a significant effect for male students, whereby male Science students were more critical of their professor's teaching style than male Fine Arts/Humanities students (Ms = 22.97 and 25.69). The test was also significant for female students, whereby female Science students were more critical than female Social Science students (Ms = 23.96 and 25.39). The Canadian Journal of Higher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors 157 An ANOVA for Treatment of Target Groups showed significant main effects for Student Gender, Professor Gender, and Discipline. Inspection of the means revealed that female students believed their professors were more sensitive to target group issues than male students. In addition, female professors were rated as more sensitive to target group issues than male professors. REGWQ tests showed that professors were judged to be less sensitive to target group needs by students in the Sciences than by students in Social Sciences or Fine Arts/Humanities. Class Analyses Since there was a considerable range in class size (8 to 330 students), the responses from students in the larger classes would have biased the findings concerning individual courses or professors; therefore, mean evaluation scores for each class were calculated and used in analyses with course and professor variables to give equal weighting to all courses. Table 4 shows the mean class evaluation scores and the correlation matrix for the evaluation scales with course variables (class size, number of times the class meets per week, time of day the class meets) and with professor variables (years of teaching, the number of hours the professor meets with students outside of class, professor gender). Results showed that professor sensitivity to Student Needs was correlated positively with time of day in which the class was held (morning versus afternoon/evening). This finding may be an indicator of students' reactions to class size since morning classes tended to be larger classes, and Student Needs were somewhat negatively correlated with class size. Course Structure correlated positively with professor's years of teaching, and Teaching Quality correlated positively with the number of hours a professor met with students outside of class. In addition, professor gender was correlated significantly with Target Group Treatment, and correlated somewhat with Overall Evaluation (i.e., students rated female professors more positively than male professors). Lastly, Overall Evaluation was somewhat negatively correlated with class size, but somewhat positively correlated with the number of hours a professor met with students outside of class. The Canadian journal ofHigher Education Volume XXX, No. 2, 2000 s s g- TO tji 00 * 1 & TO g ^ o Table 4 Correlations of Evaluation Subscales with Professor and Course Variables § § s Course Professor & I Class Meetings Time Evaluation Scale Mean SD Size per week of day Gender s- Student needs 20.26 2.21 -.25 -.23 .38* .24 .44* .29 Teaching Quality 24.77 2.34 -.22 -.11 .03 .25 .26 .49* Target Group Treatment 11.01 1.22 -.22 -.38* .06 .45* .28 .30 Course Structure 22.17 2.36 -.09 -.03 -.05 .15 .38* .25 Total 78.20 7.10 -.22 -.19 .02 .29 .40* .38* s Note: N=28;* Years of Outside Teaching Class Hours p< .05 Correlations b y time of day (morning vs. a f t e m o o n / e v e n i n g ) and gender (male, female) are point-biserial; the remainder are Pearson. Ci 3 2 >! 8» O S S s- Student Evaluations of College Professors 159 DISCUSSION Both the strengths and weaknesses of the present study should be considered when interpreting the results. The study expanded upon Basow's (1995) work by using a more generalizable sample of students, and by including additional variables that may be relevant to students' evaluations of their professors (e.g., student's designated group status, the number of hours a professor met with students outside class). However, only cautious interpretations can be drawn from the results because (as in Basow's study) the student and professor variables still only explained a small amount of the variance in evaluations scores, and there were too few female professors to adequately test differences in evaluation scores based on professor gender. Results showed that students in this sample gave moderately positive ratings to many qualities of their professors' teaching, including style, sensitivity to student needs, and aspects of the course itself. Although one might anticipate variation in student ratings based on student age, year in university, or target group designation (e.g., female, Aboriginal, disability), the only notable effect occurred with respect to course structure, which was moderately related to students' current and expected final grades (i.e., students performing better academically more likely believed their course had clear objectives, relevant readings, and fair examinations). However, students' gender and field of study were closely related to how students evaluated their professor and course. Female students perceived professors to be more sensitive to their academic needs and to the needs of target groups than male students. In addition, female students felt that their professors were better teachers than did male students. S p e c i f i c a l l y , f e m a l e Social Science students and male Fine Arts/Humanities students were more positive than either female or male Science students about their professor's teaching style. In general, Science students were more critical of their professor's teaching style and sensitivity to target group needs than were either Social Science or Fine Arts/Humanities students. These findings are consistent with Holdaway and Kelloway (1987), who found that first-year The Canadian Journal ofHigher Education VolumeXXX, No. 2, 2000 160 K.M. Cramer & L.R. Alexitch Science students were significantly less satisfied with the university experience than first-year Arts students. Similarly, Worth, Crombie, and R i n h o l m (1991) found that Arts students reported their professors engaged in more personalized classroom behaviour (e.g., called on students by name) and had more interactive exchanges with students than Science students reported experiencing. It may be that there is little opportunity for discussion of target group issues in Science classes when compared to Social Sciences or Fine Arts/Humanities classes. In fact, courses in the latter two disciplines often include presentations of women's issues, Native issues, and topics of racism or sexism as part of their curriculum. Thus, students may perceive that their Social Science and Humanities professors are more sensitive to such issues than their Science professors. Furthermore, the Sciences are viewed as traditionally male-dominated fields and this m a y f u r t h e r contribute to students' less positive reactions to their Science professors' sensitivity and teaching. Characteristics of professors and their courses were also associated with students' evaluations of their professors. For instance, professors of larger classes were more likely to be rated less positively than professors of smaller classes, especially in terms of meeting the academic needs of students. C r a w f o r d and MacLeod (1990) also found that class size affected many aspects of classroom climate such as participation rates and perceptions of professors' behaviour. In addition, professors with more teaching experience, and who scheduled more office hours with students, received higher evaluations of their teaching ability and course structure. Lastly, female professors were rated as more sensitive to target group issues than their male counterparts. Implications and Directions for Future Research As in Basow (1995), the variables included in the present study explained only a minimal amount of the variance in students' evaluation scores; however, the findings of both this and Basow's (1995) studies indicate there are multiple influences on students' ratings of their professors, operating either as main effects or as interactions at any one time. That is, the interaction of students' gender, expected and current and The Canadian Journal ofHigher Education Volume XXX, No. 2, 2000 Student Evaluations of College Professors 161 expected final grades, area of study, and additional (as yet undetermined) factors may play a significant role in students' evaluation of their professors and courses. It should be noted that although these variables have little do with a professor's actual teaching ability, they nonetheless may be affecting students' reactions to their professors. Previous research (e.g., Alexitch, 1994; Divoky & Rothermel, 1988; Morstain, 1977; Remigio & Page, 1991) has indicated that students' values, motivational orientation, and their reasons for choosing courses or for pursuing a post-secondary education may affect students' expectations and satisfaction with their courses, programs, and professors. For instance, some researchers (Morstain, 1977; Remigio & Page, 1991) have noted that the congruence between students' values and views about the purpose of education with those of faculty may affect students' level of satisfaction with their university education. In addition, there is evidence indicating that students' academic aspirations and satisfaction may be significantly influenced by informal (outside class) rather than formal (in-class) faculty-student contact (Lamport, 1993; Pascarella, 1984; Pascarella, Terenzini, & Hibel, 1978; Theophilides, Terenzini, & Lorang, 1984). The campus climate for students is largely comprised of many of these aspects: students' perceptions of higher education, academic expectations, programmatic demands and restrictions, and their degree of satisfaction with the people who make up the campus community (Baird, 1990; Pascarella, 1984). Since some or all of these variables may affect students' ratings of professors, future research examining students' evaluations of their professors should endeavour to incorporate additional student, programmatic, and campus-related variables. D'Apollonia and Abrami (1997) advised that while students' ratings of professors were moderately valid, they alone were not sufficient to gauge teaching ability in professors. Therefore, administrators and faculty should consider administrative and programmatic variables (e.g., class size, program requirements) and both professors' and students' gender when using teaching evaluations for tenure and promotion decisions. This focus would effectively increase the validity of students' ratings of professors, and thereby provide a more The Canadian journal ofHigher Education Volume XXX, No. 2, 2000 162 K.M. Cramer & L.R. Alexitch accurate reflection of a professor's teaching. More importantly, students should benefit from higher quality teaching offered by professors who receive more constructive teaching e v a l u a t i o n s . ^ References Undergraduate student expectations and perceptions of a university education in the 1990s. U n p u b l i s h e d d o c t o r a l d i s s e r t a t i o n , Alexitch, L.R. (1994). University of Windsor, O N . Baird, L.L. (1990). C a m p u s climate: Using surveys for policy-making and understanding. New Directions for Institutional Research, 68, 35—45. Basow, S.A. (1995). Student evaluations of college professors: gender matters. Journal of Educational Psychology, 87, When 656-665. B a s o w , S.A., & Silberg, N.T. (1987). Student evaluation of college professors: Are female and male professors rated differently? Journal of Educational Psychology, 74, 170-179. B r a d y , K . L . , & E i s l e r , R . M . ( 1 9 9 5 ) . G e n d e r b i a s in t h e c o l l e g e c l a s s r o o m : A critical r e v i e w of the literature and implications for future research. Journal of Research and Development in Education, 29, 9-19. Carelli, A.O. (1988). Sex equity in education: Readings and strategies. Springfield, IL: Charles C. T h o m a s . Cashin, W.E., & D o w n e y , R.G. (1992). Using global student rating items for summative evaluation. Journal of Educational Psychology, 84, 563-572. C h a m b e r s , T . , L e w i s , J., & K e r e z s i , P . ( 1 9 9 5 ) . A f r i c a n A m e r i c a n f a c u l t y a n d W h i t e A m e r i c a n students: Cross-cultural p e d a g o g y in counselor preparation programs. Counseling Psychologist, 23, 43-62. C o n s t a n t i n o p l e , A . , C o r n e l i u s , R . , & G r a y , J. ( 1 9 8 8 ) . T h e c h i l l y c l i m a t e : Fact or artifact? Journal of Higher Education, 59, 527-550. C r a w f o r d , M . , & M a c l e o d , M . ( 1 9 9 0 ) . G e n d e r in t h e c o l l e g e c l a s s r o o m : A n assessment of the 'chilly climate' for w o m e n . Sex Roles, 23, 101-122. D ' A p o l l o n i a , S., & A b r a m i , P . C . ( 1 9 9 7 ) . N a v i g a t i n g student ratings of instruction. American Psychologist, 52, 1198-1208. D ' A u g e l l i , A.R., & Hershberger, S.L. (1993). African American undergraduates on a predominantly white campus: A c a d e m i c factors, social networks, and campus climate. Journal of Negro Education, 62, 67-81. D e a u x , K., & M a j o r , B. (1987). Putting gender into context: A n interactive model of gender-related behavior. Psychological Review, 94, The Canadian Journal ofHigher Education Volume XXX, No. 2, 2000 369-389. Student Evaluations of College Professors 163 D i v o k y , J.J., & R o t h e r m e l , M . A . ( 1 9 8 8 ) . S t u d e n t p e r c e p t i o n s o f t h e r e l a t i v e i m p o r t a n c e of d i m e n s i o n s of t e a c h i n g p e r f o r m a n c e across t y p e of class. Educational Research Quarterly, 12, 40-45. Feldman, K. (1993). College students' views of male and female college t e a c h e r s : P a r t II — E v i d e n c e f r o m s t u d e n t s ' e v a l u a t i o n s of their c l a s s r o o m teachers. Research in Higher Education, 34, 151-211. F r e e m a n , H . R . (1994). Student evaluations of college instructors: Effects of t y p e of course taught, instructor gender and gender role, and student gender. Journal of Educational Psychology, 86, 627-630. Frable, D.E.S. (1989). Sex typing and gender ideology: T w o facets of the individual's gender psychology that go together. Social Psychology, Journal of Personality and 56, 95-108. H a r v e y , G . , & H e r g e r t , L . F . ( 1 9 8 6 ) . S t r a t e g i e s f o r a c h i e v i n g s e x e q u i t y in Theory Into Practice, 25, education. 290-299. Heller, J.F., P u f f , C.R., & Mills, C.J. (1985). A s s e s s m e n t of the chilly college climate for women. Journal of Higher Education, 56, 446-461. H o l d a w a y , E . A . , & K e l l o w a y , K . R . ( 1 9 8 7 ) . F i r s t y e a r at u n i v e r s i t y : perceptions and experiences of students. Education, 17, The Canadian Journal of Higher 47-63. Horn, P . W . , D e N i s i , A.S., Kinicki, A.J., & Bannister, B. (1982). Effectiveness of performance feedback from behaviourally anchored ratings Journal of Applied Psychology, 67, 5 6 8 - 5 7 6 . H o w e l l , D . C . ( 1 9 9 6 ) . Statistical methods for psychology scales. (4th ed.). T o r o n t o , ON: ITP Nelson. Lamport, M . A . (1993). Student-faculty informal interaction and the effect on college student outcomes: A review of the literature. Adolescence, 28, 971-990. Marsh, H . W . (1980). The influence of student, course, and instructor c h a r a c t e r i s t i c s i n e v a l u a t i o n s o f u n i v e r s i t y t e a c h i n g . American Research Journal, Educational 14, 441^447. Marsh, H.W., & Roche, L.A. (1997). Making students' evaluations of t e a c h i n g e f f e c t i v e n e s s effective: T h e critical issues of validity, bias, a n d utility. American Psychologist, 52, 1187-1197. M c K e a c h i e , W.J. (1997). Student ratings: T h e validity of use. Psychologist, American 52, 1218-1225. Morstain, B.R. (1977). A n analysis of students' satisfaction with their academic program. Journal of Higher Education, 48, 1-16. The Canadian journal of Higher Education Volume XXX, No. 2, 2000 164 K.M. Cramer & L.R. Alexitch M u r r a y , H . G . , R u s h t o n , J.P., & P a u n o n e n , S.V. (1990). T e a c h e r personality traits a n d student instructional ratings in six types of university courses. of Educational Psychology, 82, 250-261. Pascarella, E. (1984). College environmental influences on educational aspirations. Journal of Higher Education, 55, students' 751-771. P a s c a r e l l a , E . , T e r e n z i n i , P . , & H i b e l , J. ( 1 9 7 8 ) . Student-faculty interactional settings and their relationship to academic performance. Higher Education, 49, Journal Journal of 450-463. R e m i g i o , J., & P a g e , S. ( 1 9 9 1 ) . V a l u e o r i e n t a t i o n s in C a n a d i a n u n i v e r s i t y undergraduates. Journal of Psychology and Behavioural Sciences, 6, 160-166. Theophilides, C., Terenzini, P.T., & Lorang, W . (1984). Relation b e t w e e n freshman year experience and perceived importance of four major educational goals. Research in Higher Education, 20, 235-253. T o o t h a k e r , L.E., & Miller, L. (1996). behavioral sciences Introductory statistics for the ( 2 n d ed.). N e w Y o r k , N Y : B r o o k s / C o l e . W i l l i a m s , D . (1990). Is t h e p o s t - s e c o n d a r y c l a s s r o o m a chilly o n e for w o m e n ? A r e v i e w of the literature. The Canadian Journal of Higher Education, 23, 29—42. W o r t h , D . A . , C r o m b i e , G . , & R i n h o l m , J. ( 1 9 9 1 , J u n e ) . Students' perceptions: Gender differences in the university classroom experience. Paper p r e s e n t e d at t h e a n n u a l m e e t i n g o f t h e C a n a d i a n P s y c h o l o g i c a l A s s o c i a t i o n , Calgary, AB. The Canadian Journal of Higher Education Volume XXX, No. 2, 2000
Author
Author