The committee made two initial assumptions, which we feel should be made explicit in our report. Firstly, we took as given that there is widespread, although varying in degree, dissatisfaction with the current FCQ instruments among faculty and students. We make no attempt to document or provide evidence of this, since one only has to ask a few faculty before a picture of discontentment emerges. Some feel that we should start this report with a litany of complaints against the current instruments, but we are reluctant to do so. A very long list could be compiled on the basis of reviewed research alone. However, many would feel that such a list did not address their particular concerns as the opinions and recommendations put forth on this issue are as varied and diverse as the faculty itself. Possibly the only clear statement that can be made is that there is general dissatisfaction, not merely with the forms, but also with the way in which the results have been used. However, the root causes remain elusive. Secondly, we accept as fact that student input in the faculty evaluation process is necessary and vital to that process. It is inconceivable to us that a valid evaluation can be obtained without factoring in student perceptions. However, we reiterate that an overall evaluation of teaching on the basis of student opinion is inadequate. Effective evaluation must have assesments of multiple components such as rigor, content, student learning and good pedagogical practices in addition to student opinion.
Mandated by the Board of Regents in 1986, faculty and course assessment is to serve three functions as indicated in this section of the enabling motion:
"RESOLVED: That beginning with the fall semester, 1986, Faculty-Course Evaluation shall be implemented at the University of Colorado for all courses and their sections offered by any of the University of Colorado campuses. Each campus shall design an evaluation form that meets that individual campus's specific needs so long as such forms are uniform for that campus, include evaluation of individual faculty, and are adaptable to either the individual campus's research and testing services or such services that exist at the Boulder campus. The evaluation system shall be designed to provide published information to students, faculty, departmental administration, and the University's administration. ..."This mandate appears to assume that the three constituencies: students, faculty and administration, all can obtain the information they need and/or want from the same form. We call this assumption into question, and believe that it is the root cause of the most significant problems with the FCQ's. In what follows, we will use the term "faculty" to refer to all course instructors, be they full-time faculty, part-time faculty, graduate students, etc. We believe this use of the term in its full generality is in keeping with the spirit of the Regental mandate.
The administration's need is to have student input into the evaluative processes required for merit raises, promotions and tenure. This is the chief concern of the primary units, but higher levels of administration also look at this data. The desire is to have simple quantifiable results that can be used to compare faculty members to each other. In the current jargon, the information required is "summative" in nature, used in the process of evaluation. The faculty perspective is quite different. The faculty would like information from their students that would specifically indicate what the students liked and disliked about both the course and their own instructional performance. This information needs to be specific and detailed in order to identify what needs to be changed and/or improved. Such information requires a questionnaire with a different or "formative" quality. Finally, from the student perspective, there is a need to obtain specific information about both the course and the instructor which will be useful in making decisions about what courses and/or instructors to take in future semesters. The issues here center around items such as clarity of presentation, the amount of work required for the credit obtained, the "value" of the course, and the fairness of the instructor in dealing with students. For lack of a better term, we shall call this the "informational" nature of student evaluations. Based on our experiences and supporting research literature, we have concluded that the information needed by each of the three constituencies is not satisfactorily generated by the current FCQ instruments.
While each aspect of the evaluation has its primary audience, there is no doubt that all constituents have a keen interest in all aspects. However, this interest should not over-ride the primary functions. If a constituency is not getting what it needs from the evaluation process, there will be a general discontentment with that process among that constituency. We believe that the discontent that we are seeing with the current instruments stems from the fact that they are not adequately providing what the administration, faculty and students need them to provide.
This brings us to our first and most fundamental recommendation.
The summative and formative functions of student assessments should be separated. That is to say, there need to be two separate evaluations by students. The first is a formative evaluation of the faculty member carried out at least once a year and the second a combined summative evaluation and gathering of information on the course and instructor.
The nature of a formative evaluation dictates certain procedural changes. First of all, it makes no sense to administer such an evaluation at the end of a course when the feedback obtained can not be acted on. It is much more reasonable to administer such an instrument near midterm. It is also not necessary to have this done in every class, nor in every semester. The research indicates that a well constructed formative instrument produces a fingerprint which varies little from one class to another. We would suggest that it would be appropriate to have each faculty member administer such a formative instrument in at least one class, per academic year. Because this approach specifically asks students for constructive criticism and because such comments are sometimes very extreme, it is essential that the results of this evaluation be a strictly private communication from the students to the instructor, and as such they should not be publicly available or accessible in any form, nor used in any way by anyone other than the faculty member. It has been shown (McKeachie and Kaplan, 1996), that the most effective results occur when the formative evaluation is done in a consultational situation, that is, when there is another faculty member, or better yet, a specialist in teaching effectiveness, who is not involved in any summative activity and can help with the selection of the evaluation instrument, the administration of it, and the review of its results. We would highly recommend that such a procedure be the norm.
This approach is open to the criticism that a separate formative evaluation increases the time and money devoted to the evaluation process. We do not deny that this is true, but as in all policy decisions, one must weigh the possible benefits with the total cost. We see truly significant benefits, well documented in the literature as well as actual cases on our own campuses, arise from a purely formative evaluation that are not adequately accomplished by the current FCQ. The aim of the formative evaluation is improvement of teaching, surely this is worth the small classtime and resource investment that we are suggesting.
Since most faculty members can currently avail themselves of such a formative instrument, but few do (although many more presumably collect formative assessments from students via their own devices), it is clear that something more than suggestion is needed to improve teaching. We would like to see an atmosphere developed on all campuses in which formative evaluations would be taken as the norm, where faculty would desire them and students would come to expect them.
We firmly believe that the implementation of the above proposal would be the single most effective action that can be taken which would lead to a radical improvement of the quality of teaching at the University of Colorado.
The concept of using a separate formative evaluation is not very common and many will resist the idea because it involves change. They should be reassured that if this change is not effective, it should be dropped. To that end we propose that:
After a five year trial period the entire evaluation program should be carefully reinvestigated. If marked improvements are not apparent, alternative measures should be taken.
Since 1986, the form and processes of student assessments of instruction have been the same at Boulder, Denver, and Colorado Springs, via the FCQ, with Boulder providing processing services to Denver and Colorado Springs. This practice has led many to assume that student assessments of instruction must be, or should be, or inevitably will be, the same on all campuses, or all general campuses. We find this misperception to be a root cause of significant problems with the current system, because it has hampered review, revision, and ownership by individual campuses. At the time that the current FCQ system was put in place in the mid-80's, there was no real choice except to use Boulder's facilities and forms. It is now possible for the other campuses to handle this administrative chore themselves. With a redesigned form, considerable savings in ongoing operating costs can be made this way.
In order to carry out this task it will be necessary for the individual campuses to establish or re-establish appropriate student-faculty-administration standing committees. This was called for in the Regental mandate, but at present there seem to be only a few ad hoc groups dealing with issues surrounding student assessments of courses and instructors. Also, whereas the original motion called for a system-wide committee as well, we see no clear ongoing role for such a group and therefore suggest that whatever duties arise should be delegated to EPUS. We therefore recommend that:
Each campus should immediately act to be in compliance with
the
Regent's motion concerning campus committees, to wit:
" Individual campus committees shall also be established to
oversee the design, implementation, and information distribution
process of the Faculty-Course Evaluation for each campus. The
campus committees shall consist of three students appointed by
the campus student government, three faculty members appointed by
the campus faculty assembly, one member from the campus office of
teaching effectiveness or campus equivalent, and one member from
the campus Chancellor's office. Student and faculty appointments
will give special consideration to draw on the broadest
representation among the schools on the campuses."
Furthermore,
The CU system office should provide the four campuses with sufficient start-up or retooling funds to establish their own collection, analysis, and reporting systems.
Note that the 1986 mandate directs that "monies shall be made available from the general fund allocation [for] establishment and operating funds for ... campus ... committees [and for] expansion and operation ... of the campus offices of research and testing or campus equivalents."
The 1986 Regental mandate does not mention the purpose(s) of student ratings of instruction, saying only that assessment systems should "meet the individual campus's specific needs." It does mention all three constituencies, thereby implying summative, formative, and informational uses.
While it is certainly true that there is overlap among the summative, formative, and informational components, we do not believe that they are all satisfactorily represented in the current FCQ instrument. Furthermore, the research literature says it is very difficult to do so in one instrument.
We therefore recommend that campus committees design assessment systems so as to ensure that all three components -- formative (diagnostic, for improvement), summative (evaluative), and informational -- are handled effectively. It will be vital to this effort to use the information that is currently available in the research literature (See appendix 1).
2. What assessments should be required of or offered to which courses and instructors, with what frequency and timing?
The Regental mandate specifies implementation "for all courses and their sections offered by any of the University of Colorado campuses." While it does not speak to formative, summative, and informational assessments, its use of the word "evaluation" implies that it was probably intended to cover summative assessments at least.
Despite the common use of the FCQ, the general campuses have different current policies on coverage. Boulder requires use of the FCQ in all primary sections every fall and spring, but requires only "systematic student input" in secondary sections such as labs and recitations. Some of these use the FCQ, some do not. Use in summer is at the option of the department or instructor. Individual instruction (thesis, independent study) is excluded entirely. Policies at Denver and Colorado Springs are similar but require FCQs in summer as well as fall and spring.
Each campus should consider the utility of summative and informational assessments of very small courses (e.g., enrollment five or under, where confidentiality is difficult to protect and the informational value is small); of summer courses; and of courses taught by visiting faculty, TAs, and perhaps others. Each campus should also consider how continuing education courses and alternate delivery courses (tele-courses, on-line courses, etc.) should be covered.
The campus committees must also acknowledge both logistical concerns and students' interest in instructional improvement as they consider issues of coverage. For example, complex assessment cycles can be difficult to administer, and students in courses with no formative assessment may feel slighted.
3. Structure collection methods to enhance data integrity and to take advantage of new technologies such as the web.
Sloppy or laissez-faire administration (all too common at present) leads to poorer results. In-class administration procedures need to be attended to.
While there are currently technological difficulties concerning accessibility and anonymity in a web-based collection system, these will eventually be solved. The advantages of having a web-based system, already used in some courses at Denver and Boulder, will make this a very attractive prospect in the future. Campuses should start thinking about moving in this direction and preparing pilot programs, while monitoring response rates carefully.
4. Design instrument(s) with careful thought to the wording of any questions calling for ratings, to the rating scales themselves, and to collection and use of student comments.
In designing assessment processes and instrument(s), campus committees should be cognizant of research done on the types of questions used (see Appendix 1). They must balance the need for comparability across courses (and the mandate for forms "universal" to the campus) with the need for different questions in different kinds of courses. The appendix lists several concise documents which address question construction; commercial instruments are also available.
When an instrument asks students to make ratings, the rating scale itself should be carefully considered. The FCQ uses a "grade" scale, A to F, defined as "very good" to "very poor." Use of the grade scale may contribute to the "top-heaviness" of current responses, where the two top values (A and B) are selected 70% of the time, the remaining three values only 30% total. Boulder is planning to experiment with a revised rating scale in some fall 1999 courses.
The current FCQ collects comments, as well as ratings, from students. The comments are not seen by students, but only by the instructor and (in some cases) administrators. In contrast, many private institutions, including Northwestern, Harvard, and Brown, publish student comments on selected items of interest to students contemplating taking the course. The comments to be published are generally selected, or sampled, or summarized, and/or edited, by administrators and/or student committees. This could obviously be both time consuming and contentious. However, comments can provide a vividness to the results that ratings cannot match no matter how they are presented. Web collection would solve the problem of transcribing comments, making this an option that campus committees should consider.
5. In reporting results, provide context for interpretation and use vivid, user-friendly presentation methods.
Current FCQ reports list distributions, means, standard deviations, and percentile comparisons to the department and campus. Listings for students are limited to means only. Campus committees should consider methods of a) reporting summary measures other than the mean (e.g., medians, trimmed means); b) reporting distributions to students in addition to means; c) providing material to help users understand the effects of contextual variables such as course size, course level, reasons for taking the course, and grades assigned; d) presenting results graphically; and e) making it easier for students to compare results from different instructors and courses.
6. Consider dropping the name "Faculty Course Questionnaire."
Although it may seem like an insignificant point, we feel that it would have a great psychological impact if the name of the FCQ were changed. A name change may be more descriptive and serve as a constant reminder to those who use the data that it is only one component of a larger teaching evaluation process. Different names on each campus would help emphasize that student assessments are done in a campus context. Names used commercially and at other institutions generally include something about "teaching," "instruction," "courses," "instructors," and/or "faculty," the word "student," and something about "ratings" or "assessments" or "satisfaction" or "evaluation."
Cashin (1995) recommends using "ratings" rather than "evaluation" to emphasize that students simply say what they think, and others use the information for evaluation.
7. Provide training for administrative users of student assessments of instruction.
Major conclusions from the literature on student ratings of instruction should be provided to evaluators, and perhaps to instructors themselves as well. A "myths, facts" format might be appropriate. Boulder has started development of a series of web documents to be supplemented with and announced by periodic brief emails. The web documents will be modelled after, and borrow from, similar documents used at the University of Michigan.
Campus committees should also consider conducting regular workshops for deans, chairs, and/or evaluation committee members on how to interpret student ratings and comments, and on appropriate use of contextual information such as class size and grades, and on the proper role of student assessments in the evaluation of teaching.
8. Design faculty development structures that assist and support faculty in successfully meeting the standards of any evaluation schemes.
We should not forget that the ultimate aim is the improvement of teaching. Evaluation by itself does not do this, it can at best just point out areas that need work. The best evaluation system imaginable, if not coupled with a development system, will remain ineffective. In a Total Learning Environment, one should expect to see evaluation and developmental systems working hand in glove.
We have kept this report and its list of recommendations
short in order to focus on what is really essential,
reconstruction of the campus committees and a decoupling of the
formative and summative aspects of student assessments of courses
and instructors. These
recommendations are consistent with those in EPUS's
Report on Multiple Means of Teaching Evaluation (1991). At
this
time, with a narrower focus and a more fleshed out proposal, we
would hope that some headway can be made. Clearly, the mandating
of a separate formative evaluation will not be a popular idea
amongst the faculty, so EPUS must insure that its message gets
out in a very clear form and provides ample time to discuss the
idea. It should find faculty who have undergone such
formative evaluations who would be willing to talk about their
experiences to other faculty members. Campus "town meetings"
should also be a part of a campaign to sell the idea to the
general faculty.
Respectfully submitted,
Bill Cherowitzo
Chair, EPUS FCQ subcommittee
James Murphy, HSC Faculty
Bruce Neumann, UCD Faculty
Michael Grant, UCB Faculty
Mark Malone, UCCS Faculty
Ed Nuhfer, UCD Faculty
Lou McClelland, UCB Administration
Michael Quashigah, UCD Student
Tara Friedman, UCB Student
Scott Allan Zakon, UCB Student
Michelle McWhinney, UCCS Student; Chair, Intercampus Forum
Cashin, W.E. Student ratings of teaching: The
research revisited. IDEA paper No. 32,
September 1995, Center for Faculty Evaluation and Development,
Kansas State University.
McKeachie, W.J., and Kaplan, M., 1996. Persistent problems in
evaluating college teaching.
AAHE Bulletin, February 1996, pp 5-8.