Coronavirus (Covid-19): Latest updates and information
Skip to main content Skip to navigation

Module Evaluation Learning Circle (on-line resources)

Academic literature

Key journals are:

Sections below group some relevant articles. Links are Warwick-only access links that should allow you to view the full article (you'll be prompted to sign-in).

Relevant Special issues

Relevant literature

List of relevant academic literature very roughly organised into topic areas.

General module/course evaluation

TODO. Need to identify more overviews of state of the art.


Many authors have studied the different modes of evaluation - whether paper or on-line. Issues such as selection bias and measurement errors arise in literature covering the shift to on-line, with many authors exploring the potential consequences. Evidence for the reduction in response rate when evaluations are moved on-line but many of these are small-scale studies with questionnable experimental design. Even though much of the literature concerns small-scale studies it nevertheless contain useful suggestions about managing the transition or dual delivery, e.g. Berk (2013), and Ravenscroft & Enyeart (2009). Nulty (2008) also gives useful practical advice on boosting response rates.

Authors such as Kordts-Freudinger & Geithner (2013) and Treischl & Wolbring (2017) emphasise that most studies confound change of mode and context (in class/out of class).

Questionnaires and questions

Several articles listed below, such as Kember and Leung (2008), analyse the questions and types of questions that can or should be asked for validity and reliability. Huxham et. al (2008) compare the questionnaire against other strategies for gathering student feedback such as focus groups, rapid feedback and reflective diaries.

Student perspectives and perceptions

Several studies have identified a link between student attitudes towards the evaluation and the success of the system. Most studies state that students are motivated to participate when there is an expectation of being able to provide meaningful feedback, and that the impact of that feedback can be observed.

Chen and Howsomer (2003) found that students consider an improvement in teaching quality the preferred outcome of an evaluation process. Improvements to course content and format was considered the second most preferred outcome. Using the evaluation to impact tenure/promotion/salary, or making the results available for students' decisions on course and instructor selection were considered less important by students.

Beran. et al (2009) has a useful discussion beginning on page 524 about what students consider important in rating courses.

Staff perspectives and perceptions

Rienties (2014) gives a useful perspective on the Open University's transition to an on-line evaluation system and in particular staff attitudes. Rienties conducts a series of interviews with staff and discusses their cognitive understanding of why evaluation is now taking place on-line but juxtaposes with their emotional reaction and aversion to the transition.

Edström (2008) discusses the perception of course evaluation as a 'fire alarm' function, rather than having a course development role.

Moskal, Stein and Golding examine to what extent the technology influences staff engagement with evaluation and show how the practical elements of the solution influence overall engagement.


Student evaluation of modules and courses is multidimensional. The SEEQ (Students' Evaluation of Educational Quality) instrument was first published in early 1980's and has been researched extensively. The instrument aims to measure several dimensions including learning, enthusiam, organisation, group interaction, individual rapport and breadth. Much of the published literature refers to SEEQ or the Experiences of Teaching and Learning Questionnaire.


Many authors consider potential sources of bias including gender, expected grades & outcomes, class size, prior knowledge, difficulty and workload. Marsh (2007) has a good summary of findings but also notes "The voluminous literature on potential biases in SETs is frequently atheoretical, methodologically flawed, and not based on well-articulated operational definitions of bias, thus continuing to fuel (and to be fuelled by) myths about bias".

Marsh (1987) identifies four factors that were important to predicting a student's evaluation: prior interest of the student in the subject, expected grades, perceived workload and rationale for selecting the module.

There is a body of evidence claiming bias or impact on evaluation for a number of factors. Here are some examples. Mode of evaluation is also considered relevant but covered in an earlier section.

Factors related to teachers:

Factors related to students:

Likert scales

Revilla et. al (2014) test and discuss the impact of increased categories on the validity and relaibility. Their results show that agree-disagree scales should be offered with 5 options rather than 7 or 11 options which offer poorer quality results. Lozano et. al (2008) claim the optimum number is between four and seven options. Fewer than four the validity and reliaiblity decreases, and more than seven the relaibility doesn't increase significantly.

A general scan of the related research indicates the consensus is that four options are the absolute minimum to ensure reliability but some disagreement otherwise whether five or seven are preferable. In reality the difference in validity from either five or seven options is minimal.

Middle option: Kalton et. al (1980) consider the impact of including a middle response and conclude that the presence of the middle option reduces the extreme responses given. Sturgis et. al (2014) look at whether the respondent selection of the middle option represents a neutral response or the lack of cognitive choice/no opinion. Their study follows up with respondents who chose the middle option concluding most often it represent a "don't know" option. Including/excluding middle results representing no cognitive choice can then significantly impact on the analysis of results.

Ordering: Hartley and Betts (2010) show how the order of the options impacts the outcome. A descending scale (10 to 0, agree to disagree) results in consistently higher ratings compared to an ascending scale.