At the programme level: how to assess?

Whether you start from the top down (programme level) or the bottom up (module level), having an overview of the chosen assessment tasks at the programme level is important for a number of reasons:

  • It will help to ensure a diversity of assessment tasks across the programme. This is important not for it’s own sake but for allowing for development and demonstration of a range of skills. It can also help to “promote equity and offer an inclusive assessment culture relevant to the needs of a diverse student body”. It is important though that each tool or type of activity allows for valid assessment of the learning, and is properly introduced and framed, so that the student is familiar enough with it to be able to use it effectively (Bloxham and Boyd 2007, 166).
  • As with diversity of teaching and learning techniques, diversity of assessment practice can also help students to recognise future learning opportunities. Outside the formal education environment it will be rare for students to always have their performance assessed in the same way. Experience with a range of methods supports the development of assessment literacy.
  • It will help to ensure that the right learning activities are placed throughout the course in support of the assessment journey. Specific activities around understanding expectations, discussion, feedback and reflection are all beneficial to learning and achievement, particularly if they are planned in at most timely points across a programme (Race and Brown 2005)

At the module level: Choosing an assessment task

“The logic is stunningly obvious: say what you want students to be able to do, teach them to do it and then see if they can, in fact, do it” (Biggs 2003, p207).

Defining what will be demonstrated and measured in each assessment activity will already provide some pointers towards the most appropriate assessment activity: for example, demonstrating the knowledge and skills required to build a great web page might best be done by building a web page, which may be accompanied by an explanatory log describing how theories were implemented and decisions made. In practice though, the selection process is rarely so straightforward. Selecting the best task to assess the right things, at the right level and in the right way is often a complex challenge.

Any assessment design needs to consider what has been described as the “validity/reliability/manageability equation” (Price et al. 2012, p51).

Validity

A chosen task needs to be valid; that is, appropriate to the demonstration of the specified learning outcomes, and measuring what it claims to measure. As Biggs noted, you don’t use a lecture to teach a toddler to tie shoes, nor do you test their skill with a multiple choice exam (Biggs, ibid.). It is important to draw a distinction here between genuine validity: ‘this assessment works for this learning’, and disciplinary traditions or signature pedagogies: ‘this is how this subject area is assessed’. The latter are almost always informed by validity, but the causal relationship is not guaranteed and assumptions should be tested against the learning outcomes in question.

Some key questions to help you check the validity of a task:

  • Does it encourage time on task? Ideally, the idea that assessment can drive student learning would be a positive one, and setting the right task would generate the right learning activities in preparation (Gibbs and Simpson 2004, p15). This is not always the case though. Question-response assessments could lead to cue seeking behaviour. Providing a choice of essay titles might lead to selective coverage of the curriculum. A presentation task for a distance learner might require the use of unfamiliar technology, so that the process of the task takes up valuable time better spent on the content. Ask yourself: what will my learners be doing to prepare for this assessment?
  • Is it relevant? A multiple choice test might ensure that the student can memorise by rote, and produce responses to prompts. But is this relevant – not just in terms of content, but in terms of process – to how they will learn and demonstrate competence in the future? Ask yourself: how would my learners be expected to convey this information after graduation?
  • Is it engaging? This may affect the amount of time and effort your learners are prepared to spend on the task. Although there is debate about the relative values of ‘deep’, ‘surface’ and ‘strategic’ learning (see Race 2005, pp.68-9), there is also evidence that students who have an interest in or care about the issue will use different learning strategies, that may lead to better learning (Bloxham and Boyd 2007 p.17). Using ‘real world’ data, scenarios and problems can help to engage students beyond the intellectual level. Ask yourself: If I were the learner, and I wasn’t being graded, would I still want to know more about this?
  • Does it discriminate on ability, or on disability? Some students will do better at certain kinds of tasks; others will do better at others. But repetition of the same type of task may be disadvantaging some learners based not on their knowledge and understanding, but on their ability to convey the information in the prescribed format (Race 2005, pp.75-6). Ask yourself: If I had a disability, would I find this task more challenging than if I didn’t?

Reliability

The second part of the equation is reliability; that is, there must be consistency for students and staff about the expected standards, and about how these will be measured. This is a tricky thing to demonstrate, given that the grading of all assessment depends to some extent on the academic judgement of staff who, while possibly very experienced, are still human. Demonstrating reliability can be more of a challenge for some tasks than others; it is harder to demonstrate the equal application of grading standards for a presentation or a performance than for a multiple choice test, for example. There are a range of procedures available to mitigate the subjectivity of judgement, including the specification of grading criteria and standards, double marking and moderation. Transparency around these processes can help to evidence that an assessment is ‘fair’.

Manageability

In a recent study of new lecturers’ approaches to assessment, practical factors such as cost, time and high student numbers were perceived to be the biggest constraints to innovation (Norton et al. 2013). In an ideal world, any task that was deemed valid and reliable could be used to assess student learning. In reality though choices are constrained by other factors, such as the availability of rooms, equipment and markers and the restrictions on assessment hours specified in the modular framework. Tasks need to be practicable and scalable.

There is no such thing as the perfect assessment task, but it is possible to reach a balance between these (sometimes conflicting) requirements, to provide a sound basis for assessment. The first step is to recognise which of these elements is the most important for the assessment at hand; this may vary depending on the purpose of the assessment and the learning goal. For assessment of medical procedures, financial and staff time concerns may be secondary to ensuring validity and reliability because of the implications related to success or failure based on these judgements. Yet for assessment of design, reliability may be “less imperative” and so more emphasis will be placed on validity and manageability (examples taken from Price et al. 2012, p51).

Analysis of a proposed option against these requirements may well reveal a conflict, where a task that is strong on one criteria is potentially weak in another. If this can be recognised, then additional measures can be included in the design to mitigate or address this; for example framing tasks more towards application to increase validity, use of second marking or rubrics to increase reliability, providing alternative options for accessibility, etc. (Bloxham and Boyd 2007 pp44-46). This leads to consideration of other dimensions of assessment – with a sound valid, reliable and manageable foundation in place, assessment design can look to other factors discussed above, like authenticity and incorporation of self and peer assessment, to add value to the assessment experience and help address the challenges described in section 1.1. Analysis and planning tools, like the CAIeRO process, or the Assessment Design tool from the University of Exeter, can help with this.

How to assess: the process of assessment design

It is important to remember that assessment design is an iterative process and that any design that is implemented should be evaluated and adjusted periodically. See the assessment lifecycle outlined on the MMU website for more information.

Creating assessment briefs

“Teaching is about effective communication, not playing word games” (Race 2005, p.94)

Effective communication of expectations is crucial to any assessment task. Students should be focusing time and effort on fulfilling the assessment brief, and not on working out what it means. In addition to clarity about the nature of the task, it is important to be clear about the purpose – why is it important that a graduate of this programme be able to do this to the required standard? How will this task help you as a learner to reach that standard?

Setting and communicating expectations for assessment should be a dialogic process, involving more than simple transmission of an explanatory document. Discussing the nature and purpose of the task with students can help them engage and feel ownership towards the task. Asking them to suggest other ways that might be appropriate for measuring the learning, and discussing why you have chosen the task you have can help them to feel more like partners in the process, rather than simply being subject to it (McDowell and Sambell, 2014). Feedback from students can help you to evaluate both the task itself and the guidance you provide for it.

Assessment documentation in UK institutions “remains quite technical and procedural in nature and is slow to catch up with the shift in thinking about assessment as a more developmental process for as well as of learning” (Ferrell 2013, p6). Assessment briefs should explain the why and the how, and also how this assessment fits in with learning and feedforward from previous assessment(s), with skills development, and with the learner’s development as a professional. The Assessment Brief Design project from Oxford Brookes gives detailed guidance on writing clear and targeted briefs.

Writing grading criteria

Grading criteria form the “aspects of an assessment task which the assessor will take into account when making their judgement”, and should “follow directly from the learning outcomes” (Bloxham and Boyd 2007, 60-1). This means that if your learning outcomes state that the student should know or be able to do x, then your assessment criteria should examine the demonstration or evidencing of x (and not y or z).

Criteria here are distinct from standards, which form the measure of how well the student demonstrates x. Standards are “established by authority, custom or consensus” (Sadler 2005, 189), and are usually determined with reference to the Framework for Higher Education Qualifications, as well as relevant subject benchmarks and professional standards where applicable. Marking schemes or rubrics usually show a combination of the two, and break down broad qualitative criteria into (as far as possible) objective descriptors of levels of achievement, in order to provide transparency into the process of academic judgement, both before and after grading (Sadler, ibid).

Provision of marking schemes to students in advance of the assessment can help them to engage with and understand the grading process; however, as with any other kind of learning (and as with the assessment brief guidance above), development of understanding should be an active and dialogic process:

“Bring the benefits of tone of voice, body language, and eye contact to bear upon the clarity of the marking criteria. Explain them in lectures and tutorials, face-to-face with students. Ask students to ask you questions about how the marking criteria work in practice.” (Race and Pickford 2007, p.138)

Discussion or even negotiation of grading criteria can be a useful learning exercise, but it is possible to go beyond this. Asking students to actively use the marking scheme, either in self- and peer-assessment or in marking of model answers can help learners to translate the “necessarily abstract or generic” language of a rubric into grading practice, and also help to challenge their internal assumptions about quality (Hendry and Anderson 2013, 764; Nicol 2010a, 505-6).

Designing for uncertainty

Having emphasised the importance of clarity and understanding of expectations, it is important to note that this is not necessarily equivalent to designing a ‘closed’ task or limiting the potential range of responses. Students may need to be able to be demonstrate that they can use new and unexpected information effectively, and that they can apply what they have learned in new contexts; if these are required by the learning outcomes, and developed through learning and teaching and formative activities, they can be a valid part of the assessment process. In these cases, where “flexibility and adaptability are key outcomes’, it is more important than ever that this expectation is made clear (QAA 2012a, p23).

Enquiry-based or problem-based approaches, working with “incomplete information” or being asked to produce “multiple solutions” may all form part of a valid task to assess these skills. All of these approaches are designed to create uncertainty, but may also provoke anxiety for inexperienced students. Dialogue, along with formative opportunities, can help to address this.

Inclusivity

Although discrimination has been mentioned briefly in the section on validity, it is worth calling out as a separate consideration in the process of assessment design.