You are here

  1. Home
  2. Research
  3. Developing robust assessment in the light of Generative AI developments

Developing robust assessment in the light of Generative AI developments

Project summary

The Open University has been awarded just under £45,000 by the NCFE Assessment Innovation Fund to carry out research in order to develop evidence-based guidance on the strengths and weaknesses of assessment types in the light of Generative AI (GAI) tools. Through the testing of 55 different assessments, covering a variety of assessment types and subject disciplines, the project aims to discover what are the most robust, and easiest, assessment types for learners to answer if they use the assistance of GAI. The project will also test whether the provision of a simple VLE-based training for markers on detecting AI-generated answers improves their detection abilities.   

Project description

Methodology 

Our research questions are: 

  1. What are the most robust, and easiest, assessment types for learners to answer if they use the assistance of GAI? 
  2. Can simple training improve the ability of markers to identify AI-generated assessment solutions? 

The project is a second, larger-scale project to one conducted in the School of Computing and Communications (Richards et al, 2024). That project established the proof of concept that markers assessing a mixture of genuine student scripts and those generated by AI could provide evidence regarding the robustness of assessment types. It also pointed towards characteristics of AI-generated answers that could be utilised in generalised training to improve all markers' ability to detect such answers. 

The project will involve the testing of around 50 assessments involving 27 different assessment types from all faculties and at all undergraduate OU levels (including access). For the selected questions, a number of answers will be generated by ChatGPT4. These will be mixed with a selection of anonymised authentic student answers selected at random from presentations prior to 22B (at which point the use of generative AI became widespread). ALs will be recruited from the relevant modules to mark the scripts.  They will initially allocate a grade to half of the scripts and indicate whether they consider any were generated by AI.  The markers will then carry out an online (VLE-based) training on detecting answers generated by AI. After the training, the markers will allocate a grade to the remaining half of the answers and indicate whether they consider any have been generated by AI. 

The research has been approved by HREC and the staff research panel.  

Outputs

Analysis of the results from approximately 950 scripts will determine which assessment types are the weakest or strongest in terms of producing answers generated by AI with high / low marks and, to some extent, whether the outcomes differ across different disciplines and levels. It will also enable the project team to determine whether a short piece of training material improves the markers’ ability to identified answers generated by AI.  

The project team hopes to report by summer 2024, and to provide guidance which can be used by module teams across the OU and externally to inform future assessment  design. If the training is effective, we hope to be able to make it available to all OU tutors and possibly to external institutions. 

Impact

This research will enable the development of guidance applicable across all further education and higher education institutions and awarding bodies to promote the use of meaningful and robust, evidence-based assessment. This will impact learners by giving institutions more confidence in continuing to use a variety of different assessment types rather than examination-only assessments, and adapting learning, teaching and assessment in the light of GAI. This will support educators to improve assessment to provide learners with assessment that is fair and inclusive. 

The development of an evidence-based short piece of training to help markers detect GAI answers will assist educators in feeling more confident about the continued use of coursework.  

For further information, please contact the following: 

FBL: liz.hardie@open.ac.uk 

FASS: jonquil.lowe@open.ac.uk 

WELS: mychelle.pride@open.ac.uk 

STEM: mike.richards@open.ac.uk    

References

Richards, M., Waugh, K., Slaymaker, M., Petre, M., Woodthorpe, J. and Gooch, D. (2024) ‘Bob or Bot: Exploring ChatGPT’s Answers to University Computer Science Assessment’, ACM Trans. Comput. Educ vol. 24, no. 1. DOI: 10.1145/3633287. 

Funding body

Assessment Innovation Fund (NCFE)

NCFE logo