B2A: Leveraging LLMs for student feedback in introductory data science courses


Mine Cetinkaya-Rundel (Duke University)


Abstract

Over the last few years, a considerable challenge for learners and teachers of data science courses has been the proliferation of the use of large language model (LLM) based tools (e.g., Chat GPT, CoPilot, etc.) in generating answers to assessment questions. In some cases, this might be considered a violation of an academic conduct policy, but this is not what this breakout session is about. Instead, in this session, we will share tools built for helping students leverage LLMs to get immediate feedback on their work in an effort to convince them to give it a try themselves instead of immediately turning to LLMs to generate answers for high-stakes assessments. This tool, an R package, allows students to develop their own answers and then use an LLM to evaluate their answers against a rubric that is not, at the outset, visible to them, and receive immediate, detailed feedback on their work. We will discuss details of decisions made when building the tool, both the backend and the user interface, challenges around evaluations that are not done correctly by the LLM, and student feedback from the first set of users. Finally, we will touch on how this tool can be incorporated into the assessment structure of a data science course as a low-stakes assessment tool and discuss ethical considerations for using a tool that leverages LLMs, especially as part of the formal assessment structure of a course.


register