Skip to content

Conversation

@rnichi1
Copy link

@rnichi1 rnichi1 commented Feb 6, 2025

This PR is part of the Graded-By-AI Master thesis and allows for scoring through LLMs.

The relevant changes are:

  1. Adding llm configuration through toml files in exercises
  2. CourseService calling the separate LLM service to score submissions
  3. Adding Database Migrations for the relevant LLM fields


// Polling loop
var attempts = 0
val maxAttempts = 20 // Adjust as needed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make some recommendation for a reasonable range of values from your experience?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A call to the service can take up to 2-3 seconds for longer messages. I think 20 should be enough for most cases, but I can imagine in the case of an exam with more student a higher number would be better (maybe 30 or 40). Then it polls for around 1 minute until it gives up, which should only be an issue if more than 30 students submit at the exact same time and the model has high load so it's slower. From my experience GPT-4o is a bit slower than smaller models like 4o-mini.
In the end this is just an upper limit, it does not make a difference if we set this to 100, the service will just try longer to check the status of the evaluation task.

val assistantResponse = evaluateSubmissionWithAssistant(
AssistantDTO(
question = task.llmPrompt ?: task.instructions ?: "No instructions provided",
answer = studentCode,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I don't understand. It appears that the LLM service is provided with submission.files, which will include both code and the text answer file? I thought it should contain only the text answer (since this is what the rubrics etc. are targetting). Or is it better to include the student's code? Wouldn't it confuse the model if the code is wrong? Also, I think the ACCESS frontend will need to send the text answer separately (i.e. not as part of submission.files, but for example a newly added submission.textAnswer, which we need to add to the Submission model, DTO and frontend? As in, the frontend needs to:

  • check if the task involves a text answer
  • remove the text answer from the files submitted as "regular submission files"
  • add the text answer as an extra llm text answer field to the submission

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a mistake. I now changed it to only take the file that is specified by file path in the config. I think this way we can always take the relevant file and send only that.

modelMapper.typeMap(TaskDTO.class, Task.class)
.addMappings(mapping -> mapping.skip(TaskDTO::getFiles, Task::setFiles));
.addMappings(mapper -> {
mapper.skip(Task::setLlmSubmission);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was it necessary to skip all these? Normally, CourseConfigImporter can rely on the ModelMapper to take care of most fields, but you manually implemented the mapping in CourseConfigImporter instead. Is that necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the Task.class contains each field separately while the TaskDTO contains the llm config in one single DTO object.

@rnichi1 rnichi1 requested a review from sealexan February 15, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants