Skip to content
This repository was archived by the owner on Dec 15, 2025. It is now read-only.

Data processing: primary data processing

Anastasiia.Birillo edited this page Dec 7, 2020 · 3 revisions

Description

We use TaskTracker plugin and activity tracker plugin to gather the source data. The data gathering consists of us collecting code snapshots and actions during the solving of various programming tasks by students. The data also contains the information about the age, programming experience and so on of the student (student profile), and the current task that the student is solving. At this stage, the test files that were created during the testing phase are deleted. They have ON value in the test mode column in the TaskTracker file. Also, the student could send several files with the history of solving the task, each of which can include the previous ones. At this stage, unnecessary files are deleted. Ultimately, there is only one file with a unique history of solving the current problem. In addition, for each TaskTracker file, a unique file of the activity tracker is sent. In this step, all files of the activity tracker are combined into one.

Codetracker plugin

The TaskTracker plugin allows to collect the sequence of code snapshots, as well asking the user about their age, programming experience, gender, and country. TaskTracker files can have any names and contain the following information:

  • date of an action;
  • timestamp of an action;
  • name of the edited file;
  • hash code of the edited file;
  • the current code fragment;
  • the current chosen programming task;
  • test mode;
  • student's id;
  • student’s age. Available values: 1-100;
  • student's programming experience in years. Available values: >= 0;
  • student's programming experience in months (if programming experience in years is zero). Available values: 0-11;
  • student’s country;
  • student’s gender;

An example of the TaskTracker file can be found here.

Activity tracker plugin

The activity tracker plugin allows to track and record IDE user activity. The list of the columns and activities can be found here.

An example of the activity tracker file can be found here.

Usage

Use preprocess_data method from preprocessing.py.

Argument Description
path path to the directory with files

The root directory must have the following structure:

-root
--user_N1
---task1
----user_N1_files
--user_N2
---task1
----user_N2_files

The requirements for the user's files:

  • All files have to be in the .csv format.
  • Activity tracker files should have the prefix ide-events.

Clone this wiki locally