This repository was archived by the owner on Dec 15, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
Data processing
Anastasiia.Birillo edited this page Dec 7, 2020
·
3 revisions
- The source data has to be in the .csv format.
- Activity-tracker files have a prefix ide-events. We use activity-tracker plugin.
- TaskTracker files can have any names with a prefix of the key of the task, the data for which is collected in this file. We use TaskTracker plugin at the same time with the activity tracker plugin.
- Columns for the activity-tracker files can be found in the const file (the ACTIVITY_TRACKER_COLUMN const).
- Columns for the task-tracker files can be found in the const file (the TASK_TRACKER_COLUMN const).
The correct order for data preprocessing is:
- Primary data processing. See documentation.
- Merge activity-tracker and task-tracker files. See documentation.
- Find tests results for the tasks. See documentation.
- Reorganize files structure. See documentation.
- [Optional] Remove intermediate diffs. See documentation.
- [Optional, only for Python language] Remove inefficient statements. See documentation.
- [Optional] Add int experience column. See documentation.
Note: you can use the actions independently, the data for the Nth step must have passed all the steps before it.
- C++
- Java
- Kotlin
- Python