This project contains all the scripts used to conduct the experiments presented in the TSE papar titled "An Empirical Study of Type-Related Defects in Python Projects"
The paper conducts an empirical evaluation of 400 defects in 211 unique Python projects. The following table represents the github owner-repo links, the number of contributors, the stars and the size of the project in LOC.
- The sql_queries folder contains the sql queries used to fetch the data from GitHub archive hosted on the googles Big Query platform
- The experience folder contains all the code used to calculate the experience counts of the authors.
- The project_statistics folder contains the code used to extract the table above.
- The entire_corpus_python_patched folder contains all of the github issues with atleast one python file in the pull request. This is where we select our sample from.
-
The final list of github issues and their anlaysis can be found at https://docs.google.com/spreadsheets/d/1iJ9WZ0v5eE1U7YYttKKetNIFLpSGARVf7-4y0LxhB34/edit?usp=sharing
-
The entire corpus of dataset, along with the annotations, this github repo, the sample issues and their analyses can be found at the zenodo link https://doi.org/10.5281/zenodo.4052466