This is the project material for the thesis project "Analysis of the relationship between RNA and RBPs using machine learning", a study conducted with Umeå University as a collaboration for the course 2DT00E (DiVA: http://lnu.diva-portal.org/smash/record.jsf?pid=diva2%3A1596786). The following content consists of the implemented python scripts, used dataset, and obtained results during the project development.
To run the Python scripts, the following modules libraries are required to be insalled:
- random
- PyTorch
- NumPy
- Matplotlib
- scikit-learn
- Numba
Anaconda comes pre-installed with most of these libraries except PyTorch and Numba.
No installation is required to run the script. However, the Bag of Word and the UTR sequence 3 k-mer data needs to be unzipped beforehand. The file directory also needs to stay unchanged.
The scripts can be executed from the command line. But to get the best user experience and to modify the settings of the model, the script should be executed in an IDE.
Mattias Wassbjer