The Result Assessment Tool (RAT) is a software toolkit that allows researchers to conduct large-scale studies based on results from (commercial) search engines and other information retrieval systems. It is developed by the research group Search Studies at the Hamburg University of Applied Sciences in Germany. The RAT project is funded by the German Research Foundation (DFG –Deutsche Forschungsgemeinschaft) from 8/2021 until 10/2024, project number 460676551.
- For detailed information about the research project and additional resources, visit: https://searchstudies.org/research/rat/
- Information about how to contribute: https://searchstudies.org/rat-how-to-contribute/
- An installation of RAT can be accessed at: https://rat-software.org/
- Datasets generated using RAT and supplementary documentation can be found at: https://osf.io/t3hg9/
- Videos from the RAT Community Meeting are available at: https://www.youtube.com/watch?v=K2Gev8C7Xxw&list=PLiTHQpIQWsZwRaDAgFTANPvI3fHMncXUO
- Overview of the technical implementatio: https://osf.io/5v48w
-
Project Lead: Professor Dirk Lewandowski - https://github.com/dirklew
-
Lead Software Engineer and Developer: Sebastian Sünkler - https://github.com/sebsuenkler
-
Current Frontend Developer and Assistant: Tuhina Kumar - https://github.com/tuhinak
-
Former Frontend Developer: Nurce Yagci - https://github.com/yagci
-
Usability and User Experience Specialist: Sebastian Schultheiß - https://github.com/SebastianSchultheiss
-
Student Assistant for Software Engineering: Sophia Bosnak - https://github.com/kyuja
-
Developers who created extensions for RAT: https://github.com/rat-extensions
- Sünkler S.; Yagci, N.; Schultheiß, S.; von Mach, S.; Lewandowski, D.; (2024) Result Assessment Tool Software to Support Studies Based on Data from Search Engines In: Part of the book series: Lecture Notes in Computer Science https://link.springer.com/chapter/10.1007/978-3-031-56069-9_19
- Sünkler, S.; Yagci, N.; Sygulla, D.; von Mach, S.; Schultheiß, S., Lewandowski, D.; (2023). Result Assessment Tool (RAT): A Software Toolkit for Conducting Studies Based on Search Results. In: Proceedings of the Association for Information Science and Technology https://doi.org/10.1002/pra2.972
- Schultheiß, S.; Lewandowski, D.; von Mach, S.; Yagci, N. (2023). Query sampler: generating query sets for analyzing search engines using keyword research tools. In: PeerJ Computer Science 9(e1421). http://doi.org/10.7717/peerj-cs.1421
- Schultheiß, S.; Sünkler, S.; Yagci, N.; Sygulla, D.; von Mach, S.; Lewandowski, D.; (2023). Simplify your Search Engine Research : wie das Result Assessment Tool (RAT) Studien auf der Basis von Suchergebnissen unterstützt. In: Proceedings des 17. Internationalen Symposiums für Informationswissenschaft (ISI 2023), 429-437. https://zenodo.org/records/10009338
- Sünkler, S.; Yagci, N.; Sygulla, D.; von Mach, S.; Schultheiß, S.; Lewandowski, D.; (2023). Result Assessment Tool (RAT): Software-Toolkit für die Durchführung von Studien auf der Grundlage von Suchergebnissen. In: Proceedings des 17. Internationalen Symposiums für Informationswissenschaft (ISI 2023), 438-444. https://zenodo.org/records/10009338
- Sünkler, S., Yagci, N., Sygulla, D., von Mach, S., Schultheiß, S. Lewandowski, D. (2022). Result Assessment Tool (RAT). Informationswissenschaft im Wandel. Wissenschaftliche Tagung 2022 (IWWT22), Düsseldorf. https://zenodo.org/records/7092079
The repository provides an overview of extensions created by our developer community: https://github.com/rat-extensions
- Imprint Crawler: A web crawler that is able to automatically extract legal notice information from websites while taking German legal aspects into account: https://github.com/rat-extensions/imprint-crawler. Developed by Marius Messer - https://github.com/MnM3
- Readability Score: A Python tool that extracts the main text content of a web document and analyzes its readability: https://github.com/rat-extensions/readability-score. Developey by Mohamed Elnaggar - https://github.com/mohamedsaeed21
- Forum Scraper: An extension to extract comments from German online news services: https://github.com/rat-software/forum-scraper. Developed by Paul Kirch - https://github.com/g1thub-4cc0unt
- EI_Logger_BA: A browser extension for conducting interactive information retrieval studies. With this extension, study participants can work on search tasks with search engines of their choice and both the search queries and the clicks on search results are saved: https://github.com/rat-extensions/EI_Logger_BA. Developed by Hossam Al Mustafa - https://github.com/Samustafa
- Identifying affiliate links in webpages: https://github.com/rat-extensions/Identifying-affiliate-links-in-webpages. Developed by Philipp Krueger - https://github.com/PhilippUDE
- App Reviews Scraper: https://github.com/rat-extensions/app-reviews-scraper. Developed by Tanveer Ahmed - https://github.com/PhilippUDE
- Visualizations of IR measures: https://github.com/rat-extensions/ir-evaluation. Developed by Ritu Suhas Shetkar - https://github.com/ritushetkar
- Scraping News Articles: https://github.com/rat-extensions/NewsArticlesScraper. Developed by Esther von der Weiden - https://github.com/EstherKuerbis/
The source code consists of three individual applications:
- Web Interface (frontend)
- Server backend (backend)
RAT runs on Python and has a PostgreSQL database, the web interface is a Flask app. You can install both applications on one server or split the applications to share the workload, e. g. having 2 backends for scraping on one server and the flask app on another one.
To set up your own version of RAT, you need to clone the repository and follow these steps:
- Download and install PostgreSQL
- Import database
(rat-demo) > createdb -T template0 dbname
(rat-demo) > psql dbname < install_database/rat-db-install.sql
- Install Python
- Create a virtual environment
python -m venv venv_rat
source venv_rat/bin/activate- Install Python packages from the
requirements.txtin the root folder:
python -m pip install --no-cache-dir -r requirements.txt
Access the documentation for the frontend at: https://searchstudies.org/rat-frontend-documentation/
- Create a virtual environment
python -m venv venv_rat_frontend
source venv_rat_frontend/bin/activate- Install Python packages from the
/frontend/requirements.txt
python -m pip install --no-cache-dir -r requirements.txt
- Add own data to config file
config.py
| Setting | Example |
|---|---|
| SQLALCHEMY_DATABASE_URI | 'postgresql://USERNAME:PASSWORD@SERVER/DBNAME' |
| SECRET_KEY | How to generate |
| SECURITY_PASSWORD_SALT | How to generate |
| MAIL_SERVER | server.domain.de |
| MAIL_USERNAME | name@mail.de |
| MAIL_PASSWORD | password |
- Google Mail does no longer allow 3rd party apps to send mails, if there is no other mail adress you can use Mailtrap
- Start Flask
export FLASK_APP=rat.py
flask run
Access the documentation for the backend at: https://searchstudies.org/rat-backend-documentation/
-
Install Google Chrome
Ensure that Google Chrome is installed on your system. You can download it from here. -
Copy Backend Files
Transfer all files from thebackenddirectory to your server. -
Set Up a Virtual Environment
It is highly recommended to set up the backend in a virtual environment. Installvenvand activate it with the following commands:python -m venv venv_rat_backend source venv_rat_backend/bin/activate -
Install Dependencies Install the required packages from the requirements.txt file located in the backend directory:
python -m pip install --no-cache-dir -r requirements.txt
-
Initialize SeleniumBase Run the script initialize_seleniumbase.py to download the latest WebDriver:
python initialize_seleniumbase.py
The RAT backend application consists of three sub-applications, which can be installed separately for better resource management. However, installing all sub-applications on one server is generally recommended.
- classifier: A toolkit for using and adding classifiers based on data provided by RAT.
- scraper: A library for scraping search engines.
- sources: A library for scraping content from URLs.
All applications share the /config/ folder, which contains JSON files for configuring:
- Database Connection:
config_db.ini - Scraping Options:
config_sources.ini
- The backend applications use
appshedulerto run in the background. To start all services simultaneously, use:nohup python backend_controller_start.py & - Alternatively, each application has its own controller if you prefer to run them separately on different machines.