Skip to content

The API being developed as part of the Facilities Data Pipeline (a kind of IDS replacement)

License

Notifications You must be signed in to change notification settings

ral-facilities/datastore-api

Repository files navigation

Build Status Codecov

Datastore-API

The Datastore API accepts requests for the archival or retrieval of experimental data. These trigger subsequent requests to create corresponding metadata in ICAT, and schedules the transfer of the data using FTS3.

Development

Environment setup

To develop the API Python development tools will need to be installed. The exact command will vary, for example on Rocky 9:

sudo yum install "@Development Tools" python3.11-devel python3.11 python3.11-setuptools openldap-devel swig gcc openssl-devel xrootd-client pipx

Configuration is handled via the config.yaml and logging.ini config files.

cp logging.ini.example logging.ini
cp config.yaml.example config.yaml

If installing on other Distros (e.g Rocky 8), pipx doesn't install as easily. So you have to install pipx seperately. The Documentation goes over the steps. You can run the @Development Tools commands normally:

sudo yum install "@Development Tools" python3.11-devel python3.11 python3.11-setuptools openldap-devel swig gcc openssl-devel xrootd-client cmake3 libuv-1:1.41.1 libuuid-devel

In order to run against an external S3 storage instance (as opposed to using a Dockerised MinIO), you will also need to generate and set access and secret keys. This can be achieved using the STFC Cloud in combination with Echo:

  1. Generate credentials for the OpenStack CLI
  2. Generate credentials for Echo
  3. Set the access and secret keys in a valid settings location along with the url for the S3 storage (such as config.yaml).

In order to authenticate against the FTS storage endpoints, you will need an X509 certificate. This can be generated by request, and then set in the location expected by the API's default settings (hostcert.pem and hostkey.pem).

Note that in order to run the full suite of tests, these settings will need to be present. The S3 keys are not specified in pytest.ini, and so Pydantic will attempt to load them from another location such as the config.yaml. If they are not set anywhere then validation will fail, and only tests which mock outgoing requests will be possible.

Poetry

Poetry is used to manage the dependencies of this API. Note that to prevent conflicts Poetry should not be installed in the environment used for the project dependencies; different recommended installation methods are possible. (Note that pipx may not install the latest version)

The official documentation should be referred to for the management of dependencies, but to create a Python development environment:

pipx install poetry
poetry install

Nox

Nox is used to run tests and other tools in reproducible environments. As with poetry, this is not a direct dependency of the project and so should be installed outside the poetry managed virtual environment. Nox can run sessions for the following tools:

  • black - this uses Black to format Python code to a pre-defined style.
  • lint - this uses flake8 with a number of additional plugins (see the included noxfile.py to see which plugins are used) to lint the code to keep it Pythonic. .flake8 configures flake8 and the plugins.
  • safety - this uses safety to check the dependencies (pulled directly from Poetry) for any known vulnerabilities. This session gives the output in a full ASCII style report.
  • tests - this uses pytest to execute the automated tests in tests/. Note that this will require a running ICAT (see below).
    • unit_tests - as above but only runs tests in tests/unit, which mock dependencies on other classes and external packages.
    • integration_tests - as above but only runs tests in tests/integration, which will not mock and therefore requires services such as ICAT and FTS to be running.

To install:

pipx install nox

By executing

nox -s [SESSIONS ...]

Docker

A full ICAT installation with Payara can be used by following the standard installation tutorial. Alternatively, Docker can be used to create and manage isolated environments for the services needed to test the API.

To install Docker for the RHEL operating system from the rpm repository, run:

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/rhel/docker-ce.repo

This will setup the repository and install the yum-utils package. To install the latest version of Docker, run:

sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

(Other installation methods can be found in the official documentation).

Start Docker Daemon

sudo systemctl start docker

To start the services on which the API depends:

sudo docker compose --profile=dependencies up

To run the API:

sudo docker compose run datastore-api

To run all code checks (note that these do not require the dependency containers to be running):

sudo docker compose --profile=checks up

To run the tests:

sudo docker compose run tests -v /path/to/config.yaml:/app/config.yaml

ICAT Setup

First we need to make sure the Virtal Environment is setup correctly by running the following commands:

ls ~/.cache/pypoetry/virtualenvs/

After getting the name of the directory datastore-api-XXXXXXX_-py3.11

source ~/.cache/pypoetry/virtualenvs/datastore-api-XXXXXXX_-py3.11/bin/activate

Following the above commands will create containers for ICAT and the underlying database, but not any data. The requests to the API implicitly assume that certain high level entities exist, so these should be created:

icatingest.py -i datastore_api/scripts/metadata/epac/example.yaml -f YAML --duplicate IGNORE --url http://localhost:18080 --no-check-certificate --auth simple --user root --pass pw

If desired the entities in example.yaml can be modified or extended following the python-icat documentation. There are other files which will create multiple entities, if needed.

To verify that the entities are created correctly, the usual methods of inspecting the database (either by the command line within its container or via DB inspection software) or running commands against ICAT (via curl or a full stack including frontend) are possible, however for these use cases the ICAT admin web app offers a simple and quick method of verifying the entities are created. The full address and port (e.g. http://localhost:18080 if using Docker to forward the container port to the host machine as described above) is needed along with the credentials used in the tests:

auth: simple
username: root
password: pw

Deployment

To run the API (while sourcing the virtual environment):

uvicorn --host=127.0.0.1 --port=8000 --log-config=logging.ini --reload datastore_api.main:app

To run from outside of the virtual environment, add poetry run to the beginning of the above command. Changing the optional arguments as needed. Documentation can be found by navigating to /docs.

About

The API being developed as part of the Facilities Data Pipeline (a kind of IDS replacement)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •