HDX JOB PROCESSOR, formerly known as "GISRestLayer" or "Gislayer"
This tool is evolving from just providing a way to asynchronously run the Geopreview process to a more general tool for asynchronously executing tasks for HDX (CKAN). For now it just deals with 2 tasks:
The tools is based on Redis Queue
This manages the transformation process of GIS data files ( geojson, kml, kmz, zipped shapefiles ) into a Postgres / PostGIS database. From there, the GIS data can be served by using the PGRestAPI server ( aka Spatial Server ). The gained advantages is that the PGRestAPI can serve:
- Mapbox Vector Tiles
- simplfied geometries depending on zoom level
User uploads GIS data file on CKAN
HDX JOB PROCESSOR is notified about the new file
- GISRestLayer then delegates all the import process tasks to Redis Queue ( and it’s workers ). The name of the queue geo_q . The worker code is part of this repository in:
importapi.tasks.create_preview The worker downloads the data
- Uses ogr2ogr to import it in PostGIS
- after this is finished PGRestAPI can already start serving tiles
Notifies CKAN that the task is complete. If successful, CKAN can show the user a preview of the data
- We're using a slightly modified fork of PGRestAPI https://bitbucket.org/agartner/hdx-pgrestapi It allows serving new layers without restarting the server
The code below needs to be run in the folder where the python code is
./hdxrq.py worker --url redis://redis_ip:redis_port/1 --worker-ttl 600 geo_qPlease note that rqworker was replaced by ./hdxrq.py worker. More info can be found here
This manages the sending of data to analytics servers like Mixpanel or Google Analytics.
- CKAN sends all the event metadata to HDX JOB PROCESSOR
- HDX JOB PROCESSOR then creates tasks in Redis Queue in a queue called analytics_q .
- Workers then send the information to analytics servers.
The worker code is part of this repository in:
analyticsapi.tasks.send_event
The code below needs to be run in the folder where the python code is
./hdxrq.py worker --url redis://redis_ip:redis_port/1 --worker-ttl 600 analytics_qPlease note that rqworker was replaced by ./hdxrq.py worker. More info can be found here
The "events" API is used for detecting changes in datasets and transforming them in events which are pushed to the event bus. Read more about this here
The HDX Job Processor has support for scheduling (though it's unused at the moment). To read more about it go here
To read more about how logging is configured, look here