Skip to content
This repository was archived by the owner on Apr 19, 2019. It is now read-only.

MathWebSearch/legacy-tema-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TEMA Docker

This project contains configuration for setting up a docker instance of the Text and Math (TEMA) search system. The basic setup is for setting up search for an MMT archive but it should be straightforward to apply to other compatible by changing the configuration files.

To directly install and run this version see Installation. To learn how to configure TEMA search for your own archive, check the detailed documentation in Setup Details below. Additionally, check the file Dockerfile as well as the start-tema start-up script for the actual shell commands associated with the documentation.

Installation

After installing docker (see (https://docs.docker.com/installation/)) you can get the fully setup up version from docker hub by running
docker pull kwarc/tema_search

Then you can start the web server (will start on localhost:9999) with: docker run -d -p 9999:8888 kwarc/tema_search start-tema

For internal configuration/access (i.e. bash) you can run: docker run -t -i kwarc/tema_search /bin/bash

Don't forget to run docker commit afterwards to save the state. For more information of how to configure it see the documentation below.

Setup Details

Starting from a bare-bones Linux installation (this instance is using Debian 8.1 -- jessie)

  1. Installing dependencies

  2. Installing plain MWS & Miscellaneous deps, mainly g++, cmake and a bunch of libs. Basically follows documentation from the MWS readme.

  3. Installing dependencies for the TEMA frontend, basically elasticsearch (for searching/indexing the text), php for serving and curl for setting up a couple of proxies (i.e. to the query processor (e.g. latexml) and mwsd) to avoid issues with e.g. blocked ports.

  4. Installing npm for yet another proxy (to elasticsearch I think) that uses nodejs instead of php.

  5. Cloning and installing MWS and friends: mws, mws-frontend, and tema-proxy

  6. Clone the relevant MMT archive (for which you want to set up the search engine). The current scripts assumes the MathML-enriched HTML is in export/planetary/narration/ and the TEMA config in lib/tema-config.json. Additionally you can (need to) provide a query processor to convert math text queries from the input syntax to MathML and then provide a Javascript module in MWS frontend that posts to your converter and enable it. The default one is based on LateXML and handles TeX-style input. The parametrization of query processors should probably be improved in the MWS frontend project. This is the part you need to change to make a different instance.

  7. Setting up generated content

  8. generate MWS harvests (via mws/bin/docs2harvest)

  9. generate MWS index (via mws/bin/mws-index)

  10. generate elasticsearch annotated documents (via mws/bin/harvests2json)

  11. Starting everything up (see separate startup script start-tema)

  12. Start the elasticsearch service

  13. Load the generated index (annotated documents) into elasticsearch (using the scripts run-setup and then run-bulk from mws/scripts/elasticsearch/)

  14. Start the MWS daemon to serve the index

  15. Start tema-proxy with nodejs

  16. Serve mws-frontend (php will take care of the other two proxies)

About

Legacy Docker for Temasearch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published