This is the testing repository for https://github.com/jonasbovyn/comunica-crop (the CROP implementation is found in the feat/crop branch). To generate the WebAssembly binary, check the https://github.com/jonasbovyn/crop-cpp repository.
- Under the
datasets/folder, 3 folders are located which must be filled with data generated from https://github.com/comunica/watdiv-docker.- Under
datasets/dataset100kuse-s 1 -q 5as command line arguments which will result in around 100k triples. - Under
datasets/dataset1Muse-s 10 -q 5which will result in around 1M triples. - Under
datasets/dataset100kuse-s 10 -q 5which will result in around 10M triples.
- Under
- Each folder
datasets/X/should now have a subfolderqueries/and a filedataset.nt. - In addition, use the https://github.com/rdfhdt/hdt-cpp tool to generate
dataset.hdtfiles from thedataset.ntfiles.- The command is
./rdf2hdt dataset.nt dataset.hdt
- The command is
- Use
yarn installin the project directory - Create a soft link in
node-modules/pointing to the custom Comunica repository, named@comunica-crop/.- In linux/maxOS, the command for this is
ln -s [source] [destination] - Make sure Comunica is built and checked out in the right branch. Use the
mem-benchmarkbranch when testing optimization memory usage, otherwise use thefeat/cropbranch.
- In linux/maxOS, the command for this is
If tests require a dataset to be deployed using Server.js, first run node runServer.js. The other test files (benchmark<XXX>.js) each have their own purpose and configuration, and if they require additional command line arguments, this is specified in the file. (we used node v16.13.0)
Additionally, the data can be analyzed in results/, which has a python notebook (python v3.10 was used)