This is code for studying quantitative syntax using dependency corpora.
It is written for Python 3.5, and has been tested to work in Python 2.7.
pip install -r requirements.txt for basic functionality.
Additionally, pip install -r optrequirements.txt for optional dependencies used for parallelization and visualization.
Supposing you have your Universal Dependencies treebanks, as downloaded from the UD website, in a directory $UD_DIR. Copy the file process_ud.sh into $UD_DIR and run it with sh process_ud.sh. This will rename the directories into the format that cliqs expects. Then modify the path in cliqs/corpora.py to reflect your UD path.
The list of langs can be found at corpora.ud_langs.
To compare dependency length in some languages to random and minimal baselines, run:
python run_mindep.py run lang1 lang2 ... langn > result_raw.csv.
Then postprocess the resulting csv:
python run_mindep.py postprocess result_raw.csv > result.csv.
Then you can run the various R scripts starting in mindep_ to analyze the results and generate figures.