Distributed Hierarchical Clustering with Importance Sampling
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
Hadoop & Java
A step by step series of examples that tell you have to get a development env running
src/main/java/HCdist/
-->HC.java
-->HCMapper.java
-->HCReducer.java
-->attributeMean.java
Arguments required for the main JAVA file:
HC <inputAttrPath> <inputDataPath> <outputPath> <NumOfInstances> <NumOfPartitions>
Arguments for running jar:
hadoop jar
hadoop jar HC-Distributed.jar HC-Distributed.HC /attribute.arff /instances.arff /output 20 2
//TODO
Professor Haimonti Dutta
Akshat Sehgal
Mihir Chauhan