Skip to content
This repository was archived by the owner on Nov 26, 2020. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,15 @@ <h2>
<li>SVM (regression &amp; classification)</li>
<li>K-Means </li>
<li>LDA </li>
</ol><p>Refer <a href="http://doc.madlib.net/v0.5/">MADlib User Docs</a> for MADlib's user documentation. Please note that PyMADlib as of now is only compatible with MADlib v0.5. You can obtain MADlib v0.5 from <a href="https://github.com/madlib/madlib/archive/v0.5.tar.gz">MADlib v0.5</a>. We might add support to more recent versions of MADlib depending on adoption rate. Please email me if you have a strong case for an upgrade.</p>
</ol><p>Refer <a href="https://madlib.apache.org/docs/v0.5/">MADlib User Docs</a> for MADlib's user documentation. Please note that PyMADlib as of now is only compatible with MADlib v0.5. You can obtain MADlib v0.5 from <a href="https://github.com/madlib/madlib/archive/v0.5.tar.gz">MADlib v0.5</a>. We might add support to more recent versions of MADlib depending on adoption rate. Please email me if you have a strong case for an upgrade.</p>

<h2>
<a name="dependencies" class="anchor" href="#dependencies"><span class="octicon octicon-link"></span></a>Dependencies</h2>

<ol>
<li>You'll need the python extension <em><strong>psycopg2</strong></em> to use PyMADlib.</li>
<li>If you have matplotlib installed, you'll see Matplotlib visualizations for Linear Regression demo.</li>
<li>If you have installed <a href="http://networkx.github.com/download.html">networkx</a>, you'll see a visualization of the k-means demo</li>
<li>If you have installed <a href="https://networkx.github.com/download.html">networkx</a>, you'll see a visualization of the k-means demo</li>
<li>
<a href="https://github.com/marcelcaraciolo/PyROC">PyROC</a> is included in the source of this distribution with permission from its developer. You'll see a visualization of the ROC curves for Logistic Regression.</li>
</ol><h2>
Expand Down Expand Up @@ -95,7 +95,7 @@ <h2>
<a name="build-environment-setup-on-mac-os-x-108" class="anchor" href="#build-environment-setup-on-mac-os-x-108"><span class="octicon octicon-link"></span></a>Build Environment Setup on Mac OS X 10.8</h2>

<ul>
<li><p>Download &amp; install <a href="http://repo.continuum.io/archive/Anaconda-1.9.0-MacOSX-x86_64.pkg">Anaconda-1.9.0-MacOSX-x86_64.pkg</a></p></li>
<li><p>Download &amp; install <a href="https://repo.continuum.io/archive/Anaconda-1.9.0-MacOSX-x86_64.pkg">Anaconda-1.9.0-MacOSX-x86_64.pkg</a></p></li>
<li><p>Open a terminal and check if you have Anaconda Python &amp; the package manager conda</p></li>
</ul><blockquote>
<pre><code>vatsan-mac$ which python
Expand All @@ -106,7 +106,7 @@ <h2>
</blockquote>

<ul>
<li>If you haven't installed PostgreSQL on your Mac already, you'll have to download &amp; install <code>PostGreSQL</code> for Mac. This is so that we get some required libraries to compile the SQL Engine: psycopg2. The easiest way to install <code>PostGreSQL</code> on Mac is via <code>http://postgresapp.com/</code>. Once you've downloaded and installed PostGreSQL on Mac, it should typically be found under <code>/Library/PostgreSQL</code>
<li>If you haven't installed PostgreSQL on your Mac already, you'll have to download &amp; install <code>PostGreSQL</code> for Mac. This is so that we get some required libraries to compile the SQL Engine: psycopg2. The easiest way to install <code>PostGreSQL</code> on Mac is via <code>https://postgresapp.com/</code>. Once you've downloaded and installed PostGreSQL on Mac, it should typically be found under <code>/Library/PostgreSQL</code>
</li>
</ul><blockquote>
<pre><code>vatsan-mac$ ls /Library/PostgreSQL/9.2/
Expand Down Expand Up @@ -162,7 +162,7 @@ <h2>
<h2>
<a name="usage-tutorial" class="anchor" href="#usage-tutorial"><span class="octicon octicon-link"></span></a>Usage Tutorial</h2>

<iframe src="http://nbviewer.ipython.org/gist/vatsan/dd88abb47c2fbd9e16bd" height=2000px width=600px></iframe>
<iframe src="https://nbviewer.ipython.org/gist/vatsan/dd88abb47c2fbd9e16bd" height=2000px width=600px></iframe>
Also visit <a href="https://gist.github.com/vatsan/dd88abb47c2fbd9e16bd">PyMADlib IPython NB</a> to download the IPython NB tutorial</p>

<h2>
Expand Down Expand Up @@ -202,9 +202,9 @@ <h2>
<p>PyMADlib packages publicly available datasets from the UCI machine learning repository and other sources.</p>

<ol>
<li><a href="http://archive.ics.uci.edu/ml/datasets/Wine+Quality">Wine quality dataset from UCI Machine Learning repository</a></li>
<li><a href="http://archive.ics.uci.edu/ml/datasets/Auto+MPG">Auto MPG dataset from UCI ML repository from UCI Machine Learning repository</a></li>
<li><a href="http://archive.ics.uci.edu/ml/datasets/Wine+Quality">Wine quality dataset from UCI Machine Learning repository</a></li>
<li><a href="https://archive.ics.uci.edu/ml/datasets/Wine+Quality">Wine quality dataset from UCI Machine Learning repository</a></li>
<li><a href="https://archive.ics.uci.edu/ml/datasets/Auto+MPG">Auto MPG dataset from UCI ML repository from UCI Machine Learning repository</a></li>
<li><a href="https://archive.ics.uci.edu/ml/datasets/Wine+Quality">Wine quality dataset from UCI Machine Learning repository</a></li>
<li>Obama-Romney second presidential debate (2012) transcripts</li>
</ol><h2>
<a name="questions" class="anchor" href="#questions"><span class="octicon octicon-link"></span></a>Questions</h2>
Expand Down
2 changes: 1 addition & 1 deletion params.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"name":"PyMADlib","tagline":"A Python wrapper for MADlib - an open source library for scalable in-database machine learning algorithms","body":"A Python wrapper for MADlib - an open source library for scalable in-database machine learning algorithms\r\n\r\n## Algorithms\r\n\r\nPyMADlib currently has wrappers for the following algorithms in MADlib (version 0.5).\r\n\r\n1. Linear regression\r\n1. Logistic Regression\r\n1. SVM (regression & classification)\r\n1. K-Means \r\n1. LDA \r\n\r\nRefer [MADlib User Docs](http://doc.madlib.net/v0.5/ ) for MADlib's user documentation. Please note that PyMADlib as of now is only compatible with MADlib v0.5. You can obtain MADlib v0.5 from [MADlib v0.5](https://github.com/madlib/madlib/archive/v0.5.tar.gz). We might add support to more recent versions of MADlib depending on adoption rate. Please email me if you have a strong case for an upgrade.\r\n\r\n\r\n## Dependencies\r\n\r\n1. You'll need the python extension _**psycopg2**_ to use PyMADlib.\r\n1. If you have matplotlib installed, you'll see Matplotlib visualizations for Linear Regression demo.\r\n1. If you have installed [networkx](http://networkx.github.com/download.html), you'll see a visualization of the k-means demo\r\n1. [PyROC](https://github.com/marcelcaraciolo/PyROC) is included in the source of this distribution with permission from its developer. You'll see a visualization of the ROC curves for Logistic Regression.\r\n\r\n\r\n\r\n \r\n\r\n## Configurations\r\n\r\nTo configure your DB Connection parameters\r\nYou should create a file in your home directory\r\n\r\n> ~/.pymadlib.config \r\n\r\nthat should look like so :\r\n\r\n\r\n> [db_connection] \r\n> user = gpadmin \r\n> password = XXXXX \r\n> hostname = 127.0.0.1 (or the IP of your DB server) \r\n> port = 5432 (the port# of your DB) \r\n> database = vatsandb (the database you wish to connect to) \r\n\r\n\r\n\r\n\r\n\r\n\r\n## Installation Instructions\r\n\r\nPyMADlib depends on `MADlib`, `psycopg2` and `Pandas`. It is easiest to work with PyMADlib if you have `Anaconda Python`.\r\n\r\n## Build Environment Setup on Mac OS X 10.8\r\n\r\n* Download & install [Anaconda-1.9.0-MacOSX-x86_64.pkg] (http://repo.continuum.io/archive/Anaconda-1.9.0-MacOSX-x86_64.pkg)\r\n\r\n* Open a terminal and check if you have Anaconda Python & the package manager conda\r\n\r\n> vatsan-mac$ which python\r\n> /Users/vatsan/anaconda/bin/python\r\n> vatsan-mac$ which conda\r\n> /Users/vatsan/anaconda/bin/conda \r\n\r\n* If you haven't installed PostgreSQL on your Mac already, you'll have to download & install `PostGreSQL` for Mac. This is so that we get some required libraries to compile the SQL Engine: psycopg2. The easiest way to install `PostGreSQL` on Mac is via `http://postgresapp.com/`. Once you've downloaded and installed PostGreSQL on Mac, it should typically be found under `/Library/PostgreSQL`\r\n\r\n> vatsan-mac$ ls /Library/PostgreSQL/9.2/\r\n> Library include pg_env.sh uninstall-postgresql.app\r\n> bin installer scripts\r\n> data lib share\r\n> doc pgAdmin3.app stackbuilder.app\r\nI don't think the version of the `PostGreSQL` matters (9.1 or above is fine). \r\n\r\n* You may need to create some symlinks to `libpq` & `libssl` so that `psycopg2` is able to find it:\r\n\r\n> vatsan-mac$ sudo ln -s /Users/vatsan/anaconda/lib/libssl.1.0.0.dylib /usr/lib\r\n> vatsan-mac$ sudo ln -s /Users/vatsan/anaconda/lib/libcrypto.1.0.0.dylib /usr/lib\r\n\r\n* Install `Psycopg2` \r\n\r\n> vatsan-mac$ conda install distribute\r\n> vatsan-mac$ pip install psycopg2\r\n\r\n* Now we're ready to test if the installations of the required libraries were successful.\r\n\r\n> vatsan-mac$ python -c 'import psycopg2'\r\nIf the above command did not error out, then installation was successful.\r\n\r\n* You may install `PyMADlib` by downloading the source (from PyPI) and then run the following\r\n\r\n> sudo python setup.py build\r\n> sudo python setup.py install\r\n\r\n* If you use easy_install or pip, simply run :\r\n\r\n> sudo easy_install pymadlib\r\n\r\n\r\n## Usage Tutorial\r\n\r\nVisit [PyMADlib Tutorial](http://nbviewer.ipython.org/gist/vatsan/dd88abb47c2fbd9e16bd) for a tutorial on using PyMADlib\r\nAlso visit [PyMADlib IPython NB](https://gist.github.com/vatsan/dd88abb47c2fbd9e16bd) to download the IPython NB tutorial\r\n\r\n\r\n## Running the Demos\r\n\r\nYou may run the demo from the extracted directory of pymadlib like so :\r\n\r\n> python example.py\r\n\r\n \r\nIf you installed PyMADlib using instructions in the previous section, then simply run\r\n\r\n> python -c 'from pymadlib.example import runDemos; runDemos()'\r\n\r\nRemember to close the Matplotlib windows that pop-up to continue with the rest of the demo.\r\n\r\n\r\n\r\n\r\n## Gallery\r\n\r\n![K-Means Cluster Visualization](https://lh3.googleusercontent.com/-bXz3gCrnQFo/UTu3lXFKbeI/AAAAAAAAKgI/Hpjsqzb_GTQ/w776-h714-p-o-k/kmeans_networkx_viz.png)\r\n\r\n![Scatter Plot - Linear Regression (numeric attributes only)](https://lh3.googleusercontent.com/-esbS5NTl58E/UTu3lfBqUXI/AAAAAAAAKgE/tawiqnTgYLQ/w470-h353-o-k/linear_reg_scatter_1.png)\r\n\r\n![Scatter Plot - Linear Regression (with categorical attributes)](https://lh6.googleusercontent.com/-vNTw5Q6d0pg/UTu3lVjBIzI/AAAAAAAAKgA/pbiLfGiYisw/w470-h353-o-k/linear_reg_scatter_2.png)\r\n\r\n![ROC Curve - Logistic Regression](https://lh3.googleusercontent.com/-ymBoJ7qQo-o/UTu3l9RUBvI/AAAAAAAAKgU/_Mc0jiM_Yq0/w470-h353-o-k/logistic_reg_pyroc.png)\r\n\r\n![Random graph visualization - Networkx](https://lh6.googleusercontent.com/-H-3h0bV8EDQ/UTu3lyED9YI/AAAAAAAAKgY/CcoJ2oSme2M/s353-c-o-k/random_networkx_viz.png)\r\n\r\n \r\n\r\n\r\n## Datasets packaged with this installation\r\n\r\nPyMADlib packages publicly available datasets from the UCI machine learning repository and other sources.\r\n\r\n1. [Wine quality dataset from UCI Machine Learning repository](http://archive.ics.uci.edu/ml/datasets/Wine+Quality)\r\n1. [Auto MPG dataset from UCI ML repository from UCI Machine Learning repository](http://archive.ics.uci.edu/ml/datasets/Auto+MPG)\r\n1. [Wine quality dataset from UCI Machine Learning repository](http://archive.ics.uci.edu/ml/datasets/Wine+Quality)\r\n1. Obama-Romney second presidential debate (2012) transcripts\r\n\r\n\r\n\r\n\r\n## Questions\r\n\r\n<vatsan.cs@utexas.edu>\r\n","google":"UA-39168204-1","note":"Don't delete this file! It's used internally to help with page regeneration."}
{"name":"PyMADlib","tagline":"A Python wrapper for MADlib - an open source library for scalable in-database machine learning algorithms","body":"A Python wrapper for MADlib - an open source library for scalable in-database machine learning algorithms\r\n\r\n## Algorithms\r\n\r\nPyMADlib currently has wrappers for the following algorithms in MADlib (version 0.5).\r\n\r\n1. Linear regression\r\n1. Logistic Regression\r\n1. SVM (regression & classification)\r\n1. K-Means \r\n1. LDA \r\n\r\nRefer [MADlib User Docs](https://madlib.apache.org/docs/v0.5/ ) for MADlib's user documentation. Please note that PyMADlib as of now is only compatible with MADlib v0.5. You can obtain MADlib v0.5 from [MADlib v0.5](https://github.com/madlib/madlib/archive/v0.5.tar.gz). We might add support to more recent versions of MADlib depending on adoption rate. Please email me if you have a strong case for an upgrade.\r\n\r\n\r\n## Dependencies\r\n\r\n1. You'll need the python extension _**psycopg2**_ to use PyMADlib.\r\n1. If you have matplotlib installed, you'll see Matplotlib visualizations for Linear Regression demo.\r\n1. If you have installed [networkx](https://networkx.github.com/download.html), you'll see a visualization of the k-means demo\r\n1. [PyROC](https://github.com/marcelcaraciolo/PyROC) is included in the source of this distribution with permission from its developer. You'll see a visualization of the ROC curves for Logistic Regression.\r\n\r\n\r\n\r\n \r\n\r\n## Configurations\r\n\r\nTo configure your DB Connection parameters\r\nYou should create a file in your home directory\r\n\r\n> ~/.pymadlib.config \r\n\r\nthat should look like so :\r\n\r\n\r\n> [db_connection] \r\n> user = gpadmin \r\n> password = XXXXX \r\n> hostname = 127.0.0.1 (or the IP of your DB server) \r\n> port = 5432 (the port# of your DB) \r\n> database = vatsandb (the database you wish to connect to) \r\n\r\n\r\n\r\n\r\n\r\n\r\n## Installation Instructions\r\n\r\nPyMADlib depends on `MADlib`, `psycopg2` and `Pandas`. It is easiest to work with PyMADlib if you have `Anaconda Python`.\r\n\r\n## Build Environment Setup on Mac OS X 10.8\r\n\r\n* Download & install [Anaconda-1.9.0-MacOSX-x86_64.pkg] (https://repo.continuum.io/archive/Anaconda-1.9.0-MacOSX-x86_64.pkg)\r\n\r\n* Open a terminal and check if you have Anaconda Python & the package manager conda\r\n\r\n> vatsan-mac$ which python\r\n> /Users/vatsan/anaconda/bin/python\r\n> vatsan-mac$ which conda\r\n> /Users/vatsan/anaconda/bin/conda \r\n\r\n* If you haven't installed PostgreSQL on your Mac already, you'll have to download & install `PostGreSQL` for Mac. This is so that we get some required libraries to compile the SQL Engine: psycopg2. The easiest way to install `PostGreSQL` on Mac is via `https://postgresapp.com/`. Once you've downloaded and installed PostGreSQL on Mac, it should typically be found under `/Library/PostgreSQL`\r\n\r\n> vatsan-mac$ ls /Library/PostgreSQL/9.2/\r\n> Library include pg_env.sh uninstall-postgresql.app\r\n> bin installer scripts\r\n> data lib share\r\n> doc pgAdmin3.app stackbuilder.app\r\nI don't think the version of the `PostGreSQL` matters (9.1 or above is fine). \r\n\r\n* You may need to create some symlinks to `libpq` & `libssl` so that `psycopg2` is able to find it:\r\n\r\n> vatsan-mac$ sudo ln -s /Users/vatsan/anaconda/lib/libssl.1.0.0.dylib /usr/lib\r\n> vatsan-mac$ sudo ln -s /Users/vatsan/anaconda/lib/libcrypto.1.0.0.dylib /usr/lib\r\n\r\n* Install `Psycopg2` \r\n\r\n> vatsan-mac$ conda install distribute\r\n> vatsan-mac$ pip install psycopg2\r\n\r\n* Now we're ready to test if the installations of the required libraries were successful.\r\n\r\n> vatsan-mac$ python -c 'import psycopg2'\r\nIf the above command did not error out, then installation was successful.\r\n\r\n* You may install `PyMADlib` by downloading the source (from PyPI) and then run the following\r\n\r\n> sudo python setup.py build\r\n> sudo python setup.py install\r\n\r\n* If you use easy_install or pip, simply run :\r\n\r\n> sudo easy_install pymadlib\r\n\r\n\r\n## Usage Tutorial\r\n\r\nVisit [PyMADlib Tutorial](https://nbviewer.ipython.org/gist/vatsan/dd88abb47c2fbd9e16bd) for a tutorial on using PyMADlib\r\nAlso visit [PyMADlib IPython NB](https://gist.github.com/vatsan/dd88abb47c2fbd9e16bd) to download the IPython NB tutorial\r\n\r\n\r\n## Running the Demos\r\n\r\nYou may run the demo from the extracted directory of pymadlib like so :\r\n\r\n> python example.py\r\n\r\n \r\nIf you installed PyMADlib using instructions in the previous section, then simply run\r\n\r\n> python -c 'from pymadlib.example import runDemos; runDemos()'\r\n\r\nRemember to close the Matplotlib windows that pop-up to continue with the rest of the demo.\r\n\r\n\r\n\r\n\r\n## Gallery\r\n\r\n![K-Means Cluster Visualization](https://lh3.googleusercontent.com/-bXz3gCrnQFo/UTu3lXFKbeI/AAAAAAAAKgI/Hpjsqzb_GTQ/w776-h714-p-o-k/kmeans_networkx_viz.png)\r\n\r\n![Scatter Plot - Linear Regression (numeric attributes only)](https://lh3.googleusercontent.com/-esbS5NTl58E/UTu3lfBqUXI/AAAAAAAAKgE/tawiqnTgYLQ/w470-h353-o-k/linear_reg_scatter_1.png)\r\n\r\n![Scatter Plot - Linear Regression (with categorical attributes)](https://lh6.googleusercontent.com/-vNTw5Q6d0pg/UTu3lVjBIzI/AAAAAAAAKgA/pbiLfGiYisw/w470-h353-o-k/linear_reg_scatter_2.png)\r\n\r\n![ROC Curve - Logistic Regression](https://lh3.googleusercontent.com/-ymBoJ7qQo-o/UTu3l9RUBvI/AAAAAAAAKgU/_Mc0jiM_Yq0/w470-h353-o-k/logistic_reg_pyroc.png)\r\n\r\n![Random graph visualization - Networkx](https://lh6.googleusercontent.com/-H-3h0bV8EDQ/UTu3lyED9YI/AAAAAAAAKgY/CcoJ2oSme2M/s353-c-o-k/random_networkx_viz.png)\r\n\r\n \r\n\r\n\r\n## Datasets packaged with this installation\r\n\r\nPyMADlib packages publicly available datasets from the UCI machine learning repository and other sources.\r\n\r\n1. [Wine quality dataset from UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets/Wine+Quality)\r\n1. [Auto MPG dataset from UCI ML repository from UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets/Auto+MPG)\r\n1. [Wine quality dataset from UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets/Wine+Quality)\r\n1. Obama-Romney second presidential debate (2012) transcripts\r\n\r\n\r\n\r\n\r\n## Questions\r\n\r\n<vatsan.cs@utexas.edu>\r\n","google":"UA-39168204-1","note":"Don't delete this file! It's used internally to help with page regeneration."}
2 changes: 1 addition & 1 deletion stylesheets/normalize.css
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* normalize.css 2012-02-07T12:37 UTC - http://github.com/necolas/normalize.css */
/* normalize.css 2012-02-07T12:37 UTC - https://github.com/necolas/normalize.css */
/* =============================================================================
HTML5 display definitions
========================================================================== */
Expand Down
2 changes: 1 addition & 1 deletion stylesheets/styles.css
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ by Matt Graham
font-style: normal;
}

/* normalize.css 2012-02-07T12:37 UTC - http://github.com/necolas/normalize.css */
/* normalize.css 2012-02-07T12:37 UTC - https://github.com/necolas/normalize.css */
/* =============================================================================
HTML5 display definitions
========================================================================== */
Expand Down