Skip to content

Distributed Data Science Colony #112

@mduske

Description

@mduske

Colony Hackathon Submission

Project Title
Distributed Data Science Colony


Project Description
A decentralized way to gather Data Science teams around publicly available data sets (www.data.gov and other initiatives), teams and reward work accordingly. It covers the basic 5 phases of data science projects: Question, Exploratory Data Analysis, Formal Modeling, Interpretation, Communication.
A single colony can manage various Data Science Projects (defined as Domains).
Some of the issues tacked by the solution:

  1. Establishing clear goals (Question)
  2. Avoiding data dredging (trying everything under the Sun for correlation), as sources are tasks and require approval
  3. Provide random seeds for reproducible results while avoiding peers from selecting seed that yield specific results when applying non-deterministic algorithms (i.e: K-Means Clustering)
  4. Document not only positive results but negative ones too. There is currently an enormous bias towards only publishing positive results while negative ones are equally valid and useful. All research start is public and so are the results
  5. Data is greatly available nowadays, never before at this speed and openness, only a fully distributed solution will allow geographically disperse teams (Civic Hackers mostly) to coordinate work and contribute research.

** Very Late submission, I know **
Only got a chance to start serious work on Friday 22th, but I plan on continuing work an eventually publishing the fully operation solution. It uses Electron so to allow browser free use, only relying on the Blockchain client of choice.

Project Repository
https://github.com/mduske/colonyHackathon

Team Members and Contact info
https://github.com/mduske
https://twitter.com/MarkDuske
https://www.linkedin.com/in/markduske/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions