Architecture Overview

Autocrisp is a highly accessible data mining tool that allows for cleaning and analyzing datasets, as well as running predictive ML models, without prior coding knowledge. The AI agents leverage Large Language Models (LLMs) for their language understanding and generation capabilities. The system incorporates various frameworks and tools to empower users to build domain knowledge through meta-analyses and identify relevant datasets during the business understanding phase. Subsequently, in the data understanding phase, users can intuitively perform exploratory analyses on datasets using natural language. The AI agents facilitate the steps of data extraction and processing of datasets, including the training and evaluation of machine learning models to allow predictive analysis at scale. A human-in-the-loop approach is employed, requiring code review before execution.

N8N for workflow automation in domain exploration and dataset discovery.
AG2 (Autogen): Agentic framework to automate the data mining process.
Streamlit: User interface that allows for interaction with the agents.
Supabase: Cloud data storage, agent memory and encrypted user authentication.

Agent Functionality Across CRISP-DM Phases

Autocrisp is built following the principles of a foundational framework for data mining. The Cross-industry standard process for data mining, known as CRISP-DM, is an open standard process model that describes common approaches used by data mining experts. It is the most widely-used analytics model.

Integration of Technologies

Autocrisp is utilizing N8N for its business understanding capabilities by connecting the user in foremost with an orchestration agent that can answer direct questions or brainstorm ideas. The orchestration agent may delegate research tasks towards a set of agents capable of automating comprehensive literature research, identifying relevant datasets through targeted web searches and querying datasets using natural language to SQL.

The remaining phases data understanding, data preparation, modeling, and evaluation are supported by agents built on the Autogen (AG2) framework. These agents handle tasks such as:

Exploratory data analysis
Data cleaning
Model training
Model evaluation

All components are integrated into a Streamlit-based application, providing users with an interactive way to engage with the agents.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Architecture Overview

Agent Functionality Across CRISP-DM Phases

Integration of Technologies

About

Uh oh!

Releases

Packages

jackvandervall/autocrisp-readme

Folders and files

Latest commit

History

Repository files navigation

Architecture Overview

Agent Functionality Across CRISP-DM Phases

Integration of Technologies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages