GitHub - Solaris5959/SkySweep: A sweeping data acquisition program designed to present current hotspots of activity among publicly traded companies, conglomerates, and ETF members.

Purpose

A sweeping data acquisition program designed to present current hotspots of activity among publicly traded companies, conglomerates, and ETF members.

Tech Stack

Python
- Data Processing
  - PyArrow (In-memory Format)
  - PySpark (Data Processing)
  - DeltaLake (Durable Format [Parquet])
  - Hive (Query Layer)
  - Airflow (Orchestration)
- NetworkX (Graphs)
Scala
- Kafka (Real time Streaming)
Monitoring
- Prometheus -> Grafana

ML Inference Line

Kafka --> Spark (Arrow In-Memory) --> ML Model (Direct Predictions) ↘ Prometheus --> Grafana (Real-Time Monitoring)

Historical Data Line

Kafka --> Spark (Batch) --> Delta Lake (Parquet + ACID) ↘ Hive Metastore ↘ Batch ML Retraining

Telemetry Line

Kafka --> Prometheus --> Grafana (Kafka lag, Spark times, Model inference time) Spark --> Prometheus --> Grafana (Batch processing time, failure rate) Delta --> Airflow --> Optimize Delta, Retrain Model

RL/DL/NLP

To Start VENV

venv\Scripts\activate

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
README.md		README.md
data_ingestion.py		data_ingestion.py
data_processes.py		data_processes.py
graph.py		graph.py
sent_analysis.py		sent_analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Purpose

Tech Stack

ML Inference Line

Historical Data Line

Telemetry Line

To Start VENV

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Solaris5959/SkySweep

Folders and files

Latest commit

History

Repository files navigation

Purpose

Tech Stack

ML Inference Line

Historical Data Line

Telemetry Line

To Start VENV

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages