Skip to content
View sameerhussai230's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report sameerhussai230

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sameerhussai230/README.md

Hi, I'm Sameer Hussain 👋

Senior Technical Lead (Data & AI) | M.Tech Environmental Engineering

Bridging the gap between Earth System Sciences, Industrial AI, and Scalable Engineering.

I am a Scientific AI Researcher and Data Architect with a unique "Hybrid Profile." I combine academic rigor in Environmental Systems Modeling (M.Tech) with 5+ years of industrial experience building Digital Twins, GenAI Architectures, and Big Data Pipelines.

My goal is to leverage Scientific Machine Learning and Cloud Engineering to solve complex physical domain challenges.


🛠️ Tech Stack & Expertise



🚀 Featured Research & Engineering Projects

1. PrivacyVision GDPR: Enterprise Secure Analytics

Domain: Computer Vision, Data Privacy, GDPR Compliance
Tech: YOLO11 ByteTrack Kafka Docker

An engineered "Zero-Trust" hybrid anonymization pipeline. Unlike standard blurring, this system anonymizes video streams before storage or transmission, ensuring strict GDPR compliance while maintaining zone occupancy analytics.

Original Processed

View Full Repository & Documentation


2. AI SQL Architect: RAG for Complex Databases

Domain: Generative AI, Knowledge Retrieval, NLP
Tech: LLMs LangChain ChromaDB Azure OpenAI

A production-grade Text-to-SQL system. It uses Retrieval-Augmented Generation (RAG) to inject database schema context into the LLM, reducing hallucination and allowing users to query complex relational databases using natural language.

Architecture Diagram

View Full Repository & Documentation


3. AI Visitor Data Extractor

Domain: Applied AI, OCR, Automation
Tech: Llama Vision FastAPI React Docker

An AI-powered data entry automation system. Utilizes multimodal LLMs to extract structured data from physical ID cards and business cards with high accuracy, featuring a React frontend for human-in-the-loop validation.

View Full Repository & Documentation


4. Superset Embedded Analytics with RLS

Domain: Data Visualization, Security, Web Engineering
Tech: Apache Superset FastAPI Docker Row-Level Security

A secure embedded analytics architecture. Implemented dynamic Row-Level Security (RLS) via Superset's guest token API to enforce strict multi-tenant data segregation, ensuring users only see data relevant to their role within a single dashboard.

View Full Repository & Documentation


5. Azure Synapse Analytics: Serverless & Spark

Domain: Cloud Data Engineering, Big Data
Tech: Azure Synapse Cosmos DB Serverless SQL Synapse Link

A real-time analytics architecture utilizing Synapse Link. Optimized querying of massive NYC Taxi datasets using OpenRowSet and data pruning techniques, bridging the gap between operational NoSQL data and analytical SQL pools.

View Full Repository & Documentation


6. Dynamic ETL Workflow: Databricks & Delta Lake

Domain: Big Data, ETL, Lakehouse Architecture
Tech: Azure Databricks PySpark Delta Lake ADF

A scalable Lakehouse pipeline designed for flexibility. Features parameterized notebooks for schema enforcement and automated incremental loading using Delta Lake's upsert capabilities, orchestrated via Azure Data Factory.

View Full Repository & Documentation


7. Analysis of Particulate Matter Levels in Delhi

Domain: Environmental Science, Statistical Modeling
Tech: Python (Pandas) SARIMA Statistical Smoothing

A domain-specific environmental study. Validated high-frequency air pollution data against public meteorological records using statistical smoothing to accurately model seasonal trends and pollutant variations in New Delhi.

View Full Repository & Documentation


📜 Certifications & Achievements

Credential Issuer Verification
Qualified GATE Exam 2021 (AIR 596) Indian Institute of Technology (IIT) View Scorecard
Microsoft Certified: Fabric Data Engineer (DP-700) Microsoft Verify
Microsoft Certified: Azure Data Engineer (DP-203) Microsoft Verify
Microsoft Certified: Power BI Data Analyst (PL-300) Microsoft Verify
Fundamentals of GIS UC Davis (Coursera) Verify
AI For Everyone DeepLearning.AI (Andrew Ng) Verify
Python for Time Series Data Analysis Udemy Verify
Azure Databricks & Spark for Data Engineers Udemy Verify
Databases and SQL for Data Science IBM Verify

Pinned Loading

  1. PrivacyVision-GDPR-Enterprise-Secure-Analytics PrivacyVision-GDPR-Enterprise-Secure-Analytics Public

    PrivacyVision is a scalable, end-to-end computer vision architecture built for Smart Cities, Construction Sites, and High-Security Zones. It bridges the critical gap between Operational Analytics a…

    Python 1

  2. AI-SQL-Architect-Integrating-Large-Language-Models-and-Vector-Search AI-SQL-Architect-Integrating-Large-Language-Models-and-Vector-Search Public

    AI SQL Architect: Integrating Large Language Models and Vector Search

    Python 2

  3. Dynamic_ETL_Workflow_with_Azure_DataBricks_Delta_Lake_and_ADF Dynamic_ETL_Workflow_with_Azure_DataBricks_Delta_Lake_and_ADF Public

    Dynamic ETL Workflow with Azure DataBricks, Delta Lake and ADF

    Python 1

  4. Azure_Data_Pipeline_for_COVID_Analytics Azure_Data_Pipeline_for_COVID_Analytics Public

    TSQL

  5. Analysis-of-PM-10-and-2.5-Levels_in_Delhi Analysis-of-PM-10-and-2.5-Levels_in_Delhi Public

    Jupyter Notebook

  6. PowerBI-Dashboard-for-AdventureWorks-Data-Analysis PowerBI-Dashboard-for-AdventureWorks-Data-Analysis Public