Data Sinergy Solutions

Group 62 Data BI Repository: Data Analysis Project with Microsoft Fabric (Azure Data Stack)

Overview 📝 This project aims to develop a comprehensive framework addressing key aspects of data engineering, data analysis, visualization, and machine learning model development within the Steam Community, using Microsoft Fabric (Azure Data Stack). The framework is designed to optimize the collection, processing, and analysis of data related to user behavior, gaming trends, and interactions within the Steam community. Additionally, it includes a comparative analysis of gaming data across other popular platforms such as Nintendo and PlayStation. The insights derived from this analysis will provide valuable support for strategic decision-making, including new game development, marketing strategy optimization, and enhancing user experience on the platform.

Project Structure

Folder/File	Description
/data	Folder that stores datasets and files used by the Analysis, Dashboard and ML models.
/Notebooks	Folder containing Jupyter notebooks used for ETL, EDA and feature engineering processes
/Images	Folder containing relevant and illustrative images for the analysis project.
requirements.txt	File listing dependencies and libraries required to run the project.
gitignore	File specifying folders and files to be ignored by version control (git).
LICENSE	MIT LICENSE - File specifying the terms under which the source code is shared.
functions.py	Python file with functions to deploy in the main file 'app-py'
app.py	Main Python file serving as an entry point for the application, defining Model configuration and execution
README.md	Main project documentation in English.
README_ESP.md	Main project documentation in Spanish.

Authors

Name	Rol
Leonardo Cortés	Project Manager (PM), Data Engineer, Data Analyst	leocortes85	Leonardo Cortés Zambrano
Beverly Gonzalez	ML Engineer and Data Scientist	licette32	Beberly Gonzalez

Key Features

Technology Stack:
- Utilized Microsoft Fabric, which encompasses the full Azure Data Stack, to develop a complete end-to-end data solution.
Data Architecture:
- Implemented a Medallion Architecture to optimize data access and maintain a continuous workflow, ensuring the data remains accessible, manageable, and ready for downstream processes.

Data Transformations:
- Performed Extract, Transform, and Load (ETL) operations using the Pandas library, automating data loading from client-provided folders.
- Applied strategies to handle nested data structures and eliminated irrelevant or highly null columns to optimize the data for further use.
- Conducted an incremental load of information, using external APIs, web scraping, and custom functions to complement the dataset.
Feature Engineering:
- Conducted extensive feature engineering to ensure the data was fully consumable, cleaned, and prepped for machine learning processes and data analysis.
Dimensional Structure and Semantic Model:
- Built a dimensional structure stored in a semantic model to enable insightful analysis.
- Developed a Power BI dashboard that provides visual analytics and insights into the video game market.

Recommendation Models:
- Developed recommendation models using machine learning techniques, specifically leveraging cosine similarity for user and item recommendations.
Model Testing and Deployment:
- Conducted tests of the machine learning models using Azure ML tools.
- Created a functions.py file that stores all the functions to be executed during the deployment phase.
Streamlit Deployment: (You can deploy the live app HERE)
- Deployed the entire project via Streamlit through the app.py file, allowing users to:
  - View the interactive dashboard.
  - Interact with the machine learning models, including item and user recommendations, showcasing the project's full capabilities.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data Sinergy Solutions

Project Structure

Authors

Key Features

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Data		Data
Images		Images
Notebooks		Notebooks
.gitignore		.gitignore
Games_Analytics.pbix		Games_Analytics.pbix
LICENSE		LICENSE
README.md		README.md
README_ESP.md		README_ESP.md
app.py		app.py
functions.py		functions.py
requirements.txt		requirements.txt

License

No-Country-simulation/c20-62-ft-data-bi

Folders and files

Latest commit

History

Repository files navigation

Data Sinergy Solutions

Project Structure

Authors

Key Features

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages