diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..01d44f3 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,54 @@ +# How to Contribute to D4D Tutorials + +We're very happy you want to contribute! This repository is a community effort, and your help is essential to keeping it relevant and useful. + +There are two main ways to contribute: + +1. **Request a Tutorial** +2. **Add a Tutorial** + +--- + +## ๐Ÿ’ก 1. Request a Tutorial + +If you want to learn something that isn't covered here, or have a great idea for a tutorial, the best way to suggest it is: + +1. Go to the [**Issues**](https://github.com/Data4Democracy/tutorials/issues) tab of our repository. + +2. Check if someone hasn't already made the same suggestion. + +3. Click on "New Issue". + +4. Give it a clear title (e.g., `[REQUEST] Web Scraping Tutorial with Scrapy`). + +5. Describe what you would like to see in the tutorial. +6. Add the `tutorial-request` label, if possible. + +## โœ๏ธ 2. Adding a Tutorial + +If you have written a tutorial or found an excellent external resource that fits here, the process for adding it is through a Pull Request. + +### The GitHub Workflow + +We use the standard "Fork & Pull" workflow: + +1. **Fork:** Create a "fork" (copy) of this repository in your own GitHub account. + +2. **Clone:** Clone *your* fork to your local machine (`git clone ...`). + +3. **Branch:** Create a new "branch" for your changes (`git checkout -b my-new-tutorial`). + +4. **Add Your Tutorial:** +* Create a new folder with a descriptive name (e.g., `Web_Scraping/`). +* Add your files (preferably `.ipynb` or `.md`). +* **Important:** Go back to the main `README.md` and add a link to your new tutorial in the appropriate section. + +5. **Commit:** Commit your changes (`git commit -m "Add Scrapy tutorial"`). + +6. **Push:** Send your changes to *your* fork (`git push origin my-new-tutorial`). + +7. **Pull Request:** Go back to the original `Data4Democracy/tutorials` repository on GitHub. You will see a button for "Compare & Pull Request". Click it, give it a title and a clear description of what you did, and submit. + +One of the project maintainers will review your contribution and, if everything is correct, will merge it. + +Thank you for helping build Data for Democracy! diff --git a/README.md b/README.md index 8cb4d6f..e66a525 100644 --- a/README.md +++ b/README.md @@ -1,58 +1,75 @@ -# Tutorials - -**Slack:** [#tutorials](https://datafordemocracy.slack.com/messages/tutorials/) - -**Project Description:** A place for tutorials relevant to D4D projects. - -**Project Leads:** -* [@alarcj](https://datafordemocracy.slack.com/messages/@alarcj/) -* [@grichardson](https://datafordemocracy.slack.com/messages/@grichardson/) - -# List of Tutorials -## External Resources -* [External Learning Resources](https://github.com/Data4Democracy/tutorials/blob/master/External%20Resources/learning-resources.md) - Reference of resources that members of D4D have found useful in learning some of the languages, platforms, libraries, and methods applicable to D4D projects. - -## AWS -* [AWS S3 using boto3](https://github.com/Data4Democracy/tutorials/blob/master/aws/AWS_Boto3_s3_intro.ipynb) - How to interact with S3 bucket from python using boto3 library. - - - helpful tutorial for setting up first IAM user and adding permissions before beginning tutorial above: (https://linuxacademy.com/howtoguides/posts/show/topic/14209-automating-aws-with-python-and-boto3) - -## Twitter -* [Intro Collecting Tweets](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Intro_Collecting_Tweets.ipynb) - Getting started using the Twitter API for downloading Tweets. -* [Twitter Getting Past the 32K Limit](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Twitter_Gettingpast_32K_Limit.ipynb) - How to obtain all the Tweets from a given user. -* [Streaming Tweets From Twitter](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/StreamingTweetsFromTwitter.ipynb) - Great example on how to Stream (listen to live events) Tweets. -* [Basic Twitter Analysis](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Basic_Twiter_Analysis.ipynb) - Simple word frequency analysis and tokenization. -* [Interactive Maps with Python's Folium](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Python_and_maps.ipynb) - Interactive Maps of Tweets (excuse the sample size for this tutorial). -* [Clustering Twitter](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Clustering_twitter.ipynb) - Clustering a user's followers by using KMeans. -* [Building a Graph with Twitter](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Building_a_Graph_Twitter.ipynb) - Short tutorial on how to build a Graph of Twitter friends and followers using Networkx. - -## Tutorials by Project -* [Assemble](https://github.com/Data4Democracy/assemble) - * [Twitter](https://github.com/Data4Democracy/tutorials/tree/master/Twitter) All of it! -* [Internal Displacement](https://github.com/Data4Democracy/internal-displacement) - * [Interactive Maps with Python's Folium](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Python_and_maps.ipynb) -* [USA Dashboard](https://github.com/Data4Democracy/usa-dashboard) - * [Interactive Maps with Python's Folium](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Python_and_maps.ipynb) -* [Immigration Connect](https://github.com/Data4Democracy/immigration-connect) - * [Interactive Maps with Python's Folium](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Python_and_maps.ipynb) - -## git and GitHub -* If you're new to using git and GitHub, [download git](https://git-scm.com) and create a GitHub account, then head over to our [GitHub playground repo](https://github.com/Data4Democracy/github-playground). Follow the instructions in the `README` to learn how it's done! - -# Looking for Something? -* If you have some cool ideas for tutorial or want to see a tutorial that is not yet present please feel free to add it to our [Wish List](https://docs.google.com/spreadsheets/d/1o_821rVkR-8yz_dMBEN6Srl7tgXzrw-K8Nsqk-xkAmU/edit#gid=0) - -# Contributing -* If you want to contribute your own tutorials or have any comments please open up an [issue](https://github.com/Data4Democracy/tutorials/issues). -This way we can avoid duplicates. - -* If you you happen to have a cool idea for a tutorial or want to request a tutorial on a given subject/tool please feel free to open up an [issue](https://github.com/Data4Democracy/tutorials/issues) requesting it. +# Data for Democracy (D4D) - Tutorials + +Welcome to the central repository of Data for Democracy tutorials! Our mission is to empower volunteers with the data skills needed to make a positive impact on civic projects and promote transparency. + +This repository is a living, community-maintained resource designed to help new and existing members learn the tools, techniques, and platforms we use in our projects. + +## ๐Ÿ“Œ Table of Contents + +* [How to Get Started (For New Members)](#how-to-get-started-for-new-members) +* [How to Contribute](#how-to-contribute) +* [Tutorial Topics](#tutorial-topics) +* [External Learning Resources](#external-learning-resources) +* [AWS (Amazon Web Services)](#aws-amazon-web-services) + +* [Twitter API](#twitter-api) +* [Tutorials by D4D Project](#tutorials-by-d4d-project) + +--- + +## ๐Ÿ‘‹ How to Get Started (For New Members) + +If you are new to Data for Democracy or collaborative development, we recommend starting here: + +1. **Learn Git and GitHub:** If you are not If you're familiar with Git, we have a [**"playground" repository**](https://github.com/Data4Democracy/github-playground) where you can learn the basics of forks, commits, and pull requests without fear of breaking anything. + +2. **Explore External Resources:** We've compiled a [**learning resource list**](https://github.com/Data4Democracy/tutorials/blob/master/External%20Resources/learning-resources.md) that D4D members have found useful for learning about the languages, platforms, and methods we use. + +--- + +## ๐Ÿ™‹ How to Contribute + +This repository is made by *you*! There are two main ways to contribute: + +* **Request a Tutorial:** Want to learn something that's not here? Have an idea for a new tutorial? [**Open an Issue**](https://github.com/Data4Democracy/tutorials/issues/new?template=tutorial_request.md) and use the `tutorial-request` tag. + +* **Adding a Tutorial:** You created a tutorial? Fantastic! Please read our [**Contribution Guide (CONTRIBUTING.md)**](https(link_to_your_CONTRIBUTING.md)) for formatting guidelines and how to submit a Pull Request. + +--- + +## ๐Ÿ“š Tutorial Topics + +### External Learning Resources +* [Useful Resources](https://github.com/Data4Democracy/tutorials/blob/master/External%20Resources/learning-resources.md): A reference list of resources that D4D members have found useful for learning the languages, platforms, and methods applicable to D4D projects. + +### AWS (Amazon Web Services) +* [AWS S3 using boto3](https://github.com/Data4Democracy/tutorials/blob/master/aws/AWS_Boto3_s3_intro.ipynb): How to interact with an S3 bucket using Python and the `boto3` library. + +*Note:* This [guide on IAM configuration](https://linuxacademy.com/howtoguides/posts/show/topic/14209-automating-aws-with-python-and-boto3) is a good prerequisite. + +### Twitter API +* [Introduction to Collecting Tweets](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Intro_Collecting_Tweets.ipynb): Getting started with the Twitter API to download Tweets. + +* [Bypassing the 32K Limit](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Twitter_Gettingpast_32K_Limit.ipynb): How to get *all* the Tweets from a given user. + +* [Tweet Streaming](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/StreamingTweetsFromTwitter.ipynb): A great example of how to "listen" to live (streaming) Tweet events. + +* [Basic Twitter Analysis](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Basic_Twiter_Analysis.ipynb): Simple word frequency analysis and tokenization. + +* [Interactive Maps with Folium](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Python_and_maps.ipynb): Creating interactive Tweet maps (sample size is small). * [Clustering Twitter (KMeans)](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Clustering_twitter.ipynb): Grouping a user's followers with KMeans. + +## ๐Ÿ”— Tutorials by D4D Project + +Want to contribute to a specific project? See which tutorials are most relevant for each one: + +* **[Assemble](https://github.com/Data4Democracy/assemble)** + +* All tutorials in the [Twitter](#twitter-api) section are highly relevant. + +* **[Internal Displacement](https://github.com/Data4Democracy/internal-displacement)** +* **[USA Dashboard](https://github.com/Data4Democracy/usa-dashboard)** +* **[Immigration Connect](https://github.com/Data4Democracy/immigration-connect)** + +* [Building a Graph with Networkx](https://github.com/Data4Democracy/tutorials/blob/master/Twitter/Building_a_Graph_Twitter.ipynb): A brief tutorial on how to build a graph of Twitter friends and followers. + +--- diff --git a/streamlit_dashboard/README.md b/streamlit_dashboard/README.md new file mode 100644 index 0000000..758cefa --- /dev/null +++ b/streamlit_dashboard/README.md @@ -0,0 +1,32 @@ +# Tutorial: From Jupyter to Interactive Dashboard with Streamlit + +**Objective:** To teach how to transform a static data analysis (such as a Jupyter Notebook) into an interactive web application that anyone can use, without needing knowledge of HTML/CSS/JavaScript. + +**Practical Example:** Use a municipal spending dataset to create a dashboard where the user can filter by municipality and see spending over time. + +--- + +### 1. Concept: Why a Dashboard? + +* **The Problem:** A Jupyter Notebook is great for *analysts*, but terrible for the *public*. Nobody will download your notebook, install Jupyter, and run your cells. + +* **The Solution:** A web dashboard allows the end user to **interact** with the data (filtering, selecting) and see the results (graphs) in real time, directly in the browser. + +**The Tool (Streamlit):** It's a Python library that transforms Python scripts into web applications. You write pure Python; Streamlit magically draws the buttons, menus, and graphics. + +### 2. Installation and Execution + +For this tutorial, you will need the following libraries: + +```bash +pip install streamlit pandas plotly +``` + +To run the application: Open your terminal, navigate to this folder and run: + + +```bash +streamlit run app.py +``` + +Your browser will automatically open with the dashboard. diff --git a/streamlit_dashboard/app.py b/streamlit_dashboard/app.py new file mode 100644 index 0000000..86075db --- /dev/null +++ b/streamlit_dashboard/app.py @@ -0,0 +1,90 @@ +import streamlit as st +import pandas as pd +import plotly.express as px + +# --- Page Configuration --- +# Must be the first Streamlit command in the script +st.set_page_config( + page_title="Municipal Spending Dashboard", + page_icon="๐Ÿ“Š", + layout="wide" +) + + + + +# --- Data Loading --- +@st.cache_data +def load_data(): + # Diga ao Pandas para ignorar linhas que comeรงam com '#' + df = pd.read_csv( + "municipal_spending.csv", + comment='#' # <--- Esta รฉ a linha mรกgica + ) + df['date'] = pd.to_datetime(df['date']) + return df + +# Load the data +df = load_data() + +# --- Dashboard Title --- +st.title("๐Ÿ“Š Municipal Spending Dashboard") + +# --- Sidebar for Filters --- +st.sidebar.header("Filters") + +# Get the list of unique municipalities +municipalities_unique = sorted(df['municipality'].unique()) + +# Create the selectbox in the sidebar +selected_municipality = st.sidebar.selectbox( + "Select a Municipality:", + municipalities_unique +) + +# --- Data Filtering --- +# Filter the main dataframe based on user selection +df_filtered = df[df['municipality'] == selected_municipality] + +# --- Main Layout (Charts) --- +st.header(f"Analysis for: {selected_municipality}", divider="gray") + +# Create two columns for the charts +col1, col2 = st.columns(2) + +# --- Chart 1: Spending Over Time (Line Chart) --- +with col1: + st.subheader("Spending Over Time") + fig_line = px.line( + df_filtered, + x="date", + y="amount", + color="category", # Shows different lines for Health, Education, etc. + title=f"Spending Evolution" + ) + # Use streamlit's theme and fit to column width + st.plotly_chart(fig_line, use_container_width=True, theme="streamlit") + +# --- Chart 2: Distribution by Category (Pie Chart) --- +with col2: + st.subheader("Distribution by Category") + # Group data for the pie chart + df_grouped_category = df_filtered.groupby('category')['amount'].sum().reset_index() + + fig_pie = px.pie( + df_grouped_category, + names="category", + values="amount", + title="Percentage of Spending by Category" + ) + st.plotly_chart(fig_pie, use_container_width=True, theme="streamlit") + + +# --- Show raw data (optional, with an expander) --- +with st.expander("View Filtered Raw Data"): + st.dataframe(df_filtered) + +# To run this application: +# 1. Save this file as 'app.py' +# 2. Make sure 'municipal_spending.csv' is in the same folder +# 3. Open your terminal and run: streamlit run app.py diff --git a/streamlit_dashboard/municipal_spending.csv b/streamlit_dashboard/municipal_spending.csv new file mode 100644 index 0000000..99d8a01 --- /dev/null +++ b/streamlit_dashboard/municipal_spending.csv @@ -0,0 +1,33 @@ +# โš ๏ธ Disclaimer: Sample Data +# The municipal_spending.csv file included in this tutorial contains fictitious (false) data. +# This data was created synthetically to ensure a clean format... +id,municipality,date,category,amount +1,Lisbon,2023-01-15,Health,150000 +2,Lisbon,2023-01-20,Education,220000 +3,Lisbon,2023-01-25,Transport,95000 +4,Porto,2023-01-17,Health,120000 +5,Porto,2023-01-22,Education,180000 +6,Porto,2023-01-28,Culture,50000 +7,Lisbon,2023-02-15,Health,165000 +8,Lisbon,2023-02-20,Education,230000 +9,Lisbon,2023-02-25,Transport,110000 +10,Porto,2023-02-17,Health,135000 +11,Porto,2023-02-22,Education,190000 +12,Porto,2023-02-28,Culture,55000 +13,Lisbon,2023-03-15,Health,170000 +14,Lisbon,2023-03-20,Education,240000 +15,Lisbon,2023-03-25,Transport,100000 +16,Porto,2023-03-17,Health,140000 +17,Porto,2023-03-22,Education,205000 +18,Porto,2023-03-28,Culture,60000 +19,Coimbra,2023-01-20,Health,80000 +20,Coimbra,2023-01-25,Education,110000 +21,Coimbra,2023-02-20,Health,85000 +22,Coimbra,2023-02-25,Education,115000 +23,Coimbra,2023-03-20,Health,90000 +24,Coimbra,2023-03-25,Education,120000 +25,Coimbra,2023-03-28,Transport,40000 +26,Lisbon,2023-04-15,Health,180000 +27,Lisbon,2023-04-20,Education,250000 +28,Porto,2023-04-17,Health,150000 +29,Coimbra,2023-04-20,Health,95000