Skip to content

Neighbordhood analysis using k-means and Foursquare's API. Last assignment for the IBM Data Science Professional Certificate

Notifications You must be signed in to change notification settings

PhinanceScientist/k-means_economic-vulnerability

Repository files navigation

Merida Neighbourhoods Clustered by Economic Vulnerability due to the COVID-19 Outbreak


Introduction


For this project I will be using some prepared data from a postal public web page due to the lack of postal and geodata from Mérida, Yucatán in México. The goal is to obtain some relevant information from the economic vulnerability of neighborhoods from Merida based on the information retrieved by the Foursquare API. k-means will be used to group the neighbourhoods and finally I will use the Folium library to visualize the results. This approach is an attempt for visualizing the main neighborhoods inside Merida in order to cluster the most economic vulnerable places as the COVID-19 expands.

Please do notice that if you want to render this Jupyter notebook (show the folium maps) you can use this link https://nbviewer.jupyter.org/

Data


The data needed for this project can be found on this local postal services web page called Heraldo.com.mx where we can find several postal codes from México. In this case we will be focused on Mérida's postal codes.
As for the CSV file used it is based on the first 100 postal codes from Mérida (ascending order starting from the downtown area as common knowledge) and then linked to its own Latitude and Longitude as a result of a Google Maps Search for each one.
The Foursquare's API will be used to retrieve information of the venue on each neighborhood, type of each venue will be our goal to determine how crowded they are and therefore the whole vulnerability of the surrounding area.

Results and discussion

I decided to use the first 100 postal codes from Merida for this excercise due to their exposure in Foursquare as the people found there are most likely to utilice this application for tips and reviews. A brief expected behaviour for each cluster is written next to it.

The clusters were defined by the most common type of venue:

  • Cluster 0, Sport oriented venues around: If the places remain closed for activities as expected, it should show moderate economic downturn and represent low risk of contagion.
  • Cluster 1, Venues for High Income costumers: Common places known for nightlife, economic downturn and expected low risk of contagion as all this venues are closed.
  • Cluster 2, Restaurants, food venues mostly: High economic downturn, probably most of the venues will received a hard hit on their operations and cashflow, expected shutdown of the smallest venues of this cluster. Low risk of contagion
  • Cluster 3, Big Box Stores: Places still opened due to their food and basic needs distribution function. The risk of contagions its moderate as people gather for buying.
  • Cluster 4, Parks and recreational venues as movie theaters and shopping mall: High economic downturn, low risk of contagion.
  • Cluster 5, Residential area, no parks around but a lot of Convenience Store: Expected economic downturn and moderate contagion risk for the people gathering on the convenience stores.
  • Cluster 6, Restaurants close to parks or gyms: High economic downturn, low risk of contagion as the gyms remain closed.
  • Cluster 7, Stables around, outside the city: Expected economic downturn, low risk of contagion.
  • Cluster 8, Highly touristic places and gyms as the most common venue: Very high economic downturn caused by lack of international tourism, food venues almost closed, moderate risk of contagion
  • Cluster 9, Outside the city with sports club as the most common venue: Expected economic downturn, low risk of contagion.

    Conclussion

    In conclusion, we can observe that, regardless of the cluster, the most often venue are the restaurants. We should look for special importance to this as this kind of venue shows that it receives the most economic damage during this pandemic.
    In Mexico, 97% of food related venues are classified as micro or small companies within 10 or fewer employees (CANIRAC, 2014), this means that they are an economic sector highly affected by situations such as COVID-19's outbreak.
    As society, we and government should take special care for this business sector in an attempt to stop the disappearance of jobs created by restaurant entrepreneurs.

    Bibliografy

    https://molekule.science/places-to-avoid-flu-virus/
    https://www.babymed.com/health-news/8-public-places-avoid-during-cold-and-flu-season
    https://www.nhs.uk/conditions/coronavirus-covid-19/
    https://www.health.gov.au/news/health-alerts/novel-coronavirus-2019-ncov-health-alert/what-you-need-to-know-about-coronavirus-covid-19
    https://www.who.int/emergencies/diseases/novel-coronavirus-2019/advice-for-public
    https://www.healthline.com/health-news/public-places-and-the-coronavirus-what-to-know#Coronavirus-can-spread-through-contact-with-contaminated-surfaces,-too
    https://www.cdc.gov/coronavirus/2019-ncov/prepare/transmission.html
    https://www.bbc.com/future/article/20200317-covid-19-how-long-does-the-coronavirus-last-on-surfaces
    https://canirac.org.mx/images//files/TODO%20SOBRE%20LA%20MESA%20ESTUDIOS%20DE%20LA%20INDUSTRIA.pdf


    This notebook was The final Capstone from the week 5 of the Applied Data Science Capstone track from IBM Professional Certificate made by Luis Novelo

  • About

    Neighbordhood analysis using k-means and Foursquare's API. Last assignment for the IBM Data Science Professional Certificate

    Topics

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published