Skip to content

engabaadir/log-analysis-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Log Analysis Project

This is a project for Udacity's Full Stack Web Developer Nanodegree

Why this Project?

  1. To enhance student's SQL database skills.
  2. To get practice interacting with a live database both from the command line and from your code.
  3. To explore a large database with over a million rows.
  4. To build and refine complex queries and use them to draw business conclusions from data.

Questions to Answer:

  1. What are the most popular three articles of all time? Which articles have been accessed the most? Present this information as a sorted list with the most popular article at the top.
  2. Who are the most popular article authors of all time? That is, when you sum up all of the articles each author has written, which authors get the most page views? Present this as a sorted list with the most popular author at the top.
  3. On which days did more than 1% of requests lead to errors?
    The log table includes a column status that indicates the HTTP status code that the news site sent to the user's browser.

Core Tools Used for this Project:

  1. PostgreSQL database
  2. Python 3.7.3
  3. psycopg2

Project Requirements:

This project runs in a virutal machine using Vagrant so to get things done, follow the below steps.

Installing the Prerequisites:

  1. Install Vagrant
  2. Install VirtualBox
  3. Download the vagrant setup files from Udacity's Github These files will configure the virtual machine and install all the tools needed to run this program.
  4. Download the database file: sql data
  5. Unzip the data folder to get the newsdata.sql file.
  6. Move the newsdata.sql file into the vagrant directory
  7. Download the project: log analysis project
  8. Upzip it and copy all the files into the vagrant directory into a folder named log_analysis_project

Starting the Virtual Machine:

  1. Open Terminal and navigate to the project folders we setup above.
  2. cd into the vagrant directory
  3. Run vagrant up to build the VM for the first time.
  4. Once it is built, run vagrant ssh to connect.
  5. cd into the correct project directory: cd /vagrant/log_analysis_project

Importing the data into the database:

  1. Import the data using the following command: psql -d news -f newsdata.sql
  2. Running this command will connect to your installed database server and execute the SQL commands in the downloaded file, creating tables and populating them with data.

Run the project:

  1. If you aren't in log_analysis_project directory, cd into the correct project directory: cd /vagrant/log_analysis_project
  2. Run python log_analysis.py

Expected Output:

[=========PROCESSING OUTPUT===========]


MOST POPULAR THREE ARTICLES OF ALL TIME:
[1] "Candidate is jerk, alleges rival" — 338647 views
[2] "Bears love berries, alleges bear" — 253801 views
[3] "Bad things gone, say good people" — 170098 views

MOST POPULAR ARTICLE AUTHORS OF ALL TIME:
[1] Ursula La Multa — 507594 views
[2] Rudolf von Treppenwitz — 423457 views
[3] Anonymous Contributor — 170098 views
[4] Markoff Chaney — 84557 views

DAYS WITH MORE THAN 1% OF ERRORS:
July 17, 2016 — 2.2% errors

About

Log Analysis Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages