-
Notifications
You must be signed in to change notification settings - Fork 242
Description
Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
Step 1: Load the dataset
Replace 'your_dataset.csv' with your actual dataset file
df = pd.read_csv('your_dataset.csv')
Step 2: Data Exploration
Display the first few rows of the dataset
print("First 5 rows of the dataset:")
print(df.head())
Display basic statistics of the dataset
print("\nBasic statistics of the dataset:")
print(df.describe())
Step 3: Basic Data Analysis
Example analysis: Count the number of unique values in a specific column
unique_values = df['column_name'].nunique()
print(f"\nNumber of unique values in 'column_name': {unique_values}")
Step 4: Visualizations
Example: Plot a histogram of a numerical column
plt.figure(figsize=(10, 6))
plt.hist(df['numerical_column'], bins=30, edgecolor='k', alpha=0.7)
plt.title('Histogram of Numerical Column')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
Example: Plot a bar chart of categorical column counts
category_counts = df['categorical_column'].value_counts()
plt.figure(figsize=(10, 6))
category_counts.plot(kind='bar', color='skyblue')
plt.title('Bar Chart of Categorical Column')
plt.xlabel('Category')
plt.ylabel('Count')
plt.show()
Step 5: Findings and Observations
Print findings and observations based on your analysis
print("\nFindings and Observations:")
print("1. Example finding: Describe your observation here.")
print("2. Example finding: Describe another observation here.")