Skip to content

A collection of Python foundations covering core concepts, syntax, and hands-on examples to build a strong programming base.

Notifications You must be signed in to change notification settings

gsaini/python-foundations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 

Repository files navigation

🐍 Python Foundations

Python Pandas NumPy Matplotlib

The objective is to get a holistic overview of solving a business problem through analytics and to set up the foundations of the skills required to work with data, delivered via the foundations of Python.


πŸ“‹ Table of Contents


🎯 Overview

This repository contains hands-on case studies and projects designed to build a strong foundation in Python for data analysis. Each case study focuses on real-world business scenarios, helping you develop practical skills in:

  • Data manipulation and cleaning
  • Exploratory Data Analysis (EDA)
  • Statistical analysis
  • Data visualization
  • Deriving actionable business insights

πŸ“š Case Studies

1. 🍽️ Tips Case Study

Location: case-studies/tips/

Analyze tipping patterns at Chef's Kitchen restaurant in San Diego to understand customer behavior and identify trends in revenue and tips across different demographics.

Aspect Details
Business Domain Restaurant / Hospitality
Dataset Size 244 records, 8 features
Key Variables total_bill, tip, day, time, size, smoker, sex
Analysis Focus Tipping behavior, customer demographics, time-based patterns

Key Questions Answered:

  • What is the relationship between bill amount and tip?
  • How do tips vary by day of the week and time of day?
  • Does gender or smoking status affect tipping behavior?
  • How does group size impact tipping patterns?

πŸ“– View Full Documentation


2. πŸ” FoodHub Case Study

Location: case-studies/food-hub/

Analyze order data from FoodHub, a food aggregator company in New York, to understand restaurant demand and enhance customer experience.

Aspect Details
Business Domain Food Delivery / E-commerce
Dataset Size Large-scale order data
Key Variables order_id, restaurant_name, cuisine_type, cost, rating, delivery_time
Analysis Focus Restaurant performance, delivery efficiency, customer satisfaction

Key Questions Answered:

  • Which restaurants and cuisine types are most popular?
  • How do order costs vary across different restaurants?
  • What factors affect delivery time and customer ratings?
  • Are there patterns in weekday vs weekend orders?

πŸ“– View Full Documentation


3. 🍯 Honey Production Case Study

Location: case-studies/honey-production/

Explore the decline of honey production in the United States from 1998 to 2016, investigating the impact of Colony Collapse Disorder (CCD) and analyzing trends in production, pricing, and state-level performance.

Aspect Details
Business Domain Agriculture / Environmental Science
Dataset Size 786 records, 8 features (19 years: 1998-2016)
Key Variables state, numcol, yieldpercol, totalprod, stocks, priceperlb, prodvalue, year
Analysis Focus Production trends, colony decline, pricing dynamics, state-level patterns

Key Questions Answered:

  • How has honey production yield changed from 1998 to 2016?
  • What are the major production trends across states over time?
  • Are there patterns between total honey production and value of production each year?
  • Which states are the largest honey producers and which produce the most expensive honey?

Key Findings:

  • πŸ“‰ Overall honey production in the US has been decreasing over the years
  • 🐝 Decline attributed to both decreasing colonies AND decreasing yield per colony
  • πŸ† Top producers: North Dakota, California, South Dakota, Florida, Montana
  • πŸ’° Virginia produces the costliest honey; Oklahoma produces the cheapest

πŸ“– View Full Documentation


4. πŸ“± Google Play Store Case Study (Zoom Ads)

Location: case-studies/google-play-store/

Analyze Google Play Store data for Zoom Ads, an advertising agency looking to identify trending Android applications for targeted advertisement promotion to maximize profit.

Aspect Details
Business Domain Digital Advertising / Mobile App Market
Dataset Size App store data with 12 features
Key Variables App, Category, Rating, Reviews, Size, Installs, Price, Content Rating, Ad Supported
Analysis Focus App trends, market analysis, advertising opportunities, user engagement patterns

Context:

Android is the mobile operating system running on Google OS with about 69% of the market share worldwide. The Google Play Store is the Android app store used to install Android Apps. Zoom Ads wants to understand app trends to focus advertising efforts on applications that are trending and can lead to maximum profit.

Key Questions Answered:

  • Which app categories are most popular on the Google Play Store?
  • What is the relationship between app ratings and number of installs?
  • How do free vs paid apps compare in terms of user engagement?
  • Which apps support advertisements and have high user engagement?
  • What content ratings attract the most users?

Analysis Guidelines:

  • πŸ“Š Univariate analysis to understand individual variable distributions
  • πŸ”— Bivariate analysis to explore correlations between variables
  • πŸ“ˆ Visualizations to extract actionable insights for advertising strategy

Data Features:

Feature Description
App Application Name
Category Category the app belongs to
Rating Overall user rating of the app
Reviews Number of user reviews for the app
Size Size of the app in kilobytes
Installs Number of user downloads/installs for the app
Price Price of an app in dollars
Paid/Free Whether an app is paid or free (Yes/No)
Content Rating Age group the app is targeted at
Ad Supported Whether an app supports an Ad or not (Yes/No)
In App Purchases App containing in-app purchase feature or not (Yes/No)
Editors Choice Whether rated as Editor's Choice (Yes/No)

πŸ“– View Full Documentation


5. πŸš— Austo Automobile Case Study

Location: case-studies/austo/

Analyze customer data for Austo, a UK-based automobile company looking to expand into the US market by understanding buyer profiles and car purchase behavior.

Aspect Details
Business Domain Automobile / Market Research
Dataset Size Customer data with 14 features
Key Variables Age, Gender, Profession, Salary, Total_salary, Price, Make, Personal_loan, etc.
Analysis Focus Customer profiling, purchase behavior, market segmentation, demographic analysis

Context:

In the 21st century, cars are essential for personal mobility. Research shows more than 76% of people limit their travel when they don't have a car. Austo has successfully established itself in the European market and now aims to understand US customer preferences for three major car types: Hatchback, Sedan, and SUV.

Key Questions Answered:

  • What are the demographics of buyers for each car type?
  • How do income levels (personal and household) influence car purchase decisions?
  • What is the relationship between loan behavior and car pricing?
  • How does profession (Salaried vs Business) affect car preferences?
  • What customer profiles emerge for Hatchback, Sedan, and SUV buyers?

Data Features:

Feature Description
Age Age of the customer
Gender Gender of the customer
Profession Salaried or Business person
Marital_status Marital status (Single/Married)
Education Highest education level (Graduate/Post Graduate)
No_of_Dependents Number of dependents
Personal_loan Whether customer availed a personal loan (Yes/No)
House_loan Whether customer availed a house loan (Yes/No)
Partner_working Whether partner is working (Yes/No)
Salary Annual salary of the customer
Partner_salary Annual salary of partner
Total_salary Annual household income
Price Price of the car purchased
Make Car type - Hatchback, Sedan, or SUV

πŸ“– View Full Documentation


πŸ› οΈ Skills Covered

Skill Category Topics
Python Basics Data types, loops, conditionals, functions, list comprehensions
Data Manipulation Pandas DataFrames, data cleaning, filtering, grouping, aggregation
Data Visualization Matplotlib, Seaborn (histograms, boxplots, scatter plots, heatmaps)
Statistical Analysis Descriptive statistics, correlation, distribution analysis
Business Analytics Deriving insights, identifying patterns, making recommendations

πŸ’» Technologies Used

Technology Purpose
Python 3.x Core programming language
Pandas Data manipulation and analysis
NumPy Numerical computations
Matplotlib Basic plotting and visualization
Seaborn Statistical data visualization
Jupyter Notebook Interactive development environment

πŸ“ Project Structure

python-foundations/
β”œβ”€β”€ README.md                          # This file
└── case-studies/
    β”œβ”€β”€ tips/
    β”‚   β”œβ”€β”€ README.md                  # Tips case study documentation
    β”‚   β”œβ”€β”€ Tips_Case_Study.ipynb      # Jupyter notebook with analysis
    β”‚   └── tips.csv                   # Dataset
    β”œβ”€β”€ food-hub/
    β”‚   β”œβ”€β”€ README.md                  # FoodHub case study documentation
    β”‚   β”œβ”€β”€ foodhub.ipynb              # Jupyter notebook with analysis
    β”‚   └── foodhub_order.csv          # Dataset
    β”œβ”€β”€ honey-production/
    β”‚   β”œβ”€β”€ README.md                  # Honey production case study documentation
    β”‚   β”œβ”€β”€ Session_Notebook_Honey_Production_Case_Study.ipynb  # Jupyter notebook
    β”‚   └── honeyproduction1998-2016.csv  # Dataset (1998-2016)
    β”œβ”€β”€ google-play-store/
    β”‚   β”œβ”€β”€ README.md                  # Google Play Store case study documentation
    β”‚   β”œβ”€β”€ Google_Play_Store_Case_Study.ipynb  # Jupyter notebook with analysis
    β”‚   └── google_play_store.csv      # Google Play Store dataset
    └── austo/
        β”œβ”€β”€ README.md                  # Austo case study documentation
        β”œβ”€β”€ austo_project.ipynb        # Jupyter notebook with analysis
        └── austo_automobile.csv       # Customer and car purchase dataset

πŸš€ Getting Started

Prerequisites

Make sure you have Python 3.x installed along with the following libraries:

pip install pandas numpy matplotlib seaborn jupyter

Running the Case Studies

  1. Clone the repository:

    git clone <repository-url>
    cd python-foundations
  2. Launch Jupyter Notebook:

    jupyter notebook
  3. Navigate to a case study and open the .ipynb file

  4. Run all cells to reproduce the analysis


πŸ“– Reference Material

Here is a curated list of resources to deepen your Python data analysis skills:

Data Visualization

Exploratory Data Analysis

Data Cleaning & Preprocessing

Python Programming

Libraries & Tools

Text Processing


πŸ“ License

This project is for educational purposes.


🀝 Contributing

Contributions are welcome! Feel free to:

  • Add new case studies
  • Improve existing documentation
  • Fix bugs or enhance code quality
  • Add additional reference materials

About

A collection of Python foundations covering core concepts, syntax, and hands-on examples to build a strong programming base.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published