🛒 Zepto Products Analysis using Python

This project performs an inventory data analysis on product listings from Zepto using Python. The analysis includes data cleaning, exploratory data analysis (EDA), and visualizations to uncover insights related to product pricing, discounts, and stock availability.

📁 Dataset Overview

The dataset contains detailed product-level data, including:

Category
Product name
MRP and Discounted Price (in paise)
Discount Percent
Available Quantity
Weight (in grams)
Stock Status (In Stock or Out of Stock)
Quantity per pack

🧹 Data Cleaning (using pandas)

Key steps in data cleaning:

Checked shape, info, and null values using .shape, .info(), .isnull().sum()
Removed rows with MRP = 0 and weight = 0
Converted columns from paise to rupees
- mrp, discountedSellingPrice = divided by 100
Identified and removed duplicate categories like:
- Personal Care = Paan Corner
- Cooking Essentials = Munchies
- Ice Cream & Desserts = Chocolates & Candies
- Dairy, Bread & Butter = Beverages
Dropped exact duplicates using .drop_duplicates()

📊 Exploratory Data Analysis (using matplotlib)

1️⃣ Top 10 products with highest discounts

Bar chart showing product names vs. discount %

2️⃣ Stock status analysis

Pie chart showing count of products that are in stock vs out of stock

3️⃣ Products with high MRP but low weight

Bar chart identifying luxury/small items (like cosmetics)

4️⃣ Cheapest products after applying discount

Bar chart of final price (in ₹)

5️⃣ Relationship between Discount % and Final Price

Line chart showing trend between discount % and average final price

🗝️ Key Insights

This analysis revealed that while a majority of products are well-stocked, heavy discounts are mostly offered on low-cost everyday ready-to-eat items like wafers and liquid masalas to attract more customers. On the other hand, luxury or premium products like saffron and skincare items have a significantly higher price per gram, indicating niche value. The relationship between discount percentage and final price is not linear—high discounts do not always mean high-value savings, as they are often applied to lower-priced products.

📌 Tools Used

Python
Pandas – data cleaning and transformation
Matplotlib – data visualization
Jupyter Notebook – code and analysis

Visualization📈

✅ Final Output

This project helped explore:

Discount patterns
Inventory stock status
Product pricing behavior
Data cleaning on real-world messy data

💡 How to Run

Clone this repo
Open the Jupyter Notebook
Install required packages:
```
pip install pandas matplotlib
```
Run all cells to see the full analysis

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Python_Analysis.ipynb		Python_Analysis.ipynb
README.md		README.md
report.pdf		report.pdf
zepto_v2.csv		zepto_v2.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛒 Zepto Products Analysis using Python

📁 Dataset Overview

🧹 Data Cleaning (using pandas)

📊 Exploratory Data Analysis (using matplotlib)

1️⃣ Top 10 products with highest discounts

2️⃣ Stock status analysis

3️⃣ Products with high MRP but low weight

4️⃣ Cheapest products after applying discount

5️⃣ Relationship between Discount % and Final Price

🗝️ Key Insights

📌 Tools Used

Visualization📈

✅ Final Output

💡 How to Run

About

Uh oh!

Releases

Packages

Languages

Sayantanidalui/Zepto-Product-Analysis-Using-Python

Folders and files

Latest commit

History

Repository files navigation

🛒 Zepto Products Analysis using Python

📁 Dataset Overview

🧹 Data Cleaning (using pandas)

📊 Exploratory Data Analysis (using matplotlib)

1️⃣ Top 10 products with highest discounts

2️⃣ Stock status analysis

3️⃣ Products with high MRP but low weight

4️⃣ Cheapest products after applying discount

5️⃣ Relationship between Discount % and Final Price

🗝️ Key Insights

📌 Tools Used

Visualization📈

✅ Final Output

💡 How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages