Skip to content

Add files via upload#3

Merged
Ritik574-coder merged 1 commit intomainfrom
Ritik574-coder-patch-1
Feb 4, 2026
Merged

Add files via upload#3
Ritik574-coder merged 1 commit intomainfrom
Ritik574-coder-patch-1

Conversation

@Ritik574-coder
Copy link
Owner

NYC Yellow Taxi Trip Data Analysis (2025)

Overview

This project analyzes the 2025 Yellow Taxi Trip dataset from New York City.
The raw dataset is not included in this repository due to GitHub file size limitations.


Why the Dataset Is Not Uploaded to GitHub

GitHub enforces strict file size limits:

  • Maximum 25MB per file via web upload
  • Maximum 100MB per file via git push
  • Large repositories are discouraged for performance and cloning efficiency

Each monthly Yellow Taxi Trip dataset file (Parquet format) ranges from approximately 200MB to 800MB.
Therefore, uploading raw data files directly to this repository is not feasible or recommended.

This repository contains only:

  • Data analysis scripts
  • Transformation logic
  • Aggregations and queries
  • Documentation

Raw datasets are intentionally excluded.


Official Data Source (2025)

The dataset is publicly available from:

New York City Taxi and Limousine Commission (TLC)
Website: https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page

Cloud-hosted Parquet files:
https://d37ci6vzurychx.cloudfront.net/trip-data/

Example (January 2025 dataset):

yellow_tripdata_2025-01.parquet

How to Download the Data

Option 1: Direct Download

Visit the TLC website and download the required monthly Parquet file.

Option 2: Using wget

wget https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2025-01.parquet

Option 3: Using Python

import pandas as pd

df = pd.read_parquet("yellow_tripdata_2025-01.parquet")
print(df.head())

@Ritik574-coder Ritik574-coder merged commit 9e14fd6 into main Feb 4, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant