FinalGroupProject-Data

Zip-pot-ify is an amazing new music streaming company with listeners all over the country.

We, the mgmt team, want you, the data engineering team, to create a cool new dashboard that takes in historical listening data and show us cool things. What cool things?

Regional differences, popularities, and other metrics; show it to us by? artist? song? genre? time? what else?

Pipeline for Data Processing of a Music Streaming Company

The project will stream events that are created with EventSim. We can clean the data, convert the data, and aggregate the data using data engineering techniques. Clean up and aggregation can be done with various tech you have learned. The processed data are saved in a database (MySQL?).

Then make use of this data by consuming it, applying transformations to it, and creating the tables that are needed for our dashboard so that analytics may be generated. We are going to try to conduct an analysis of indicators such as the most played songs, active users, user demographics, regional differences etc.

You will be able to generate a sample dataset for this project by using Eventism and the Million Songs dataset. Apache Kafka and Apache Spark are two examples of streaming technologies that are used for processing data in (somewhat) real-time. The processed data are uploaded to a database, where they are then subjected to transformation. We can clean the data, convert the data, and aggregate the data using your tools so that it is ready for analysis. The data is then sent to a data warehouse, and tools are used to create a visual representation of the data. Apache AirFlow has been used for the purpose of orchestration, whilst Docker is the tool of choice when it comes to containerization.

eventsim

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FinalGroupProject-Data

Pipeline for Data Processing of a Music Streaming Company

About

Uh oh!

Releases

Packages

License

ZCW-Spring25/FinalGroupProject-Data

Folders and files

Latest commit

History

Repository files navigation

FinalGroupProject-Data

Pipeline for Data Processing of a Music Streaming Company

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages