Classifies the genre of music files (.wav) using ML models. Trained using the GTZAN Dataset: https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification
Repo link: https://github.com/break-through-19/Music-Genre-Classification
-
Download the dataset from Kaggle: https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification?resource=download
-
From the downloaded dataset, move the contents of genres_original (contains folders of .wav files) folder to Music-Genre-Classification/dataset folder.
-
Navigate to Home -> Add-Ons -> Search for "Audio Toolbox" -> Install.
-
Run MATLAB script:
scripts/generate_mel_feature_csv.m
-
To normalize the extracted feature CSV and generate 3D scatter plots:
scripts/normalize_and_plot_feature_space.m
-
Navigate to Home -> Add-Ons -> Search for "Statistics and Machine Learning Toolbox" -> Install.
-
To write PCA transformed features and train/test SVM models on original, PCA, and curated feature variants:
scripts/iteration4_svm.m
-
Navigate to Home -> Add-Ons -> Search for "Deep Learning Toolbox" -> Install. To train/test CRNN models on original, PCA, and curated feature variants:
scripts/crnn.m
The MATLAB pipeline is modular and organized as:
scripts/generate_mel_feature_csv.m: main entry point.src/pipeline/processDatasetToMelFeatureTable.m: dataset traversal + orchestration.src/io/listGenreAudioFiles.m: class/file discovery from folder structure.src/audio/splitAudioIntoFixedSegments.m: 30s audio -> 3s segments.src/features/extractMelSpectrogramFeatureVector.m: extract 16 important features (4 per category).src/features/extractHarmonicFeatureVector.m: compute harmonic feature block.src/features/getImportantMelFeatureNames.m: fixed names for the 16 features.src/io/buildFeatureRowTable.m: metadata + feature row construction.src/io/alignColumnsToExampleCsv.m: optional schema alignment with example CSV.src/preprocessing/readFeatureDataset.m: read and validate the feature CSV.src/preprocessing/meanCenterAndNormalizeFeatureTable.m: mean-center and z-score normalize feature columns.src/visualization/getFeatureCategoryDefinitions.m: define the four feature categories.src/visualization/generateCategory3DScatterPlots.m: write category-wise 3D scatter plots.
Output CSV:
data/features/mel_spectrogram_features.csvdata/features/mel_spectrogram_features_normalized.csv
Generated plot folder:
data/features/plots_3d_scatter
CRNN output folder:
data/model_outputs/crnn
Included metadata columns:
class_name(from folder name)wav_file_name(from WAV file name)segment_index(1-based segment id within each WAV file)
Feature columns:
- Spectral/timbral level features (4):
timbral_log_mel_meantimbral_log_mel_mediantimbral_band_energy_spreadtimbral_frame_peak_mean
- Spectral variability features (4):
variability_log_mel_std_globalvariability_band_std_meanvariability_frame_energy_stdvariability_band_range_mean
- Temporal dynamics features (4):
temporal_frame_energy_delta_abs_meantemporal_frame_energy_delta_stdtemporal_zero_crossing_rate_meantemporal_rms_std
- Harmonic features (4):
harmonic_voiced_frame_ratioharmonic_f0_mean_hzharmonic_f0_std_hzharmonic_energy_ratio_mean
Input CSV:
data/features/mel_spectrogram_features_normalized.csv
Generated outputs from scripts/iteration4_svm.m:
- PCA transformed feature CSV:
data/transformed_features/mel_spectrogram_features_pca_r4.csv
- Confusion matrix plots:
Iteration 4 plots/SVM_Phase1_Baseline.pngIteration 4 plots/SVM_Phase2_PCA.pngIteration 4 plots/SVM_Phase3_Curated.png
SVM experiment phases:
- Phase 1: Baseline SVM on all 16 normalized features.
- Phase 2: SVM on top-4 PCA components.
- Phase 3: SVM on curated 14-feature set (drops
timbral_log_mel_medianandvariability_band_range_mean).