This repository offers a hands-on exploration of key tasks in Spoken Language Processing (SLP), including dialogue act recognition, emotion classification, and speech feature extraction. Each module demonstrates practical applications using state-of-the-art tools and machine learning techniques.
Dialogue_Act_Recognition
Implements models to classify Switchboard Corpus Dialogue utterances based on their communicative intent (e.g., question, statement) using acoustic and lexical features and BERT encoder Embeddings.
Emotion_Classification
Focuses on detecting emotional states from speech data using prosodic/acoustic features and Multilayer Perceptron classifier.
Speech_Feature_Extraction
Provides tools for extracting relevant features from raw audio signals, such as MFCCs and pitch, for use in downstream task.
The project leverages a variety of tools and libraris:
- *Programming Language: Python
- *Data Manipulation: pandas, NumPy
- *Audio Processing: Parselmouth, Opensmile
- *Natural Language Processing: NLTK, TextBob
- *Machine Learning: scikit-learn, MLP, BERT
- *Visualization: matplotlib, seaborn
- *Time Series Forecasting: Prophet
- *Hyperparameter Optimization: scikit-optimze