Skip to content

Multimodal feature extraction and Machine Learning Classification

Notifications You must be signed in to change notification settings

cjd2186/Spoken-Language-Processing

Repository files navigation

🗣️ Spoken Language Processing

This repository offers a hands-on exploration of key tasks in Spoken Language Processing (SLP), including dialogue act recognition, emotion classification, and speech feature extraction. Each module demonstrates practical applications using state-of-the-art tools and machine learning techniques.


📂 Repository Structure

Dialogue_Act_Recognition Implements models to classify Switchboard Corpus Dialogue utterances based on their communicative intent (e.g., question, statement) using acoustic and lexical features and BERT encoder Embeddings.

Emotion_Classification
Focuses on detecting emotional states from speech data using prosodic/acoustic features and Multilayer Perceptron classifier.

Speech_Feature_Extraction
Provides tools for extracting relevant features from raw audio signals, such as MFCCs and pitch, for use in downstream task.


🧰 Technologies & Libraries

The project leverages a variety of tools and libraris:

  • *Programming Language: Python
  • *Data Manipulation: pandas, NumPy
  • *Audio Processing: Parselmouth, Opensmile
  • *Natural Language Processing: NLTK, TextBob
  • *Machine Learning: scikit-learn, MLP, BERT
  • *Visualization: matplotlib, seaborn
  • *Time Series Forecasting: Prophet
  • *Hyperparameter Optimization: scikit-optimze

About

Multimodal feature extraction and Machine Learning Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published