Comprehensive Data Science & Machine Learning Course Materials
Diogo Ribeiro
ESMAD - Escola Superior de MΓ©dia Arte e Design
Lead Data Scientist, Mysense.ai
This repository contains a comprehensive collection of professional academic presentations covering advanced topics in statistics, machine learning, deep learning, and data science. The materials are designed for:
- π Graduate-level courses in Data Science, Statistics, and Computer Science
- π¬ Research seminars and academic conferences
- π’ Professional training programs in industry
- π Self-study for advanced learners
β
15+ comprehensive presentations with 100+ hours of content
β
Production-ready code in Python and R (27,000+ lines)
β
140+ curated references with DOIs
β
Professional LaTeX theme with consistent styling
β
Hands-on exercises and assessments
β
Automated PDF generation via GitHub Actions
π 02-deep-learning/deep-learning-fundamentals/
Learning Objectives:
- Understand the mathematical foundations of neural networks
- Implement backpropagation and gradient descent from scratch
- Master modern optimization techniques (SGD, Adam, AdamW)
- Design and train CNN architectures for computer vision
- Build RNN/LSTM models for sequential data
- Understand Transformer architecture and attention mechanisms
- Apply regularization techniques (dropout, batch normalization)
Topics Covered:
- Perceptron and multilayer networks
- Activation functions (ReLU, sigmoid, tanh, Swish)
- Loss functions and optimization
- Convolutional Neural Networks (LeNet, AlexNet, VGG, ResNet)
- Recurrent Neural Networks and LSTM
- Transformers and self-attention
- Training best practices
Prerequisites: Linear algebra, calculus, Python programming
Level: Intermediate to Advanced
Duration: 3-4 weeks (graduate course)
π 02-deep-learning/reinforcement-learning/
Learning Objectives:
- Formulate problems as Markov Decision Processes
- Derive and apply Bellman equations
- Implement value iteration and policy iteration
- Understand Monte Carlo and TD learning methods
- Build Q-learning and SARSA agents
- Apply function approximation with neural networks
- Implement modern deep RL algorithms (DQN, PPO, A3C)
- Design multi-agent systems
Topics Covered:
- Markov Decision Processes and dynamic programming
- Monte Carlo methods
- Temporal Difference learning (SARSA, Q-learning)
- Function approximation and deep Q-networks
- Policy gradient methods (REINFORCE, Actor-Critic, PPO)
- Multi-agent reinforcement learning
- Applications (games, robotics, resource allocation)
Prerequisites: Probability, linear algebra, Python
Level: Advanced
Duration: 4-5 weeks (graduate course)
π 01-foundations/statistical-modeling/
Learning Objectives:
- Understand bias-variance tradeoff
- Master regularization techniques (Ridge, Lasso, Elastic Net)
- Apply cross-validation and model selection
- Implement ensemble methods (bagging, boosting, stacking)
- Understand kernel methods and SVMs
- Perform dimensionality reduction (PCA, t-SNE, UMAP)
- Evaluate models using appropriate metrics
Topics Covered:
- Supervised learning fundamentals
- Linear and logistic regression
- Regularization and model selection
- Tree-based methods (CART, Random Forests, XGBoost)
- Support Vector Machines
- Gaussian Processes
- Model evaluation and validation
Prerequisites: Statistics, linear algebra, programming
Level: Intermediate
Duration: 4-5 weeks
π 01-foundations/feature-engineering/
Learning Objectives:
- Design effective feature engineering pipelines
- Handle missing data with advanced imputation techniques
- Encode categorical variables appropriately
- Create polynomial and interaction features
- Apply feature scaling and normalization
- Perform feature selection using multiple methods
- Build end-to-end ML pipelines with scikit-learn
Topics Covered:
- Missing value imputation (mean, median, KNN, MICE)
- Categorical encoding (one-hot, ordinal, target, entity embeddings)
- Feature scaling (standard, min-max, robust)
- Polynomial features and interactions
- Feature selection (filter, wrapper, embedded methods)
- Dimensionality reduction
- Pipeline construction
Prerequisites: Basic Python, pandas, scikit-learn
Level: Beginner to Intermediate
Duration: 2-3 weeks
π 06-advanced-topics/explainable-ai/
Learning Objectives:
- Understand the interpretability-accuracy tradeoff
- Explain model predictions using SHAP values
- Apply LIME for local explanations
- Compute and interpret permutation importance
- Visualize partial dependence and ICE plots
- Detect and mitigate algorithmic bias
- Implement fairness metrics and constraints
- Use modern XAI tools (SHAP, LIME, InterpretML)
Topics Covered:
- Global vs local explanations
- Model-agnostic methods (SHAP, LIME, permutation importance)
- Model-specific interpretability (linear models, trees, neural networks)
- Attention mechanisms and gradient-based explanations
- Algorithmic fairness and bias detection
- Fairness definitions and impossibility results
- Practical implementation with Python tools
Prerequisites: Machine learning basics, Python
Level: Intermediate to Advanced
Duration: 2-3 weeks
π 03-bayesian-methods/mcmc/
Learning Objectives:
- Understand Bayesian inference and posterior distributions
- Derive Metropolis-Hastings acceptance probability
- Implement MCMC algorithms from scratch
- Apply Hamiltonian Monte Carlo for efficient sampling
- Use No-U-Turn Sampler (NUTS) for automatic tuning
- Diagnose convergence using R-hat and ESS
- Apply MCMC to real Bayesian models
Topics Covered:
- Bayesian inference fundamentals
- Metropolis-Hastings algorithm
- Hamiltonian Monte Carlo and leapfrog integration
- No-U-Turn Sampler (NUTS)
- Convergence diagnostics (trace plots, R-hat, ESS)
- Applications (Bayesian regression, hierarchical models)
Prerequisites: Probability theory, calculus, Python
Level: Advanced
Duration: 3-4 weeks
Code: Complete Python implementations (8,000+ lines)
π 03-bayesian-methods/bayesian-machine-learning/
Learning Objectives:
- Apply Bayesian inference to machine learning problems
- Build Bayesian linear and logistic regression models
- Implement Gaussian Processes for regression
- Understand Bayesian neural networks
- Perform approximate inference (VI, EP)
- Apply Bayesian optimization for hyperparameter tuning
- Quantify predictive uncertainty
Topics Covered:
- Bayesian linear regression
- Gaussian Processes
- Bayesian neural networks
- Variational inference
- Bayesian optimization
- Uncertainty quantification
Prerequisites: Bayesian statistics, machine learning, Python
Level: Advanced
Duration: 3-4 weeks
π 04-causal-inference/causal-inference-fundamentals/
Learning Objectives:
- Understand potential outcomes framework
- Draw and interpret causal DAGs
- Implement Instrumental Variables (IV/2SLS)
- Apply Regression Discontinuity Design
- Use Difference-in-Differences methods
- Estimate propensity scores and perform matching
- Apply synthetic control methods
- Identify and address confounding
Topics Covered:
- Potential outcomes and causal graphs
- Instrumental Variables and weak instruments
- Regression Discontinuity (sharp and fuzzy)
- Difference-in-Differences and event studies
- Propensity score methods
- Synthetic controls
- Modern methods (Callaway-Sant'Anna, Sun-Abraham)
Prerequisites: Statistics, econometrics, R or Python
Level: Advanced
Duration: 4-5 weeks
Code: Python & R implementations (11,000+ lines)
π 05-time-series/time-series-forecasting/
Learning Objectives:
- Analyze time series components (trend, seasonality)
- Test for and achieve stationarity
- Build ARIMA and SARIMA models
- Implement VAR models for multivariate series
- Apply state space models and Kalman filter
- Use LSTM and Transformers for forecasting
- Evaluate forecasting accuracy
- Apply hybrid methods (Prophet, N-BEATS)
Topics Covered:
- Stationarity and unit root tests
- ARMA, ARIMA, SARIMA models
- Vector Autoregression (VAR)
- State space models and Kalman filter
- Forecasting and evaluation
- Deep learning for time series (LSTM, GRU)
- Transformer models (TFT, Autoformer, Informer)
- Hybrid approaches (ES-RNN, N-BEATS, Prophet)
Prerequisites: Statistics, linear algebra, Python
Level: Intermediate to Advanced
Duration: 3-4 weeks
π 01-foundations/optimization/
Learning Objectives:
- Formulate optimization problems
- Understand convexity and its implications
- Derive and apply KKT conditions
- Implement gradient descent variants
- Apply momentum and adaptive methods (Adam, AdamW)
- Solve constrained optimization problems
- Use evolutionary algorithms for black-box optimization
- Apply Bayesian optimization for hyperparameter tuning
- Optimize neural network training
Topics Covered:
- Convex optimization fundamentals
- Gradient descent (batch, SGD, mini-batch)
- Momentum methods and Nesterov acceleration
- Adaptive learning rates (AdaGrad, RMSProp, Adam)
- Constrained optimization (Lagrangian, KKT, penalties)
- Evolutionary algorithms (GA, ES, PSO, CMA-ES)
- Bayesian optimization
- Multi-objective optimization
Prerequisites: Calculus, linear algebra, Python
Level: Intermediate to Advanced
Duration: 3-4 weeks
π 04-causal-inference/ab-testing/
Learning Objectives:
- Design statistically rigorous A/B tests
- Calculate required sample sizes
- Perform hypothesis testing correctly
- Control for multiple comparisons
- Understand statistical power and effect sizes
- Apply sequential testing methods
- Analyze experimental results
- Avoid common pitfalls (peeking, p-hacking)
Topics Covered:
- Experimental design
- Hypothesis testing and p-values
- Sample size calculations
- Multiple testing corrections
- Bayesian A/B testing
- Sequential analysis
- Common pitfalls and best practices
Prerequisites: Statistics, probability
Level: Intermediate
Duration: 1-2 weeks
academic-presentations/
βββ README.md # This file
βββ CONTRIBUTING.md # Contribution guidelines
βββ CHANGELOG.md # Version history
βββ LICENSE # CC BY-SA 4.0 for content
β
βββ .github/ # π€ GitHub Actions automation
β βββ workflows/
β β βββ compile-latex.yml # Auto-compile PDFs
β β βββ check-links.yml # Verify all URLs
β β βββ generate-previews.yml # Create PDF previews
β βββ dependabot.yml # Dependency updates
β βββ markdown-link-check-config.json
β
βββ shared/ # π Shared resources
β βββ theme/ # π¨ Professional LaTeX theme
β β βββ esmad_beamer_theme.sty # Custom Beamer theme
β β βββ esmad_beamer_theme_highcontrast.sty
β β βββ STYLE_GUIDE.md # Theme documentation
β β βββ template_presentation.tex # Example template
β βββ bibliographies/ # π Reference libraries (140+ papers)
β βββ mcmc_references.bib # MCMC methods (30+ refs)
β βββ causal_inference_references.bib # Causal inference (50+ refs)
β βββ statistical_learning_references.bib # ML/Stats (60+ refs)
β
βββ 00-programming-fundamentals/ # π» Programming Basics
β βββ r-programming/
β βββ presentation/
β βββ R_programming.tex
β
βββ 01-foundations/ # π Core Foundations
β βββ statistical-modeling/
β β βββ presentation/ # Statistical Learning Theory
β βββ feature-engineering/
β β βββ presentation/ # Feature Engineering
β βββ pca/
β β βββ presentation/ # Principal Component Analysis
β βββ optimization/
β βββ presentation/ # Optimization for Data Science
β
βββ 02-deep-learning/ # π§ Deep Learning
β βββ deep-learning-fundamentals/
β β βββ presentation/ # Deep Learning Fundamentals
β βββ reinforcement-learning/
β βββ presentation/ # Reinforcement Learning
β
βββ 03-bayesian-methods/ # π² Bayesian Statistics
β βββ mcmc/
β β βββ presentation/ # MCMC Methods
β β βββ exercises/ # MCMC Exercises
β βββ bayesian-machine-learning/
β βββ presentation/ # Bayesian ML
β
βββ 04-causal-inference/ # βοΈ Causal Methods
β βββ causal-inference-fundamentals/
β β βββ presentation/ # Causal Inference Fundamentals
β β βββ exercises/ # Causal Inference Exercises
β βββ ab-testing/
β βββ presentation/ # A/B Testing & Experimentation
β
βββ 05-time-series/ # β±οΈ Time Series
β βββ time-series-forecasting/
β βββ presentation/ # Time Series Analysis
β
βββ 06-advanced-topics/ # π¬ Advanced Topics
β βββ explainable-ai/
β β βββ presentation/ # Explainable AI
β βββ computer-science/
β βββ presentation/ # OOP & Streaming Pipelines
β
βββ 07-capstone-projects/ # π Projects
β βββ industry-focus/ # Industry applications
β βββ project-guides/ # Project guidelines
β βββ prerequisites/ # Prerequisites
β
βββ 08-data-science-applications-course/ # π― Applied Course
βββ presentation/ # Full course materials
βββ assessments/ # Course assessments
All presentations use the ESMAD Beamer Theme for consistent, professional appearance:
β
Professional color palette (ESMAD Blue, accents)
β
Custom environments (theorems, definitions, examples, alerts)
β
Mathematical notation helpers (\Normal, \E, \Var, etc.)
β
Code listing styles with syntax highlighting
β
Author information with ORCID integration
β
Slide templates (title, TOC, contact, references)
\documentclass[aspectratio=169]{beamer}
\usepackage{../../../shared/theme/esmad_beamer_theme}
% Author info
\authorname{Your Name}
\authoremail{your.email@university.edu}
\authororcid{0000-0000-0000-0000}
\title{Your Presentation}
\date{\today}
\begin{document}
\begin{frame}
\titlepage
\end{frame}
% Your content...
\contactslide
\end{document}See shared/theme/STYLE_GUIDE.md for complete documentation.
LaTeX Distribution:
# Ubuntu/Debian
sudo apt-get install texlive-full
# macOS
brew install --cask mactex
# Windows
# Download and install MiKTeX or TeX LivePython Environment (for code examples):
pip install numpy scipy matplotlib seaborn pandas scikit-learn statsmodels
pip install torch tensorflow # For deep learning examples
pip install shap lime # For XAI examplesR Environment (for R examples):
install.packages(c(
"AER", "rdrobust", "fixest", "did", # Causal inference
"caret", "recipes", "mice", # Feature engineering
"forecast", "vars", "fable" # Time series
))Manual compilation:
cd 02-deep-learning/deep-learning-fundamentals/presentation/
pdflatex deep_learning_beamer.tex
pdflatex deep_learning_beamer.tex # Run twice for referencesUsing latexmk (recommended):
cd 02-deep-learning/reinforcement-learning/presentation/
latexmk -pdf rl_beamer.texAutomated compilation:
- Push to GitHub β GitHub Actions automatically compiles all PDFs
- Download compiled PDFs from Actions artifacts or Releases
Python:
# MCMC examples (if code/ directory exists with implementations)
# Example references are embedded in presentation materials
# Exercises and assessments
cd 03-bayesian-methods/mcmc/exercises/
pdflatex mcmc_exercises.texExercises:
# MCMC exercises
cd 03-bayesian-methods/mcmc/exercises/
pdflatex mcmc_exercises.tex
# Causal inference exercises
cd 04-causal-inference/causal-inference-fundamentals/exercises/
pdflatex causal_inference_exercises.texPath 1: Machine Learning Fundamentals
- Statistical Learning (4 weeks)
- Feature Engineering (2 weeks)
- Optimization (3 weeks)
- Explainable AI (2 weeks)
Path 2: Deep Learning Specialization
- Deep Learning Fundamentals (4 weeks)
- Optimization (focus on neural networks)
- Reinforcement Learning (4 weeks)
- Time Series Analysis (focus on deep methods)
Path 3: Causal & Bayesian Methods
- Causal Inference (5 weeks)
- Bayesian ML (4 weeks)
- MCMC Methods (3 weeks)
- A/B Testing (2 weeks)
- π Start with slides to understand concepts
- π» Run code examples to see methods in action
- π Complete exercises to test understanding
- π Read references for deeper knowledge
- π€ Join discussions (create GitHub issues)
These materials can be integrated into:
- Graduate courses in Data Science/Statistics/CS
- Professional training programs
- Workshop series
- Seminar courses
- Fork this repository
- Customize presentations for your needs
- Add your own examples and exercises
- Maintain attribution (CC BY-SA 4.0)
Use the materials in assessments/:
- Quizzes for each topic
- Midterm and final exams
- Grading rubrics
- Project ideas
If you use these materials in your research or teaching, please cite:
@misc{ribeiro2025academic,
author = {Ribeiro, Diogo},
title = {Academic Presentations: Comprehensive Data Science Course Materials},
year = {2025},
publisher = {GitHub},
url = {https://github.com/diogoribeiro7/academic-presentations},
note = {ESMAD \& Mysense.ai}
}All presentations reference comprehensive BibTeX files:
\usepackage[backend=bibtex]{biblatex}
\addbibresource{../../../shared/bibliographies/mcmc_references.bib}
% In document
\cite{metropolis1953}
\cite{hoffman2014}
% At end
\printbibliographyAvailable:
shared/bibliographies/mcmc_references.bib: 30+ MCMC papersshared/bibliographies/causal_inference_references.bib: 50+ causal inference papersshared/bibliographies/statistical_learning_references.bib: 60+ ML/stats papers
All include DOIs for easy access.
We welcome contributions! See CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch
- Make your changes
- Test compilation and code
- Submit a pull request
- π Fix errors in presentations
- π Add new presentations
- π‘ Improve existing content
- π Enhance documentation
- π§ͺ Add code examples
- π Create exercises
- π¨ Improve theme/styling
compile-latex.yml: Auto-compiles all LaTeX on pushcheck-links.yml: Verifies all URLs and DOIs weeklygenerate-previews.yml: Creates PDF preview gallerydependabot.yml: Keeps dependencies updated
View slide previews at: https://diogoribeiro7.github.io/academic-presentations/
- π 15+ comprehensive presentations
- π» 27,000+ lines of code (Python & R)
- π 140+ curated references with DOIs
- π 14 pages of exercises (2 comprehensive problem sets)
- π¨ 1 professional LaTeX theme with full documentation
- π€ Fully automated PDF compilation and testing
Licensed under Creative Commons Attribution-ShareAlike 4.0 International
You are free to:
- β Share β copy and redistribute
- β Adapt β remix, transform, and build upon
Under the terms:
- π Attribution required
- π ShareAlike for derivatives
Code examples licensed under MIT License
- Email: dfr@esmad.ipp.pt
- Institution: ESMAD - Escola Superior de MΓ©dia Arte e Design
- Company: Mysense.ai (Lead Data Scientist)
- ORCID: 0009-0001-2022-7072
- Markov Chain Monte Carlo and Bayesian computation
- Machine learning and deep learning
- Causal inference and econometrics
- Financial risk modeling
- Time series analysis and forecasting
- π Guest lectures and workshops
- π’ Corporate training programs
- π¬ Research collaborations
- π Joint publications
- π Conference presentations
- ESMAD for institutional support
- Mysense.ai for industry applications and insights
- Students and colleagues for valuable feedback
- Open source community for tools and inspiration
- Academic community for rigorous peer review
See CHANGELOG.md for detailed version history.
Last Updated: January 2025
Repository Maintainer: Diogo Ribeiro
Status: β
Actively maintained
Latest Release: View releases