This repository contains analysis and machine learning projects focused on the education process, including student performance analysis, skill clustering, and course recommendations. The work is based on real student grade data and skill requirements extracted from job vacancies.
The repository focuses on applying data science and machine learning techniques to improve understanding of the education process and student outcomes. Main goals include:
- Analyzing student grades and performance trends.
- Clustering skills from job vacancy data to define relevant courses.
- Predicting student success using machine learning models.
- Supporting research in educational data analytics (e.g., ICL 2023 paper).
The datasets include:
- Student Grades: Final grades for students graduating in 2022–2025.
- Skill Data: Skills extracted from job vacancy descriptions to cluster and define relevant courses.
Note: Data is anonymized to preserve privacy.
The repository contains the following notebooks:
-
grades-analysis.ipynb
- Exploratory analysis of student grades across multiple cohorts.
- Includes statistics, distributions, and visualizations of performance.
-
skills-courses.ipynb
- Clustering of skills extracted from job vacancies.
- Supports course definition and curriculum planning.
-
student-success-analysis.ipynb
- Research-oriented notebook presented at ICL 2023.
- Includes student clustering, performance prediction, and machine learning modeling.
This repository is provided for educational and research purposes.