A results-driven data professional passionate about transforming raw, complex, and un-structured datasets into scalable data pipelines, predictive models, and strategic business insights.
- Graduate with a Bachelor's degree in Economics, totally studied in english at University Carlos III.
- Graduate with a Master's Degree in Data Science, specializing in end to end analytical pipelines, statistical modeling, and data engineering architectures.
- My core expertise lies in designing robust data curation workflows, managing database schemas, and building interactive business intelligence systems.
- Currently exploring advanced distributed computing workflows, automated deployment of pipelines, and production-level machine learning architectures.
| Category | Technologies & Tools |
|---|---|
| Data engineering | |
| Databases & query | |
| Programming languages | |
| Analytics & BI | |
| Environments & versioning |
Stack:
Apache Hop•MySQL•Tableau
Design of a complete data warehouse infrastructure using a medallion architecture (bronze, silver, gold) to isolate data extraction and monitor service level agreements (SLA) alongside internal backlog dynamics.
Stack:
R•Caret•Glmnet•pROC
Application of advanced statistical inference, logistic regression, and predictive regularization models (Ridge and Lasso) using cross-validation to prevent user attrition and handle corporate dataset balancing.
Stack:
Python•Apache Spark•Parquet
Cloud-ready big data engineering pipeline targetting compressed columnar metadata and nested arrays to track international streaming production hubs and release timeline trends.
Stack:
Python•Pandas•Seaborn
Detailed exploratory data analysis (EDA) and data cleansing pipeline on automotive marketplace transactions, implementing domain-specific outlier filtering and mathematical stabilization.
Stack:
R•Tidyverse•Plotly
Statistical transformation, custom multi-variable reshaping, and data cleaning using advanced functional pivoting techniques to isolate specific environmental pollution metrics by industry.
Stack:
Python•Pandas•Seaborn
Consolidation and deep data cleaning of a multi-year municipal open data corpus encompassing over 312,000 records to identify temporal and seasonal trends in road safety.
Are you interested in my profile or looking to discuss data architecture and analytics? Feel free to reach out!
- LinkedIn: linkedin.com/in/aliciasantamariaroman
- Email: your-email@example.com