-
Based in DMV
-
Recent CS Graduate from University of Maryland
-
Contact me @: dberhane@terpmail.umd.edu
-
Building data pipelines and analytics solutions with cloud technologies
-
Passionate about data engineering, ML operations, and business intelligence
-
Always exploring new ways to optimize data workflows
[Automated Invoice Processing Pipeline)(https://github.com/Daniel21b/invoice_pipeline)
Tech Stack: Python, PostgreSQL, Streamlit, AWS (Lambda, S3, RDS), OCR (Textrac
- Designed an event-driven architecture that eliminates manual data entry by automatically triggering extraction pipelines upon file upload, reducing processing time by 95% .
- Migrated real-time invoice processing to a scheduled Airflow batch architecture, optimizing compute resource usage by processing uploads in hourly micro-batches.
- Implemented unstructured-to-structured data transformation using AI-based OCR to parse PDF invoices and normalize them into a relational PostgreSQL schema
Tech Stack: Python, Pandas, Seaborn, Plotly, Looker
- Processed 2M+ trips to identify usage patterns across DC metro stations
- Analyzed weather-ridership correlations to quantify seasonal demand fluctuations
- Built interactive Looker dashboards visualizing peak hours and route trends
Tech Stack: Python, Pandas, BeautifulSoup, Matplotlib
- Scraped and analyzed job postings to track company hiring patterns
- Standardized data with Pandas for accurate trend analysis
- Visualized month-over-month hiring trends with Matplotlib







