Skip to content

Latest commit

 

History

History
64 lines (44 loc) · 2.65 KB

File metadata and controls

64 lines (44 loc) · 2.65 KB

CodeCut Blog Articles

Visit CodeCut Blog

About CodeCut

These notebooks are from CodeCut. CodeCut features open-source Python data science tools explained in clear, digestible tutorials. Subscribe to get:

  • Weekly articles with step-by-step guides
  • Newsletters 3x per week (2-minute digests)

Repository Overview

This repository contains 45+ comprehensive technical articles covering data science, MLOps, and AI tools.

Here are some examples of what you'll find in this repository:

Data Engineering

  • PySpark SQL - DataFrames, window functions, aggregations
  • DuckDB - Fast analytical queries for data scientists
  • DVC - Data versioning and experiment tracking
  • Delta Lake - Production lakehouses with delta-rs

Machine Learning

LLM Applications

Data Visualization

Data Utilities

  • Faker - Generate realistic test data
  • PRegEx - Human-readable regex patterns
  • Loguru - Simplified Python logging
  • Hydra - Configuration management

Setup

Prerequisites: Python 3.9+

Quick Start:

# Clone repository
git clone https://github.com/khuyentran1401/codecut-blog.git
cd codecut-blog

# Install dependencies (listed at top of each notebook)
pip install package1 package2

Use UV for faster installs: uv pip install package1 package2

License

All articles are copyright � Khuyen Tran. Code examples within articles are MIT licensed for reuse.