Skip to content

Latest commit

 

History

History
85 lines (68 loc) · 1.99 KB

File metadata and controls

85 lines (68 loc) · 1.99 KB

Learning Machine

When you learn machine, the machine also learns you.

Document

Learning Machine, library helps process data, model construction for machine learning. Quickly build and version-control your data processing pipeline using easy-to-read and easy-to-edit config file. Insipred by Detectron.

Supported features

  • Building data processing pipeline (engine)
  • Create an engine with a readable config file (YAML)
  • Support widely used data processing engines. (e.g. scikit-learn scalers)

Install

As package (local)

git clone https://github.com/devhoodit/learning-machine.git
pip install -e .

As directory

git clone https://github.com/devhoodit/learning-machine.git

Quick Start

Build engine with code

import pandas as pd
from learning_machine.engine import SequentialEngine, StringToDatetime, FillNa, StandardScaler

string_to_datetime_engine = StringToDatetime(col="datetime")
fill_na_engine = FillNa(cols=["age"], fillwith=10)
standard_scaler_engine = StandardScaler(cols=["income"])

seq_engine = SequentialEngine([
    string_to_datetime_engine,
    fill_na_engine,
    standard_scaler_engine
])

engine = seq_engine

data = pd.read_csv("data.csv")
data = engine(data)

Build engine with config file

# config.ymal

# preload custom engines from directory
projects:
    - "projects"

data_engine:
    - StringToDatetime:
        col: datetime
    - FillNa:
        fillwith: 10
        cols:
            - age
    - StandardScaler:
        cols:
            - income

Parsing config file and create engine

from learning_machine import create_from_config

bundle = create_from_config("config.yaml")
engine = bundle.data_engine

read csv with pandas and apply engine

import pandas as pd

data = pd.read_csv("data.csv")
data = engine(data)