Skip to content

michael-0acf4/anitag2vec

Repository files navigation

anitag2vec

anitag2vec is a vector embedding primarily focused on Danbooru, Sakugabooru, Pixiv, MAL, etc type of tags.

Why?

If you have your own local gallery or index of things you like, which, to be fair you most likely probably don't BUT having a recommendation system is quite laborious without a fuzzy component to it.

I mean, sure you can do tag based statistics but you will have to manually group similar tags and somehow also account for spelling variation. With a vector embedding, problem solved! Just pin something you like then get recommended cosimilar stuff.

There are many off-the-shelf vector embeddings, but they are primarily designed for general-purpose tasks such as sentence embeddings. While you can still adapt them for other use cases, many models are sensitive to token order and the exact phrasing of inputs.

Setup

The model checkpoints are available HERE, this includes ONNX ports.

Python

pip install torch tokenizers tqdm asciichartpy

See the notebook python/ranked_inference.ipynb for a concrete inference example.

You can also explore the model's capabilities by composing embeddings using +, *, -, /.

python python/interactive.py

Here for example, we look for the closest entries to the expression within this MAL style dataset.

Tag Algebra

Inference in Rust

The rust implementation relies on the ONNX port of the PyTorch model.

# old: rust ort Backend (heavy)
# you will most likely run into issues on Windows
# it requires a very specific compiler setup
cargo add anitag2vec@0.1.0

# Or..
# new: tract Backend
cargo add anitag2vec@0.3.0-dev
use anitag2vec::{
    downloader::{ModelDownloader, KnownModel},
    model::Anitag2Vec,
    tagtok::TagSet
};

fn main() {
    println!("Downloading models...");
    let model_path = ModelDownloader::from_known(KnownModel::Anitag2VecV1, false).download().unwrap();
    let tokenizer_path = ModelDownloader::from_known(KnownModel::Anitag2VecTokenizerV1, false).download().unwrap();
    println!("Done!");

    let mut anitag2vec = Anitag2Vec::load_from_file_v1(model_path, tokenizer_path).unwrap();
    let example = vec![
        TagSet::new(["transcend", "uma musume", "imageset", "japanese"]),
        TagSet::new(["Comedy", "TV", "Anime", "Romance"]),
    ];
    let emb = anitag2vec.run_inference(example).unwrap();
    println!("{:?}", emb.shape()); // [2, 128]

    // Similar to emb.map(|nd| ..)
    // This representation allows various math operations
    println!("{}", emb.ndarray());

    // or alternatively as Vec<Vec<f32>>
    // println!("{:?}", emb.to_vec());
}

Architecture

You can refer to my blog post in which I detail the design decisions and also how it works.