minicache

wassname/minicache

Fork 0

mirror of https://github.com/wassname/minicache.git synced 2026-06-27 15:15:59 +08:00

T

wassname ed358f5980 Merge branch 'main' of https://github.com/wassname/minicache

2026-05-15 06:44:57 +00:00

src/minicache

minicache 0.2.0: simplify to @cached(exclude=...) as primary API

2026-05-15 06:25:44 +00:00

pyproject.toml

minicache 0.1.0 — tiny disk cache: cloudpickle + gzip + arg blacklist + explicit state

2026-05-15 05:23:55 +00:00

README.md

Update README.md

2026-05-15 13:59:10 +08:00

README.md

minicache — tiny disk cache for ML / research code.

This wraps function calls and stores returns on disk (gzip + cloudpickle). Solves the four pain points that stdlib functools.lru_cache + pickle and existing function-cache libraries (anycache, cachier) hit on ML code:

Loaded models can't be hashed. So we use a arg blacklist (exclude=["model", "tok"]). Here, excluded args pass through to the function but never enter the cache key.
Tensors / pandas / closures can't be picked* → we use cloudpickle which extends to many more objects.
Pickle files grow large → gzip on disk save 20-50%

Quick use

Install

uv add git+https://github.com/wassname/minicache.git

from minicache import cached, cache_call

@cached(exclude=["model", "tok"]) # can't hash model or tokenizer, but model_id will substitute
def run_eval(model, tok, *, model_id, name, batch_size):
    return tinymfv_evaluate(model, tok, name=name, batch_size=batch_size)

report = run_eval(model, tok, model_id="qwen-27b", name="classic", batch_size=16)
# 30 minutes

report = run_eval(model, tok, model_id="qwen-27b", name="classic", batch_size=16)
# 0 minutes, gives saved results