Tool profile

psmarter/mini-infer

stable

LLM inference engine from scratch — paged KV cache, continuous batching, chunked prefill, prefix caching, speculative decoding, CUDA graph, tensor parallelism, OpenAI-compatible serving

continuous-batchingcudainferenceinference-enginekv-cachelanguage-modelllmmachine-learning

Velocity score

0.08/ 10

[STARS]

204

[FORKS]

[CONTRIBUTORS]

[LAST_COMMIT]

16d ago

OPEN_ON_GITHUB

30-day stars

0.08/ 10 score

last 32d

[SIGNAL_TRACE / 32_PT]

Score breakdown

669/ 1000

inference · psmarter/mini-infer

Velocity50%

Adoption30%

Maintenance15%

Community5%

[CODE_GROWTH]

872

[INSTALL_VEL]

498

[ACTIVITY]

438

[COMMUNITY_SIGNAL]

353

Terminal score: 0–1000 raw, weighted across 4 dimensions. Public score: 0–10 normalized (shown in the 30-day stars chart above).