DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published Mar 10, 2025 • 32
PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters Paper • 2504.08791 • Published Apr 7, 2025 • 139
[mixed] Image Generation Stack Collection The stuff we actually use, pruned on an ongoing basis. • 11 items • Updated about 17 hours ago • 1
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published 1 day ago • 71
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published 5 days ago • 8
Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models Paper • 2601.18734 • Published 8 days ago • 2
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 6 days ago • 14
Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs Paper • 2512.03324 • Published Dec 3, 2025 • 1 • 1
Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs Paper • 2512.03324 • Published Dec 3, 2025 • 1