Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry Paper • 2601.22588 • Published 5 days ago • 4
Adaptive Ability Decomposing for Unlocking Large Reasoning Model Effective Reinforcement Learning Paper • 2602.00759 • Published 4 days ago • 5
PromptRL: Prompt Matters in RL for Flow-Based Image Generation Paper • 2602.01382 • Published 2 days ago • 6
How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing Paper • 2602.01851 • Published 2 days ago • 15
Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization Paper • 2601.21358 • Published 6 days ago • 6
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment Paper • 2601.20218 • Published 7 days ago • 15
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning Paper • 2601.18150 • Published 9 days ago • 6
AACR-Bench: Evaluating Automatic Code Review with Holistic Repository-Level Context Paper • 2601.19494 • Published 8 days ago • 15
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 6 days ago • 21
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 7 days ago • 115
ECO: Quantized Training without Full-Precision Master Weights Paper • 2601.22101 • Published 5 days ago • 6
FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning Paper • 2601.19001 • Published 8 days ago • 4
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 6 days ago • 14
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published 6 days ago • 9
Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening Paper • 2601.21590 • Published 6 days ago • 12
Language-based Trial and Error Falls Behind in the Era of Experience Paper • 2601.21754 • Published 6 days ago • 16