4 307

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry

upvoted a paper 1 day ago

Adaptive Ability Decomposing for Unlocking Large Reasoning Model Effective Reinforcement Learning

upvoted a paper 1 day ago

PromptRL: Prompt Matters in RL for Flow-Based Image Generation

View all activity

Organizations

None yet

upvoted 4 papers 1 day ago

Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry

Paper • 2601.22588 • Published 5 days ago • 4

Adaptive Ability Decomposing for Unlocking Large Reasoning Model Effective Reinforcement Learning

Paper • 2602.00759 • Published 4 days ago • 5

PromptRL: Prompt Matters in RL for Flow-Based Image Generation

Paper • 2602.01382 • Published 2 days ago • 6

How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing

Paper • 2602.01851 • Published 2 days ago • 15

upvoted 2 papers 2 days ago

Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization

Paper • 2601.21358 • Published 6 days ago • 6

DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Paper • 2601.20218 • Published 7 days ago • 15

upvoted 14 papers 5 days ago

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published 7 days ago • 24

Persona Prompting as a Lens on LLM Social Reasoning

Paper • 2601.20757 • Published 7 days ago • 3

FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

Paper • 2601.18150 • Published 9 days ago • 6

AACR-Bench: Evaluating Automatic Code Review with Holistic Repository-Level Context

Paper • 2601.19494 • Published 8 days ago • 15

Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published 6 days ago • 21

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published 7 days ago • 50

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published 7 days ago • 35

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Paper • 2601.20614 • Published 7 days ago • 115

Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening

Paper • 2601.21590 • Published 6 days ago • 12

Language-based Trial and Error Falls Behind in the Era of Experience

Paper • 2601.21754 • Published 6 days ago • 16

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity