medr-bench

Name: arXiv/medr-bench
Author: arXiv

arXiv/medr-bench

Description

MedR-Bench is a benchmark for evaluating reasoning-enhanced LLMs in clinical settings, comprising 1,453 structured patient cases across 13 body systems and 10 specialties annotated with reasoning references derived from clinical case reports. It evaluates the full patient care journey—examination recommendation, diagnostic decision-making, and treatment planning—and includes a novel automated Reasoning Evaluator that scores free-text reasoning on efficiency, actuality, and completeness.

arXiv

Leaderboard

Loading leaderboard...

Implementations (1)

Environment	Stars	Last Updated
Pengcheng/MedR-Bench	1	3 months ago