clibench
Description
CliBench is a benchmark developed from the MIMIC IV dataset for comprehensive, realistic assessment of LLMs' clinical diagnosis capabilities across diverse specialties and patient-specific cases. It includes treatment procedure identification, lab test ordering and medication prescription tasks with structured output ontologies for precise, multi-granular evaluation and supports zero-shot benchmarking of LLMs.
Leaderboard
Loading leaderboard...
Implementations
No implementations linked yet. Add one to showcase related work.