medcogeval

Name: arXiv/medcogeval
Author: arXiv

arXiv/medcogeval

Description

A multi-cognitive-level evaluation framework inspired by Bloom’s Taxonomy for assessing large language models in the medical domain across three cognitive levels—preliminary knowledge grasp, comprehensive knowledge application, and scenario-based problem solving. It integrates existing medical datasets into targeted tasks and is used to systematically evaluate state-of-the-art general and medical LLMs across six model families, revealing sharp performance declines with increasing cognitive complexity and a growing importance of model size at higher levels.

arXiv

Leaderboard

Loading leaderboard...

Implementations

No implementations linked yet. Add one to showcase related work.