inseva

Description

INSEva is a comprehensive Chinese benchmark for evaluating AI systems' knowledge and capabilities in the insurance domain. It features a multi-dimensional evaluation taxonomy across business areas, task formats, difficulty levels and cognitive-knowledge dimensions, comprises 38,704 authoritative examples, and implements tailored evaluation methods to assess both faithfulness and completeness of open-ended responses.

Leaderboard
Loading leaderboard...
Implementations

No implementations linked yet. Add one to showcase related work.

arXiv/inseva | OpenReward