inseva
Description
INSEva is a comprehensive Chinese benchmark for evaluating AI systems' knowledge and capabilities in the insurance domain. It features a multi-dimensional evaluation taxonomy across business areas, task formats, difficulty levels and cognitive-knowledge dimensions, comprises 38,704 authoritative examples, and implements tailored evaluation methods to assess both faithfulness and completeness of open-ended responses.
Leaderboard
Loading leaderboard...
Implementations
No implementations linked yet. Add one to showcase related work.