climaqa
Description
ClimaQA-Gold is an expert-annotated benchmark dataset for evaluating the quality and scientific validity of LLM outputs on climate science question-answering. It consists of graduate-textbook-derived QA pairs generated by the ClimaGen adaptive framework with climate scientists in the loop (complemented by ClimaQA-Silver, a large-scale synthetic QA dataset).
Leaderboard
Loading leaderboard...
Implementations
No implementations linked yet. Add one to showcase related work.