×
ヒント: 日本語の検索結果のみ表示します。検索言語は [表示設定] で指定できます
2023/07/04 · We present a benchmark, CARE-MI, for evaluating LLM misinformation in: 1) a sensitive topic, specifically the maternity and infant care domain; and 2) a ...
This paper introduces a new Chinese benchmark, CARE-MI, designed for evaluating LLM misinformation in the maternity and infant care subfields.
2024/04/07 · It contains 1,612 expert-checked questions, accompanied with human-selected references. Using our benchmark, we conduct extensive experiments ...
2023/10/26 · ... CARE-MI aims solely at evaluating the misinformation in long-form generation tasks for Chinese LLMs on the topic of maternity and infant care.
The benchmark is and only is for evaluating the misinformation in long-form (LF) generation for Chinese Large Language Models (LLMs) in the maternity and infant ...
A benchmark for evaluating LLM misinformation in a sensitive topic, specifically the maternity and infant care domain; and a language other than English, ...
2024/07/09 · CARE-MI: Chinese Benchmark for Misinformation Evaluation in Maternity and Infant Care ... Evaluation Benchmark for Chinese Large Language Models.
2023/10/26 · 文献「CARE-MI:マタニティと乳児ケアにおける誤情報評価のための中国語ベンチマーク【JST・京大機械翻訳】」の詳細情報です。
Lu Wei. Latest. CARE-MI: Chinese benchmark for misinformation evaluation in maternity and infant care. 大阪大学データビリティフロンティア機構. TEL: 06-6105-6074
Connected Papers is a visual tool to help researchers and applied scientists find academic papers relevant to their field of work.