×
ヒント: 日本語の検索結果のみ表示します。検索言語は [表示設定] で指定できます
Multiple-choice questions (MCQs) are widely used in the evaluation of large language models (LLMs) due to their simplicity and efficiency.
2024年3月26日
We identify a significant issue: LLMs exhibit an order sensitivity in bilingual MCQs, favoring answers located at specific positions, i.e., the first position.
2024/05/20 · Multiple-choice questions (MCQs) are widely used in the evaluation of large language models (LLMs) due to their simplicity and efficiency.
2024/05/24 · Our experiments showed that the order of candidate answers in MCQs significantly impacts LLMs outputs. GPT-3.5-turbo and GPT4 exhibited ...
関連する質問
Multiple-choice questions (MCQs) are commonly used to evaluate the knowledge and abilities of large language models (LLMs) because of their simple format and ...
2024/05/23 · This paper explores the effectiveness of using multiple-choice questions to assess the capabilities of large language models (LLMs).
マルチプルチョイス問題はLLMの能力を正確に測定できない可能性がある。LLMはマルチプルチョイス問題に対して順序依存性を示し、長文生成問題との間に大きな差異がある ...
2024/03/26 · Multiple-choice questions (MCQs) are widely used in the evaluation of large language models (LLMs) due to their simplicity and efficiency.
2024/06/27 · The researchers demonstrate that MCQs can effectively assess the capabilities of LLMs, offering several advantages over more open-ended ...
多项选择题(MCQs)由于其简单和高效而被广泛用于评估大型语言模型(LLMs)。然而,在需要长篇生成(LFG)答案的知识密集型场景中,MCQs是否真正能够衡量LLM的能力存在疑虑。