Асосий контентга ўтиш
AkademIndex

Маҳсулотлар

Ишлаб чиқувчилар учун

AkademBaseЭкотизим учун очиқ API
Мақола

A Cascade of Evaluation Biases in LLM-Based Knowledge Graph Verification

Anatoliy KremenchutskiyBukhara State Universit
ABI

Аннотация

Large Language Models (LLMs) are increasingly deployed as automated evaluators for knowledge graph (KG) verification, yet the biases they introduce into this process remain poorly characterized. We present a systematic investigation of four interconnected evaluation biases, verbosity bias, acquiescence bias, negation asymmetry, and position bias, that form a compounding cascade in LLM-based KG verification. Using four locally-deployed 7–9B parameter models (Qwen 2.5:7b, Gemma2:9b, Llama 3.1:8b, and Mistral:7b) evaluated on 42–100 knowledge graph triples across multiple datasets, we demonstrate that: (1) verbose model responses inflate verification accuracy by up to 47 percentage points (logistic regression OR = 1.90 per 10 additional words, p < 0.001); (2) acquiescence toward known-false triples ranges from 8.9% to 33.3% across models, with sharp domain-dependent variation (0–70% within a single model); (3) negation comprehension drops 9.9–31.8 percentage points on false versus true triples; and (4) multiple-choice position bias reaches statistical significance (χ²(3) = 14.33, p < 0.01) with primacy effects up to 100% for position A. These biases interact sequentially and may compound: verbosity inflates string-match scores, which mask acquiescence, which in turn compounds with negation failures to produce systematically over-optimistic verification. We propose the cascade model as a diagnostic framework and discuss mitigation strategies for each bias layer.

Ҳали таржима қилинмаган

Мавзулар

Идентификаторлар

Иқтибослар ва манбалар

0 та иқтибос0 та фойдаланилган манба