Asosiy kontentga oʻtish
AkademIndex

Mahsulotlar

Ishlab chiquvchilar uchun

AkademBaseEkotizim uchun ochiq API
Maqola

Prompt Injection Detection and Mitigation with AI Multiagent NLP-based Agentic Frameworks

Diego GosmarPolytechnic University of Turin, Mu Nu Chapter of IEEE-HKNDeborah A. DahlVoiceinteroperability.ai, Linux Foundation AI and DataDario GosmarPolytechnic University of Turin, Mu Nu Chapter of IEEE-HKN
2025
ABI

Annotatsiya

Prompt injection is a significant challenge for generative AI systems because it can lead to unintended outputs. We introduce a Multiagent NLP-based experimental framework, specifically designed to address prompt injection vulnerabilities through layered detection and metadata mechanisms. The framework orchestrates specialized AI agents to generate responses, detect vulnerabilities, and mitigate injection effects. An empirical evaluation of 500 engineered injection prompts was conducted, with ten different prompt injection categories properly generated and shuffled (50 prompts for each injection attack category). The experimental results show a significant reduction in the injection score and an increased detection of prompt injection markers, indicating potential applications for mitigation. Novel metrics—including Injection Success Rate (ISR), Policy Override Frequency (POF), Prompt Sanitization Rate (PSR), and Compliance Consistency Score (CCS)—are proposed to derive a composite Total Injection Vulnerability Score (TIVS). The system utilizes the vendor-independent OFP (Open Floor Protocol) framework for agentic AI communication via structured JSON messages. It encapsulates APIs using natural language while also comparing and extending a previously established multiagent experiment on hallucination mitigation to address the specific challenges of prompt injection.

Hali tarjima qilinmagan

Mavzular

Identifikatorlar

Iqtiboslar va manbalar