Article

Multi-Class Detection of Humanized AI Text Using Machine Learning and Transformer Models

Batyr SharimbayevSDU University,Department of Information Systems,Kaskelen,KazakhstanAiken KazinSDU University,Department of Mathematics,Kaskelen,KazakhstanShirali KadyrovNew Uzbekistan University,Department of General Education,Tashkent,Uzbekistan

2025

ABI

Abstract

The rise of advanced large language models (LLMs) has enabled the generation of human-like text, challenging the detection of AI-generated and humanized AI content. This study evaluates Logistic Regression, Bidirectional LSTM, and DeBERTa for multi-class detection of human-written, AI-generated, and humanized AI text. We introduce a novel dataset of 30,000 texts, including 10,000 humanized samples created via a LangChain-based pipeline with GPT-4o, verified to reduce AI detectability using ZeroGPT. Experimental results show DeBERTa achieves 96.93% accuracy, outperforming Logistic Regression (93.43%) and LSTM (93.77%) in distinguishing text classes. Our approach leverages stylometric features and deep contextual embeddings to address real-world challenges like stylistic overlap and adversarial paraphrasing. Key contributions include the dataset, a comparative model evaluation, and insights into detecting humanized AI text, with implications for content moderation, academic integrity, and misinformation prevention.

Topics

Hate Speech and Cyberbullying Detection Authorship Attribution and Profiling Topic Modeling

Identifiers

DOI: 10.1109/iccike67021.2025.11318272

Citations and references

Cited by 09 references

Metrics — AkademScholar · Coming soon