Securing with Dual-LLM Architecture: ChatTEDU An Open Access Chatbot’s Defense

Emekci, Hakan; Budakoglu, Gulsum

doi:10.1109/access.2025.3623268

Securing with Dual-LLM Architecture: ChatTEDU An Open Access Chatbot’s Defense

Emekci H., Budakoglu G.

IEEE Access, cilt.13, ss.183156-183170, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 13
Basım Tarihi: 2025
Doi Numarası: 10.1109/access.2025.3623268
Dergi Adı: IEEE Access
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.183156-183170
Anahtar Kelimeler: Adversarial attacks, artificial intelligence security, large language models, open access chatbot security, prompt injection
TED Üniversitesi Adresli: Evet

Özet

Open access chatbots face escalating cybersecurity risks due to adversarial exploitation. This paper presents the case of ChatTEDU, a dual-LLM architecture designed to protect open-access AI systems from sophisticated adversarial attacks while maintaining the user experience quality. During a two-month deployment at TED University, we analyzed 4501 unique real-world interactions, including 180 malicious attempts targeting the system through prompt injection, jailbreaking, and content manipulation attacks. Our dual-layer security approach separates content moderation from response generation by using two specialized language models. The first model (LLM-1) acts as a security filter, analyzing incoming queries for threats, whereas the second model (LLM-2) generates educational responses only after validation. This architecture successfully blocked 100% of the identified attacks, with only 0.28% false positives, demonstrating robust protection without compromising legitimate educational interactions. The system handles over 100 concurrent users during peak registration periods without security breaches or performance degradation. Attack analysis of the case revealed that 77.8% of the threats used technical exploitation rather than content-based manipulation, with multilingual attacks comprising 15% of the attempts. The dual-LLM approach introduced only 18% latency overhead, while providing comprehensive protection against prompt injection, jailbreaking, spam insertion, and denial-of-service attacks. This study provides practical guidance for implementing robust security measures in public-faced AI deployments worldwide.