Real-Time Model-Agnostic Generated Answer Detection in Text-Based Chat Interview
, Chief Data Scientist, Sapia.ai
, Senior Data Scientist, Sapia.ai
, Senior Machine Learning Engineer, Sapia.ai
Large language models (LLMs) such as GPT-4 can now generate realistic text in real time that's difficult to distinguish from human-written content. The reliability, validity, and fairness of text-based chat interview assessments can be impacted when job candidates use LLMs to generate answers. In this work, we demonstrate a real-time, model-agnostic LLM-generated answer detector that's built using NVIDIA NeMo, TesorRT, and Triton. To offer real-time warnings to chat interview users, our detection service latency is within the sub-second range with sublinear scalability over text length. Our model-agnostic detector is also more scalable than model-specific methods, which are limited to detecting answers from specific LLMs and are hard to scale because new LLMs are released so frequently.