Obinwanne, Uchechukwu Emmanuel
Optimized Large Language Model for Hate Speech Detection
Recent developments in Artificial Intelligence (AI), particularly Large Language Models (LLMs), have provided powerful tools for Natural Language Processing (NLP) tasks like sentiment analysis. However, their fine-tuning and deployment present challenges, specifically in terms of computational efficiency and high training costs. To address these challenges, this work applies optimization techniques such as Quantized Low-Rank Adaptation (QLoRA) for parameter-efficient fine-tuning, followed by Generalized Post-Training Quantization (GPTQ) on the Llama 3.1 LLM. To evaluate these optimizations, we apply the model to a practical task: hate speech detection, using a curated dataset comprising of X (formerly Twitter) posts. Overall, the optimized model achieved a 67% reduction in size along with significant improvements in classification accuracy and inference speed compared to the base model.
Author Keywords: Generalized Post-Training Quantization, Large Language Models, Low-Rank Adaptation, Parameter-Efficient Fine-Tuning, Quantized Low-Rank Adaptation