Abstract

Recent developments in Artificial Intelligence (AI), particularly Large Language Models (LLMs), have provided powerful tools for Natural Language Processing (NLP) tasks like sentiment analysis. However, their fine-tuning and deployment present challenges, specifically in terms of computational efficiency and high training costs. To address these challenges, this work applies optimization techniques such as Quantized Low-Rank Adaptation (QLoRA) for parameter-efficient fine-tuning, followed by Generalized Post-Training Quantization (GPTQ) on the Llama 3.1 LLM. To evaluate these optimizations, we apply the model to a practical task: hate speech detection, using a curated dataset comprising of X (formerly Twitter) posts. Overall, the optimized model achieved a 67% reduction in size along with significant improvements in classification accuracy and inference speed compared to the base model.

Author Keywords: Generalized Post-Training Quantization, Large Language Models, Low-Rank Adaptation, Parameter-Efficient Fine-Tuning, Quantized Low-Rank Adaptation

Item Description

Type

Text

Genre

Electronic Thesis or Dissertation

Contributors

Creator (cre): Obinwanne, Uchechukwu Emmanuel

Thesis advisor (ths): Feng, Wenying

Degree committee member (dgc): Alam, Omar

Degree committee member (dgc): Xu, Simon

Degree committee member (dgc): Parker, James

Degree granting institution (dgg): Trent University