Do LLMs Outperform Fine-tuned Transformers in Emotion Classification?: A Case Study of Llama and RoBERTa on an Emotion Benchmark

Timothy Meinert; Anna Koufakou

doi:10.32473/flairs.39.1.141637

Back

Do LLMs Outperform Fine-tuned Transformers in Emotion Classification?: A Case Study of Llama and RoBERTa on an Emotion Benchmark

Journal article

Open access

Peer reviewed

Do LLMs Outperform Fine-tuned Transformers in Emotion Classification?: A Case Study of Llama and RoBERTa on an Emotion Benchmark

Timothy Meinert and Anna Koufakou

The International FLAIRS Conference Proceedings, Vol.39(1)

05-06-2026

DOI: https://doi.org/10.32473/flairs.39.1.141637

Abstract

Generative large language models (LLMs) are often assumed to outperform earlier transformer-based encoders across NLP tasks, yet this has not been adequately tested for emotion classification. Using a recently introduced multi-dataset emotion benchmark, we compare a Llama-based generative model with previously reported results from a fine-tuned RoBERTa classifier. The zero-shot LLM consistently underperforms while few-shot prompting substantially improves LLM performance for several datasets. These findings challenge the assumption that LLMs universally surpass older transformers and highlight the continued relevance of fine-tuned models for emotion classification. At the same time, they show that few-shot prompting can unlock competitive LLM performance without the need for task-specific training but not for all datasets.

Files and links (2)

pdf

Article PDF193.69 kBDownload View

Open Access CC BY-NC V4.0

url

Link to presentation.View

Published (Version of record) Open

Metrics

2 Record Views

Details

Title: Do LLMs Outperform Fine-tuned Transformers in Emotion Classification?
Creators: Timothy Meinert - Florida Gulf Coast University
Anna Koufakou - Florida Gulf Coast University
Publication Details: The International FLAIRS Conference Proceedings, Vol.39(1)
Identifiers: 99385963913106570
Academic Unit: Department of Computing and Software Engineering
Language: English
Resource Type: Journal article

Do LLMs Outperform Fine-tuned Transformers in Emotion Classification?: A Case Study of Llama and RoBERTa on an Emotion Benchmark

Abstract

Files and links (2)

Related links

Metrics

Details