Abstract
Student opinions for a course are important to educators and administrators,
regardless of the type of the course or the institution. Reading and manually
analyzing open-ended feedback becomes infeasible for massive volumes of
comments at institution level or online forums. In this paper, we collected and
pre-processed a large number of course reviews publicly available online. We
applied machine learning techniques with the goal to gain insight into student
sentiments and topics. Specifically, we utilized current Natural Language
Processing (NLP) techniques, such as word embeddings and deep neural networks,
and state-of-the-art BERT (Bidirectional Encoder Representations from
Transformers), RoBERTa (Robustly optimized BERT approach) and XLNet
(Generalized Auto-regression Pre-training). We performed extensive
experimentation to compare these techniques versus traditional approaches. This
comparative study demonstrates how to apply modern machine learning approaches
for sentiment polarity extraction and topic-based classification utilizing
course feedback. For sentiment polarity, the top model was RoBERTa with 95.5\%
accuracy and 84.7\% F1-macro, while for topic classification, an SVM (Support
Vector Machine) was the top classifier with 79.8\% accuracy and 80.6\%
F1-macro. We also provided an in-depth exploration of the effect of certain
hyperparameters on the model performance and discussed our observations. These
findings can be used by institutions and course providers as a guide for
analyzing their own course feedback using NLP models towards self-evaluation
and improvement.