DOI: 10.70858/tijmet.1951725 ISSN: 2667-4033

A Quantitative Evaluation of Computational Resource Constraints and Carbon Intensity in Transformer-Based Media Bias Detection

Sylwia Snarska, Krzysztof Siwek
Automated media bias detection is becoming more important as disinformation spreads faster than human fact-checkers can respond. Yet most NLP research treats model accuracy as the only measure of success, ignoring training cost and energy use. This study asks a different question: is the accuracy gain from larger, more complex transformer models actually worth the environmental and computational price? We trained and evaluated three transformer architectures like RoBERTa-base, RoBERTa-large, and DeBERTa-v3-large on a three-class bias classification task using the News Bias Data corpus. Each model was tested across multiple hyperparameter configurations, including Focal Loss, class-balanced weights, and learning rate adjustments. A TF-IDF baseline provided a classical reference point. CO₂ emissions were estimated using the Green Algorithms calculator, based on the hardware specifications of the training environment. DeBERTa-v3-large with Focal Loss, class-balanced weights, and a reduced learning rate achieved the best classification result, with a Macro F₁ of 0.9756. The cost was steep: training took around 140 hours and produced more than four times the emissions of the Pareto-efficient RoBERTa-large setup. That setup was with Focal Loss and reached a Macro F₁ of 0.9487 for 9.95 kg CO₂e. The TF-IDF baseline used 99.9% less energy but fell 20–28 percentage points short on accuracy. The relationship between model quality and computational cost is strongly non-linear. Gains beyond a certain threshold require disproportionate resources. For most real-world applications, a well-tuned RoBERTa-large is a more responsible choice than the top-performing model available.

More from our Archive