DOI: 10.25259/jaes_6_2025 ISSN: 1658-9386

The Role of Machine Learning in Accounting: A Study on Automated Financial Statement Analysis and Prediction

Maysoon Khoja

Objectives

The integration of machine learning into accounting has advanced predictive analytics, yet the optimal fusion of structured financial data (“hard data”) and textual sentiment (“soft data”) remains ambiguous. This study addresses this gap by examining whether appending lowdimensional text features to strong predictive models enhances accuracy.

Material and Methods

Using an XGBoost model trained on financial ratios from 10-K reports, we compare its performance against a naive hybrid model incorporating Loughran-McDonald (2011) dictionary-based sentiment features from the Management Discussion & Analysis (MD&A) section.

Results

Empirical results reveal a “Signal Dilution Effect”: the hybrid model statistically underperforms the pure structured-data model. SHAP value analysis confirms that while structured ratios dominate predictive contributions, naive text concatenation introduces noise that degrades the performance of advanced non-linear classifiers.

Conclusion

This research contributes to accounting analytics by cautioning against indiscriminate feature stacking and advocating for separated processing strategies in financial prediction systems.

More from our Archive