Comparative Analysis of PCA-Transformed Soil Data and ML Models for Maize Yield Prediction in Nigeria

Authors

  • Ezra Daniel Dzarma Department of Operations Research Modibbo Adama Unioversity, Yola, Nigeria. Author

DOI:

https://doi.org/10.26765/DRJEIT47187932

Keywords:

Soil Fertility, Agronomic Modeling, Principal Component Analysis (PCA), Random Forest (RF, Artificial Neural Network (ANN), Precision Agriculture, Gradient Boosting Machine (GBM)

Abstract

This study predicts maize yield using soil data from long-term trials by the International Institute of Tropical Agriculture (IITA) in Ibadan, Nigeria. Multi-year measurements from experimental and farmer-managed fields covered pH, organic matter, nitrogen, phosphorus, exchangeable cations, texture, and micronutrients. To manage multi-collinearity, variables were standardized and analysed using principal component analysis (PCA). Six principal components (PCs) explained over 80% of variance, capturing fertility and texture gradients. These PCs was used as predictors in three machine-learning models: Random Forest (RF), Gradient Boosting Machine (GBM), and Artificial Neural Network (ANN). RF achieved the highest accuracy (R² ≈ 0.89; RMSE ≈ 0.59), outperforming GBM (R² ≈ 0.50) and ANN (R² ≈ 0.36). PCA loadings and RF feature importance identified soil organic matter, nitrogen, cation exchange capacity, and texture as major yield drivers. Results confirm PCA improves data efficiency and interpretability, while RF provides robust, reliable predictions for maize yield, supporting precision agriculture in tropical systems.

Downloads

Published

2025-12-03

How to Cite

Dzarma, E. D. (2025). Comparative Analysis of PCA-Transformed Soil Data and ML Models for Maize Yield Prediction in Nigeria. Direct Research Journal of Engineering and Information Technology, 13(3), 79-86. https://doi.org/10.26765/DRJEIT47187932