Hopp til hovedinnholdet

Publications

NIBIOs employees contribute to several hundred scientific articles and research reports every year. You can browse or search in our collection which contains references and links to these publications as well as other research and dissemination activities. The collection is continously updated with new and historical material.

2022

To document

Abstract

Semi- and nonparametric models are popular in the area-based approach (ABA) using airborne laser scanning. It is unclear, however, how many predictors and training plots are needed to provide accurate predictions without overfitting. This work aims to explore these limits for various approaches: ordinary least squares regression (OLS), generalized additive models (GAM), least absolute shrinkage and selection operator (LASSO), random forest (RF), support vector machine (SVM), and Gaussian process regression (GPR). We modeled timber volume (m3·ha–1) for four boreal sites using ABA with 2–39 predictors and 20–500 training plots. OLS, GAM, LASSO, and SVM overfitted as the number of predictors approached the number of training plots. They required ≥15 plots per predictor to provide accurate predictions (RMSE ≤30%). GAM required ≥250 plots regardless of the number of predictors. The number of predictors only mildly affected RF and GPR, but they required ≥200 and ≥250 training plots, respectively. RF did not overfit in any circumstances, whereas GPR overfit even with 500 training plots. Overall, using up to 39 predictors did not generally result in overfit, and for most model types, it resulted in better accuracy for sufficiently large datasets (≥250 plots).