Abstract
To improve the accuracy of reference evapotranspiration (ETo) prediction model based on public weather forecast, weather types and wind levels were numerically processed by introducing four encoding methods, namely, Ordinal (ORD), One-Hot (O-H), Target (TAR) and CatBoost (CAT) encoding. The Light Gradient Boosting Decision Machine (LGB) algorithm was combined with the above methods to build the ETo prediction model based on weather forecast category features. The results show that the accuracy of LGB3 model can be improved effectively by introducing encoded weather type and wind level data (R2 improved by -0.97%~9.36% compared with LGB1), and the improvement rank is O-H>CAT>TAR>ORD.LGB4 with additional encoded weather type data alone can obtain similar accuracy to LGB3 model, while introducing only wind level has no significant contribution to LGB5 model accuracy and may even introduce noise reduction accuracy. Therefore, using O-H encoding to pre-process weather type and wind level data can expand input dimension to improve the model accuracy and it can be recommended for precise ETo prediction in regions with lack of meteorological station or incomplete data type.
Translated title of the contribution | A Novel Reference Evapotranspiration Forecasting Model Based on Categorical Feature Encoding Methods |
---|---|
Original language | Chinese (Traditional) |
Pages (from-to) | 1402-1419 |
Number of pages | 18 |
Journal | Yingyong Jichu yu Gongcheng Kexue Xuebao/Journal of Basic Science and Engineering |
Volume | 30 |
Issue number | 6 |
DOIs | |
Publication status | Published - Dec 2022 |
Externally published | Yes |
Keywords
- Categorical feature encoding
- Data pre-processing
- Machine learning
- Reference evapotranspiration
- Weather forecast