TY - JOUR
T1 - SSCL-GBM
T2 - A Semi-Supervised Stock Prediction Approach With Custom Loss Function
AU - Wang, Huijia
AU - Stefanidis, Angelos
AU - Jiang, Zhengyong
AU - Su, Jionglong
N1 - Publisher Copyright:
Copyright © 2025 Huijia Wang et al. Journal of Mathematics published by John Wiley & Sons Ltd.
PY - 2025
Y1 - 2025
N2 - Accurately identifying effective market features and making reasonable model predictions are crucial for investment decision making in financial market prediction. This study proposes a LightGBM model, semi-supervised classification and learning with gradient boosting machine (SSCL-GBM), that combines semi-supervised learning and a custom loss function for classification tasks in stock price prediction. In typical return prediction tasks, labels are derived from future price movements, which can create a temporal mismatch between historical features and future-dependent labels. We adopt a pseudo-labeling mechanism to approximate labels in the final prediction window to address this issue and avoid relying on unavailable future outcomes during training. This enables the model to utilize the most recent data without violating temporal causality. Additionally, we design a custom loss function that balances return and risk, significantly reducing false positives. Experimental results show that our SSCL-GBM model significantly outperforms key indicators, such as cumulative returns and the Sharpe ratio, validating the effectiveness of this method in financial market prediction.
AB - Accurately identifying effective market features and making reasonable model predictions are crucial for investment decision making in financial market prediction. This study proposes a LightGBM model, semi-supervised classification and learning with gradient boosting machine (SSCL-GBM), that combines semi-supervised learning and a custom loss function for classification tasks in stock price prediction. In typical return prediction tasks, labels are derived from future price movements, which can create a temporal mismatch between historical features and future-dependent labels. We adopt a pseudo-labeling mechanism to approximate labels in the final prediction window to address this issue and avoid relying on unavailable future outcomes during training. This enables the model to utilize the most recent data without violating temporal causality. Additionally, we design a custom loss function that balances return and risk, significantly reducing false positives. Experimental results show that our SSCL-GBM model significantly outperforms key indicators, such as cumulative returns and the Sharpe ratio, validating the effectiveness of this method in financial market prediction.
KW - custom loss function
KW - DEAP factor mining
KW - dynamic feature updates
KW - LightGBM model
KW - semi-supervised learning
UR - https://www.scopus.com/pages/publications/105014007006
U2 - 10.1155/jom/5991303
DO - 10.1155/jom/5991303
M3 - Article
AN - SCOPUS:105014007006
SN - 2314-4629
VL - 2025
JO - Journal of Mathematics
JF - Journal of Mathematics
IS - 1
M1 - 5991303
ER -