TY - JOUR
T1 - Interpretable data-driven modeling of total phosphorus dynamics from 2005 to 2024 in a large shallow lake
AU - Lin, Weipeng
AU - Zhou, Yongqiang
AU - Ren, Ze
AU - Zou, Wei
AU - Guo, Hongwei
AU - Li, Na
AU - Zhang, Yunlin
AU - Elser, James
AU - Woolway, R. Iestyn
AU - Shi, Kun
AU - Zhu, Guangwei
AU - Qin, Boqiang
AU - Xue, Yufei
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2026/3/1
Y1 - 2026/3/1
N2 - Phosphorus (P) is a critical biogenic element driving aquatic productivity and eutrophication in freshwater systems. However, monitoring total phosphorus (TP) in shallow, dynamic lakes remain challenging due to its pronounced spatiotemporal variability and complex interactions with optically active constituents. While remote sensing provides a cost-effective supplement to in situ monitoring of TP, accurately estimating TP —a non-optically active component— using remote sensing requires models that balance high precision, reliability, and interpretability. Therefore, this study develops an interpretable machine learning framework using the Light Gradient Boosting Machine Regressor (LGBMR), trained on multi-source data including MODIS satellite reflectance, in situ measurements and meteorological variables to estimate TP dynamics. The LGBMR outperformed 13 other algorithms on independent validation datasets (N = 609, R² = 0.70, MAPE = 27.9 %), demonstrating superior predictive performance. Shapley Additive Explanations (SHAP) analysis revealed mechanistic controls of input variables on TP dynamics, enabling the model to effectively capture both seasonal-spatial TP variability and climate-induced extremes. Long-term analysis of Lake Taihu revealed a significant declining trend in TP concentration over the past two decades (R2 = 0.26, P < 0.05, rate: -0.009 mg/L/decade), with an accelerated decline from 2017 to 2024 (R2 = 0.77, P < 0.05, rate: -0.059 mg/L/decade). SHAP analysis revealed a 12.4 % and 18.9 % decrease in pixel counts dominated by total suspended matter (TSM) and algal-associated P, respectively. The decline is attributed to reduced external loading due to improved watershed management and internal phosphorus release due to reduced algal biomass and sediment resuspension linked to weakened wind-driven mixing. These findings underscore the effectiveness of integrated modeling approaches for tracking phosphorus dynamics in shallow eutrophic lakes, providing actionable insights for eutrophication management. The proposed framework advances interpretable machine learning in environmental monitoring by elucidating mechanistic linkages between hydrological, meteorological, and biogeochemical drivers.
AB - Phosphorus (P) is a critical biogenic element driving aquatic productivity and eutrophication in freshwater systems. However, monitoring total phosphorus (TP) in shallow, dynamic lakes remain challenging due to its pronounced spatiotemporal variability and complex interactions with optically active constituents. While remote sensing provides a cost-effective supplement to in situ monitoring of TP, accurately estimating TP —a non-optically active component— using remote sensing requires models that balance high precision, reliability, and interpretability. Therefore, this study develops an interpretable machine learning framework using the Light Gradient Boosting Machine Regressor (LGBMR), trained on multi-source data including MODIS satellite reflectance, in situ measurements and meteorological variables to estimate TP dynamics. The LGBMR outperformed 13 other algorithms on independent validation datasets (N = 609, R² = 0.70, MAPE = 27.9 %), demonstrating superior predictive performance. Shapley Additive Explanations (SHAP) analysis revealed mechanistic controls of input variables on TP dynamics, enabling the model to effectively capture both seasonal-spatial TP variability and climate-induced extremes. Long-term analysis of Lake Taihu revealed a significant declining trend in TP concentration over the past two decades (R2 = 0.26, P < 0.05, rate: -0.009 mg/L/decade), with an accelerated decline from 2017 to 2024 (R2 = 0.77, P < 0.05, rate: -0.059 mg/L/decade). SHAP analysis revealed a 12.4 % and 18.9 % decrease in pixel counts dominated by total suspended matter (TSM) and algal-associated P, respectively. The decline is attributed to reduced external loading due to improved watershed management and internal phosphorus release due to reduced algal biomass and sediment resuspension linked to weakened wind-driven mixing. These findings underscore the effectiveness of integrated modeling approaches for tracking phosphorus dynamics in shallow eutrophic lakes, providing actionable insights for eutrophication management. The proposed framework advances interpretable machine learning in environmental monitoring by elucidating mechanistic linkages between hydrological, meteorological, and biogeochemical drivers.
KW - Eutrophication
KW - Interpretable machine learning
KW - Remote sensing
KW - Shallow eutrophic lake
KW - Total phosphorus
UR - https://www.scopus.com/pages/publications/105025037035
U2 - 10.1016/j.watres.2025.125169
DO - 10.1016/j.watres.2025.125169
M3 - Article
C2 - 41412029
AN - SCOPUS:105025037035
SN - 0043-1354
VL - 291
JO - Water Research
JF - Water Research
M1 - 125169
ER -