Field evaluation of semi-automated moisture estimation from geophysics using machine learning
Geophysical methods can provide 3D, spatially extensive estimates of soil moisture. However, point-to-point comparisons of geophysical properties to measured moisture data are frequently unsatisfactory, resulting in geophysics being used for qualitative purposes only. This is because geophysics (1) requires models to related estimated geophysical properties with soil moisture, (2) suffers from uncertainties, smoothing and/or artifacts introduced from processing and inversion, and (3) results from multiple geophysical methods are not easily combined within a single moisture estimation framework.
To investigate these shortcomings, we performed an irrigation experiment monitoring soil moisture through time and collecting several surface geophysical datasets indirectly sensitive to soil moisture before and after irrigation: ground penetrating radar (GPR), electrical resistivity tomography (ERT), and frequency domain electromagnetics (FDEM). Data were exported in both raw and processed form, as well as evaluated in terms of soil moisture using common calibrations. Post-irrigation datasets were snapped to a common 3D grid to facilitate analysis and testing by multivariate regression and machine learning (support vector regression and random regression forests).
Results were explored in terms of advantages/limitations of each geophysical method and their respective sensitivity and resolution to subtle soil moisture changes; however, for this site raw ERT pseudosection data were the most informative in terms of accurately predicting soil moisture using a random regression forest model (one-hundred 60/40 training/test cross validation folds produced R2 values ranging from 0.49 – 0.96). This cross validated model was further supported by a separate evaluation using a test set from a physically separate portion of the study area. Machine learning was conducive to a semi-automated model-selection process that could be used for other sites and datasets to locally improve accuracy.