Damping value cross validation

4. Damping value cross validation#

The choice of the damping value used in the inversion directly affects the results and therefore needs to be carefully chosen. We provide some tools to perform a cross-validation with a range of damping values in order to chose the optimal value.

This cross validation routine is adapted from Uieda & Barbosa 2017 (DOI: 10.1093/gji/ggw390). See their paper for a detailed discussion.

The main idea is to re-sample the gravity data at a finer resolution and seperate the data into a testing and training set. The inversion is performed with a range of damping values, each time using just the training set. Each resulting inverted topo from each of the damping values is then forward modelled onto the locations of the testing data, which were not included in the inversion. The optimal damping value is the one which results in the smallest difference between the forward modeled data and the testing data.

4.1. Import packages#

[1]:

%load_ext autoreload
%autoreload 2

import logging

import xarray as xr
from polartoolkit import utils as polar_utils

from invert4geom import (
    cross_validation,
    inversion,
    optimization,
    plotting,
    regional,
    synthetic,
    utils,
)

# set up logging to see what's going on
logging.basicConfig(level=logging.INFO)

4.2. Create synthetic data#

The below function load_synthetic_model will create some synthetic topography data to act as the real topogrpahy, and it will forward model its gravity effect. It will forward model it at 2x the grid spacing to create testing and trainng datasets to use for the damping parameter cross validation.

[2]:

# set grid parameters
spacing = 1000
region = [0, 40000, 0, 30000]

# the density contrast is between rock (~2670 kg/m3) and air (~1 kg/m3)
true_density_contrast = 2670 - 1

# define the amount of gravity noise
gravity_noise = 0.2

(
    true_topography,
    _,
    _,
    grav_df,
) = synthetic.load_synthetic_model(
    spacing=spacing,
    region=region,
    density_contrast=true_density_contrast,
    gravity_noise=gravity_noise,
    resample_for_cv=True,
    plot_topography=True,
)

true_zref = true_topography.values.mean()
print(f"mean of true topography: {true_zref} m ")

../_images/tutorial_04_damping_cross_validation_4_0.png

INFO:invert4geom:Standard deviation used for noise: [0.2]

../_images/tutorial_04_damping_cross_validation_4_2.png

mean of true topography: 492.2704164812973 m

4.3. Gravity misfit#

As in the last notebook, for simplicity here we assume that we know the true density contrast and the reference level of the true topography, and use these values to create our starting model. Note that in a real world scenario, these would be unknowns which would need to be carefully chosen, as explained in the following notebooks.

[3]:

# create flat topography grid with a constant height equal the mean of the starting
# topography (zref)
starting_topography = xr.full_like(true_topography, true_zref)

# prisms above zref have positive density contrast and prisms below zref have negative
# density contrast
density_grid = xr.where(
    starting_topography >= true_zref, true_density_contrast, -true_density_contrast
)

# create layer of prisms
starting_prisms = utils.grids_to_prisms(
    starting_topography,
    true_zref,
    density=density_grid,
)

# since our starting model is flat, the starting gravity is 0
grav_df["starting_gravity"] = 0

# in many cases, we want to remove a regional signal from the misfit to isolate the
# residual signal. In this simple case, we assume there is no regional misfit and set
# it to 0
grav_df = regional.regional_separation(
    method="constant",
    constant=0,
    grav_df=grav_df,
)

grav_df.describe()

[3]:

	northing	easting	upward	gravity_anomaly	starting_gravity	misfit	reg	res
count	4941.000000	4941.000000	4941.0	4941.000000	4941.0	4941.000000	4941.0	4941.000000
mean	15000.000000	20000.000000	1000.0	-0.223283	0.0	-0.223283	0.0	-0.223283
std	8804.299419	11691.635129	0.0	7.135352	0.0	7.135352	0.0	7.135352
min	0.000000	0.000000	1000.0	-17.476926	0.0	-17.476926	0.0	-17.476926
25%	7500.000000	10000.000000	1000.0	-5.647968	0.0	-5.647968	0.0	-5.647968
50%	15000.000000	20000.000000	1000.0	-1.169577	0.0	-1.169577	0.0	-1.169577
75%	22500.000000	30000.000000	1000.0	3.598600	0.0	3.598600	0.0	3.598600
max	30000.000000	40000.000000	1000.0	19.872526	0.0	19.872526	0.0	19.872526

4.4. Get Cross Validation Score#

We will perform an inversion with a single damping value and calculate a score for it. The grid shows how to score is calculated, as the RMS difference between the predicted and observed gravity data.

[4]:

# set kwargs to pass to the inversion
kwargs = {
    "prism_layer": starting_prisms,
    # set stopping criteria
    "max_iterations": 200,
    "l2_norm_tolerance": gravity_noise
    ** 2,  # gravity error is .2 mGal or L2-norm of ~0.45
    "delta_l2_norm_tolerance": 1.01,
}

# run inversion, calculate the score, and plot the predicted and observed gravity for
# the testing dataset
score, results = cross_validation.grav_cv_score(
    training_data=grav_df[grav_df.test == False],
    testing_data=grav_df[grav_df.test == True],
    solver_damping=0.1,
    progressbar=False,
    plot=True,
    **kwargs,
)
score

../_images/tutorial_04_damping_cross_validation_8_0.png

[4]:

np.float64(0.243357318010514)

4.5. Cross Validation#

Now we can repeat this with a range of damping parameters to find the optimal (lowest) score.

This is a classic optimization problem, where we vary the damping parameter to find which value produces the best score. For this we use the Python package Optuna.

For each trial a .pickle file containing the inversion results will be created. Once all trials are complete, all but the best trial’s pickle files will be deleted. Additionally, another .pickle file will be created containing the Optuna study object with all the trial information and results.

By default, optimize_inversion_damping will use a sampler to choose damping values between the provided damping_limits and repeat this n_trials times. This may skip large portions of the parameter space. If instead you would like to systematically test damping values at a constant interval, you can set grid_search to True, and this will test all damping values as defined by:

np.logspace(<<exponent of lower damping limit>>, <<exponent of upper damping limit>>, n_trials)

[5]:

study, inversion_results = optimization.optimize_inversion_damping(
    training_df=grav_df[grav_df.test == False],
    testing_df=grav_df[grav_df.test == True],
    damping_limits=(0.001, 1),
    n_trials=8,
    grid_search=True,
    # plot_grids=True,
    plot_cv=False,
    fname="../tmp/damping_CV",
    **kwargs,
)

INFO:invert4geom:Trial with best score:
INFO:invert4geom:       trial number: 6
INFO:invert4geom:       parameter: {'damping': 0.13894954943731375}
INFO:invert4geom:       scores: [0.23970922177542184]

[6]:

fig = plotting.plot_cv_scores(
    study.trials_dataframe().value.values,
    study.trials_dataframe().params_damping.values,
    param_name="Damping",
    logx=True,
)

../_images/tutorial_04_damping_cross_validation_11_0.png

[7]:

study.trials_dataframe().drop(columns=["user_attrs_results"])

[7]:

	number	value	datetime_start	datetime_complete	duration	params_damping	user_attrs_fname	system_attrs_grid_id	system_attrs_search_space	state
0	0	0.496708	2025-08-05 13:16:17.607742	2025-08-05 13:17:09.895367	0 days 00:00:52.287625	0.372759	../tmp/damping_CV_trial_0	0	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
1	1	0.438346	2025-08-05 13:17:09.904159	2025-08-05 13:17:25.060107	0 days 00:00:15.155948	0.007197	../tmp/damping_CV_trial_1	1	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
2	2	1.176755	2025-08-05 13:17:25.063087	2025-08-05 13:17:46.657224	0 days 00:00:21.594137	0.002683	../tmp/damping_CV_trial_2	2	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
3	3	7.036115	2025-08-05 13:17:46.662341	2025-08-05 13:17:48.143949	0 days 00:00:01.481608	1.000000	../tmp/damping_CV_trial_3	3	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
4	4	0.252007	2025-08-05 13:17:48.153049	2025-08-05 13:18:21.346799	0 days 00:00:33.193750	0.019307	../tmp/damping_CV_trial_4	4	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
5	5	1.170169	2025-08-05 13:18:21.350464	2025-08-05 13:18:33.318458	0 days 00:00:11.967994	0.001000	../tmp/damping_CV_trial_5	5	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
6	6	0.239709	2025-08-05 13:18:33.322927	2025-08-05 13:18:53.995337	0 days 00:00:20.672410	0.138950	../tmp/damping_CV_trial_6	6	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE
7	7	0.251958	2025-08-05 13:18:54.000509	2025-08-05 13:20:03.795344	0 days 00:01:09.794835	0.051795	../tmp/damping_CV_trial_7	7	{'damping': [0.001, 0.0026826957952797246, 0.0...	COMPLETE

[8]:

dampings = [
    (t.params["damping"], t.user_attrs["results"][0]) for t in study.get_trials()
]
dampings = sorted(dampings, key=lambda x: x[0])

for damp, df in dampings:
    final_topography = df.set_index(["northing", "easting"]).to_xarray().topo

    _ = polar_utils.grd_compare(
        true_topography,
        final_topography,
        grid1_name="True topography",
        grid2_name="Inverted topography",
        robust=True,
        hist=True,
        inset=False,
        title=f"Damping: {damp}",
        grounding_line=False,
        reverse_cpt=True,
        cmap="rain",
    )

../_images/tutorial_04_damping_cross_validation_13_0.png

../_images/tutorial_04_damping_cross_validation_13_1.png

../_images/tutorial_04_damping_cross_validation_13_2.png

../_images/tutorial_04_damping_cross_validation_13_3.png

../_images/tutorial_04_damping_cross_validation_13_4.png

../_images/tutorial_04_damping_cross_validation_13_5.png

../_images/tutorial_04_damping_cross_validation_13_6.png

../_images/tutorial_04_damping_cross_validation_13_7.png

[9]:

# # import pathlib
# # import pickle

# # to re-load the study from the saved pickle file
# with pathlib.Path("../tmp/damping_CV.pickle").open("rb") as f:
#     study = pickle.load(f)

# # to re-load the inversion results from the saved pickle file
# with pathlib.Path(f"{study.best_trial.user_attrs["fname"]}.pickle").open("rb") as f:
#     inversion_results = pickle.load(f)

4.6. Plot inversion results using optimal value#

[10]:

# collect the results
topo_results, grav_results, parameters, elapsed_time = inversion_results

plotting.plot_convergence(
    grav_results,
    params=parameters,
)

plotting.plot_inversion_results(
    grav_results,
    topo_results,
    parameters,
    region,
    iters_to_plot=2,
    plot_iter_results=True,
    plot_topo_results=True,
    plot_grav_results=True,
)

final_topography = topo_results.set_index(["northing", "easting"]).to_xarray().topo

_ = polar_utils.grd_compare(
    true_topography,
    final_topography,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    title="difference",
    grounding_line=False,
    reverse_cpt=True,
    cmap="rain",
    # diff_lims=(-20, 20),
)

../_images/tutorial_04_damping_cross_validation_16_0.png

../_images/tutorial_04_damping_cross_validation_16_1.png

../_images/tutorial_04_damping_cross_validation_16_2.png

../_images/tutorial_04_damping_cross_validation_16_3.png

../_images/tutorial_04_damping_cross_validation_16_4.png

4.7. Run inversion with poor choice of damping value#

[11]:

# set Python's logging level
logging.getLogger().setLevel(logging.WARNING)

# add new solver_damping value to kwargs
kwargs["solver_damping"] = 0.005

results = inversion.run_inversion(
    grav_df=grav_df[grav_df.test == False],  # , we can exclude the testing data now
    **kwargs,
)

# collect the results
topo_results, grav_results, parameters, elapsed_time = results

[12]:

final_topography = topo_results.set_index(["northing", "easting"]).to_xarray().topo

_ = polar_utils.grd_compare(
    true_topography,
    final_topography,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    title="difference",
    grounding_line=False,
    reverse_cpt=True,
    cmap="rain",
)

../_images/tutorial_04_damping_cross_validation_19_0.png

Comparing this inversion with a damping value of 0.005 to the inversion with the optimal damping value of ~0.1 shows that a low damping value allows too much noise through to the final inverted topography.

In the past example (simple inversion), we arbitrarily chose a damping value. Here, we show how by finding the optimal damping value using a cross-validation, we have improved the inversion’s performance.

The next notebook, density cross validation, will show how to perform a similar cross validation for choosing a density contrast value.