9. Combining it all - Alternative density/zref optimization technique#

This notebook is the same as the previous, but for the zref and density optimization, we use a different technique; internal cross validation. Instead of switching to a regional estimation technique which doesn’t use the constraint points, we instead keep using the constraint point minimization regional estimation technique, but split constraints into testing and training sets, performing a K-Folds cross validation to determine the optimal zref and density contrast values.

9.1. Import packages#

[1]:
# set EPSG for plotting functions
import os
import pathlib
import pickle

import numpy as np
import pandas as pd
import polartoolkit as ptk
import verde as vd
import xarray as xr

import invert4geom

os.environ["POLARTOOLKIT_EPSG"] = "3031"
/home/mdtanker/miniforge3/envs/invert4geom/lib/python3.12/site-packages/UQpy/__init__.py:6: UserWarning:

pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.

9.2. Get Input datasets#

[2]:
true_topography, _, _, _ = invert4geom.load_synthetic_model(
    spacing=1000,
    region=(0, 40000, 0, 30000),
)

# create lower synthetic topography data
lower_topography = invert4geom.synthetic_topography_regional(
    spacing=1000,
    region=(0, 40000, 0, 30000),
    scale=1,
    yoffset=-1000,
)

true_zref = true_topography.to_numpy().mean()
true_density_contrast = 2670 - 1  # sea level
print(f"mean of true topography: {true_zref} m ")
mean of true topography: 492.2704164812973 m
[3]:
topography_model = invert4geom.create_model(
    zref=true_zref,
    density_contrast=true_density_contrast,
    topography=true_topography.to_dataset(name="upward"),
)
basement_model = invert4geom.create_model(
    zref=lower_topography.to_numpy().mean(),
    density_contrast=3300 - 2800,
    topography=lower_topography.to_dataset(name="upward"),
)

invert4geom.plot_prism_layers(
    [topography_model, basement_model],
    color_by="thickness",
    zscale=20,
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_5_0.png
[4]:
# make pandas dataframe of locations to calculate gravity
# this represents the station locations of a gravity survey

# create lists of coordinates
coords = vd.grid_coordinates(
    region=topography_model.region,
    spacing=topography_model.spacing,
    pixel_register=False,
    extra_coords=1001,  # survey elevation
)

# grid the coordinates
observations = vd.make_xarray_grid(
    (coords[0], coords[1]),
    data=coords[2],
    data_names="upward",
    dims=("northing", "easting"),
)
[5]:
data = invert4geom.create_data(observations)

print(f"Gravity region (W,E,S,N): {data.region}")
print(f"Gravity spacing: {data.spacing} m")

data
Gravity region (W,E,S,N): (0.0, 40000.0, 0.0, 30000.0)
Gravity spacing: 1000.0 m
[5]:
<xarray.Dataset> Size: 11kB
Dimensions:   (northing: 31, easting: 41)
Coordinates:
  * northing  (northing) float64 248B 0.0 1e+03 2e+03 ... 2.8e+04 2.9e+04 3e+04
  * easting   (easting) float64 328B 0.0 1e+03 2e+03 ... 3.8e+04 3.9e+04 4e+04
Data variables:
    upward    (northing, easting) float64 10kB 1.001e+03 1.001e+03 ... 1.001e+03
Attributes:
    region:        (0.0, 40000.0, 0.0, 30000.0)
    spacing:       1000.0
    buffer_width:  3000.0
    inner_region:  (3000.0, 37000.0, 3000.0, 27000.0)
    dataset_type:  data
    model_type:    prisms
    coord_names:   ('easting', 'northing')
[6]:
data.inv.forward_gravity(topography_model, name="upper_surface_grav")
data.inv.forward_gravity(basement_model, name="lower_surface_grav")
data.inv.df.describe()
[6]:
northing easting upward upper_surface_grav lower_surface_grav
count 1271.000000 1271.00000 1271.0 1271.000000 1271.000000
mean 15000.000000 20000.00000 1001.0 -0.129619 0.004659
std 8947.792584 11836.81698 0.0 7.147140 0.830306
min 0.000000 0.00000 1001.0 -17.180401 -1.201774
25% 7000.000000 10000.00000 1001.0 -5.537833 -0.699901
50% 15000.000000 20000.00000 1001.0 -1.136301 -0.177953
75% 23000.000000 30000.00000 1001.0 3.743994 0.629891
max 30000.000000 40000.00000 1001.0 19.581120 1.860259
[7]:
# contaminate gravity with 0.1 mGal of random noise
data["gravity_anomaly_no_noise"] = data.upper_surface_grav + data.lower_surface_grav
data["gravity_anomaly"], stddev = invert4geom.contaminate(
    data.gravity_anomaly_no_noise,
    stddev=0.1,
    percent=False,
    seed=0,
)
data["uncert"] = xr.full_like(data.gravity_anomaly, stddev)
data
[7]:
<xarray.Dataset> Size: 62kB
Dimensions:                   (northing: 31, easting: 41)
Coordinates:
  * northing                  (northing) float64 248B 0.0 1e+03 ... 3e+04
  * easting                   (easting) float64 328B 0.0 1e+03 ... 3.9e+04 4e+04
Data variables:
    upward                    (northing, easting) float64 10kB 1.001e+03 ... ...
    upper_surface_grav        (northing, easting) float64 10kB 9.526 ... 2.855
    lower_surface_grav        (northing, easting) float64 10kB -0.04793 ... 0...
    gravity_anomaly_no_noise  (northing, easting) float64 10kB 9.478 ... 3.368
    gravity_anomaly           (northing, easting) float64 10kB 9.495 ... 3.425
    uncert                    (northing, easting) float64 10kB 0.1 0.1 ... 0.1
Attributes:
    region:        (0.0, 40000.0, 0.0, 30000.0)
    spacing:       1000.0
    buffer_width:  3000.0
    inner_region:  (3000.0, 37000.0, 3000.0, 27000.0)
    dataset_type:  data
    model_type:    prisms
    coord_names:   ('easting', 'northing')
[8]:
fig = ptk.plot_grid(
    data.lower_surface_grav,
    fig_height=10,
    title="lower prisms gravity",
    cmap="viridis",
    hist=True,
    cbar_label="mGal",
    frame=["nSWe", "xaf10000", "yaf10000"],
)

fig = ptk.plot_grid(
    data.upper_surface_grav,
    fig=fig,
    origin_shift="x",
    fig_height=10,
    title="upper prisms gravity",
    cmap="viridis",
    hist=True,
    cbar_label="mGal",
    frame=["nSwE", "xaf10000", "yaf10000"],
)

fig = ptk.plot_grid(
    data.gravity_anomaly,
    fig=fig,
    origin_shift="x",
    fig_height=10,
    title="Observed gravity",
    cmap="viridis",
    hist=True,
    cbar_label="mGal",
    frame=["nSwE", "xaf10000", "yaf10000"],
)

fig.show()
../_images/tutorial_09_combining_it_all_alternative_CV_technique_10_0.png

9.3. Create “a-priori” topography measurements#

[9]:
# create 15 random point within the outcropping basement region
num_constraints = 15
coords = vd.scatter_points(
    region=vd.pad_region(data.region, -data.spacing),
    size=num_constraints,
    random_state=7,
)
constraint_points = pd.DataFrame(data={"easting": coords[0], "northing": coords[1]})

# sample true topography at these points
constraint_points = invert4geom.sample_grids(
    constraint_points,
    true_topography,
    "true_upward",
)

constraint_points["upward"] = constraint_points.true_upward

# re-sample depths with uncertainty to emulate measurement errors
# set each points uncertainty equal to 2% of depth
uncert = np.abs(0.02 * constraint_points.upward)
constraint_points.loc[constraint_points.index, "uncert"] = uncert

constraint_points = invert4geom.randomly_sample_data(
    seed=0,
    data_df=constraint_points,
    data_col="upward",
    uncert_col="uncert",
)

# create weights column
constraint_points["weight"] = 1 / (constraint_points.uncert**2)

constraint_points.head()
[9]:
easting northing true_upward upward uncert weight
0 3899.714996 26468.618776 655.102400 656.749723 13.102048 0.005825
1 30636.914105 6974.789900 407.011208 405.935844 8.140224 0.015091
2 17659.550795 13659.470931 470.155613 476.177579 9.403112 0.011310
3 28491.676758 27073.768551 515.239653 516.320627 10.304793 0.009417
4 38163.601456 1697.178371 419.197817 414.706788 8.383956 0.014227

9.4. Create starting model#

[10]:
# grid the sampled values using verde
starting_topography_kwargs = dict(
    method="splines",
    region=topography_model.region,
    spacing=topography_model.spacing,
    constraints_df=constraint_points,
    dampings=np.logspace(-40, 0, 100),
    weights_col="weight",
)

starting_topography = invert4geom.create_topography(**starting_topography_kwargs)

_ = ptk.grid_compare(
    true_topography,
    starting_topography.upward,
    grid1_name="True topography",
    grid2_name="Starting topography",
    robust=True,
    hist=True,
    inset=False,
    title="difference",
    reverse_cpt=True,
    cmap="rain",
    points=constraint_points,
    points_style="x.3c",
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_14_0.png
[11]:
model = invert4geom.create_model(
    zref=constraint_points.upward.mean(),
    density_contrast=2500,
    topography=starting_topography,
)
model.inv.plot_model(
    color_by="density",
    zscale=20,
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_15_0.png

9.5. Calculate starting gravity misfit#

Now we need to calculate the forward gravity of the starting topography. We then can subtract it from our observed gravity to get a starting gravity misfit.

We don’t know the optimal values for the density contrast or the reference level, so we will make a guess at appropriate values for the starting model.

[12]:
data.inv.forward_gravity(model)

# calculate the true residual misfit
# true misfit is difference between noise-free gravity and starting gravity
# true regional misfit is the moho gravity
# so true residual is misfit - true_regional
data["true_res"] = (
    data.gravity_anomaly_no_noise - data.forward_gravity - data.lower_surface_grav
)
data.inv.df.describe()
[12]:
northing easting upward upper_surface_grav lower_surface_grav gravity_anomaly_no_noise gravity_anomaly uncert forward_gravity true_res
count 1271.000000 1271.00000 1271.0 1271.000000 1271.000000 1271.000000 1271.000000 1.271000e+03 1271.000000 1271.000000
mean 15000.000000 20000.00000 1001.0 -0.129619 0.004659 -0.124959 -0.124959 1.000000e-01 1.079930 -1.209548
std 8947.792584 11836.81698 0.0 7.147140 0.830306 7.314752 7.313577 2.776650e-17 6.599125 4.456801
min 0.000000 0.00000 1001.0 -17.180401 -1.201774 -16.908338 -17.061563 1.000000e-01 -7.883841 -19.046853
25% 7000.000000 10000.00000 1001.0 -5.537833 -0.699901 -5.826630 -5.827131 1.000000e-01 -3.944309 -2.622186
50% 15000.000000 20000.00000 1001.0 -1.136301 -0.177953 -0.557795 -0.572175 1.000000e-01 -0.765707 -1.088926
75% 23000.000000 30000.00000 1001.0 3.743994 0.629891 4.171648 4.139840 1.000000e-01 4.823461 0.332004
max 30000.000000 40000.00000 1001.0 19.581120 1.860259 19.774399 19.944612 1.000000e-01 22.159800 12.693226
[13]:
_ = ptk.grid_compare(
    data.gravity_anomaly,
    data.forward_gravity,
    grid1_name="Observed gravity",
    grid2_name="Starting gravity",
    robust=True,
    hist=True,
    inset=False,
    title="Gravity misfit",
    rmse_in_title=False,
    points=constraint_points,
    points_style="x.3c",
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_18_0.png
[14]:
fig = ptk.plot_grid(
    data.lower_surface_grav,
    fig_height=10,
    title="True regional",
    hist=True,
    cbar_label="mGal",
    frame=["nSWe", "xaf10000", "yaf10000"],
)

fig = ptk.plot_grid(
    data.true_res,
    fig=fig,
    origin_shift="x",
    fig_height=10,
    title="True residual",
    hist=True,
    cbar_label="mgal",
    frame=["nSwE", "xaf10000", "yaf10000"],
)

fig.show()
../_images/tutorial_09_combining_it_all_alternative_CV_technique_19_0.png

9.6. Regional separation#

[15]:
def regional_comparison(data):
    # compare with true regional
    _ = ptk.grid_compare(
        data.lower_surface_grav,
        data.reg,
        robust=True,
        grid1_name="True regional field",
        grid2_name="Estimated regional field",
        hist=True,
        inset=False,
        title="difference",
        points=constraint_points,
        points_style="x.3c",
    )
    # compare with true residual
    _ = ptk.grid_compare(
        data.true_res,
        data.res,
        robust=True,
        grid1_name="True residual field",
        grid2_name="Estimated residual field",
        hist=True,
        inset=False,
        title="difference",
        points=constraint_points,
        points_style="x.3c",
    )
[16]:
# estimate regional with the mean misfit at constraints
regional_grav_kwargs = dict(
    method="constraints",
    grid_method="eq_sources",
    constraints_weights_column="weight",
    constraints_df=constraint_points,
    cv=True,
    cv_kwargs=dict(
        n_trials=50,
        damping_limits=(1e-10, 1),
        # depth_limits=(1e3,100e3),
        progressbar=False,
        fname="../tmp/09_combining_it_all_alternative_regional_sep",
        # plot=True,
    ),
    depth="default",
    block_size=None,
)

data.inv.regional_separation(
    **regional_grav_kwargs,
)

regional_comparison(data)
data.inv.df.describe()
../_images/tutorial_09_combining_it_all_alternative_CV_technique_22_0.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_22_1.png
[16]:
northing easting upward upper_surface_grav lower_surface_grav gravity_anomaly_no_noise gravity_anomaly uncert forward_gravity true_res misfit reg res starting_forward_gravity starting_misfit starting_reg starting_res
count 1271.000000 1271.00000 1271.0 1271.000000 1271.000000 1271.000000 1271.000000 1.271000e+03 1271.000000 1271.000000 1271.000000 1271.000000 1271.000000 1271.000000 1271.000000 1271.000000 1271.000000
mean 15000.000000 20000.00000 1001.0 -0.129619 0.004659 -0.124959 -0.124959 1.000000e-01 1.079930 -1.209548 -1.204889 -1.050441 -0.154448 1.079930 -1.204889 -1.050441 -0.154448
std 8947.792584 11836.81698 0.0 7.147140 0.830306 7.314752 7.313577 2.776650e-17 6.599125 4.456801 4.644997 0.899712 4.423033 6.599125 4.644997 0.899712 4.423033
min 0.000000 0.00000 1001.0 -17.180401 -1.201774 -16.908338 -17.061563 1.000000e-01 -7.883841 -19.046853 -19.654062 -2.393489 -18.180347 -7.883841 -19.654062 -2.393489 -18.180347
25% 7000.000000 10000.00000 1001.0 -5.537833 -0.699901 -5.826630 -5.827131 1.000000e-01 -3.944309 -2.622186 -2.931387 -1.856200 -1.829092 -3.944309 -2.931387 -1.856200 -1.829092
50% 15000.000000 20000.00000 1001.0 -1.136301 -0.177953 -0.557795 -0.572175 1.000000e-01 -0.765707 -1.088926 -1.578812 -1.142989 -0.050248 -0.765707 -1.578812 -1.142989 -0.050248
75% 23000.000000 30000.00000 1001.0 3.743994 0.629891 4.171648 4.139840 1.000000e-01 4.823461 0.332004 0.296782 -0.286862 1.744463 4.823461 0.296782 -0.286862 1.744463
max 30000.000000 40000.00000 1001.0 19.581120 1.860259 19.774399 19.944612 1.000000e-01 22.159800 12.693226 13.114356 0.689128 14.012389 22.159800 13.114356 0.689128 14.012389
[17]:
data.inv.plot_anomalies()
../_images/tutorial_09_combining_it_all_alternative_CV_technique_23_0.png

9.7. Single inversion#

Perform a single inversion to experiment with values of stopping criteria.

[18]:
# setup the inversion
inv = invert4geom.Inversion(
    data,
    model,
    solver_damping=0.005,
    # set stopping criteria
    l2_norm_tolerance=0.3,  # grav error of .1mGal -> sqrt(.1) = 0.3
    delta_l2_norm_tolerance=1.008,
)
[19]:
# run the inversion
inv.invert(
    plot_dynamic_convergence=True,
    results_fname="../tmp/09_combining_it_all_alternative",
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_26_0.png
[20]:
inv.stats_df
[20]:
iteration rmse l2_norm delta_l2_norm iter_time_sec
0 0.0 4.636082 2.153156 inf NaN
1 1.0 1.360583 1.166440 1.845920 0.709483
2 2.0 0.478946 0.692059 1.685464 0.151519
3 3.0 0.246154 0.496139 1.394890 0.152965
4 4.0 0.213645 0.462217 1.073389 0.172500
5 5.0 0.276410 0.525747 0.879163 0.167975
6 6.0 0.432148 0.657380 0.799761 0.161606
[21]:
inv.plot_inversion_results(
    iters_to_plot=2,
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_28_0.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_28_1.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_28_2.png
[22]:
_ = ptk.grid_compare(
    true_topography,
    inv.model.topography,
    region=data.inner_region,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    verbose="q",
    title="difference",
    reverse_cpt=True,
    cmap="rain",
    frame=True,
    points=constraint_points,
    points_style="x.3c",
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_29_0.png

9.8. Damping parameter cross validation#

9.8.1. Resample to create testing / training data sets#

[23]:
data.inv.df
[23]:
northing easting upward upper_surface_grav lower_surface_grav gravity_anomaly_no_noise gravity_anomaly uncert forward_gravity true_res misfit reg res starting_forward_gravity starting_misfit starting_reg starting_res
0 0.0 0.0 1001.0 9.525994 -0.047934 9.478060 9.494673 0.1 10.740517 -1.214524 -1.245844 -2.206690 0.960846 10.740517 -1.245844 -2.206690 0.960846
1 0.0 1000.0 1001.0 10.416800 -0.060607 10.356193 10.347023 0.1 11.470352 -1.053551 -1.123328 -2.237354 1.114026 11.470352 -1.123328 -2.237354 1.114026
2 0.0 2000.0 1001.0 9.945126 -0.079985 9.865141 9.933223 0.1 10.699837 -0.754711 -0.766613 -2.265615 1.499002 10.699837 -0.766613 -2.265615 1.499002
3 0.0 3000.0 1001.0 9.265122 -0.102636 9.162486 9.177017 0.1 9.740384 -0.475262 -0.563367 -2.291301 1.727934 9.740384 -0.563367 -2.291301 1.727934
4 0.0 4000.0 1001.0 8.528509 -0.126515 8.401994 8.352468 0.1 8.762056 -0.233547 -0.409588 -2.314268 1.904680 8.762056 -0.409588 -2.314268 1.904680
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1266 30000.0 36000.0 1001.0 3.330540 0.765378 4.095918 3.988219 0.1 5.931613 -2.601073 -1.943393 0.556764 -2.500158 5.931613 -1.943393 0.556764 -2.500158
1267 30000.0 37000.0 1001.0 3.328091 0.745878 4.073969 4.144597 0.1 6.177712 -2.849622 -2.033116 0.528620 -2.561735 6.177712 -2.033116 0.528620 -2.561735
1268 30000.0 38000.0 1001.0 3.333133 0.706587 4.039719 4.073005 0.1 6.390391 -3.057258 -2.317386 0.499154 -2.816540 6.390391 -2.317386 0.499154 -2.816540
1269 30000.0 39000.0 1001.0 3.298196 0.634280 3.932475 3.841945 0.1 6.478970 -3.180774 -2.637025 0.468717 -3.105742 6.478970 -2.637025 0.468717 -3.105742
1270 30000.0 40000.0 1001.0 2.855276 0.512861 3.368138 3.425127 0.1 5.730724 -2.875448 -2.305597 0.437636 -2.743233 5.730724 -2.305597 0.437636 -2.743233

1271 rows × 17 columns

[24]:
inv.data.inv.df.columns
[24]:
Index(['northing', 'easting', 'upward', 'upper_surface_grav',
       'lower_surface_grav', 'gravity_anomaly_no_noise', 'gravity_anomaly',
       'uncert', 'forward_gravity', 'true_res', 'misfit', 'reg', 'res',
       'starting_forward_gravity', 'starting_misfit', 'starting_reg',
       'starting_res', 'iter_1_initial_residual', 'iter_2_initial_residual',
       'iter_3_initial_residual', 'iter_4_initial_residual',
       'iter_5_initial_residual', 'iter_6_initial_residual'],
      dtype='str')
[25]:
inv.reinitialize_inversion()

# resample data at 1/2 spacing to include test points for cross-validation
inv.data = invert4geom.add_test_points(inv.data)
[26]:
damping_cv_obj = inv.optimize_inversion_damping(
    damping_limits=(0.001, 1),
    n_trials=8,
    fname="../tmp/09_combining_it_all_alternative_damping_CV",
)
inv.solver_damping
[26]:
0.022462769883683984
../_images/tutorial_09_combining_it_all_alternative_CV_technique_35_3.png
[27]:
damping_cv_obj.study.trials_dataframe()
[27]:
number value datetime_start datetime_complete duration params_damping user_attrs_fname system_attrs_fixed_params state
0 0 0.483118 2026-02-17 02:21:04.924908 2026-02-17 02:21:05.732415 0 days 00:00:00.807507 0.001000 ../tmp/09_combining_it_all_alternative_damping... {'damping': 0.001} COMPLETE
1 1 4.592549 2026-02-17 02:21:05.733508 2026-02-17 02:21:06.273835 0 days 00:00:00.540327 1.000000 ../tmp/09_combining_it_all_alternative_damping... {'damping': 1} COMPLETE
2 2 0.606439 2026-02-17 02:21:06.274852 2026-02-17 02:21:07.400753 0 days 00:00:01.125901 0.001983 ../tmp/09_combining_it_all_alternative_damping... NaN COMPLETE
3 3 0.073751 2026-02-17 02:21:07.401706 2026-02-17 02:21:08.987635 0 days 00:00:01.585929 0.062714 ../tmp/09_combining_it_all_alternative_damping... NaN COMPLETE
4 4 0.033137 2026-02-17 02:21:08.995668 2026-02-17 02:21:09.739758 0 days 00:00:00.744090 0.022463 ../tmp/09_combining_it_all_alternative_damping... NaN COMPLETE
5 5 0.051635 2026-02-17 02:21:09.740771 2026-02-17 02:21:10.641748 0 days 00:00:00.900977 0.037994 ../tmp/09_combining_it_all_alternative_damping... NaN COMPLETE
6 6 0.080030 2026-02-17 02:21:10.642737 2026-02-17 02:21:11.593243 0 days 00:00:00.950506 0.010451 ../tmp/09_combining_it_all_alternative_damping... NaN COMPLETE
7 7 0.065969 2026-02-17 02:21:11.594263 2026-02-17 02:21:12.338491 0 days 00:00:00.744228 0.016081 ../tmp/09_combining_it_all_alternative_damping... NaN COMPLETE
[28]:
inv.plot_convergence()

inv.plot_inversion_results(iters_to_plot=2)

_ = ptk.grid_compare(
    true_topography,
    inv.model.topography,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    title="difference",
    reverse_cpt=True,
    cmap="rain",
    points=constraint_points,
    points_style="x.3c",
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_37_0.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_37_1.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_37_2.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_37_3.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_37_4.png
[29]:
# sample the inverted topography at the constraint points
constraint_points = invert4geom.sample_grids(
    constraint_points,
    inv.model.topography,
    "inverted_topography",
)

rmse = invert4geom.rmse(
    constraint_points.upward - constraint_points.inverted_topography
)
print(f"RMSE: {rmse:.2f} m")
RMSE: 10.02 m

9.9. Z-ref and density contrast optimizations#

Since this cross validation uses the inversion error at constraint points, we can’t use the constraint point minimization technique to estimate the regional field since that inherently sets the inversion error at constraints to zero, invalidating the cross validation scores.

There are two options for how to get around this issue:

  1. use a different regional estimation technique while finding the optimal density contrast value, then use the found optimal values with the constraint point minimization regional estimation technique afterwards.

  2. separate the constraints into testing and training sets, so that only the training set is used during the regional separation, and only the testing set is used for scoring the density contrast cross validation.

Since we have constraint points, we will just use their mean value as the zref and omit the zref CV.

Well perform just the second method below:

This option is less efficient, but in some cases may be more accurate.

Perform a cross validation for the optimal zref and density contrast values same as function optimize_inversion_zref_density_contrast, but pass a dataframe of constraint points which contains folds of testing and training data (generated with cross_validation.split_test_train). For each set of zref/density values, recreate the starting topography with the supplied starting_topography_kwargs, recreate the starting prism model, and perform a regional separation and inversion for each of the K folds in the constraints dataframe.

After all K folds are inverted, the mean of the K folds scores will be the score for that set of parameters. Repeat this for all parameters. Within each fold, the training constraints are used for the regional separation and the testing constraints are used for scoring. This is only useful if the regional separation technique you supply via regional_grav_kwargs uses constraints points for the estimations, such as constraint point minimization. If using 20 sets of density and zref values, and use 5 folds, this will run 100 inversions. It is more efficient, but less accurate, to simple use a different regional estimation technique, which doesn’t require constraint points, to find the optimal zref and density values. Then use these again in another inversion with the desired regional separation technique.

[30]:
inv.reinitialize_inversion()
[31]:
# run a 5-fold cross validation for 15 parameter sets of density and zref
density_zref_optimization_obj = inv.optimize_inversion_zref_density_contrast_kfolds(
    constraints_df=constraint_points,
    density_contrast_limits=(2300, 3000),
    zref_limits=(0, 1e3),
    n_trials=15,
    regional_grav_kwargs=regional_grav_kwargs,
    starting_topography_kwargs=starting_topography_kwargs,
    plot_scores=False,
    fname="../tmp/09_combining_it_all_alternative_density_and_zref_optimization",
    split_kwargs=dict(
        n_splits=5,
        method="KFold",
    ),
    fold_progressbar=False,
)
Best damping value (1e-10) is at the limit of provided values (1.0, 1e-10) and thus is likely not a global minimum, expand the range of values tested to ensure the best parameter value is found.
Best damping value (1e-40) is at the limit of provided values (1e-40, 1.0) and thus is likely not a global minimum, expand the range of values test to ensure the best parameter value value is found.
Best damping value (1.0) is at the limit of provided values (1.0, 1e-10) and thus is likely not a global minimum, expand the range of values tested to ensure the best parameter value is found.
Best damping value (1e-40) is at the limit of provided values (1e-40, 1.0) and thus is likely not a global minimum, expand the range of values test to ensure the best parameter value value is found.
'forward_gravity' already a variable of `grav_ds`, but is being overwritten since calculate_starting_gravity is True
'reg' already a column of `grav_df`, but is being overwritten since calculate_regional_misfit is True
[32]:
# to re-load the inv object from the saved pickle file
with pathlib.Path(
    "../tmp/09_combining_it_all_alternative_density_and_zref_optimization.pickle"
).open("rb") as f:
    inv = pickle.load(f)

# to re-load the study from the saved pickle file
with pathlib.Path(
    "../tmp/09_combining_it_all_alternative_density_and_zref_optimization_study.pickle"
).open("rb") as f:
    study = pickle.load(f)

inv.solver_damping, inv.model.density_contrast, inv.model.zref
[32]:
(0.022462769883683984, 2805, 599.1217798843752)
[33]:
fig = invert4geom.plot_2_parameter_scores_uneven(
    density_zref_optimization_obj.study,
    param_names=(
        "params_zref",
        "params_density_contrast",
    ),
    plot_param_names=(
        "Reference level (m)",
        "Density contrast (kg/m$^3$)",
    ),
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_44_0.png
[34]:
inv.plot_convergence()

inv.plot_inversion_results(iters_to_plot=2)

_ = ptk.grid_compare(
    true_topography,
    inv.model.topography,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    title="difference",
    reverse_cpt=True,
    cmap="rain",
    points=constraint_points,
    points_style="x.3c",
)
../_images/tutorial_09_combining_it_all_alternative_CV_technique_45_0.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_45_1.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_45_2.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_45_3.png
../_images/tutorial_09_combining_it_all_alternative_CV_technique_45_4.png
[ ]: