9. Combining it all - Alternative density/zref optimization technique#
This notebook is the same as the previous, but for the zref and density optimization, we use a different technique; internal cross validation. Instead of switching to a regional estimation technique which doesn’t use the constraint points, we instead keep using the constraint point minimization regional estimation technique, but split constraints into testing and training sets, performing a K-Folds cross validation to determine the optimal zref and density contrast values.
9.1. Import packages#
[1]:
# set EPSG for plotting functions
import os
import pathlib
import pickle
import numpy as np
import pandas as pd
import polartoolkit as ptk
import verde as vd
import xarray as xr
import invert4geom
os.environ["POLARTOOLKIT_EPSG"] = "3031"
/home/mdtanker/miniforge3/envs/invert4geom/lib/python3.12/site-packages/UQpy/__init__.py:6: UserWarning:
pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
9.2. Get Input datasets#
[2]:
true_topography, _, _, _ = invert4geom.load_synthetic_model(
spacing=1000,
region=(0, 40000, 0, 30000),
)
# create lower synthetic topography data
lower_topography = invert4geom.synthetic_topography_regional(
spacing=1000,
region=(0, 40000, 0, 30000),
scale=1,
yoffset=-1000,
)
true_zref = true_topography.to_numpy().mean()
true_density_contrast = 2670 - 1 # sea level
print(f"mean of true topography: {true_zref} m ")
mean of true topography: 492.2704164812973 m
[3]:
topography_model = invert4geom.create_model(
zref=true_zref,
density_contrast=true_density_contrast,
topography=true_topography.to_dataset(name="upward"),
)
basement_model = invert4geom.create_model(
zref=lower_topography.to_numpy().mean(),
density_contrast=3300 - 2800,
topography=lower_topography.to_dataset(name="upward"),
)
invert4geom.plot_prism_layers(
[topography_model, basement_model],
color_by="thickness",
zscale=20,
)
[4]:
# make pandas dataframe of locations to calculate gravity
# this represents the station locations of a gravity survey
# create lists of coordinates
coords = vd.grid_coordinates(
region=topography_model.region,
spacing=topography_model.spacing,
pixel_register=False,
extra_coords=1001, # survey elevation
)
# grid the coordinates
observations = vd.make_xarray_grid(
(coords[0], coords[1]),
data=coords[2],
data_names="upward",
dims=("northing", "easting"),
)
[5]:
data = invert4geom.create_data(observations)
print(f"Gravity region (W,E,S,N): {data.region}")
print(f"Gravity spacing: {data.spacing} m")
data
Gravity region (W,E,S,N): (0.0, 40000.0, 0.0, 30000.0)
Gravity spacing: 1000.0 m
[5]:
<xarray.Dataset> Size: 11kB
Dimensions: (northing: 31, easting: 41)
Coordinates:
* northing (northing) float64 248B 0.0 1e+03 2e+03 ... 2.8e+04 2.9e+04 3e+04
* easting (easting) float64 328B 0.0 1e+03 2e+03 ... 3.8e+04 3.9e+04 4e+04
Data variables:
upward (northing, easting) float64 10kB 1.001e+03 1.001e+03 ... 1.001e+03
Attributes:
region: (0.0, 40000.0, 0.0, 30000.0)
spacing: 1000.0
buffer_width: 3000.0
inner_region: (3000.0, 37000.0, 3000.0, 27000.0)
dataset_type: data
model_type: prisms
coord_names: ('easting', 'northing')[6]:
data.inv.forward_gravity(topography_model, name="upper_surface_grav")
data.inv.forward_gravity(basement_model, name="lower_surface_grav")
data.inv.df.describe()
[6]:
| northing | easting | upward | upper_surface_grav | lower_surface_grav | |
|---|---|---|---|---|---|
| count | 1271.000000 | 1271.00000 | 1271.0 | 1271.000000 | 1271.000000 |
| mean | 15000.000000 | 20000.00000 | 1001.0 | -0.129619 | 0.004659 |
| std | 8947.792584 | 11836.81698 | 0.0 | 7.147140 | 0.830306 |
| min | 0.000000 | 0.00000 | 1001.0 | -17.180401 | -1.201774 |
| 25% | 7000.000000 | 10000.00000 | 1001.0 | -5.537833 | -0.699901 |
| 50% | 15000.000000 | 20000.00000 | 1001.0 | -1.136301 | -0.177953 |
| 75% | 23000.000000 | 30000.00000 | 1001.0 | 3.743994 | 0.629891 |
| max | 30000.000000 | 40000.00000 | 1001.0 | 19.581120 | 1.860259 |
[7]:
# contaminate gravity with 0.1 mGal of random noise
data["gravity_anomaly_no_noise"] = data.upper_surface_grav + data.lower_surface_grav
data["gravity_anomaly"], stddev = invert4geom.contaminate(
data.gravity_anomaly_no_noise,
stddev=0.1,
percent=False,
seed=0,
)
data["uncert"] = xr.full_like(data.gravity_anomaly, stddev)
data
[7]:
<xarray.Dataset> Size: 62kB
Dimensions: (northing: 31, easting: 41)
Coordinates:
* northing (northing) float64 248B 0.0 1e+03 ... 3e+04
* easting (easting) float64 328B 0.0 1e+03 ... 3.9e+04 4e+04
Data variables:
upward (northing, easting) float64 10kB 1.001e+03 ... ...
upper_surface_grav (northing, easting) float64 10kB 9.526 ... 2.855
lower_surface_grav (northing, easting) float64 10kB -0.04793 ... 0...
gravity_anomaly_no_noise (northing, easting) float64 10kB 9.478 ... 3.368
gravity_anomaly (northing, easting) float64 10kB 9.495 ... 3.425
uncert (northing, easting) float64 10kB 0.1 0.1 ... 0.1
Attributes:
region: (0.0, 40000.0, 0.0, 30000.0)
spacing: 1000.0
buffer_width: 3000.0
inner_region: (3000.0, 37000.0, 3000.0, 27000.0)
dataset_type: data
model_type: prisms
coord_names: ('easting', 'northing')[8]:
fig = ptk.plot_grid(
data.lower_surface_grav,
fig_height=10,
title="lower prisms gravity",
cmap="viridis",
hist=True,
cbar_label="mGal",
frame=["nSWe", "xaf10000", "yaf10000"],
)
fig = ptk.plot_grid(
data.upper_surface_grav,
fig=fig,
origin_shift="x",
fig_height=10,
title="upper prisms gravity",
cmap="viridis",
hist=True,
cbar_label="mGal",
frame=["nSwE", "xaf10000", "yaf10000"],
)
fig = ptk.plot_grid(
data.gravity_anomaly,
fig=fig,
origin_shift="x",
fig_height=10,
title="Observed gravity",
cmap="viridis",
hist=True,
cbar_label="mGal",
frame=["nSwE", "xaf10000", "yaf10000"],
)
fig.show()
9.3. Create “a-priori” topography measurements#
[9]:
# create 15 random point within the outcropping basement region
num_constraints = 15
coords = vd.scatter_points(
region=vd.pad_region(data.region, -data.spacing),
size=num_constraints,
random_state=7,
)
constraint_points = pd.DataFrame(data={"easting": coords[0], "northing": coords[1]})
# sample true topography at these points
constraint_points = invert4geom.sample_grids(
constraint_points,
true_topography,
"true_upward",
)
constraint_points["upward"] = constraint_points.true_upward
# re-sample depths with uncertainty to emulate measurement errors
# set each points uncertainty equal to 2% of depth
uncert = np.abs(0.02 * constraint_points.upward)
constraint_points.loc[constraint_points.index, "uncert"] = uncert
constraint_points = invert4geom.randomly_sample_data(
seed=0,
data_df=constraint_points,
data_col="upward",
uncert_col="uncert",
)
# create weights column
constraint_points["weight"] = 1 / (constraint_points.uncert**2)
constraint_points.head()
[9]:
| easting | northing | true_upward | upward | uncert | weight | |
|---|---|---|---|---|---|---|
| 0 | 3899.714996 | 26468.618776 | 655.102400 | 656.749723 | 13.102048 | 0.005825 |
| 1 | 30636.914105 | 6974.789900 | 407.011208 | 405.935844 | 8.140224 | 0.015091 |
| 2 | 17659.550795 | 13659.470931 | 470.155613 | 476.177579 | 9.403112 | 0.011310 |
| 3 | 28491.676758 | 27073.768551 | 515.239653 | 516.320627 | 10.304793 | 0.009417 |
| 4 | 38163.601456 | 1697.178371 | 419.197817 | 414.706788 | 8.383956 | 0.014227 |
9.4. Create starting model#
[10]:
# grid the sampled values using verde
starting_topography_kwargs = dict(
method="splines",
region=topography_model.region,
spacing=topography_model.spacing,
constraints_df=constraint_points,
dampings=np.logspace(-40, 0, 100),
weights_col="weight",
)
starting_topography = invert4geom.create_topography(**starting_topography_kwargs)
_ = ptk.grid_compare(
true_topography,
starting_topography.upward,
grid1_name="True topography",
grid2_name="Starting topography",
robust=True,
hist=True,
inset=False,
title="difference",
reverse_cpt=True,
cmap="rain",
points=constraint_points,
points_style="x.3c",
)
[11]:
model = invert4geom.create_model(
zref=constraint_points.upward.mean(),
density_contrast=2500,
topography=starting_topography,
)
model.inv.plot_model(
color_by="density",
zscale=20,
)
9.5. Calculate starting gravity misfit#
Now we need to calculate the forward gravity of the starting topography. We then can subtract it from our observed gravity to get a starting gravity misfit.
We don’t know the optimal values for the density contrast or the reference level, so we will make a guess at appropriate values for the starting model.
[12]:
data.inv.forward_gravity(model)
# calculate the true residual misfit
# true misfit is difference between noise-free gravity and starting gravity
# true regional misfit is the moho gravity
# so true residual is misfit - true_regional
data["true_res"] = (
data.gravity_anomaly_no_noise - data.forward_gravity - data.lower_surface_grav
)
data.inv.df.describe()
[12]:
| northing | easting | upward | upper_surface_grav | lower_surface_grav | gravity_anomaly_no_noise | gravity_anomaly | uncert | forward_gravity | true_res | |
|---|---|---|---|---|---|---|---|---|---|---|
| count | 1271.000000 | 1271.00000 | 1271.0 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1.271000e+03 | 1271.000000 | 1271.000000 |
| mean | 15000.000000 | 20000.00000 | 1001.0 | -0.129619 | 0.004659 | -0.124959 | -0.124959 | 1.000000e-01 | 1.079930 | -1.209548 |
| std | 8947.792584 | 11836.81698 | 0.0 | 7.147140 | 0.830306 | 7.314752 | 7.313577 | 2.776650e-17 | 6.599125 | 4.456801 |
| min | 0.000000 | 0.00000 | 1001.0 | -17.180401 | -1.201774 | -16.908338 | -17.061563 | 1.000000e-01 | -7.883841 | -19.046853 |
| 25% | 7000.000000 | 10000.00000 | 1001.0 | -5.537833 | -0.699901 | -5.826630 | -5.827131 | 1.000000e-01 | -3.944309 | -2.622186 |
| 50% | 15000.000000 | 20000.00000 | 1001.0 | -1.136301 | -0.177953 | -0.557795 | -0.572175 | 1.000000e-01 | -0.765707 | -1.088926 |
| 75% | 23000.000000 | 30000.00000 | 1001.0 | 3.743994 | 0.629891 | 4.171648 | 4.139840 | 1.000000e-01 | 4.823461 | 0.332004 |
| max | 30000.000000 | 40000.00000 | 1001.0 | 19.581120 | 1.860259 | 19.774399 | 19.944612 | 1.000000e-01 | 22.159800 | 12.693226 |
[13]:
_ = ptk.grid_compare(
data.gravity_anomaly,
data.forward_gravity,
grid1_name="Observed gravity",
grid2_name="Starting gravity",
robust=True,
hist=True,
inset=False,
title="Gravity misfit",
rmse_in_title=False,
points=constraint_points,
points_style="x.3c",
)
[14]:
fig = ptk.plot_grid(
data.lower_surface_grav,
fig_height=10,
title="True regional",
hist=True,
cbar_label="mGal",
frame=["nSWe", "xaf10000", "yaf10000"],
)
fig = ptk.plot_grid(
data.true_res,
fig=fig,
origin_shift="x",
fig_height=10,
title="True residual",
hist=True,
cbar_label="mgal",
frame=["nSwE", "xaf10000", "yaf10000"],
)
fig.show()
9.6. Regional separation#
[15]:
def regional_comparison(data):
# compare with true regional
_ = ptk.grid_compare(
data.lower_surface_grav,
data.reg,
robust=True,
grid1_name="True regional field",
grid2_name="Estimated regional field",
hist=True,
inset=False,
title="difference",
points=constraint_points,
points_style="x.3c",
)
# compare with true residual
_ = ptk.grid_compare(
data.true_res,
data.res,
robust=True,
grid1_name="True residual field",
grid2_name="Estimated residual field",
hist=True,
inset=False,
title="difference",
points=constraint_points,
points_style="x.3c",
)
[16]:
# estimate regional with the mean misfit at constraints
regional_grav_kwargs = dict(
method="constraints",
grid_method="eq_sources",
constraints_weights_column="weight",
constraints_df=constraint_points,
cv=True,
cv_kwargs=dict(
n_trials=50,
damping_limits=(1e-10, 1),
# depth_limits=(1e3,100e3),
progressbar=False,
fname="../tmp/09_combining_it_all_alternative_regional_sep",
# plot=True,
),
depth="default",
block_size=None,
)
data.inv.regional_separation(
**regional_grav_kwargs,
)
regional_comparison(data)
data.inv.df.describe()
[16]:
| northing | easting | upward | upper_surface_grav | lower_surface_grav | gravity_anomaly_no_noise | gravity_anomaly | uncert | forward_gravity | true_res | misfit | reg | res | starting_forward_gravity | starting_misfit | starting_reg | starting_res | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 1271.000000 | 1271.00000 | 1271.0 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1.271000e+03 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 | 1271.000000 |
| mean | 15000.000000 | 20000.00000 | 1001.0 | -0.129619 | 0.004659 | -0.124959 | -0.124959 | 1.000000e-01 | 1.079930 | -1.209548 | -1.204889 | -1.050441 | -0.154448 | 1.079930 | -1.204889 | -1.050441 | -0.154448 |
| std | 8947.792584 | 11836.81698 | 0.0 | 7.147140 | 0.830306 | 7.314752 | 7.313577 | 2.776650e-17 | 6.599125 | 4.456801 | 4.644997 | 0.899712 | 4.423033 | 6.599125 | 4.644997 | 0.899712 | 4.423033 |
| min | 0.000000 | 0.00000 | 1001.0 | -17.180401 | -1.201774 | -16.908338 | -17.061563 | 1.000000e-01 | -7.883841 | -19.046853 | -19.654062 | -2.393489 | -18.180347 | -7.883841 | -19.654062 | -2.393489 | -18.180347 |
| 25% | 7000.000000 | 10000.00000 | 1001.0 | -5.537833 | -0.699901 | -5.826630 | -5.827131 | 1.000000e-01 | -3.944309 | -2.622186 | -2.931387 | -1.856200 | -1.829092 | -3.944309 | -2.931387 | -1.856200 | -1.829092 |
| 50% | 15000.000000 | 20000.00000 | 1001.0 | -1.136301 | -0.177953 | -0.557795 | -0.572175 | 1.000000e-01 | -0.765707 | -1.088926 | -1.578812 | -1.142989 | -0.050248 | -0.765707 | -1.578812 | -1.142989 | -0.050248 |
| 75% | 23000.000000 | 30000.00000 | 1001.0 | 3.743994 | 0.629891 | 4.171648 | 4.139840 | 1.000000e-01 | 4.823461 | 0.332004 | 0.296782 | -0.286862 | 1.744463 | 4.823461 | 0.296782 | -0.286862 | 1.744463 |
| max | 30000.000000 | 40000.00000 | 1001.0 | 19.581120 | 1.860259 | 19.774399 | 19.944612 | 1.000000e-01 | 22.159800 | 12.693226 | 13.114356 | 0.689128 | 14.012389 | 22.159800 | 13.114356 | 0.689128 | 14.012389 |
[17]:
data.inv.plot_anomalies()
9.7. Single inversion#
Perform a single inversion to experiment with values of stopping criteria.
[18]:
# setup the inversion
inv = invert4geom.Inversion(
data,
model,
solver_damping=0.005,
# set stopping criteria
l2_norm_tolerance=0.3, # grav error of .1mGal -> sqrt(.1) = 0.3
delta_l2_norm_tolerance=1.008,
)
[19]:
# run the inversion
inv.invert(
plot_dynamic_convergence=True,
results_fname="../tmp/09_combining_it_all_alternative",
)
[20]:
inv.stats_df
[20]:
| iteration | rmse | l2_norm | delta_l2_norm | iter_time_sec | |
|---|---|---|---|---|---|
| 0 | 0.0 | 4.636082 | 2.153156 | inf | NaN |
| 1 | 1.0 | 1.360583 | 1.166440 | 1.845920 | 0.709483 |
| 2 | 2.0 | 0.478946 | 0.692059 | 1.685464 | 0.151519 |
| 3 | 3.0 | 0.246154 | 0.496139 | 1.394890 | 0.152965 |
| 4 | 4.0 | 0.213645 | 0.462217 | 1.073389 | 0.172500 |
| 5 | 5.0 | 0.276410 | 0.525747 | 0.879163 | 0.167975 |
| 6 | 6.0 | 0.432148 | 0.657380 | 0.799761 | 0.161606 |
[21]:
inv.plot_inversion_results(
iters_to_plot=2,
)
[22]:
_ = ptk.grid_compare(
true_topography,
inv.model.topography,
region=data.inner_region,
grid1_name="True topography",
grid2_name="Inverted topography",
robust=True,
hist=True,
inset=False,
verbose="q",
title="difference",
reverse_cpt=True,
cmap="rain",
frame=True,
points=constraint_points,
points_style="x.3c",
)
9.8. Damping parameter cross validation#
9.8.1. Resample to create testing / training data sets#
[23]:
data.inv.df
[23]:
| northing | easting | upward | upper_surface_grav | lower_surface_grav | gravity_anomaly_no_noise | gravity_anomaly | uncert | forward_gravity | true_res | misfit | reg | res | starting_forward_gravity | starting_misfit | starting_reg | starting_res | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 0.0 | 1001.0 | 9.525994 | -0.047934 | 9.478060 | 9.494673 | 0.1 | 10.740517 | -1.214524 | -1.245844 | -2.206690 | 0.960846 | 10.740517 | -1.245844 | -2.206690 | 0.960846 |
| 1 | 0.0 | 1000.0 | 1001.0 | 10.416800 | -0.060607 | 10.356193 | 10.347023 | 0.1 | 11.470352 | -1.053551 | -1.123328 | -2.237354 | 1.114026 | 11.470352 | -1.123328 | -2.237354 | 1.114026 |
| 2 | 0.0 | 2000.0 | 1001.0 | 9.945126 | -0.079985 | 9.865141 | 9.933223 | 0.1 | 10.699837 | -0.754711 | -0.766613 | -2.265615 | 1.499002 | 10.699837 | -0.766613 | -2.265615 | 1.499002 |
| 3 | 0.0 | 3000.0 | 1001.0 | 9.265122 | -0.102636 | 9.162486 | 9.177017 | 0.1 | 9.740384 | -0.475262 | -0.563367 | -2.291301 | 1.727934 | 9.740384 | -0.563367 | -2.291301 | 1.727934 |
| 4 | 0.0 | 4000.0 | 1001.0 | 8.528509 | -0.126515 | 8.401994 | 8.352468 | 0.1 | 8.762056 | -0.233547 | -0.409588 | -2.314268 | 1.904680 | 8.762056 | -0.409588 | -2.314268 | 1.904680 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1266 | 30000.0 | 36000.0 | 1001.0 | 3.330540 | 0.765378 | 4.095918 | 3.988219 | 0.1 | 5.931613 | -2.601073 | -1.943393 | 0.556764 | -2.500158 | 5.931613 | -1.943393 | 0.556764 | -2.500158 |
| 1267 | 30000.0 | 37000.0 | 1001.0 | 3.328091 | 0.745878 | 4.073969 | 4.144597 | 0.1 | 6.177712 | -2.849622 | -2.033116 | 0.528620 | -2.561735 | 6.177712 | -2.033116 | 0.528620 | -2.561735 |
| 1268 | 30000.0 | 38000.0 | 1001.0 | 3.333133 | 0.706587 | 4.039719 | 4.073005 | 0.1 | 6.390391 | -3.057258 | -2.317386 | 0.499154 | -2.816540 | 6.390391 | -2.317386 | 0.499154 | -2.816540 |
| 1269 | 30000.0 | 39000.0 | 1001.0 | 3.298196 | 0.634280 | 3.932475 | 3.841945 | 0.1 | 6.478970 | -3.180774 | -2.637025 | 0.468717 | -3.105742 | 6.478970 | -2.637025 | 0.468717 | -3.105742 |
| 1270 | 30000.0 | 40000.0 | 1001.0 | 2.855276 | 0.512861 | 3.368138 | 3.425127 | 0.1 | 5.730724 | -2.875448 | -2.305597 | 0.437636 | -2.743233 | 5.730724 | -2.305597 | 0.437636 | -2.743233 |
1271 rows × 17 columns
[24]:
inv.data.inv.df.columns
[24]:
Index(['northing', 'easting', 'upward', 'upper_surface_grav',
'lower_surface_grav', 'gravity_anomaly_no_noise', 'gravity_anomaly',
'uncert', 'forward_gravity', 'true_res', 'misfit', 'reg', 'res',
'starting_forward_gravity', 'starting_misfit', 'starting_reg',
'starting_res', 'iter_1_initial_residual', 'iter_2_initial_residual',
'iter_3_initial_residual', 'iter_4_initial_residual',
'iter_5_initial_residual', 'iter_6_initial_residual'],
dtype='str')
[25]:
inv.reinitialize_inversion()
# resample data at 1/2 spacing to include test points for cross-validation
inv.data = invert4geom.add_test_points(inv.data)
[26]:
damping_cv_obj = inv.optimize_inversion_damping(
damping_limits=(0.001, 1),
n_trials=8,
fname="../tmp/09_combining_it_all_alternative_damping_CV",
)
inv.solver_damping
[26]:
0.022462769883683984
[27]:
damping_cv_obj.study.trials_dataframe()
[27]:
| number | value | datetime_start | datetime_complete | duration | params_damping | user_attrs_fname | system_attrs_fixed_params | state | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0.483118 | 2026-02-17 02:21:04.924908 | 2026-02-17 02:21:05.732415 | 0 days 00:00:00.807507 | 0.001000 | ../tmp/09_combining_it_all_alternative_damping... | {'damping': 0.001} | COMPLETE |
| 1 | 1 | 4.592549 | 2026-02-17 02:21:05.733508 | 2026-02-17 02:21:06.273835 | 0 days 00:00:00.540327 | 1.000000 | ../tmp/09_combining_it_all_alternative_damping... | {'damping': 1} | COMPLETE |
| 2 | 2 | 0.606439 | 2026-02-17 02:21:06.274852 | 2026-02-17 02:21:07.400753 | 0 days 00:00:01.125901 | 0.001983 | ../tmp/09_combining_it_all_alternative_damping... | NaN | COMPLETE |
| 3 | 3 | 0.073751 | 2026-02-17 02:21:07.401706 | 2026-02-17 02:21:08.987635 | 0 days 00:00:01.585929 | 0.062714 | ../tmp/09_combining_it_all_alternative_damping... | NaN | COMPLETE |
| 4 | 4 | 0.033137 | 2026-02-17 02:21:08.995668 | 2026-02-17 02:21:09.739758 | 0 days 00:00:00.744090 | 0.022463 | ../tmp/09_combining_it_all_alternative_damping... | NaN | COMPLETE |
| 5 | 5 | 0.051635 | 2026-02-17 02:21:09.740771 | 2026-02-17 02:21:10.641748 | 0 days 00:00:00.900977 | 0.037994 | ../tmp/09_combining_it_all_alternative_damping... | NaN | COMPLETE |
| 6 | 6 | 0.080030 | 2026-02-17 02:21:10.642737 | 2026-02-17 02:21:11.593243 | 0 days 00:00:00.950506 | 0.010451 | ../tmp/09_combining_it_all_alternative_damping... | NaN | COMPLETE |
| 7 | 7 | 0.065969 | 2026-02-17 02:21:11.594263 | 2026-02-17 02:21:12.338491 | 0 days 00:00:00.744228 | 0.016081 | ../tmp/09_combining_it_all_alternative_damping... | NaN | COMPLETE |
[28]:
inv.plot_convergence()
inv.plot_inversion_results(iters_to_plot=2)
_ = ptk.grid_compare(
true_topography,
inv.model.topography,
grid1_name="True topography",
grid2_name="Inverted topography",
robust=True,
hist=True,
inset=False,
title="difference",
reverse_cpt=True,
cmap="rain",
points=constraint_points,
points_style="x.3c",
)
[29]:
# sample the inverted topography at the constraint points
constraint_points = invert4geom.sample_grids(
constraint_points,
inv.model.topography,
"inverted_topography",
)
rmse = invert4geom.rmse(
constraint_points.upward - constraint_points.inverted_topography
)
print(f"RMSE: {rmse:.2f} m")
RMSE: 10.02 m
9.9. Z-ref and density contrast optimizations#
Since this cross validation uses the inversion error at constraint points, we can’t use the constraint point minimization technique to estimate the regional field since that inherently sets the inversion error at constraints to zero, invalidating the cross validation scores.
There are two options for how to get around this issue:
use a different regional estimation technique while finding the optimal density contrast value, then use the found optimal values with the constraint point minimization regional estimation technique afterwards.
separate the constraints into testing and training sets, so that only the training set is used during the regional separation, and only the testing set is used for scoring the density contrast cross validation.
Since we have constraint points, we will just use their mean value as the zref and omit the zref CV.
Well perform just the second method below:
This option is less efficient, but in some cases may be more accurate.
Perform a cross validation for the optimal zref and density contrast values same as function optimize_inversion_zref_density_contrast, but pass a dataframe of constraint points which contains folds of testing and training data (generated with cross_validation.split_test_train). For each set of zref/density values, recreate the starting topography with the supplied starting_topography_kwargs, recreate the starting prism model, and perform a regional separation and inversion for each of the K folds in the constraints dataframe.
After all K folds are inverted, the mean of the K folds scores will be the score for that set of parameters. Repeat this for all parameters. Within each fold, the training constraints are used for the regional separation and the testing constraints are used for scoring. This is only useful if the regional separation technique you supply via regional_grav_kwargs uses constraints points for the estimations, such as constraint point minimization. If using 20 sets of density and zref values, and use 5 folds, this will run 100 inversions. It is more efficient, but less accurate, to simple use a different regional estimation technique, which doesn’t require constraint points, to find the optimal zref and density values. Then use these again in another inversion with the desired regional separation technique.
[30]:
inv.reinitialize_inversion()
[31]:
# run a 5-fold cross validation for 15 parameter sets of density and zref
density_zref_optimization_obj = inv.optimize_inversion_zref_density_contrast_kfolds(
constraints_df=constraint_points,
density_contrast_limits=(2300, 3000),
zref_limits=(0, 1e3),
n_trials=15,
regional_grav_kwargs=regional_grav_kwargs,
starting_topography_kwargs=starting_topography_kwargs,
plot_scores=False,
fname="../tmp/09_combining_it_all_alternative_density_and_zref_optimization",
split_kwargs=dict(
n_splits=5,
method="KFold",
),
fold_progressbar=False,
)
Best damping value (1e-10) is at the limit of provided values (1.0, 1e-10) and thus is likely not a global minimum, expand the range of values tested to ensure the best parameter value is found.
Best damping value (1e-40) is at the limit of provided values (1e-40, 1.0) and thus is likely not a global minimum, expand the range of values test to ensure the best parameter value value is found.
Best damping value (1.0) is at the limit of provided values (1.0, 1e-10) and thus is likely not a global minimum, expand the range of values tested to ensure the best parameter value is found.
Best damping value (1e-40) is at the limit of provided values (1e-40, 1.0) and thus is likely not a global minimum, expand the range of values test to ensure the best parameter value value is found.
'forward_gravity' already a variable of `grav_ds`, but is being overwritten since calculate_starting_gravity is True
'reg' already a column of `grav_df`, but is being overwritten since calculate_regional_misfit is True
[32]:
# to re-load the inv object from the saved pickle file
with pathlib.Path(
"../tmp/09_combining_it_all_alternative_density_and_zref_optimization.pickle"
).open("rb") as f:
inv = pickle.load(f)
# to re-load the study from the saved pickle file
with pathlib.Path(
"../tmp/09_combining_it_all_alternative_density_and_zref_optimization_study.pickle"
).open("rb") as f:
study = pickle.load(f)
inv.solver_damping, inv.model.density_contrast, inv.model.zref
[32]:
(0.022462769883683984, 2805, 599.1217798843752)
[33]:
fig = invert4geom.plot_2_parameter_scores_uneven(
density_zref_optimization_obj.study,
param_names=(
"params_zref",
"params_density_contrast",
),
plot_param_names=(
"Reference level (m)",
"Density contrast (kg/m$^3$)",
),
)
[34]:
inv.plot_convergence()
inv.plot_inversion_results(iters_to_plot=2)
_ = ptk.grid_compare(
true_topography,
inv.model.topography,
grid1_name="True topography",
grid2_name="Inverted topography",
robust=True,
hist=True,
inset=False,
title="difference",
reverse_cpt=True,
cmap="rain",
points=constraint_points,
points_style="x.3c",
)
[ ]: