Density contrast cross validation#

The choice of the density value used in the inversion directly affects the results and therefore needs to be carefully chosen. Choosing too high of a density contrast will result in a low amplitude topography, and too low of a value will result in a topography with high amplitude features. Similar to choosing an optimal damping value in damping_cross_validation.ipynb, we provides some tools to best choose the density contrast value.

However, this cross validation is slightly different than the damping cross validation in that we need at least 1 location where we know the true topography. These locations could be from seismic surveys, drill sites etc. These points of known topography are referred to as constraint points.

Import packages#

[1]:
from __future__ import annotations

%load_ext autoreload
%autoreload 2


import logging

import numpy as np
import pandas as pd
import verde as vd
import xarray as xr
from polartoolkit import maps
from polartoolkit import utils as polar_utils
from tqdm.autonotebook import tqdm

from invert4geom import cross_validation, inversion, plotting, synthetic, utils

Create observed gravity data#

True topography#

[2]:
# set grid parameters
spacing = 1000
region = [0, 40000, 0, 30000]

# create synthetic topography data
true_topography = synthetic.synthetic_topography_simple(
    spacing,
    region,
)

Constraint points#

Sample the starting topography at 10 random locations to create a set of constraints points, simulating locations where the topography is known.

[3]:
# create 10 random point withing the region
num_constraints = 10
coords = vd.scatter_points(region=region, size=num_constraints, random_state=7)
constraint_points = pd.DataFrame(data={"easting": coords[0], "northing": coords[1]})

# sample true topography at these points
constraint_points = utils.sample_grids(
    constraint_points, true_topography, "upward", coord_names=("easting", "northing")
)

# plot the true topography
fig = maps.plot_grd(
    true_topography,
    fig_height=10,
    title="True topography",
    cmap="rain",
    reverse_cpt=True,
    grd2_cpt=True,
    cbar_label="elevation (m)",
    frame=["nSWe", "xaf10000", "yaf10000"],
    points=constraint_points.rename(columns={"easting": "x", "northing": "y"}),
)
fig.show()
../_images/user_guide_density_cross_validation_7_0.png

True topography prism layer#

[4]:
# the density contrast is between rock (~2670 kg/m3) and air (~1 kg/m3)
density_contrast = 2670 - 1

# prisms are created between the mean topography value and the height of the topography
zref = true_topography.values.mean()
print(f"mean of true topography: {zref} m ")

# prisms above zref have positive density contrast and prisms below zref have negative
# density contrast
density = xr.where(true_topography >= zref, density_contrast, -density_contrast)

# create layer of prisms
prisms = utils.grids_to_prisms(
    true_topography,
    zref,
    density=density,
)
mean of true topography: 492.2704164812973 m

Forward gravity of prism layer#

[5]:
# make pandas dataframe of locations to calculate gravity
# this represents the station locations of a gravity survey
# create lists of coordinates
coords = vd.grid_coordinates(
    region=region,
    spacing=spacing,
    pixel_register=False,
    extra_coords=1000,  # survey elevation
)

# grid the coordinates
observations = vd.make_xarray_grid(
    (coords[0], coords[1]),
    data=coords[2],
    data_names="upward",
    dims=("northing", "easting"),
).upward

grav_df = vd.grid_to_table(observations)

grav_df["grav"] = prisms.prism_layer.gravity(
    coordinates=(
        grav_df.easting,
        grav_df.northing,
        grav_df.upward,
    ),
    field="g_z",
    progressbar=True,
)
grav_df
[5]:
northing easting upward grav
0 0.0 0.0 1000.0 9.534643
1 0.0 1000.0 1000.0 10.422834
2 0.0 2000.0 1000.0 9.949973
3 0.0 3000.0 1000.0 9.269279
4 0.0 4000.0 1000.0 8.532160
... ... ... ... ...
1266 30000.0 36000.0 1000.0 3.332716
1267 30000.0 37000.0 1000.0 3.330307
1268 30000.0 38000.0 1000.0 3.335438
1269 30000.0 39000.0 1000.0 3.300721
1270 30000.0 40000.0 1000.0 2.858299

1271 rows Γ— 4 columns

[6]:
# contaminate gravity with 0.5 mGal of random noise
grav_df["observed_grav"], stddev = synthetic.contaminate(
    grav_df.grav,
    stddev=0.5,
    percent=False,
    seed=0,
)

grav_df.set_index(["northing", "easting"]).to_xarray().observed_grav.plot()
[6]:
<matplotlib.collections.QuadMesh at 0x7f37aaf1bd10>
../_images/user_guide_density_cross_validation_12_1.png

For simplicity here we assume that we know reference level zref of the true topography, and use this when creating our starting model.

What we don’t know in this scenario is what value to use for the density contrast. We will use a cross-validation of a range of density values to find the one which is optimal.

Each density value will give a cross validation score, and the lowest score will show which density value is optimal.

The cross validation score is calculated as the root mean square error between the points of known topography (constraints) and the inverted topography at those points.

Get Cross Validation Score#

First we need to use the density value to create the flat starting model. We will then use this in an inversion to calculate a score.

[7]:
# set a density contrast
density_contrast = 2300  # compared to true value of 2669

# assume we know the optimal zref value
zref = true_topography.values.mean()

# create flat topography grid with a constant height
starting_topography = xr.full_like(true_topography, zref)

# prisms above zref have positive density contrast and prisms below zref have negative
# density contrast
density = xr.where(starting_topography >= zref, density_contrast, -density_contrast)

# create layer of prisms
starting_prisms = utils.grids_to_prisms(
    starting_topography,
    zref,
    density=density,
)

# gravity of starting model is 0 since its flat, so observed_grav = misfit
grav_df["misfit"] = grav_df["observed_grav"]
grav_df["reg"] = 0
grav_df["res"] = grav_df.misfit

# set kwargs to pass to the inversion
kwargs = {
    "input_grav_column": "observed_grav",
    "prism_layer": starting_prisms,
    "deriv_type": "annulus",
    "solver_damping": 0.1,
    "zref": zref,
    "density_contrast": density_contrast,
    # set stopping criteria
    "max_iterations": 30,
    "l2_norm_tolerance": 0.5,
    "delta_l2_norm_tolerance": 1.005,
}

# run inversion, calculate the score
score = cross_validation.constraints_cv_score(
    grav=grav_df,
    constraints=constraint_points,
    **kwargs,
)
score
[7]:
15.307584112536286

Cross Validation#

Lets see if we can improve the score with other values for the density contrast.

Now we can repeat this with a range of density contrast values to find the optimal (lowest) score. But remember we need to recreate the starting model with each density value.

[11]:
# set Python's logging level
logger = logging.getLogger()
logger.setLevel(logging.WARNING)

# set which density contrasts to include
density_contrasts = np.linspace(2400, 3000, 8)

# run inversions and collect scores
scores = []
for density_contrast in tqdm(density_contrasts, desc="Density values"):
    # re-calculate density grid with new density contrast
    density = xr.where(starting_topography >= zref, density_contrast, -density_contrast)

    # create layer of prisms
    starting_prisms = utils.grids_to_prisms(
        starting_topography,
        zref,
        density=density,
    )

    # calculate forward gravity of starting prism layer
    # this isn't exactly necessary since the prism's thickness is 0 so gravity will be 0
    grav_df["starting_grav"] = starting_prisms.prism_layer.gravity(
        coordinates=(
            grav_df.easting,
            grav_df.northing,
            grav_df.upward,
        ),
        field="g_z",
        progressbar=False,
    )

    # calculate misfit as observed - starting
    grav_df["misfit"] = grav_df.observed_grav - grav_df.starting_grav

    # set the regional misfit
    grav_df["reg"] = 0

    # remove the regional from the misfit to get the residual
    grav_df["res"] = grav_df.misfit - grav_df.reg

    # update density contrast value in kwargs
    kwargs["density_contrast"] = density_contrast

    # update starting model in kwargs
    kwargs["prism_layer"] = starting_prisms

    # run cross validation
    score = cross_validation.constraints_cv_score(
        grav=grav_df,
        constraints=constraint_points,
        **kwargs,
    )
    scores.append(score)

# print damping and score pairs
for density_contrast, score in zip(density_contrasts, scores):
    print(f"Density contrast: {density_contrast} -> Score: {score}")

best_idx = np.argmin(scores)
best_score = scores[best_idx]
best_density_contrast = density_contrasts[best_idx]
print(f"Best score of {best_score} with density contrast={best_density_contrast}")
Density contrast: 2400.0 -> Score: 12.146992321104095
Density contrast: 2485.714285714286 -> Score: 9.68314924776792
Density contrast: 2571.4285714285716 -> Score: 7.729267747109761
Density contrast: 2657.142857142857 -> Score: 6.239270135848094
Density contrast: 2742.8571428571427 -> Score: 5.404648028861726
Density contrast: 2828.5714285714284 -> Score: 5.369994982535395
Density contrast: 2914.285714285714 -> Score: 6.029795434562008
Density contrast: 3000.0 -> Score: 7.111292687879829
Best score of 5.369994982535395 with density contrast=2828.5714285714284
[12]:
plotting.plot_cv_scores(
    scores,
    density_contrasts,
    param_name="Density contrast",
    # logx=True,
    # logy=True,
)
../_images/user_guide_density_cross_validation_17_0.png

Run inversion with optimal value#

[13]:
# set Python's logging level
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# re-calculate density grid with the best density contrast
density_contrast = best_density_contrast
density = xr.where(starting_topography >= zref, density_contrast, -density_contrast)

# create layer of prisms
starting_prisms = utils.grids_to_prisms(
    starting_topography,
    zref,
    density=density,
)

# calculate forward gravity of starting prism layer
grav_df["starting_grav"] = starting_prisms.prism_layer.gravity(
    coordinates=(
        grav_df.easting,
        grav_df.northing,
        grav_df.upward,
    ),
    field="g_z",
    progressbar=False,
)

# calculate misfit as observed - starting
grav_df["misfit"] = grav_df.observed_grav - grav_df.starting_grav

# set the regional misfit
grav_df["reg"] = 0

# remove the regional from the misfit to get the residual
grav_df["res"] = grav_df.misfit - grav_df.reg

# update starting model in kwargs
kwargs["prism_layer"] = starting_prisms

# make new kwargs without zref or density
new_kwargs = {
    key: value
    for key, value in kwargs.items()
    if key
    not in [
        "zref",
        "density_contrast",
    ]
}
results = inversion.run_inversion(
    input_grav=grav_df,
    plot_convergence=True,
    zref=kwargs.get("zref"),
    density_contrast=density_contrast,
    **new_kwargs,
)

# collect the results
topo_results, grav_results, parameters, elapsed_time = results
INFO:root:starting inversion
INFO:root:extracted prism spacing is 1000.0
INFO:root:
 ####################################
 iteration 1
INFO:root:Layer correction median: -5.9354 m, RMSE:34.004 m
INFO:root:updated misfit RMSE: 3.8649
INFO:root:updated L2-norm: 1.9659, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.3605, tolerance: 1.005
INFO:root:
 ####################################
 iteration 2
INFO:root:Layer correction median: 11.9234 m, RMSE:17.9204 m
INFO:root:updated misfit RMSE: 2.2018
INFO:root:updated L2-norm: 1.4838, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.3249, tolerance: 1.005
INFO:root:
 ####################################
 iteration 3
INFO:root:Layer correction median: 6.2933 m, RMSE:9.8122 m
INFO:root:updated misfit RMSE: 1.3524
INFO:root:updated L2-norm: 1.1629, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.276, tolerance: 1.005
INFO:root:
 ####################################
 iteration 4
INFO:root:Layer correction median: 3.4011 m, RMSE:5.6637 m
INFO:root:updated misfit RMSE: 0.9097
INFO:root:updated L2-norm: 0.9538, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.2192, tolerance: 1.005
INFO:root:
 ####################################
 iteration 5
INFO:root:Layer correction median: 1.7899 m, RMSE:3.4841 m
INFO:root:updated misfit RMSE: 0.6706
INFO:root:updated L2-norm: 0.8189, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.1647, tolerance: 1.005
INFO:root:
 ####################################
 iteration 6
INFO:root:Layer correction median: 0.9634 m, RMSE:2.2937 m
INFO:root:updated misfit RMSE: 0.5343
INFO:root:updated L2-norm: 0.7309, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.1204, tolerance: 1.005
INFO:root:
 ####################################
 iteration 7
INFO:root:Layer correction median: 0.5349 m, RMSE:1.6097 m
INFO:root:updated misfit RMSE: 0.4512
INFO:root:updated L2-norm: 0.6717, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0882, tolerance: 1.005
INFO:root:
 ####################################
 iteration 8
INFO:root:Layer correction median: 0.3057 m, RMSE:1.1937 m
INFO:root:updated misfit RMSE: 0.397
INFO:root:updated L2-norm: 0.6301, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.066, tolerance: 1.005
INFO:root:
 ####################################
 iteration 9
INFO:root:Layer correction median: 0.1963 m, RMSE:0.9262 m
INFO:root:updated misfit RMSE: 0.3595
INFO:root:updated L2-norm: 0.5996, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0509, tolerance: 1.005
INFO:root:
 ####################################
 iteration 10
INFO:root:Layer correction median: 0.1286 m, RMSE:0.7457 m
INFO:root:updated misfit RMSE: 0.3321
INFO:root:updated L2-norm: 0.5763, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0405, tolerance: 1.005
INFO:root:
 ####################################
 iteration 11
INFO:root:Layer correction median: 0.0824 m, RMSE:0.619 m
INFO:root:updated misfit RMSE: 0.3111
INFO:root:updated L2-norm: 0.5578, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0332, tolerance: 1.005
INFO:root:
 ####################################
 iteration 12
INFO:root:Layer correction median: 0.0658 m, RMSE:0.5272 m
INFO:root:updated misfit RMSE: 0.2944
INFO:root:updated L2-norm: 0.5426, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.028, tolerance: 1.005
INFO:root:
 ####################################
 iteration 13
INFO:root:Layer correction median: 0.0592 m, RMSE:0.4588 m
INFO:root:updated misfit RMSE: 0.2806
INFO:root:updated L2-norm: 0.5298, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0242, tolerance: 1.005
INFO:root:
 ####################################
 iteration 14
INFO:root:Layer correction median: 0.0541 m, RMSE:0.4065 m
INFO:root:updated misfit RMSE: 0.269
INFO:root:updated L2-norm: 0.5187, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0214, tolerance: 1.005
INFO:root:
 ####################################
 iteration 15
INFO:root:Layer correction median: 0.0373 m, RMSE:0.3656 m
INFO:root:updated misfit RMSE: 0.2589
INFO:root:updated L2-norm: 0.5089, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0193, tolerance: 1.005
INFO:root:
 ####################################
 iteration 16
INFO:root:Layer correction median: 0.028 m, RMSE:0.333 m
INFO:root:updated misfit RMSE: 0.2501
INFO:root:updated L2-norm: 0.5001, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0176, tolerance: 1.005
INFO:root:
 ####################################
 iteration 17
INFO:root:Layer correction median: 0.0313 m, RMSE:0.3065 m
INFO:root:updated misfit RMSE: 0.2421
INFO:root:updated L2-norm: 0.492, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0163, tolerance: 1.005
INFO:root:
Inversion terminated after 17 iterations because L2-norm (0.49203814399916346) was less then set tolerance: 0.5
Change parameter 'l2_norm_tolerance' if desired.
../_images/user_guide_density_cross_validation_19_3.png
[14]:
plotting.plot_inversion_results(
    grav_results,
    topo_results,
    parameters,
    region,
    iters_to_plot=4,
    plot_iter_results=True,
    plot_topo_results=True,
    plot_grav_results=True,
)
../_images/user_guide_density_cross_validation_20_0.png
../_images/user_guide_density_cross_validation_20_1.png
../_images/user_guide_density_cross_validation_20_2.png
[33]:
final_topography = topo_results.set_index(["northing", "easting"]).to_xarray().topo

_ = polar_utils.grd_compare(
    true_topography,
    final_topography,
    # plot_type="xarray",
    plot=True,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    verbose="q",
    title="difference",
    grounding_line=False,
    reverse_cpt=True,
    cmap="rain",
    # diff_lims=(-20, 20),
)
../_images/user_guide_density_cross_validation_21_0.png

Run inversion with poor choice of density contrast#

[23]:
# set Python's logging level
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# re-calculate density grid with the best density contrast
density_contrast = 4000
density = xr.where(starting_topography >= zref, density_contrast, -density_contrast)

# create layer of prisms
starting_prisms = utils.grids_to_prisms(
    starting_topography,
    zref,
    density=density,
)

# calculate forward gravity of starting prism layer
grav_df["starting_grav"] = starting_prisms.prism_layer.gravity(
    coordinates=(
        grav_df.easting,
        grav_df.northing,
        grav_df.upward,
    ),
    field="g_z",
    progressbar=False,
)

# calculate misfit as observed - starting
grav_df["misfit"] = grav_df.observed_grav - grav_df.starting_grav

# set the regional misfit
grav_df["reg"] = 0

# remove the regional from the misfit to get the residual
grav_df["res"] = grav_df.misfit - grav_df.reg

# update starting model in kwargs
kwargs["prism_layer"] = starting_prisms

# make new kwargs without zref or density
new_kwargs = {
    key: value
    for key, value in kwargs.items()
    if key
    not in [
        "zref",
        "density_contrast",
    ]
}
results = inversion.run_inversion(
    input_grav=grav_df,
    plot_convergence=True,
    zref=kwargs.get("zref"),
    density_contrast=density_contrast,
    **new_kwargs,
)

# collect the results
topo_results, grav_results, parameters, elapsed_time = results
INFO:root:starting inversion
INFO:root:extracted prism spacing is 1000.0
INFO:root:
 ####################################
 iteration 1
INFO:root:Layer correction median: -5.9495 m, RMSE:35.2102 m
INFO:root:updated misfit RMSE: 2.3872
INFO:root:updated L2-norm: 1.5451, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.7311, tolerance: 1.005
INFO:root:
 ####################################
 iteration 2
INFO:root:Layer correction median: 7.3292 m, RMSE:11.4953 m
INFO:root:updated misfit RMSE: 0.9924
INFO:root:updated L2-norm: 0.9962, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.551, tolerance: 1.005
INFO:root:
 ####################################
 iteration 3
INFO:root:Layer correction median: 2.2817 m, RMSE:4.3921 m
INFO:root:updated misfit RMSE: 0.5644
INFO:root:updated L2-norm: 0.7512, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.326, tolerance: 1.005
INFO:root:
 ####################################
 iteration 4
INFO:root:Layer correction median: 0.7651 m, RMSE:2.0991 m
INFO:root:updated misfit RMSE: 0.4096
INFO:root:updated L2-norm: 0.64, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.1739, tolerance: 1.005
INFO:root:
 ####################################
 iteration 5
INFO:root:Layer correction median: 0.2847 m, RMSE:1.2277 m
INFO:root:updated misfit RMSE: 0.3386
INFO:root:updated L2-norm: 0.5819, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0999, tolerance: 1.005
INFO:root:
 ####################################
 iteration 6
INFO:root:Layer correction median: 0.1194 m, RMSE:0.83 m
INFO:root:updated misfit RMSE: 0.2983
INFO:root:updated L2-norm: 0.5462, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0653, tolerance: 1.005
INFO:root:
 ####################################
 iteration 7
INFO:root:Layer correction median: 0.0847 m, RMSE:0.6204 m
INFO:root:updated misfit RMSE: 0.2716
INFO:root:updated L2-norm: 0.5212, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.048, tolerance: 1.005
INFO:root:
 ####################################
 iteration 8
INFO:root:Layer correction median: 0.058 m, RMSE:0.4976 m
INFO:root:updated misfit RMSE: 0.2518
INFO:root:updated L2-norm: 0.5018, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0385, tolerance: 1.005
INFO:root:
 ####################################
 iteration 9
INFO:root:Layer correction median: 0.0425 m, RMSE:0.4188 m
INFO:root:updated misfit RMSE: 0.2361
INFO:root:updated L2-norm: 0.4859, tolerance: 0.5
INFO:root:updated delta L2-norm : 1.0328, tolerance: 1.005
INFO:root:
Inversion terminated after 9 iterations because L2-norm (0.485883674633997) was less then set tolerance: 0.5
Change parameter 'l2_norm_tolerance' if desired.
../_images/user_guide_density_cross_validation_23_3.png
[24]:
plotting.plot_inversion_results(
    grav_results,
    topo_results,
    parameters,
    region,
    iters_to_plot=4,
    plot_iter_results=True,
    plot_topo_results=True,
    plot_grav_results=True,
)
../_images/user_guide_density_cross_validation_24_0.png
../_images/user_guide_density_cross_validation_24_1.png
../_images/user_guide_density_cross_validation_24_2.png
[28]:
final_topography = topo_results.set_index(["northing", "easting"]).to_xarray().topo

_ = polar_utils.grd_compare(
    true_topography,
    final_topography,
    # plot_type="xarray",
    plot=True,
    grid1_name="True topography",
    grid2_name="Inverted topography",
    robust=True,
    hist=True,
    inset=False,
    verbose="q",
    title="difference",
    grounding_line=False,
    reverse_cpt=True,
    cmap="rain",
    # diff_lims=(-20, 20),
)
../_images/user_guide_density_cross_validation_25_0.png

This inversion with a density contrast (4000) instead of the optimal density contrast of ~2830, shows that too high of a density contrast will result in a subdued topography, even though the misfit is minimized equally well. This depicts the non-uniqueness of this style of gravity inversion.