invert4geom.full_workflow_uncertainty_loop

invert4geom.full_workflow_uncertainty_loop#

full_workflow_uncertainty_loop(inversion_object, runs, fname=None, sample_gravity=False, gravity_filter_width=None, constraints_df=None, sample_constraints=False, starting_topography_parameter_dict=None, regional_misfit_parameter_dict=None, parameter_dict=None, create_starting_topography=False, calculate_starting_gravity=False, calculate_regional_misfit=False, regional_grav_kwargs=None, starting_topography_kwargs=None)[source]#

Run a series of inversions (N=runs), and save results of each inversion to pickle files starting with fname. If files already exist, just return the loaded results instead of re-running the inversion. Choose which variables to include in the sampling and whether or not to run a damping value cross-validation for each inversion.

Feed returned values into function merged_stats to compute cell-wise stats on the resulting ensemble of starting topography models, inverted topography models, and gravity anomalies.

Sampling of data (gravity and constraints) uses the columns “uncert” in the dataframes and randomly samples the data from a normal distribution with the uncertainty value as the standard deviation and the data value as the mean. The randomness is controlled by a seed which is equal to the run number, so it changes at every run, and the same run will always produce the same sampling. This allows the run number to be increased and this function run again with the same filename to continue the stochastic ensemble. This only works with data sampling, not parameter sampling.

Sampling of parameter values are determined by 3 supplied dictionaries: parameter_dict which can contain parameters density_contrast, zref, and solver_damping. The other two dictionaries are starting_topography_parameter_dict and regional_misfit_parameter_dict which can contain any parameters that are used in create_topography and DatasetAccessorInvert4Geom.regional_separation respectively. Any parameters in these 3 dictionaries will be sampled with a Latin Hypercube sampling technique and the sampled values will be past to inversion.run_inversion. These dictionaries should be formatted as follows: {“parameter_name”: {“distribution”: “normal”, “loc”: 0, “scale”: 1, “log”: True}} where for a “distribution” of “normal”, “loc” is the center of the distribution and “scale” is the standard deviation, and for a “distribution” of “uniform”, “loc” is the lower bound and “scale” is the range of the distribution. If “log” is True, “loc” and “scale” refer to the base 10 exponent of the values. For example, a uniform distribution with loc=-4, scale=6 and log=True will sample values between 1e-4 and 1e2. The Latin Hypercube sampling takes the parameter distributions and the number of runs and creates evenly spaced samples within the distribution bounds. Therefore, unlike the sampled of data, the same run number will only reproduce the same sampling results if the total run numbers are the same. This means you should not reuse the filename to add more iterations to the stochastic ensemble but increasing the run number if you are using parameter sampling.

Parameters:
  • inversion_object (Inversion) – an Inversion object created through

  • runs (int) – number of inversion workflows to run

  • fname (str | None) – file name to use as root to save each inversions results to, by default None and is set to “tmp_{random.randint(0,999)}_stochastic_ensemble”.

  • sample_gravity (bool) – choose to randomly sample the gravity data from a normal distribution with a mean of each data value and a standard deviation given by the column “uncert”, by default False

  • gravity_filter_width (float | None) – the width in meters of a low-pass filter to apply to the gravity data after sampling, by default None

  • constraints_df (DataFrame | None) – dataframe of constraints with columns “easting”, “northing”, and “upward”, by default None

  • sample_constraints (bool) – choose to randomly sample the constraint elevations from a normal distribution with a mean of each data value and a standard deviation given by the column “uncert”, by default False

  • starting_topography_parameter_dict (dict[str, Any] | None) – parameters with their uncertainty distributions used for creating the starting topography model, by default None

  • regional_misfit_parameter_dict (dict[str, Any] | None) – parameters with their uncertainty distributions used for estimating the regional component of the gravity misfit, by default None

  • parameter_dict (dict[str, Any] | None) – parameters with their uncertainty distributions used in the inversion workflow, by default None

  • create_starting_topography (bool) – choose to recreate the starting topography model, by default False

  • calculate_starting_gravity (bool) – choose to recalculate the starting gravity, by default False

  • calculate_regional_misfit (bool) – choose to recalculate the regional gravity, by default False

  • regional_grav_kwargs (dict[str, Any] | None) – kwargs passed to DatasetAccessorInvert4Geom.regional_separation, by default None

  • starting_topography_kwargs (dict[str, Any] | None) – kwargs passed to create_topography, by default None

Return type:

tuple[dict[str, Any], list[DataFrame], list[DataFrame], dict[str, Any]]

Returns:

  • params (list[dict[str, typing.Any]]) – list of inversion parameters dictionaries with added key for the run number

  • grav_datasets (list[xr.Dataset]) – list of gravity datasets from each inversion run

  • prism_dfs (list[pandas.DataFrame]) – list of prism dataframes from each inversion run

  • sampled_params (dict[str, typing.Any]) – dictionary of sampled parameter values from the Latin Hypercube sampling