invert4geom.split_test_train

Contents

invert4geom.split_test_train#

split_test_train(data_df, method, spacing=None, shape=None, n_splits=5, random_state=10, coord_names=('easting', 'northing'), plot=False)[source]#

Split data into training or testing sets either using KFold (optional blocked) or LeaveOneOut methods.

Parameters:
  • data_df (DataFrame) – dataframe with coordinate columns set by coord_names

  • method (str) – choose between “LeaveOneOut” or “KFold” methods.

  • spacing (float | tuple[float, float] | None) – grid spacing to use for Block K-Folds, by default None

  • shape (tuple[float, float] | None) – number of blocks to use for Block K-Folds, by default None

  • n_splits (int) – number for folds to make for K-Folds method, by default 5

  • random_state (int) – random state used for both methods, by default 10

  • coord_names (tuple[str, str]) – names of the coordinate columns in the dataframe, by default (“easting”, “northing”)

  • plot (bool) – plot the separated training and testing dataset, by default False

Returns:

a dataset with a new column for each fold in the form fold_0, fold_1 etc., with the value “train” or “test”

Return type:

DataFrame