invert4geom.split_test_train#
- split_test_train(data_df, method, spacing=None, shape=None, n_splits=5, random_state=10, coord_names=('easting', 'northing'), plot=False)[source]#
Split data into training or testing sets either using KFold (optional blocked) or LeaveOneOut methods.
- Parameters:
data_df (pandas.DataFrame) – dataframe with coordinate columns set by coord_names
method (str) – choose between “LeaveOneOut” or “KFold” methods.
spacing (float | tuple[float, float] | None, optional) – grid spacing to use for Block K-Folds, by default None
shape (tuple[float, float] | None, optional) – number of blocks to use for Block K-Folds, by default None
n_splits (int, optional) – number for folds to make for K-Folds method, by default 5
random_state (int, optional) – random state used for both methods, by default 10
coord_names (tuple[str, str], optional) – names of the coordinate columns in the dataframe, by default (“easting”, “northing”)
plot (bool, optional) – plot the separated training and testing dataset, by default False
- Returns:
a dataset with a new column for each fold in the form fold_0, fold_1 etc., with the value “train” or “test”
- Return type: