invert4geom.random_split_test_train

invert4geom.random_split_test_train#

random_split_test_train(data_df, test_size=0.3, random_state=10, coord_names=('easting', 'northing'), plot=False)[source]#

split data into training and testing sets randomly with a specified percentage of points to be in the test set set by test_size.

Parameters:
  • data_df (DataFrame) โ€“ data to be split, must have columns set by parameter coord_names

  • test_size (float) โ€“ decimal percentage of points to put in the testing set, by default 0.3

  • random_state (int) โ€“ number to set th random splitting, by default 10

  • coord_names (tuple[str, str]) โ€“ names of the coordinate columns in the dataframe, by default (โ€œeastingโ€, โ€œnorthingโ€)

  • plot (bool) โ€“ choose to plot the results, by default False

Returns:

dataframe with a new column โ€œtestโ€ which is a boolean value of whether the row is in the training or testing set.

Return type:

DataFrame