write_after_distillation_data_split

write_after_distillation_data_split#

autoplex.fitting.common.utils.write_after_distillation_data_split(distillation, f_max, split_ratio, vasp_ref_name='vasp_ref.extxyz', train_name='train.extxyz', test_name='test.extxyz', force_label='REF_forces')[source]#

Write train.extxyz and test.extxyz after data distillation and split.

Reject structures with large force components and split dataset into training and test datasets.

Parameters:
  • distillation (bool) – For using data distillation.

  • f_max (float) – Maximally allowed force in the data set.

  • split_ratio (float) – Parameter to divide the training set and the test set. A value of 0.1 means that the ratio of the training set to the test set is 9:1

  • vasp_ref_name (str) – name of the VASP reference data file.

  • train_name (str) – name of the training data file.

  • test_name (str) – name of the test data file.

  • force_label (str) – label of the force entries.

Return type:

None