sampling#
- autoplex.data.common.jobs.sampling(selection_method='random', num_of_selection=5, bcur_params=None, dir=None, structure=None, traj_info=None, isol_es=None, random_seed=None)[source]#
Job to sample training configurations from trajectories of MD/RSS.
- Parameters:
selection_method (Literal['cur', 'bcur', 'random', 'uniform']) –
- Method for selecting samples. Options include:
’cur’: Pure CUR selection.
’bcur’: Boltzmann flat histogram in enthalpy, then CUR.
’random’: Random selection.
’uniform’: Uniform selection.
num_of_selection (int, optional) – Number of selections to be made. Default is 5.
bcur_params (dict, optional) –
Parameters for Boltzmann CUR selection. The default dictionary includes: - ‘soap_paras’: SOAP descriptor parameters:
’l_max’: int, Maximum degree of spherical harmonics (default 8).
’n_max’: int, Maximum number of radial basis functions (default 8).
’atom_sigma’: float, Width of Gaussian smearing (default 0.75).
’cutoff’: float, Radial cutoff distance (default 5.5).
’cutoff_transition_width’: float, Width of the transition region (default 1.0).
’zeta’: float, Exponent for dot-product SOAP kernel (default 4.0).
’average’: bool, Whether to average the SOAP vectors (default True).
’species’: bool, Whether to consider species information (default True).
’kT’: float, Temperature in eV for Boltzmann weighting (default 0.3).
’frac_of_bcur’: float, Fraction of Boltzmann CUR selections (default 0.1).
’bolt_max_num’: int, Maximum number of Boltzmann selections (default 3000).
’kernel_exp’: float, Exponent for the kernel (default 4.0).
’energy_label’: str, Label for the energy data (default ‘energy’).
dir (str, optional) – Directory containing trajectory files for MD/RSS simulations. Default is None.
structure (list[Structure], optional) – List of structures for sampling. Default is None.
traj_info (list[dict[str, Union[str, float]]], optional) – List of dictionaries containing trajectory information. Each dictionary should have keys ‘traj_path’ and ‘pressure’. Default is None.
isol_es (dict, optional) – Dictionary of isolated energy values for species. Required for ‘boltzhist_CUR’ selection method. Default is None.
random_seed (int) – Random seed.
- Returns:
The selected atoms. These are copies of the atoms in the input list.
- Return type:
list of ase.Atoms