sampling

Contents

sampling#

autoplex.data.common.jobs.sampling(selection_method='random', num_of_selection=5, bcur_params=None, dir=None, structure=None, traj_info=None, isol_es=None, random_seed=None)[source]#

Job to sample training configurations from trajectories of MD/RSS.

Parameters:
  • selection_method (Literal['cur', 'bcur', 'random', 'uniform']) –

    Method for selecting samples. Options include:
    • ’cur’: Pure CUR selection.

    • ’bcur’: Boltzmann flat histogram in enthalpy, then CUR.

    • ’random’: Random selection.

    • ’uniform’: Uniform selection.

  • num_of_selection (int, optional) – Number of selections to be made. Default is 5.

  • bcur_params (dict, optional) –

    Parameters for Boltzmann CUR selection. The default dictionary includes: - ‘soap_paras’: SOAP descriptor parameters:

    • ’l_max’: int, Maximum degree of spherical harmonics (default 8).

    • ’n_max’: int, Maximum number of radial basis functions (default 8).

    • ’atom_sigma’: float, Width of Gaussian smearing (default 0.75).

    • ’cutoff’: float, Radial cutoff distance (default 5.5).

    • ’cutoff_transition_width’: float, Width of the transition region (default 1.0).

    • ’zeta’: float, Exponent for dot-product SOAP kernel (default 4.0).

    • ’average’: bool, Whether to average the SOAP vectors (default True).

    • ’species’: bool, Whether to consider species information (default True).

    • ’kT’: float, Temperature in eV for Boltzmann weighting (default 0.3).

    • ’frac_of_bcur’: float, Fraction of Boltzmann CUR selections (default 0.1).

    • ’bolt_max_num’: int, Maximum number of Boltzmann selections (default 3000).

    • ’kernel_exp’: float, Exponent for the kernel (default 4.0).

    • ’energy_label’: str, Label for the energy data (default ‘energy’).

  • dir (str, optional) – Directory containing trajectory files for MD/RSS simulations. Default is None.

  • structure (list[Structure], optional) – List of structures for sampling. Default is None.

  • traj_info (list[dict[str, Union[str, float]]], optional) – List of dictionaries containing trajectory information. Each dictionary should have keys ‘traj_path’ and ‘pressure’. Default is None.

  • isol_es (dict, optional) – Dictionary of isolated energy values for species. Required for ‘boltzhist_CUR’ selection method. Default is None.

  • random_seed (int) – Random seed.

Returns:

The selected atoms. These are copies of the atoms in the input list.

Return type:

list of ase.Atoms