do_rss_iterations#
- autoplex.auto.rss.jobs.do_rss_iterations(input, tag, generated_struct_numbers, num_of_initial_selected_structs=None, buildcell_options=None, fragment_file=None, fragment_numbers=None, num_processes_buildcell=1, initial_selection_enabled=False, rss_selection_method=None, num_of_rss_selected_structs=100, bcur_params=None, random_seed=None, include_isolated_atom=False, isolatedatom_box=None, e0_spin=False, include_dimer=False, dimer_box=None, dimer_range=None, dimer_num=21, custom_incar=None, custom_potcar=None, config_types=None, vasp_ref_file='vasp_ref.extxyz', rss_group='rss', test_ratio=0.1, regularization=False, retain_existing_sigma=False, scheme=None, element_order=None, reg_minmax=None, distillation=True, force_max=200, force_label='REF_forces', mlip_type='GAP', ref_energy_name='REF_energy', ref_force_name='REF_forces', ref_virial_name='REF_virial', auto_delta=False, num_processes_fit=1, device_for_fitting='cpu', scalar_pressure_method='exp', scalar_exp_pressure=100, scalar_pressure_exponential_width=0.2, scalar_pressure_low=0, scalar_pressure_high=50, max_steps=200, force_tol=0.05, stress_tol=0.05, hookean_repul=False, hookean_paras=None, keep_symmetry=False, write_traj=True, num_processes_rss=1, device_for_rss='cpu', stop_criterion=0.01, max_iteration_number=5, num_groups=1, initial_kt=0.3, current_iter_index=1, **fit_kwargs)[source]#
Perform iterative RSS to improve the accuracy of a MLIP.
Each iteration involves generating new structures, sampling, running VASP calculations, collecting data, preprocessing data, and fitting a new MLIP.
- Parameters:
input (dict) –
A dictionary parameter used to pass specific input data required during the RSS iterations. The keys in this dictionary should be one of the following valid keys:
- test_error: float
The test error of the fitted MLIP.
- pre_database_dir: str
The directory of the preprocessed database.
- mlip_path: list[str]
List of path to the fitted MLIP.
- isolated_atom_energies: dict
The isolated energy values.
- current_iter: int
The current iteration index.
- kt: float
The value of kt.
tag (str) – Tag of systems. It can also be used for setting up elements and stoichiometry. For example, ‘SiO2’ will generate structures with a 2:1 ratio of Si to O.
generated_struct_numbers (list[int]) – Expected number of generated randomized unit cells.
num_of_initial_selected_structs (list[int] | None) – Number of structures to be sampled. Default is None.
buildcell_options (list[dict] | None) – Customized parameters for buildcell. Default is None.
fragment_file (Atoms | list[Atoms] | None) – Fragment(s) for random structures, e.g. molecules, to be placed indivudally intact. atoms.arrays should have a ‘fragment_id’ key with unique identifiers for each fragment if in same Atoms. atoms.cell must be defined (e.g. Atoms.cell = np.eye(3)*20).
fragment_numbers (list[str] | None) – Numbers of each fragment to be included in the random structures. Defaults to 1 for all specified.
num_processes_buildcell (int) – Number of processes to use for parallel computation during buildcell generation. Default is 1.
initial_selection_enabled (bool) – If true, sample structures using CUR. Default is False.
rss_selection_method (str) – Method for selecting samples from the generated structures. Default is None.
num_of_rss_selected_structs (int) – Number of structures to be selected.
bcur_params (dict | None) – Parameters for Boltzmann CUR selection. Default is None.
random_seed (int | None) – A seed to ensure reproducibility of CUR selection. Default is None.
include_isolated_atom (bool) – If true, perform single-point calculations for isolated atoms. Default is False.
isolatedatom_box (list[float] | None) – List of the lattice constants for an isolated atom configuration. Default is None.
e0_spin (bool) – If true, include spin polarization in isolated atom and dimer calculations. Default is False.
include_dimer (bool) – If true, perform single-point calculations for dimers only once. Default is False.
dimer_box (list[float] | None) – The lattice constants of a dimer box. Default is None.
dimer_range (list[float] | None) – Range of distances for dimer calculations. Default is None.
dimer_num (int) – Number of different distances to consider for dimer calculations. Default is 21.
custom_incar (dict | None) – Dictionary of custom VASP input parameters. If provided, will update the default parameters. Default is None.
custom_potcar (dict | None) – Dictionary of POTCAR settings to update. Keys are element symbols, values are the desired POTCAR labels. Default is None.
config_types (list[str] | None) – Configuration types for the VASP calculations. Default is None.
vasp_ref_file (str) – Reference file for VASP data. Default is ‘vasp_ref.extxyz’.
rss_group (str) – Group name for GAP RSS. Default is ‘rss’.
test_ratio (float) – The proportion of the test set after splitting the data. Default is 0.1.
regularization (bool) – If true, apply regularization. This only works for GAP. Default is False.
retain_existing_sigma (bool) – Whether to keep the current sigma values for specific configuration types. If set to True, existing sigma values for specific configurations will remain unchanged.
scheme (str | None) – Scheme to use for regularization. Default is None.
element_order (list | None) – List of atomic numbers in order of choice (e.g. [42, 16] for MoS2). This value is useful when constructing high-dimensional convex hulls based on the “volume-stoichiometry” scheme. Specially, if the dataset contains compounds with different numbers of constituent elements (e.g., both binary and ternary structures), this value must be explicitly set to ensure the convex hull is constructed consistently.
reg_minmax (list[tuple] | None) – A list of tuples representing the minimum and maximum values for regularization.
distillation (bool) – If true, apply data distillation. Default is True.
force_max (float) – Maximum force value to exclude structures. Default is 200.
force_label (str) – The label of force values to use for distillation. Default is ‘REF_forces’.
mlip_type (Literal["GAP", "J-ACE", "NEP", "NEQUIP", "M3GNET", "MACE"]) – Choose one specific MLIP type to be fitted. Default is ‘GAP’.
ref_energy_name (str) – Reference energy name. Default is ‘REF_energy’.
ref_force_name (str) – Reference force name. Default is ‘REF_forces’.
ref_virial_name (str) – Reference virial name. Default is ‘REF_virial’.
auto_delta (bool) – If true, apply automatic determination of delta for GAP terms. Default is False.
num_processes_fit (int) – Number of processes used for fitting. Default is 1.
device_for_fitting (str) – Device to be used for model fitting, either “cpu” or “cuda”.
scalar_pressure_method (str) – Method for adding external pressures. Default is ‘exp’.
scalar_exp_pressure (float) – Scalar exponential pressure. Default is 100.
scalar_pressure_exponential_width (float) – Width for scalar pressure exponential. Default is 0.2.
scalar_pressure_low (float) – Low limit for scalar pressure. Default is 0.
scalar_pressure_high (float) – High limit for scalar pressure. Default is 50.
max_steps (int) – Maximum number of steps for relaxation. Default is 200.
force_tol (float) – Force residual tolerance for relaxation. Default is 0.05.
stress_tol (float) – Stress residual tolerance for relaxation. Default is 0.05.
hookean_repul (bool) – If true, apply Hookean repulsion. Default is False.
hookean_paras (dict[tuple[int, int], tuple[float, float]] | None) – Parameters for Hookean repulsion as a dictionary of tuples. Default is None.
keep_symmetry (bool) – If true, preserve symmetry during relaxation. Default is False.
write_traj (bool) – If true, write trajectory of RSS. Default is True.
num_processes_rss (int) – Number of processes used for running RSS. Default is 1.
device_for_rss (str) – Specify device to use “cuda” or “cpu” for running RSS. Default is “cpu”.
stop_criterion (float) – Convergence criterion for stopping RSS iterations. Default is 0.01.
max_iteration_number (int) – Maximum number of RSS iterations to perform. Default is 5.
num_groups (int) – Number of structure groups, used for assigning tasks across multiple nodes. Default is 1.
initial_kt (float) – Initial temperature (in eV) for Boltzmann sampling. Default is 0.3.
current_iter_index (int) – Index for the current RSS iteration. Default is 1.
fit_kwargs – Additional keyword arguments for the MLIP fitting process.
- Returns:
A dictionary with following information
’test_error’: float, The test error of the fitted MLIP.
’pre_database_dir’: str, The directory of the preprocessed database.
’mlip_path’: List of path to the fitted MLIP.
’isolated_atom_energies’: dict, The isolated energy values.
’current_iter’: int, The current iteration index.
’kt’: float, The temperature (in eV) for Boltzmann sampling.
- Return type:
dict