MLIPFitMaker

MLIPFitMaker#

class autoplex.fitting.common.flows.MLIPFitMaker(name='MLpotentialFit', mlip_type='GAP', hyperpara_opt=False, ref_energy_name='REF_energy', ref_force_name='REF_forces', ref_virial_name='REF_virial', glue_file_path='glue.xml', split_ratio=0.4, force_max=40.0, force_min=0.01, distillation=True, separated=False, pre_xyz_files=None, pre_database_dir=None, regularization=False, atomwise_regularization_parameter=0.1, atom_wise_regularization=True, auto_delta=False, glue_xml=False, num_processes_fit=None, apply_data_preprocessing=True, run_fits_on_different_cluster=False)[source]#

Bases: Maker

Maker to fit ML potentials based on DFT labelled reference data.

This Maker will filter the provided dataset in a data preprocessing step and then proceed with the MLIP fit (default is GAP).

Parameters:

name (str) – Name of the flows produced by this maker.
mlip_type (Literal["GAP", "J-ACE", "NEP", "NEQUIP", "M3GNET", "MACE"]) – Choose one specific MLIP type to be fitted.
hyperpara_opt (bool) – Perform hyperparameter optimization using XPOT (XPOT: https://pubs.aip.org/aip/jcp/article/159/2/024803/2901815)
ref_energy_name (str) – Reference energy name.
ref_force_name (str) – Reference force name.
ref_virial_name (str) – Reference virial name.
glue_file_path (str) – Name of the glue.xml file path.
split_ratio (float) – Ratio to divide the dataset into training and test sets. A value of 0.1 means 90% training data and 10% test data
force_max (float) – Maximum allowed force in the dataset.
force_min (float) – Minimal force cutoff value for atom-wise regularization.
regularization (bool) – For using sigma regularization.
distillation (bool) – For using data distillation.
separated (bool) – Repeat the fit for each data_type available in the (combined) database.
pre_xyz_files (list[str] or None) – Names of the pre-database train xyz file and test xyz file.
pre_database_dir (str or None) – The pre-database directory.
atomwise_regularization_parameter (float) – Regularization value for the atom-wise force components.
atom_wise_regularization (bool) – For including atom-wise regularization.
auto_delta (bool) – Automatically determine delta for 2b, 3b and soap terms.
glue_xml (bool) – Use the glue.xml core potential instead of fitting 2b terms.
num_processes_fit (int) – Number of processes for fitting.
apply_data_preprocessing (bool) – Determine whether to preprocess the data.
run_fits_on_different_cluster (bool) – If true, run fits on different clusters.

make(database_dir=None, fit_input=None, hyperparameters=MLIP_HYPERS, species_list=None, isolated_atom_energies=None, device='cpu', **fit_kwargs)[source]#

Make a flow for fitting MLIP models.

Parameters:

database_dir (Path | str) – Path to the directory containing the database.
fit_input (dict) – Output from the CompletePhononDFTMLDataGenerationFlow process.
database_dir – Path to the directory containing the database.
hyperparameters (MLIP_HYPERS) – Hyperparameters for the MLIP.
species_list (list) – List of element names (strings) involved in the training dataset
isolated_atom_energies (dict) – Dictionary of isolated atoms energies.
device (str) – Device to be used for model fitting, either “cpu” or “cuda”.
fit_kwargs (dict) – Additional keyword arguments for MLIP fitting.

MLIPFitMaker

Contents

MLIPFitMaker#