MLIPFitMaker#
- class autoplex.fitting.common.flows.MLIPFitMaker(name='MLpotentialFit', mlip_type='GAP', hyperpara_opt=False, ref_energy_name='REF_energy', ref_force_name='REF_forces', ref_virial_name='REF_virial', glue_file_path='glue.xml', use_defaults=True)[source]#
Bases:
Maker
Maker to fit ML potentials based on DFT labelled reference data.
This Maker will filter the provided dataset in a data preprocessing step and then proceed with the MLIP fit (default is GAP).
- Parameters:
name (str) – Name of the flows produced by this maker.
mlip_type (str) – Choose one specific MLIP type to be fitted: ‘GAP’ | ‘J-ACE’ | ‘NEQUIP’ | ‘M3GNET’ | ‘MACE’
hyperpara_opt (bool) – Perform hyperparameter optimization using XPOT (XPOT: https://pubs.aip.org/aip/jcp/article/159/2/024803/2901815)
ref_energy_name (str) – Reference energy name.
ref_force_name (str) – Reference force name.
ref_virial_name (str) – Reference virial name.
glue_file_path (str) – Name of the glue.xml file path.
use_defaults (bool) – if true, uses default fit parameters
- make(fit_input=None, species_list=None, isolated_atom_energies=None, split_ratio=0.4, force_max=40.0, regularization=False, distillation=True, separated=False, pre_xyz_files=None, pre_database_dir=None, atomwise_regularization_parameter=0.1, force_min=0.01, atom_wise_regularization=True, auto_delta=False, glue_xml=False, num_processes_fit=None, apply_data_preprocessing=True, database_dir=None, device='cpu', **fit_kwargs)[source]#
Make a flow for fitting MLIP models.
- Parameters:
fit_input (dict) – Output from the CompletePhononDFTMLDataGenerationFlow process.
species_list (list) – List of element names (strings) involved in the training dataset
isolated_atom_energies (dict) – Dictionary of isolated atoms energies.
split_ratio (float) – Ratio to divide the dataset into training and test sets. A value of 0.1 means 90% training data and 10% test data
force_max (float) – Maximum allowed force in the dataset.
regularization (bool) – For using sigma regularization.
distillation (bool) – For using data distillation.
separated (bool) – Repeat the fit for each data_type available in the (combined) database.
pre_xyz_files (list[str] or None) – Names of the pre-database train xyz file and test xyz file.
pre_database_dir (str or None) – The pre-database directory.
atomwise_regularization_parameter (float) – Regularization value for the atom-wise force components.
force_min (float) – Minimal force cutoff value for atom-wise regularization.
atom_wise_regularization (bool) – For including atom-wise regularization.
auto_delta (bool) – Automatically determine delta for 2b, 3b and soap terms.
glue_xml (bool) – Use the glue.xml core potential instead of fitting 2b terms.
num_processes_fit (int) – Number of processes for fitting.
apply_data_preprocessing (bool) – Determine whether to preprocess the data.
database_dir (Path | str) – Path to the directory containing the database.
device (str) – Device to be used for model fitting, either “cpu” or “cuda”.
fit_kwargs (dict) – Additional keyword arguments for MLIP fitting.