MLIPFitMaker

MLIPFitMaker#

class autoplex.fitting.common.flows.MLIPFitMaker(name='MLpotentialFit', mlip_type='GAP', hyperpara_opt=False, ref_energy_name='REF_energy', ref_force_name='REF_forces', ref_virial_name='REF_virial', glue_file_path='glue.xml', use_defaults=True)[source]#

Bases: Maker

Maker to fit ML potentials based on DFT labelled reference data.

This Maker will filter the provided dataset in a data preprocessing step and then proceed with the MLIP fit (default is GAP).

Parameters:
  • name (str) – Name of the flows produced by this maker.

  • mlip_type (str) – Choose one specific MLIP type to be fitted: ‘GAP’ | ‘J-ACE’ | ‘NEQUIP’ | ‘M3GNET’ | ‘MACE’

  • hyperpara_opt (bool) – Perform hyperparameter optimization using XPOT (XPOT: https://pubs.aip.org/aip/jcp/article/159/2/024803/2901815)

  • ref_energy_name (str) – Reference energy name.

  • ref_force_name (str) – Reference force name.

  • ref_virial_name (str) – Reference virial name.

  • glue_file_path (str) – Name of the glue.xml file path.

  • use_defaults (bool) – if true, uses default fit parameters

make(fit_input=None, species_list=None, isolated_atom_energies=None, split_ratio=0.4, force_max=40.0, regularization=False, distillation=True, separated=False, pre_xyz_files=None, pre_database_dir=None, atomwise_regularization_parameter=0.1, force_min=0.01, atom_wise_regularization=True, auto_delta=False, glue_xml=False, num_processes_fit=None, apply_data_preprocessing=True, database_dir=None, device='cpu', **fit_kwargs)[source]#

Make a flow for fitting MLIP models.

Parameters:
  • fit_input (dict) – Output from the CompletePhononDFTMLDataGenerationFlow process.

  • species_list (list) – List of element names (strings) involved in the training dataset

  • isolated_atom_energies (dict) – Dictionary of isolated atoms energies.

  • split_ratio (float) – Ratio to divide the dataset into training and test sets. A value of 0.1 means 90% training data and 10% test data

  • force_max (float) – Maximum allowed force in the dataset.

  • regularization (bool) – For using sigma regularization.

  • distillation (bool) – For using data distillation.

  • separated (bool) – Repeat the fit for each data_type available in the (combined) database.

  • pre_xyz_files (list[str] or None) – Names of the pre-database train xyz file and test xyz file.

  • pre_database_dir (str or None) – The pre-database directory.

  • atomwise_regularization_parameter (float) – Regularization value for the atom-wise force components.

  • force_min (float) – Minimal force cutoff value for atom-wise regularization.

  • atom_wise_regularization (bool) – For including atom-wise regularization.

  • auto_delta (bool) – Automatically determine delta for 2b, 3b and soap terms.

  • glue_xml (bool) – Use the glue.xml core potential instead of fitting 2b terms.

  • num_processes_fit (int) – Number of processes for fitting.

  • apply_data_preprocessing (bool) – Determine whether to preprocess the data.

  • database_dir (Path | str) – Path to the directory containing the database.

  • device (str) – Device to be used for model fitting, either “cpu” or “cuda”.

  • fit_kwargs (dict) – Additional keyword arguments for MLIP fitting.