MLIPFitMaker

MLIPFitMaker#

class autoplex.fitting.common.flows.MLIPFitMaker(name='MLpotentialFit', mlip_type='GAP', hyper_param_optimization=False, ref_energy_name='REF_energy', ref_force_name='REF_forces', ref_virial_name='REF_virial')[source]#

Bases: Maker

Maker to fit ML potentials based on DFT labeled reference data.

This Maker will filter the provided dataset in a data preprocessing step and then proceed with the MLIP fit (default is GAP).

Parameters:
  • name (str) – Name of the flows produced by this maker.

  • mlip_type (str) – Choose one specific MLIP type: ‘GAP’ | ‘J-ACE’ | ‘P-ACE’ | ‘NEQUIP’ | ‘M3GNET’ | ‘MACE’

  • hyper_param_optimization (bool) – Perform hyperparameter optimization using XPOT (XPOT: https://pubs.aip.org/aip/jcp/article/159/2/024803/2901815)

  • ref_energy_name (str, optional) – Reference energy name.

  • ref_force_name (str, optional) – Reference force name.

  • ref_virial_name (str, optional) – Reference virial name.

make(fit_input=None, species_list=None, isolated_atoms_energies=None, split_ratio=0.4, f_max=40.0, regularization=False, distillation=True, separated=False, pre_xyz_files=None, pre_database_dir=None, atomwise_regularization_parameter=0.1, f_min=0.01, atom_wise_regularization=True, auto_delta=False, glue_xml=False, num_processes_fit=None, preprocessing_data=True, database_dir=None, device='cuda', **fit_kwargs)[source]#

Make a flow to create ML potential fits.

Parameters:
  • species_list (list.) – List of element names (str)

  • isolated_atoms_energies (dict) – Dict of isolated atoms energies.

  • fit_input (dict.) – CompletePhononDFTMLDataGenerationFlow output.

  • split_ratio (float.) – Parameter to divide the training set and the test set. A value of 0.1 means that the ratio of the training set to the test set is 9:1.

  • f_max (float) – Maximally allowed force in the data set.

  • regularization (bool) – For using sigma regularization.

  • distillation (bool) – For using data distillation.

  • separated (bool) – Repeat the fit for each data_type available in the (combined) database.

  • pre_xyz_files (list[str] or None) – names of the pre-database train xyz file and test xyz file.

  • pre_database_dir (str | None) – the pre-database directory.

  • atomwise_regularization_parameter (float) – regularization value for the atom-wise force components.

  • f_min (float) – minimal force cutoff value for atom-wise regularization.

  • atom_wise_regularization (bool) – for including atom-wise regularization.

  • auto_delta (bool) – automatically determine delta for 2b, 3b and soap terms.

  • glue_xml (bool) – use the glue.xml core potential instead of fitting 2b terms.

  • num_processes_fit (int) – number of processes for fitting.

  • preprocessing_data (bool) – Determine whether to preprocess the data. If not, one needs to input the path to the training database.

  • database_dir (Path) – the database directory.

  • device (str) – specify device to use cuda or cpu.

  • fit_kwargs (dict) – dict including MLIP fit keyword args.