CompleteDFTvsMLBenchmarkWorkflow#

class autoplex.auto.phonons.flows.CompleteDFTvsMLBenchmarkWorkflow(name='add_data', add_dft_phonon_struct=True, add_dft_random_struct=True, add_rss_struct=False, displacement_maker=None, phonon_bulk_relax_maker=None, phonon_static_energy_maker=None, rattled_bulk_relax_maker=None, isolated_atom_maker=None, n_structures=10, displacements=<factory>, symprec=0.0001, uc=False, volume_custom_scale_factors=None, volume_scale_factor_range=None, rattle_std=0.01, distort_type=0, min_distance=1.5, angle_percentage_scale=10, angle_max_attempts=1000, rattle_type=0, rattle_seed=42, rattle_mc_n_iter=10, w_angle=None, ml_models=<factory>, hyper_para_loop=False, atomwise_regularization_list=None, soap_delta_list=None, n_sparse_list=None, supercell_settings=<factory>, benchmark_kwargs=<factory>, path_to_default_hyperparameters=PosixPath('/home/runner/micromamba/envs/autoplex_docs/lib/python3.10/site-packages/autoplex/fitting/common/mlip-phonon-defaults.json'), summary_filename_prefix='results_')[source]#

Bases: Maker

Maker to construct a DFT (VASP) based dataset, composed of the following two configuration types.

  1. single atom displaced supercells (based on the atomate2 PhononMaker subroutines)

  2. supercells with randomly displaced atoms (based on the ase rattled function).

Machine-learned interatomic potential(s) are then fitted on the dataset, followed by benchmarking the resulting potential(s) to DFT (VASP) level using the provided benchmark structure(s) and comparing the respective DFT and MLIP-based Phonon calculations. The benchmark metrics are provided in form of a phonon band structure comparison and q-point-wise phonons RMSE plots, as well as a summary text file.

Parameters:
  • name (str) – Name of the flow produced by this maker.

  • add_dft_phonon_struct (bool.) – If True, will add displaced supercells via phonopy for DFT calculation.

  • add_dft_random_struct (bool.) – If True, will add randomly distorted structures for DFT calculation.

  • add_rss_struct (bool.) – If True, will add RSS generated structures for DFT calculation. n_structures: int. The total number of randomly displaced structures to be generated.

  • displacement_maker (BaseVaspMaker) – Maker used for a static calculation for a supercell.

  • phonon_bulk_relax_maker (BaseVaspMaker) – Maker used for the bulk relax unit cell calculation.

  • rattled_bulk_relax_maker (BaseVaspMaker) – Maker used for the bulk relax unit cell calculation.

  • phonon_static_energy_maker (BaseVaspMaker) – Maker used for the static energy unit cell calculation.

  • isolated_atom_maker (IsoAtomStaticMaker) – VASP maker for the isolated atom calculation.

  • n_structures (int.) – Total number of distorted structures to be generated. Must be provided if distorting volume without specifying a range, or if distorting angles. Default=10.

  • displacements (list[float]) – displacement distances for phonons

  • symprec (float) – Symmetry precision to use in the reduction of symmetry to find the primitive/conventional cell (use_primitive_standard_structure, use_conventional_standard_structure) and to handle all symmetry-related tasks in phonopy.

  • uc (bool.) – If True, will generate randomly distorted structures (unitcells) and add static computation jobs to the flow.

  • distort_type (int.) – 0- volume distortion, 1- angle distortion, 2- volume and angle distortion. Default=0.

  • volume_scale_factor_range (list[float]) – [min, max] of volume scale factors. e.g. [0.90, 1.10] will distort volume +-10%.

  • volume_custom_scale_factors (list[float]) – Specify explicit scale factors (if range is not specified). If None, will default to [0.90, 0.95, 0.98, 0.99, 1.01, 1.02, 1.05, 1.10].

  • min_distance (float) – Minimum separation allowed between any two atoms. Default= 1.5A.

  • angle_percentage_scale (float) – Angle scaling factor. Default= 10 will randomly distort angles by +-10% of original value.

  • angle_max_attempts (int.) – Maximum number of attempts to distort structure before aborting. Default=1000.

  • w_angle (list[float]) – List of angle indices to be changed i.e. 0=alpha, 1=beta, 2=gamma. Default= [0, 1, 2].

  • rattle_type (int.) – 0- standard rattling, 1- Monte-Carlo rattling. Default=0.

  • rattle_std (float.) – Rattle amplitude (standard deviation in normal distribution). Default=0.01. Note that for MC rattling, displacements generated will roughly be rattle_mc_n_iter**0.5 * rattle_std for small values of n_iter.

  • rattle_seed (int.) – Seed for setting up NumPy random state from which random numbers are generated. Default=42.

  • rattle_mc_n_iter (int.) – Number of Monte Carlo iterations. Larger number of iterations will generate larger displacements. Default=10.

  • ml_models (list[str]) – list of the ML models to be used. Default is GAP.

  • hyper_para_loop (bool) – making it easier to loop through several hyperparameter sets.

  • atomwise_regularization_list (list) – List of atom-wise regularization parameters that are checked.

  • soap_delta_list (list) – List of SOAP delta values that are checked.

  • n_sparse_list (list) – List of GAP n_sparse values that are checked.

  • supercell_settings (dict) – settings for supercell generation

  • benchmark_kwargs (dict) – kwargs for the benchmark flows

  • summary_filename_prefix (str) – Prefix of the result summary file.

  • path_to_default_hyperparameters (Path | str) –

make(structure_list, mp_ids, split_ratio=0.4, f_max=40.0, pre_xyz_files=None, pre_database_dir=None, preprocessing_data=True, atomwise_regularization_parameter=0.1, f_min=0.01, atom_wise_regularization=True, auto_delta=False, dft_references=None, benchmark_structures=None, benchmark_mp_ids=None, **fit_kwargs)[source]#

Make flow for constructing the dataset, fitting the potentials and performing the benchmarks.

Parameters:
  • structure_list (list[Structure]) – list of pymatgen structures.

  • mp_ids – materials project IDs.

  • split_ratio (float.) – Parameter to divide the training set and the test set. A value of 0.1 means that the ratio of the training set to the test set is 9:1.

  • f_max (float) – Maximally allowed force in the data set.

  • pre_xyz_files (list[str] or None) – names of the pre-database train xyz file and test xyz file.

  • pre_database_dir (str or None) – the pre-database directory.

  • preprocessing_data (bool) – preprocessing the data.

  • atomwise_regularization_parameter (float) – regularization value for the atom-wise force components.

  • f_min (float) – minimal force cutoff value for atom-wise regularization.

  • atom_wise_regularization (bool) – for including atom-wise regularization.

  • auto_delta (bool) – automatically determine delta for 2b, 3b and soap terms.

  • dft_references (list[PhononBSDOSDoc] | None) – a list of DFT reference files containing the PhononBSDOCDoc object.

  • benchmark_structures (list[Structure] | None) – pymatgen structure for benchmarking.

  • benchmark_mp_ids (list[str] | None) – Materials Project ID of the benchmarking structure.

  • fit_kwargs (dict.) – dict including MLIP fit keyword args.

static add_dft_phonons(structure, displacements, symprec, phonon_bulk_relax_maker, phonon_static_energy_maker, phonon_displacement_maker, supercell_settings)[source]#

Add DFT phonon runs for reference structures.

Parameters:
  • structure (Structure) – pymatgen Structure object

  • displacements (list[float]) – displacement distance for phonons

  • symprec (float) – Symmetry precision to use in the reduction of symmetry to find the primitive/conventional cell (use_primitive_standard_structure, use_conventional_standard_structure) and to handle all symmetry-related tasks in phonopy

  • phonon_displacement_maker (BaseVaspMaker) – Maker used to compute the forces for a supercell.

  • phonon_bulk_relax_maker (BaseVaspMaker) – Maker used for the bulk relax unit cell calculation.

  • phonon_static_energy_maker (BaseVaspMaker) – Maker used for the static energy unit cell calculation.

  • supercell_settings (dict) – supercell settings

static add_dft_random(structure, mp_id, rattled_bulk_relax_maker, displacement_maker, uc=False, volume_custom_scale_factors=None, volume_scale_factor_range=None, rattle_std=0.01, distort_type=0, n_structures=10, min_distance=1.5, angle_percentage_scale=10, angle_max_attempts=1000, rattle_type=0, rattle_seed=42, rattle_mc_n_iter=10, w_angle=None, supercell_settings=None)[source]#

Add DFT static runs for randomly displaced structures.

Parameters:
  • structure (Structure) – pymatgen Structure object

  • mp_id (str) – materials project id

  • displacement_maker (BaseVaspMaker) – Maker used for a static calculation for a supercell.

  • rattled_bulk_relax_maker (BaseVaspMaker) – Maker used for the bulk relax unit cell calculation.

  • uc (bool.) – If True, will generate randomly distorted structures (unitcells) and add static computation jobs to the flow.

  • distort_type (int.) – 0- volume distortion, 1- angle distortion, 2- volume and angle distortion. Default=0.

  • n_structures (int.) – Total number of distorted structures to be generated. Must be provided if distorting volume without specifying a range, or if distorting angles. Default=10.

  • volume_scale_factor_range (list[float]) – [min, max] of volume scale factors. e.g. [0.90, 1.10] will distort volume +-10%.

  • volume_custom_scale_factors (list[float]) – Specify explicit scale factors (if range is not specified). If None, will default to [0.90, 0.95, 0.98, 0.99, 1.01, 1.02, 1.05, 1.10].

  • min_distance (float) – Minimum separation allowed between any two atoms. Default= 1.5A.

  • angle_percentage_scale (float) – Angle scaling factor. Default= 10 will randomly distort angles by +-10% of original value.

  • angle_max_attempts (int.) – Maximum number of attempts to distort structure before aborting. Default=1000.

  • w_angle (list[float]) – List of angle indices to be changed i.e. 0=alpha, 1=beta, 2=gamma. Default= [0, 1, 2].

  • rattle_type (int.) – 0- standard rattling, 1- Monte-Carlo rattling. Default=0.

  • rattle_std (float.) – Rattle amplitude (standard deviation in normal distribution). Default=0.01. Note that for MC rattling, displacements generated will roughly be rattle_mc_n_iter**0.5 * rattle_std for small values of n_iter.

  • rattle_seed (int.) – Seed for setting up NumPy random state from which random numbers are generated. Default=42.

  • rattle_mc_n_iter (int.) – Number of Monte Carlo iterations. Larger number of iterations will generate larger displacements. Default=10.

  • supercell_settings (dict) – settings for supercells