autoplex.data.common.utils

autoplex.data.common.utils#

Utility functions for training data jobs.

Functions

boltzhist_cur_dual_iter

Execute sampling with two iterations.

boltzhist_cur_one_shot

Sample atoms from a list according to boltzmann energy weighting and CUR diversity.

check_distances

Take in a pymatgen Structure object and check minimum distances between atoms using minimum image convention.

convexhull_cur

Sample atoms from a list according to Boltzmann energy weighting relative to convex hull and CUR diversity.

create_soap_descriptor

Generate a SOAP descriptor string based on the given parameters.

cur_select

Perform CUR selection on a set of atoms to get representative SOAP descriptors.

data_distillation

For data distillation.

energy_plot

Plot the distribution of energy per atom on the output vs the input.

extract_base_name

Extract the base of a file name to easier manipulate other file names.

filter_outlier_energy

Filter data outliers per energy criteria and write them into files.

filter_outlier_forces

Filter data outliers per force criteria and write them into files.

flatten

Flatten an iterable fully, but excluding Atoms objects.

flatten_list

Flatten a nested list into a single list if necessary.

force_plot

Plot the distribution of force components per atom on the output vs the input.

handle_rss_trajectory

Handle trajectory and associated information.

mc_rattle

Take in a pymatgen Structure object and generates rattled structures.

parallel_calc_descriptor_vec

Calculate the SOAP descriptor vector for a given atom and hypers in parallel.

plot_energy_forces

Plot energy and forces of the data.

random_vary_angle

Take in a pymatgen Structure object and generates angle-distorted structures.

rms_dict

Compute RMSE and standard deviation of predictions with reference data.

scale_cell

Take in a pymatgen Structure object and generates stretched or compressed structures.

std_rattle

Take in a pymatgen Structure object and generates rattled structures.

stratified_dataset_split

Split the dataset.

to_ase_trajectory

Convert to an ASE .Trajectory.

Classes

ElementCollection

A class to handle different species operations for a collection of atoms.

GPa

Convert a string or number to a floating point number, if possible.

TYPE_CHECKING

bool(x) -> bool