Transition tube sampling

Contributed by Krystof Brezina on Oct 18, 2024.

How to use TTS

Transition tube sampling (TTS) is a method of sampling thermal reactive candidate geometries for, primarily, the training of reactive neural network potentials (NNPs). [1] [2] In general, this is quite a difficult task, because in order to overcome a thermally insurmountable potential barrier, one has to perform a kind of enhanced sampling simulation to drive the system over it: this comes with an inpractically high computational cost when done at an ab initio level. TTS significantly reduces this computational cost as it only requires a discretized estimate of the reaction path (e.g., a minimum energy path, MEP) and a small numbers of Hessian matrices evaluated at chosen control points (geometries \(\mathbf{R}_c, \ c = 1, \dots, N_c\)) along the path. Typically, selecting control points at the reagent and the product minima and the transition state is enough, but one can go as dense as they want.

Setting up

Having calculated the Hesians for the \(N_c\) control points (and arranged them in an Iterable, e.g., a list hessians), one can easily set up the normal mode objects using the NormalModes class of the TTS module:

import transition_tube_sampling as tts
normalmodes = [tts.NormalModes(h, masses) for h in hessians]

where masses is an Iterable of atomic masses used for mass-weighing purposes. The module has some clever ways of dealing with Hessians from CP2K and ASE, for any other Hessian one must convert it externally to mass-weighted atomic units and enter it directly following the documentation below.

Once the local modes normalmodes are set up, one can move to the TTS itself through the NormalModeSampling class, which works in the following way. Given the geometries of the MEP listed in an iterable (e.g., positions), the code runs a cubic spline through it to have an analytic description of the MEP and its tangent vectors (derivatives). The other output to set up the NormalModeSampling class if the control hessians from above, just note that the code requires a dictionary, where the keys are integer indices of the control points within the discretized MEP (something like idx_c = [0, 5, 10] for a control point at the 0th, 5th and 10th position of the MEP). You can get this easily through, for instance, a dictionary comprehension:

assert len(normalmodes) == len(idx_c)
dnormalmodes = {idx: nm for idx, nm in zip(idx_c, normalmodes)}

and now, the NormalModeSampling object is set up just using:

sampling = tts.NormalModeSampling(positions, dnormalmodes)

and is ready to sample new geometries.

Sampling new geometries

This is achieved through calling the main interface of the class:

new_geometries = sampling.get_samples(...)

with appropriate parameters. The way this works is that for each of the \(c\) control points, this erects a distribution of reference geometries \({\mathbf{R}_0}\) peaking at the position of the control point and decaying away. This encourages using local modes “locally”; summing up over these distributions gives an exactly uniform distribution along the path. Now, each of the reference geometries is displaced perpendicularly from the path using its assigned control normal modes

\[\mathbf{R} = \mathbf{R}_0 + \mathcal{M}^{-\frac{1}{2}}\sum_{i=1}^{N_\mathrm{vib}} \Omega_{i,c} \mathbb{\Omega}_{i,c},\]

where \(\mathcal{M}\) is the diagonal mass matrix and the values of normal coordinates \(\Omega_{i,c}\) are sampled from the harmonic Boltzmann distribution

\[p(\Omega_{1,c}, \dots, \Omega_{N_\mathrm{vib},c}) = \prod_{i=1}^{N_\mathrm{vib}} \mathrm{exp} \left( -\frac{1}{2}\beta\omega_{i,c}^2 \Omega_{i,c}^2 \right)\]

at an inverse temperature \(\beta = (k_\mathbf{B} T)^{-1}\). Note that imaginary-frequency modes are ommited as they correspond to the reactive direction which is already accounted for. Choosing sampling_mode=’quantum’ invokes the calculation of the effective quantum inverse temperature

\[\beta^* = \frac{2}{\hbar\omega} \tanh \left( \frac{\beta\hbar\omega}{2} \right)\]

for each mode separately, yielding the quantum harmonic distribution instead.

Important notes

TTS reduces to a normal mode sampling around a single minimum if len(dnormalmodes) == 1. This can be used to get thermal structures of non-reactive molecules and crystals as long as they are reasonably described by the harmonic approximation.In that case, the obtained set of structures is canonical on the harmonic potential energy surface.

In the case of TTS beyond a single minimum, the obtained ensemble of structures does not correspond to the canonical ensemble. No expectation values should be calculated over it, its purpose is only auxiliary to provide training structures for NNP generation.

Code documentation

class tts.transition_tube_sampling.NormalModeSampling(positions: ndarray, normal_modes: dict)[source]

Sample geometries based on normal modes.

Generate geometry samples around a single minimum or around a minimum energy path. The weight of the sampling is uniform along the MEP between the endpoints. Perpendicular to the MEP and beyong the endpoints, the weighting is thermal, based on either classical or quantum thermal probability densities.

Written by Krystof Brezina in the group of Ondrej Marsalek at the Charles University in Prague, 2023. If you use this code, please cite Brezina K. et al. JCTC, 19(19), 6589-6604, 2023.

get_samples(temperature: float, sampling_mode: str, mep_density: float, match_density_at_minimum: bool = False, n_points_min: int | None = None) → dict[source]

Perform thermal sampling around an MEP or a minimum.

This is the main interface of this class.

Arguments:

temperature: temperature in Kelvin for the thermal sampling

sampling_mode: ‘classical’ or ‘quantum’; select whether effective quantum temperatures should be calculated for each mode

mep_density: linear density of points per unit length (a0) along the MEP

match_density_at_minimum: if True, the density of points at the minimum is matched with the MEP density without the need to specify n_points_min

n_points_min: number of points to sample at the minimum

Returns:

A dictionary of:

xis: values of xi on the MEP from which the samples originate

distortions: generated thermal distortions, array of shape (number of distortions, number of atoms, 3)

labels: integer labels which denote the origin of each distortion, array of shape (number of distortions)

class tts.transition_tube_sampling.NormalModes(hessian: ndarray, masses: ndarray | None = None)[source]

A representation of normal modes.

Provided a Hessian matrix, calculate normal mode vectors and corresponding frequencies, optionally filtering out translational and rotational degrees of freedom or unstable modes.

Normal modes and frequencies are made available in mass-weighted atomic units.

property frequencies: Normal mode frequencies in atomic units.

classmethod from_ase(hessian_ase: ndarray, masses: ndarray)[source]: Take an ASE Hessian, convert units, mass-weigh and return a NormalModes object.

classmethod from_cp2k(fn_in: str, n_atoms: int, masses: ndarray | None = None)[source]: Read a CP2K Hessian file and return a NormalModes object.

property modes: Normal mode vectors in atomic units.

update_normal_modes(symmetrize: bool = False, order_SP: int = 0, discard_imag_freq: bool = True, eps_freq: float = 0.0005, n_TR: int = 6) → None[source]

Diagonalize the Hessian to get normal modes and their frequencies.

Optionally also filter out translational and rotational degrees of freedom.

Arguments:

symmetrize: activate symmetrization of the Hessian

order_SP: expected saddle point order, number of finite imaginary frequencies at the configuration

discard_imag_freq: discard unstable modes when order_SP > 0

eps_freq: numerical threshold to check frequencies in a.u., 5e-4 ~ 100 cm^-1.

n_TR: how many modes to filter to remove rotations and translations

property wavenumbers: Normal mode wavenumbers in spectroscopic units (cm^-1).