Structure-based drug design with equivariant diffusion models

admin

1 week ago

Structure-based drug design with equivariant diffusion models

Denoising diffusion probabilistic models

DDPMs²³ are a class of generative models inspired by non-equilibrium thermodynamics. In brief, they define a Markovian chain of random diffusion steps by slowly adding noise to sample data and then learning the reverse of this process (typically via a neural network) to reconstruct data samples from noise.

In this work, we closely follow the framework developed by Hoogeboom et al.²⁴. In our setting, data samples are atomic point clouds z_data = [x, h] with 3D geometric coordinates $\bfx\in \mathbbR^N\times 3$ and categorical features $\bfh\in \mathbbR^N\times d$, where N is the number of atoms. A fixed noise process

$$q\left(\bfz_t| \bfz_\rmdata\right)=\mathcalN\left(\bfz_t| \alpha _t\bfz_\rmdata,\sigma _t^2I\right)$$

(1)

adds noise to the data z_data and produces a latent noised representation z_t for t = 0, …, T. $\sigma _t^2$ is the variance of the Gaussian noise distribution. α_t controls the signal-to-noise ratio $\,\textSNR\,(t)=\alpha _t^2/\sigma _t^2$ and follows either a learned or pre-defined schedule from α₀ ≈ 1 to α_T ≈ 0 (ref. ³⁷). We choose a variance-preserving noising process³² with $\alpha _t=\sqrt1-\sigma _t^2$. I is an identity matrix.

As the noising process is Markovian, we can write the denoising transition from time step t to s < t in closed form as

$$q(\bfz_s| \bfz_\rmdata,\bfz_t)=\mathcalN\left(\bfz_s\left\vert \frac\alpha _t\sigma _s^2\sigma _t^2\right.\bfz_t+\frac\alpha _s\sigma _t^2\sigma _t^2\bfz_\rmdata,\frac\sigma _t^2\sigma _s^2\sigma _t^2I\right)$$

(2)

with $\alpha _t=\frac\alpha _t\alpha _s$ and $\sigma _ s^2=\sigma _t^2-\alpha _ s^2\sigma _s^2$ following the notation of Hoogeboom et al.²⁴. This true denoising process depends on the data sample z_data, which is not available when using the model for generating new samples. Instead, a neural network ϕ_θ, where θ indicates trainable parameters, is used to approximate the sample $\hat\bfz_\rmdata$. More specifically, we can reparameterize equation (1) as z_t = α_tz_data + σ_tϵ with $\bf\upepsilon \sim \mathcalN(\bf0,I)$ and directly predict the Gaussian noise $\hat\bf\upepsilon _\theta =\phi _\theta (\bfz_t,t)$. Thus, $\hat\bfz_\rmdata$ is simply given as $\hat\bfz_\rmdata=\frac1\alpha _t\bfz_t-\frac\sigma _t\alpha _t\hat\bf\upepsilon _\theta $.

The neural network is trained to maximize the likelihood of observed data by optimizing a variational lower bound on the data, which is equivalent to the simplified training objective $\mathcalL_\rmtrain=\frac12| | \bf\upepsilon -\phi _\theta (\bfz_t,t)| ^2$ up to a scale factor^23,37. See Supplementary Section 1 for details.

Equivariance

Structural biology remains a rather data-sparse domain. It is therefore common practice to encode known geometric constraints, typically equivariance to rotations and translations, directly into the neural network architecture, thereby facilitating the learning task because possible neural operations are limited to a meaningful subset. In the 3D molecule-generation setting, we explicitly exclude reflection-equivariant operations because they would make the model blind to some aspects of stereochemistry. It is known that different stereoisomers can have fundamentally different therapeutic effects (for example, ref. ³⁸; Fig. 1e) and might even lead to unforeseen off-target activity and hence toxicity. We therefore developed a reflection-sensitive system that is SE(3)-equivariant rather than E(3)-equivariant although the latter is more commonly adopted in related studies^18,24,39.

Technically, we ensure SE(3)-equivariance in the following sense: evaluating the likelihood of a molecule $\bfx^(\mathrmL)\in \mathbbR^3\times N_\mathrmL$ given the 3D representation of a protein pocket $\bfx^(\mathrmP)\in \mathbbR^3\times N_\mathrmP$ should not depend on global SE(3)-transformations of the system, meaning p(Rx^(L) + t∣Rx^(P) + t) = p(x^(L)∣x^(P)) for orthogonal $R\in \mathbbR^3\times 3$ with R^TR = I, $\det (R)=1$ and $\bft\in {\mathbbR}^3$ added column-wise. At the same time, it should be possible to generate samples x^(L) ~ p(x^(L)∣x^(P)) from this conditional probability distribution so that equivalently transformed ligands Rx^(L) + t are sampled with the same probability if the input pocket is rotated and translated and we sample from p(Rx^(L) + t∣Rx^(P) + t). This definition explicitly excludes reflections that are connected with chirality and can alter the biomolecule’s properties. Node-type features, which transform invariantly, are ignored in this discussion for simpler notation.

In our set-up, equivariance to the orthogonal group O(3) (comprising rotations and reflections) is achieved because we model both prior and transition probabilities with isotropic Gaussians where the mean vector transforms equivariantly with respect to rotations of the context (see Hoogeboom et al.²⁴ and Supplementary Section 3). Ensuring translation equivariance, however, is harder because the transition probabilities p(z_t−1∣z_t) are not inherently translation-equivariant. To circumvent this issue, we follow previous studies^24,40 by limiting the whole sampling process to a linear subspace where the center of mass (COM) of the system is zero. In practice, this is achieved by subtracting the COM of the system before performing likelihood computations or denoising steps. As equivariance of the transition probabilities depends on the parameterization of the noise predictor ${\hat\bf\upepsilon }_\theta $, we can make the model sensitive to reflections with a simple additive cross-product term in the neural network’s coordinate update as discussed in the next section and Supplementary Section 4.

SE(3)-equivariant GNNs

A function $f:\mathcalX\to \mathcalY$ is said to be equivariant with respect to the group G if f(g.x) = g.f(x), where g. denotes the action of the group element g ∈ G on $\mathcalX$ and $\mathcalY$ (ref. ⁴¹). GNNs are learnable functions that process graph-structured data in a permutation-equivariant way, making them particularly useful for molecular systems where nodes do not have an intrinsic order. Permutation invariance means that GNN(ΠX) = ΠGNN(X) where Π is an n × n permutation matrix acting on the node feature matrix.

As the nodes of the molecular graph represent the 3D coordinates of atoms, we are interested in additional equivariance with respect to the Euclidean group E(3) or rigid transformations. An E(3)-equivariant GNN (EGNN) satisfies EGNN(ΠXA + b) = Π EGNN(X)A + b for an orthogonal 3 × 3 matrix A with A^⊤A = I and some translation vector b added row-wise.

In our case, as the nodes have both geometric atomic coordinates x as well as atomic type features h, we can use a simple implementation of EGNN proposed by Satorras et al.³⁹, in which the updates for features h and coordinates x of node i at layer l are computed as follows:

$$\bfm_ij=\phi _e\left(\bfh_i^l,\bfh_j^l,d_ij^\,2,a_ij\right),\,\tildee_ij=\phi _\rmatt\left(\bfm_ij\right)$$

(3)

$$\bfh_i^l+1=\phi _h\left(\bfh_i^l,\sum _j\ne i\tildee_ij\bfm_ij\right)$$

(4)

$$\bfx_i^l+1=\bfx_i^l+\sum _j\ne i\frac\bfx_i^l-\bfx_j^ld_ij+1\phi _x\left(\bfh_i^l,\bfh_j^l,d_ij^2,a_ij\right)$$

(5)

where ϕ_e, ϕ_att, ϕ_h and ϕ_x are learnable multilayer perceptrons (MLPs) and d_ij and a_ij are the relative distances and edge features between nodes i and j respectively. m_ij and $\tildee_ij$ are messages and attention coefficients, respectively. Following Igashov et al.³⁶, we do not update the coordinates of nodes that belong to the pocket to ensure the 3D protein context remains fixed throughout the EGNN layers.

We can break the symmetry to reflections and thereby make the GNN layer SE(3)-equivariant by adding a cross-product-dependent term to the coordinate update, which changes sign under reflection:

$$\bfx_i^l+1=\bfx_i^l+\sum _j\ne i\frac\bfx_i^l-\bfx_j^ld_ij+1\phi _x^d\left(\bfh_i^l,\bfh_j^l,d_ij^2,a_ij\right)$$

(6)

$$+\frac{\left(\bfx_i^l-\bar\bfx^l\right)\times \left(\bfx_j^l-\bar\bfx^l\right)}{\parallel \left(\bfx_i^l-\bar\bfx^l\right)\times \left(\bfx_j^l-\bar\bfx^l\right)\parallel +1}\phi _x^\times \left(\bfh_i^l,\bfh_j^l,d_ij^2,a_ij\right).$$

(7)

Here, ${\bar\bfx}^l$ denotes the COM of all nodes at layer l. $\phi _x^\times $ is an additional MLP. The desired SE(3)-equivariance of this modification is discussed in Supplementary Section 4.

Inpainting

For molecular inpainting as shown in Fig. 1c, a subset of all atoms is fixed and serves as the molecular context we want to condition on. All other atoms are generated by the DDPM. To this end, we sample a diffused representation $\bfz_t^\rminput$ of the fixed atoms z_data at every time step t in addition to the predicted latent representation $z_t^\,\rmgen$. A set of mask indices $\mathcalM$ uniquely identifies nodes corresponding to fixed atoms in $\bfz_t^\,\rmgen$. Note that $\bfz_t^\rminput$ contains exactly $| \mathcalM|$ atoms while $z_t^\,\rmgen$ is bigger. For every denoising step, we then replace the generated atoms corresponding to fixed nodes ($\bfz_t-1,i\in \mathcalM^\,\rmgen$) with their forward noised counterparts:

$$\bfz_t-1^\rminput \sim q\left(\bfz_t-1| \bfz_{{\rmdata}}\right)$$

(8)

$$\bfz_t-1^\rmgen \sim p_\theta \left(\bfz_t-1| \bfz_t\right)$$

(9)

$$\bfz_t-1=\left[\bfz_t-1^\rminput,\bfz_t-1,i\notin \mathcalM^\rmgen\right].$$

(10)

In this manner, we traverse the Markov chain in reverse order from t = T to t = 0 to generate conditional samples. Because the noise schedule decreases the noising process’s variance to almost zero at t = 0 (‘Denoising diffusion probabilistic models’ section), the final sample is guaranteed to contain an unperturbed representation of the fixed atoms. This approach was applied to pocket-conditioned ligand inpainting by fixing all pocket nodes when sampling from the joint distribution model (DiffSBDD-joint). It was also used in the substructure design experiments.

Equivariance

As the equivariant diffusion process is defined for a COM-free system, we must ensure that this requirement remains satisfied after the substitution step in equation (10). To prevent a COM shift, we therefore translate the fixed atom representation so that its COM coincides with the predicted representation:

$${\tilde\bfx}_t-1^\rminput=\bfx_t-1^\rminput+\frac1n\sum _i\in \mathcalM{{\bfx}}_t-1,i^\rmgen-\frac1n\sum _i\in \mathcalM{{\bfx}}_t-1,i^\rminput$$

(11)

before creating the new combined representation

$${{\bfz}}_t-1=\left[{\tilde{{\bfz}}}_t-1^\rminput,{{\bfz}}_t-1,i\notin \mathcalM^{\rmgen}\right]$$

(12)

with ${\tilde{{\bfz}}}_t-1^\rminput=\left[{\tilde{{\bfx}}}_t-1^\rminput,{{\bfh}}_t-1^{{\rminput}}\right]$ and $n=| \mathcalM|$.

Resampling

Trippe et al.⁴² showed that this simple replacement method inevitably introduces approximation error that can lead to inconsistent inpainted regions. In our experiments, we observe that the inpainting solution sometimes generates disconnected molecules that are not properly positioned in the target pocket (see Supplementary Fig. 1a for an example). Trippe et al.⁴² proposed to address this limitation with a particle filtering scheme that upweights more consistent samples in each denoising step. We, however, choose to adopt the conceptually simpler idea of resampling³³, where each latent representation is repeatedly diffused back and forth before advancing to the next time step as demonstrated in the algorithm in Supplementary Section 6.4. This enables the model to harmonize its prediction for the generated part and the noisy sample from the fixed part, which does not include any information about the generated part. We choose r = 10 resamplings per denoising step for our experiments with DiffSBDD-joint based on empirical results discussed in Supplementary Section 5.4.

Implementation details

Molecule size

As part of a sample’s overall likelihood, we compute the empirical joint distribution of ligand and pocket nodes p(N_L, N_P) observed in the training set and smooth it with a Gaussian filter (σ = 1). In the conditional generation scenario, we derive the distribution p(N_L∣N_P) and use it for likelihood computations.

For sampling, we can either fix molecule sizes manually or sample the number of ligand nodes from the same distribution given the number of nodes in the target pocket:

$$N_\mathrmL \sim p(N_\mathrmL| N_\mathrmP).$$

(13)

For the experiments discussed in ‘DiffSBDD captures the underlying data distribution’ section, we increase the mean size of sampled molecules by five (CrossDocked) and ten (Binding MOAD) atoms, respectively, to approximately match the sizes of molecules found in the test set. This modification makes the reported Vina scores more comparable as the in silico docking score is highly correlated with the molecular size, which is demonstrated in Supplementary Fig. 4. Average molecule sizes after applying the correction are shown in Supplementary Table 7 together with corresponding values for generated molecules from other methods.

Featurization

All molecules are expressed as graphs in which every atom is represented by a node. To process ligand and pocket nodes with a single GNN, atom types and residue types are first embedded in a joint node embedding space by separate learnable MLPs (Fig. 1f). We also experimented with coarse-grained C_α descriptions of the pockets to reduce processing time but found this representation to be inferior in most cases (Supplementary Section 5.9). The full atom model uses the same one-hot encoding of atom types for ligand and protein nodes. For the C_α-only model, the node features of the protein are set as one-hot encodings of the amino acid type instead.

Noise schedule

We use the pre-defined polynomial noise schedule introduced in ref. ²⁴:

$$\tilde\alpha _t=1-\left(\fractT\right)^2,\quad t=0,\ldots ,T.$$

(14)

Following refs. ^24,43, values of $\tilde\alpha _t^2=\left(\frac\tilde\alpha _t\tilde\alpha _s\right)^2$ are clipped between 0.001 and 1 for numerical stability near t = T, and $\tilde\alpha _t$ is recomputed as

$$\tilde\alpha _t=\mathop\prod \limits_\tau =0^t\tilde\alpha _ \tau -1.$$

(15)

A tiny offset ϵ = 10⁻⁵ is used to avoid numerical problems at t = 0 defining the final noise schedule:

$$\alpha _t^2=(1-2\epsilon )\cdot \tilde\alpha _t^2+\epsilon .$$

(16)

Feature scaling

We scale the node-type features h by a factor of 0.25 relative to the coordinates x, which was empirically found to improve model performance in previous work²⁴. To train joint probability models in the all-atom scenario, it was necessary to scale down the coordinates (and corresponding distance cut-offs) by a factor of 0.2 instead to avoid introducing too many edges in the graph near the end of the diffusion process at t = T.

Postprocessing

For postprocessing of generated molecules, we use a similar procedure as in ref. ⁴⁴. Given a list of atom types and coordinates, bonds are first added using OpenBabel⁴⁵. We then use RDKit to sanitize molecules and filter for the largest molecular fragment.

Quantitative evaluation of inpainting for the whole Binding MOAD test set

For all inpainting experiments across the whole test set, we perform automatic masking of atoms that are to be fixed. For scaffold elaboration, we extract the Bemis–Murcko scaffold⁴⁶ using RDKit and compute a binary mask to fix the scaffold, while functional groups are redesigned. For scaffold hopping, we simply take the inverse of the mask used for scaffold elaboration. For linker design, we fragment each molecule in the test set in multiple ways as in Igashov et al.³⁶. To benchmark against DiffLinker, we use the model weights and protocol as described in Igashov et al.³⁶ except we give the ground-truth linker size as input, rather than predict it using the auxiliary model, for fairness. In small-scale experiments where finer control is desirable (for example, as in the fragment merging example described below), the binary mask is defined manually.

Depending on the use case, we find it desirable to perform molecular inpainting within two regimes: (1) designing a completely new inpainted region de novo (DiffSBDD-de novo) to explore the entire chemical fitness landscape; or (2) redesigning an existing region via partial noising then denoising (Supplementary Section 5.7), thus locally exploring desired properties by exploitation (DiffSBDD-diversify). The first case is more amenable to situations in which we have no prior information other than the fixed substructure (for example, fragment linking after a fragment screen), meaning that unconstrained exploration of the chemical fitness landscape is the preferred approach for the majority of SBDD. The second case is more relevant in scenarios where we have prior information about the desired chemical and topological composition of the designed region that we can use to bias generation (with the choice of t being a hyperparameter). This is particularly relevant in the case of scaffold hopping, where we try to keep the properties of a molecule relatively unchanged while designing a new topology⁴⁷.

Molecular-inpainting case studies

All molecular-inpainting experiments shown in Extended Data Fig. 1a–e use a version of DiffSBDD-cond trained on Binding MOAD.

Scaffold hopping is performed for a mitotic kinesin Eg5 inhibitor (PDB code 2gm1)⁴⁸ where we fix the functional groups mediating the binding to the pocket while designing a new scaffold structure.

The opposite case of scaffold elaboration is applied to a rationally designed inhibitor targeting the actin-associated protein ENAH EVH1 (PDB code 6rcj)⁴⁹ where we fix the scaffold and design new functional groups.

Fragment merging is the task of combining fragments with an overlapping binding site⁵⁰. For the example in this study, we replicate the results of Gahbauer et al.⁵¹, who performed fragment merging of two fragments (PDB codes 5rsw and 5rue) identified by experimental screening⁵² for the SARS-CoV-2 non-structural protein 3 (Nsp3) using the chemoinformatics-based method Fragmenstein⁵³. To perform the fragment merge, instead of masking out and reinserting atoms, we instead choose to fix all atoms during generation except the atom on each fragment closest to the other. We need to perform t = 200 steps of the DiffSBDD-diversify procedure to allow the model to arrange the atom positions as well as change the atom types. All PDB files were already structurally aligned.

Fragment growing is performed around the central motif of another inhibitor for the ENAH EVH1 target (PDB entry 5ndu)⁴⁹.

The fragment linking example is based on the same target (PDB entry 5ndu). Here we are designing not only a small linker made of a few atoms but rather an entirely new fragment with two connecting linkers to join two outer fragments of the reference ligand.

Iterative molecule optimization

To perform property optimization as shown in Fig. 1d, we first noise a molecule from an experimental protein–ligand complex for t steps, where t ≪ T, using the forward diffusion process. From this partially noised sample, we can then denoise the appropriate number of steps with the reverse process until t = 0. The stochasticity in this quick noise/denoise process allows us to sample new and diverse candidates of various properties while staying in the same region of chemical space, assuming t is small (Supplementary Fig. 3). Note that this approach, which is inspired by Luo et al.⁵⁴, does not allow for direct optimization of specific properties. Instead, it can be regarded as an exploration around the local chemical space while maintaining high shape and chemical complementary via the conditional denoising model.

We extend this idea by combining the partial noising/denoising procedure with a simple evolutionary algorithm that optimizes for specific molecular properties (Fig. 1d). At every stage in the optimization process, we generate 100 new molecules (from either the previous generation or the original molecule in the first case). Molecules are modified via partial noising/denoising with a randomly chosen t between 10 and 150. The new molecules are then passed to an oracle/score function (for instance, a docking program or synthetic accessibility predictor) to be ranked. The top-k molecules are then selected to seed the new population. In our study, we use k = 10.

For the selective kinase design experiment, we additionally pruned any candidates that regress with regard to the on- and off-target docking scores of the original molecule before selecting the top molecules (that is those above or left of the red star in Extended Data Fig. 2f) to bias the molecules to have high affinity to the on-target kinase as well as specificity. The starting molecule has ChEMBL identifier CHEMBL388978.

Experimental set-up

Datasets

We use the CrossDocked dataset⁵⁵ with 100,000 high-quality protein–ligand pairs for training and 100 proteins for testing, following the sequence-based data split of previous studies^15,44.

We also evaluate our method on a curated dataset of experimentally determined complexed protein–ligand structures from Binding MOAD²⁸. We keep pockets with moderately ‘drug-like’ ligands with QED score >0.3 that pass the database’s validity criteria ( We further discard small molecules that contain atom types ∉ C, N, O, S, B, Br, Cl, P, I, F as well as binding pockets with non-standard amino acids. We define binding pockets as the set of residues that have any atom within 8 Å of any ligand atom. Ligand redundancy is reduced by randomly sampling at most 50 molecules with the same chemical component identifier (3-letter-code). After removing corrupted entries that could not be processed, 40,344 training pairs and 130 testing pairs remain. A validation set of size 246 is used to monitor estimated log-likelihoods during training. The split is made to ensure different sets do not contain proteins from the same Enzyme Commission Number main class.

As various proteins could not be successfully processed by one or several baseline methods, our analysis of the distribution learning capabilities is performed for only pockets for which samples from all methods are available. These are 78 and 119 targets from CrossDocked and Binding MOAD, respectively.

Baselines

We select four recently published autoregressive deep learning methods for SBDD. Pocket2Mol¹⁵, ResGen²⁵ and PocketFlow²⁶ are sequential schemes relying on graph representations of the protein pocket and previously placed atoms to predict probabilities based on which new atoms are added. DeepICL²⁷ pursues a similar sequential approach but strives to improve generalizability in the face of limited data by incorporating prior knowledge in the form of protein–ligand interaction patterns. They are currently the state of the art among this class of models. For Pocket2Mol, we re-evaluate already generated ligands on the CrossDocked dataset kindly provided by the authors. All other results were produced using the official implementations available online with default sampling parameters. Note that, unlike DiffSBDD, we therefore sample for the Binding MOAD test set with Pocket2Mol and ResGen models that have been trained on CrossDocked. As these two sets overlap (30 test set proteins from Binding MOAD are found in the CrossDocked training set), there is potential data leakage. In practice, however, we do not observe substantially different results when these targets are excluded from the analysis. We also attempted to train Pocket2Mol on Binding MOAD, but did not manage to robustly train the model on this dataset due to instability during training. PocketFlow was pretrained on about 8 million molecules from the ZINC database⁵⁶ and finetuned on a different subset of the CrossDocked dataset. DeepICL was trained on a much smaller dataset with about 11,000 structures from the PDBbind database⁵⁷.

For the fragment linking task, we compare against DiffLinker³⁶. DiffLinker is an equivariant diffusion model similar to ours, but takes the pocket and fixed fragments as inputs and then designs only a linker.

Evaluation metrics

We use widely used metrics to assess the quality of our generated molecules^14,15. (1) Vina score is an empirical estimate of the binding free energy of protein–small-molecule complexes. While it is not an ideal predictor of binding affinity, we chose the Vina score as a fast proxy that shows a certain level of correlation with experimentally determined values (see Extended Data Fig. 3 in ref. ³⁶). (2) Convolutional neural network affinity is another predicted affinity score reported by the GNINA docking software⁵⁸. (3) QED is a quantitative estimation of drug-likeness combining several desirable molecular properties⁵⁹. (4) SA estimates synthetic accessibility, that is, the difficulty of synthesis⁶⁰. (5) logP is the predicted octanol–water partition coefficient, a measure of hydrophobicity⁶¹. (6) Lipinski measures how many rules in the Lipinski rule of five⁶² are satisfied (in addition to the original four rules we require ten or fewer rotatable bonds). (7) Diversity is computed as the average pairwise dissimilarity (1 − Tanimoto similarity) between molecular fingerprints of all generated molecules for each pocket. (8) Inference time is the average sampling time per target. Chemical properties are calculated with RDKit⁶³. Docking scores are obtained after local minimization with an empirical force field using the GNINA implementation⁵⁸ or, if specified, after redocking with QuickVina2⁶⁴.

Statistics and reproducibility

No statistical method was used to predetermine sample size. While we aimed to sample 100 ligands per pocket for the results in the ‘DiffSBDD captures the underlying data distribution’ section, the exact number of available molecules varies slightly due to technical reasons and the characteristics of the different methods (Supplementary Table 9). Some metrics could be calculated only for molecules that pass RDKit’s sanitization step. Molecules not passing this filter were therefore excluded from the affected analyses. Furthermore, we exclude DeepICL from the comparison with Binding MOAD as we did not manage to sample any molecules for more than half of the test set proteins. Nevertheless we report distribution learning results of all methods on this substantially reduced set of targets in Supplementary Section 5.2.

Software

All code was written in Python (v3.10.4). For dataset preparation, we used numpy (v1.22.4), BioPython (v1.81) and RDKit (v2023.9.4). The neural network models were implemented and trained with PyTorch (v1.12.1), PyTorch Lightning (v1.7.4), PyTorch Geometric (v2.2) and Weights & Biases (v0.13.1). OpenBabel (v3.1.1) and RDKit (v2023.9.4) were used to post-process molecules. Docking/scoring was performed using the Gnina (v1.1) and QuickVina (v2.1) softwares. The data were analyzed and visualized using Pandas (v1.4.2), SciPy (v1.7.3), Matplotlib (v3.4.3) and Seaborn (v0.12.0).

The code required to run the baseline models is available in public repositories. Pocket2Mol can be found at ResGen at PocketFlow (latest) at and DeepICL (v1.1.0) at Finally, DiffLinker (v1.0) is available at The Pocket2Mol and ResGen repositories do not provide version releases.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

link