SiteAF3: Accurate Site-specific Folding via Conditional Diffusion Based on Alphafold3

Accurate structure prediction of biomolecular complexes is crucial for understanding biological processes and enabling drug discovery. While AlphaFold3 represents a significant advancement, enhancing its accuracy for specific binding sites remains a challenge. We present SiteAF3, a novel method for accurate site-specific folding via conditional diffusion, built upon the AlphaFold3 framework. SiteAF3 refines the diffusion process by fixing the receptor structure and optionally incorporating binding pocket and hotspot residue information. Comprehensive evaluations on protein-small molecule, protein-peptide, and protein-nucleic acid datasets demonstrate that SiteAF3 consistently outperforms AlphaFold3, achieving higher accuracy in complex structure prediction especially for orphan proteins and allosteric ligands, with reduced computational cost. SiteAF3 offers a user-friendly plug-in compatible with AlphaFold3, providing a valuable tool for more accurate modeling of biomolecular interactions..

1

The overall architecture of SiteAF3 (Fig. 1a) mirrors that of AlphaFold3 (AF3), inheriting AF3's inference workflow, which comprises four primary components: the input preparation, the representation learning, the structure prediction, and the confidence assessment. Our main divergences from AF3 lie in two key aspects: firstly, the structure prediction module employs a novel conditional diffusion model, and secondly, the representation learning incorporates additional binding pocket and hotspot residue information via the MSA module.

In diffusion process, the atomic coordinates of the ligand are initialized with noise based on a Gaussian distribution centered around the pocket center with a specified radius, while the relative atomic coordinates of the receptor are directly fixed. Another different point is situated within the second sequence local attention block, where SiteAF3 introduces a mask to update only the coordinates of the ligand.

In our initial explorations with the base SiteAF3 model, we observed that despite the noise initialization of ligand atomic coordinates was around the pocket, the predicted ligand binding occurred at a site distant from the intended pocket occasionally. To address this issue, we first used genetic searching tools in AF3 to get templates for MSA module to evaluate how much binding pocket information can be learned. We then experimented with directly embedding information about designated hotspot and pocket residues via the MSA module. These strategies led to a substantial improvement in the accuracy of the model's predictions, effectively guiding the ligand towards the desired binding site. However, we observed that MSA may introduce wrong binding site bias. We then used pocket information to mask full-length MSA templates after alignment to minimize the misleading impact as much as possible. Through pocket-masked AF3 MSA, SiteAF3 demonstrated a significant improvement on the pocket-well-defined protein-ligand dataset. 2

MSA may introduce wrong binding site bias on unseen protein and ligands.

An intriguing result is presented in Fig. 2, where SiteAF3, when utilizing AF3_MSA, consistently demonstrates higher cumulative prediction accuracy than the standard AF3 model within the 0-10 Å range, which aligns with our expectations. However, when not employing AF3_MSA, SiteAF3 initially underperforms compared to AF3. Interestingly, beyond a threshold of 5 Å, the proportion of correctly predicted targets for the po+hot and baseline configurations gradually exceeds that of AF3. This observation suggests that the MSA templates leveraged by AF3 might occasionally introduce misleading information regarding the binding site for unseen protein-ligand pairs. This trend was not observed in the PoseBustersV2 dataset, where all data were included in AF3's training set. This discrepancy suggests a potential overfitting of the genetic searching-derived data during the representation learning phase for seen datasets. To our delight, pocket masked AF3 MSA mode consistently performed the best within the 0-10 Å range: its trend in the range of 0-5 Å was the same as that of the model using full-length MSA templates, but with a greater slope. In the range of 5-10 Å, it was always higher than the models without using genetic searching, and the slope of the ending curve tended to be consistent with the others. This result suggested that pocket guidance can effectively eliminate the hidden bias in full-length MSA templates

3

Prediction accuracy across biomolecular complexes.

SiteAF3 contributes a lot in protein and small molecule interaction modelling, while keeps outperforming AlphaFold3 on other datasets.

Haocheng, Tang.; Junmei, Wang. Accurate Site-specific Folding via Conditional Diffusion Based on Alphafold3. BioArxiv, 2025, 10.1101/2025.07.06.663385.