How to tell if our ligand-protein docking is good from AutoDock Vina's result

Bioinformatics Asked by scamander on June 20, 2021

I have perform a ligand-protein docking using Autodock Vina.
The result of the docking looks like this:

WARNING: The search space volume > 27000 Angstrom^3 (See FAQ)
Detected 8 CPUs
Setting up the scoring function ... done.
Analyzing the binding site ... done.
Using random seed: -1553787135
Performing search ... done.
Refining results ... done.

mode |   affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1         -5.9      0.000      0.000.  Pose 1
2         -5.7     22.945     25.492.  Pose 2
3         -5.5      1.426      2.046.  Pose 3
4         -5.5     23.669     25.616
5         -5.4     25.783     29.152.  .....
6         -5.3     21.146     23.357
7         -5.2     20.323     22.545
8         -5.2     23.864     26.064
9         -5.1     23.422     26.585.  Pose 9

As far as I understand from these statistics Mode 1(Pose 1) is the best.
However when I actually visualize them in Pymol, Pose 1 has no hydrogen bonding at all
but Pose 2 has.

My question is how can we judge if which of those two Pose is the best to use?

Note in figure below Pose 2 has dashed line (Hydrogen bond).

It is best to contextualise the numbers. -1 kcal/mol is about the potential energy gained from a hydrogen bond —technically described in the r^6 part of the Lenard–Jones term, it is also the average collision energy of a water molecule at 37°C as that is RT ($$frac{k_bcdot T}{N_A}$$, wiki)under a Maxwell–Boltzmann distribution. A salt bridge –2 kcal/mol (Columbic force term). So your scores are not very low, hence why you are counting two hydrogen bonds. Although you can also see a lovely sulfur-pi interaction which is –1 to –2 kcal/mol, so those bonds alone are probably making a –4 kcal/mol contribution, so I am guessing that some terms may be horrendous, such as repulsion forces etc. A nice metric is doing a conversion to ligand efficiency, which weeds out affinities driven by size of the molecule... in this case most of the molecule is doing nothing. Also, most programs have an accuracy of 1 kcal/mol or higher.

So one cannot say what ∆∆G is the best with the data at hand due to noise, but one can say that the ∆∆G is not low enough... Sorry.

Answered by Matteo Ferla on June 20, 2021

Related Questions

DESeq2 multiple treatments, multiple time points, multiple cell lines

2  Asked on June 8, 2021 by tnocs

How to get matching pattern along with ID in a single command in grep?

1  Asked on June 7, 2021 by 20-21

is it ok to use gene sets from an organism to enrich genes from a different organism?

0  Asked on June 7, 2021 by gabt

Brackets in .Rmd code blocks

1  Asked on June 5, 2021 by angus-campbell

Reading nested Map data structures in WDL

0  Asked on June 5, 2021 by blawney_dfci

Retain only part of a file name and fasta header in fasta directory

2  Asked on June 3, 2021 by malia-w

Selecting rows by partial match

3  Asked on June 3, 2021

Differential Gene Expression with Replicates for some of the samples

0  Asked on June 3, 2021 by reza-rezaei

How to import large .bed, .gff, .vcf, .paf, .sam files into an SQL database?

1  Asked on June 3, 2021

CRISPR/Cas9 screen analysis with Mageck: paired-end sequencing

1  Asked on June 3, 2021 by swimming-bird

MM/PB(GB)SA calculations for DESMOND trajectories

0  Asked on June 2, 2021 by user9085

Compute the significance of the overlap between 2 or more gene sets

1  Asked on May 31, 2021

Alternative to enrichR for enrichment analysis?

2  Asked on May 30, 2021 by nova

Publicly available genome sequence database for viruses?

3  Asked on May 30, 2021 by alwaystrying44

Create GFF3 feature exporter. The information is below:

1  Asked on May 29, 2021 by kendal-b

Annotating gene names or gene IDs to a dataframe containing SNPs?

1  Asked on May 29, 2021

creating a tab delimited file

2  Asked on May 29, 2021 by edwardo

Does rRNA depletion protocol give higher number of mapped reads in Intronic regions?

2  Asked on May 28, 2021