Statistics
Benchmarking IUPred2
The performance of the newest version of IUPred has been tested using customized datasets.
Intrinsically disordered protein regions (IDRs) were taken from the
DisProt database as the positive
testing dataset.
Only IDRs with a length of at least 9 residues were included.
The negative dataset comprises protein regions that are known to represent
independently folding,
stable monomeric units encompassing a single domain according to
CATH definitions. These structures were collected from the PDB and were filtered to
include only one protein region from each UniRef90 sequence cluster.
Structures with flexible residues, as evidenced by highly dissimilar NMR models or missing
X-ray coordinates, were removed.
The positive and negative datasets used for benchmarking can be downloaded from
here and here, respectively.
Database statistics
|
|
Number of residues |
Number of proteins |
Positive (DisProt) |
84,479 |
1,195 |
Negative (monomeric structures) |
178,957 |
1,095 |
Using the above bechmark sets, IUPred2 can be characterized with the following binary
classifier measures:
|
Sensitivity (True Positive Rate) |
61.85% |
Specificity (True Negative Rate) |
94.03% |
Precision* |
91.20% |
*as the value of precision depends heavily on the relative sizes of the
positive and negative datasets,
the database sizes were scaled to be equal to achieve an unbiased measure
Benchmarking ANCHOR2
The performance of ANCHOR2 was tested on the recently published DIBS database,
as the positive testing dataset. DIBS represens the largest currently available set
of experimentally verified IDRs capable of forming
ordered structures upon binding to protein domains. Only entries not used in the training of
ANCHOR2 were used in testing.
For negative testing, the same monomeric single-domain protein dataset was used as
for testing IUPred2, but allowing for structures with
up to 20% of flexible residues. Only entries not used in the training of ANCHOR2 were used
in testing.
Furthemore, ANCHOR2 was also evaluated on a set of flexible linkers that are disordered but
are known to lack a primary binding function.
To get a fuller picture about the efficiency of ANCHOR2 on sequence sets with different
compositions, two auxiliary datasets were also considered.
The first is composed of disordered regions from DisProt, the next is a collection of random
(decoy) segments from the human proteome excluding
transmembrane regions, structured Pfam domains and extracellular proteins. Both datasets are
expected to contain disordered binding regions,
albeit to a significantly lower extent, compared to DIBS.
The positive, negative, and auxiliary datasets used for benchmarking can be downloaded from
here.
Database statistics
|
|
Number of residues |
Number of proteins |
Positive (DIBS) |
2,135 |
140 |
Negative (monomeric structures) |
583,033 |
3,320 |
Negative (flexible linkers) |
5,425 |
389 |
Auxiliary (DisProt) |
79,049 |
1,042 |
Auxiliary (decoy) |
76,860 |
5,040 |
Using the above bechmark sets, ANCHOR2 can be characterized with the following binary
classifier measures.
As ANCHOR often specifically identifies only strongly binding sub-regions inside larger
binding regions,
segment-based sensitivity was also calculated. In this case a binding region was considered
found if it
incorporates at least one ANCHOR-identified region, regardless of possible difference in
length:
|
|
Residue-based metrics |
Segment-based metrics |
Sensitivity (True Positive Rate) |
62.67% |
69.29% |
Specificity (True Negative Rate on ordered monomers) |
98.26% |
- |
Specificity (True Negative Rate on flexible linkers) |
94.58% |
- |
Fraction of predicted binding residues in auxiliary DisProt dataset |
50.00% |
- |
Fraction of predicted binding residues in auxiliary decoy dataset |
10.93% |
- |
Benchmarking other context dependent features
As currently there are no comprehensive datasets collecting a large number of experimentally
verified
examples for other types of context-dependent IDRs targeted by IUPred2A, the rigorous
testing of these
features are not possible as of yet. In accord, these features are marked as ‘Experimental’.
However, a
number of select examples are available in the How to use section.
|