IMG not available

Prediction of Intrinsically Unstructured Proteins

How to use


Basics

The main aim of IUPred2 is to identify Intrinsically Disordered Protein Regions (IDPRs, i.e. regions that lack a stable monomeric structure under native conditions) based on a biophysics-based model. The user can input any protein sequence and IUPred returns a score between 0 and 1 for each residue, corresponding to the probability of the given residue being part of a disordered region.

The disordered nature of a protein segment can be context dependent: certain protein regions can switch between an ordered and a disordered state depending on various environmental factors. Currently, the IUPred2A server is able to detect such context-dependent disorder in the case where the environmental factors are either a change in the redox state or the presence of an ordered binding partner (for more details see here).

The following sections outline the use of IUPred2A in various scenarios. For a list of select example runs highlighting various IUPred2A features, see the Examples section.

Protein sequence input

There are three basic ways to input protein sequences into IUPred2A:

I - If the protein is deposited in the UniProt database (either in SwissProt or TrEMBL) you can specify the accession code or the ID of the protein in the "Enter SWISS-PROT/TrEMBL identifier or accession number" field. The IUPred2A server is always linked to newest version of UniProt. The header of the UniProt entry will be displayed as the title in the results page.
II - Type or cut and paste your sequence in the "paste the amino acid sequence" field. The amino acid sequence must be in the standard FASTA format or must be a plain sequence. Spaces and other non-standard characters within the pasted sequence are permitted, however they will be removed with the remaining sequence treated as a single continuous chain.
III - For analysis of a large number of sequences/full proteomes, users can upload their sequences in a single file adhering to the standard multiple FASTA file format criteria. In this case the output will be provided in text format via email.

Prediction type

There are three different disorder prediction types offered, each using different parameters optimized for slightly different applications. These are: long disorder, short disorder, and structured domains.

Long disorder (default option):
The main profile of IUPred2A is to predict global structural disorder that encompasses at least 30 consecutive residues of the protein. The long option is optimized for this task.

Short disorder:
In this setting, IUPred2A uses a parameter set best suited for predicting short disordered regions, such as missing residues in the X-ray structure of an otherwise globular protein. For this application a smaller sequential neighbourhood of residues is considered for the calculation of the IUPred score. As chain termini of globular proteins are often disordered in X-ray structures, this is taken into account by an end-adjustment parameter which favors disorder prediction at the ends.

Structured domains:
The reliable identification of ordered protein regions is a crucial step in target selection for structural studies and structural genomics projects. Finding putative structured domains suitable for structure determination is another potential application of IUPred2A. In this case the algorithm aims to find continuous regions confidently predicted ordered. Neighbouring regions close to each other are merged, while regions shorter than the minimal domain size of at least 30 residues are ignored. When this prediction type is selected, the region(s) predicted to correspond to structured/globular domains are returned.

Context-dependent predictions

IDPRs often harbor binding regions that are able to specifically interact with a globular domain. During this interaction, in the majority of known cases, the binding disordered region adopts an ordered structure in its bound form. This is probably the most commonly occurring context-dependent protein disorder, where the transition between the unstructured and the structured states is initiated by the presence of an appropriate protein partner. Such disordered binding regions are identified using the ANCHOR2 prediction algorithm. Similarly to IUPred2, ANCHOR2 also assigns to each residue a score between 0 and 1, representing the probability of the given residue to be part of a disordered binding region. Selecting ANCHOR2 as a prediction option, the ANCHOR2 score is provided along with the IUPred score.
Another known context-dependent behaviour of IDPRs is the change between a folded and an unfolded state as a result of a change in the redox state. Such protein regions can be ordered or disordered depending on their localization in/outside the cell. Upon selecting this option, IUPred2A marks such redox-sensitive protein regions, and also shows their maximal and minimal predicted disorder tendencies.

Output

Basic features:
The primary output of IUPred2A is a graph showing the disorder tendency of each residue in the given protein, where higher values correspond to a higher probability of disorder. The graph is scalable and can be directly downloaded for presentation/publication purposes. The list of position-specific disorder scores is also downloadable in simple text or JSON format.


Extended features:
If the prediction was run by specifying a UniProt ID/accession, the output of IUPred2A also shows additional protein annotations, including Pfam regions; post-translational modifications (PTMs), including phosphorylations (upper line), methylations and acetylations (lower line) taken from PhosphoSitePlus; corresponding structures from the PDB; and regions that were experimentally verified to be disordered, taken from DisProt, DIBS, and MFIB.

If context-dependent predictions were selected, the output graph and the downloadable results incorporate additional data as well.
In case of disordered binding region prediction via ANCHOR2, the graph shows the probability of each residue being part of a binding region in blue. The presence or absence of the IUPred2 and ANCHOR2 scores are switchable by clicking on the legend.
If redox state-dependent predictions were enabled, the ranges of possible disorder tendencies for redox- sensitive regions of the query protein are marked in purple.

REST API

IUPred2A is also accessible via REST API to enable automated/large scale use. Requests should be input following the syntax:

http://iupred2a.elte.hu/iupred2a/::accession::

or

http://iupred2a.elte.hu/iupred2a/::iupred_type::/::accession::

In case ::iupred_type:: is not given, the default "long" will be used. If the requested URL ends with ".json" the output will be JSON type, in any other case it will be simple text.

Examples:
http://iupred2a.elte.hu/iupred2a/q32p44
http://iupred2a.elte.hu/iupred2a/q32p44.json
http://iupred2a.elte.hu/iupred2a/short/q32p44



References:

IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding
Bálint Mészáros, Gábor Erdős, Zsuzsanna Dosztányi
Nucleic Acids Research (2018), in press

Prediction of protein disorder based on IUPred
Zsuzsanna Dosztányi
Tools for Protein Science (2017) 27, 331-340.

Dosztányi Z, Csizmók V, Tompa P, Simon I.
The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins.
J Mol Biol. 2005;347:827-39.

Dosztányi Z, Csizmók V, Tompa P, Simon I.
IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content.
Bioinformatics. 2005;21:3433-4.

Mészáros B, Simon I, Dosztányi Z.
Prediction of protein binding regions in disordered proteins.
PLoS Comput Biol. 2009;5:e1000376.

Dosztányi Z, Mészáros B, Simon I.
ANCHOR: web server for predicting protein binding regions in disordered proteins.
Bioinformatics. 2009;25:2745-6.
 
Zsuzsanna Dosztanyi | Balint Meszaros | Gabor Erdos | MTA-ELTE Momentum Bioinformatics Research Group