The AKID server allows web access to the AKID method for the prediction of phosphorylation sites for kinases in target proteins.
Parca L, Ariano B, Cabibbo A, Paoletti M, Tamburrini A, Palmeri A, Ausiello G, Helmer-Citterich M. Kinome-wide identification of phosphorylation networks in Eukaryotic proteomes. Bioinformatics. 2018 Jul 17. doi: 10.1093/bioinformatics/bty545.
How to use
When dealing with user-submitted sequences, the web server currently handles up to 500 kinase-substrate couples per job. This could be 1 kinase and 500 substrate proteins, 500 kinases and 1 substrate protein, and every variation in between up to 500 couples. The server also handles the prediction on up to 10 substrate proteins with a full organism kinome. Larger jobs can be performed with the AKID Linux software.
Selecting the kinase(s)
The user should input one or more kinases in FASTA format or provide a set of UniProt ids corresponding to the desired kninases. In alternative, a predefined set of kinases from an organism can be choosen by using the "select kinome" option.
The user may paste the sequences or a set of UniProt ids in a text-area ("Paste sequences in FASTA format or a list of UNIPROT accessions" option). Or upload a file with the sequences in FASTA format ("Upload file with sequences in FASTA format" option).
Uniprot ids in the text-area can be separated by commas, semicolons, or one per line.
In the case of user-submitted kinases, the system will attempt to identify, in each submitted kinase, all the "Specificity Determining Regions" (SDR) present in the enzyme. In most cases a kinase will have a single SDR while in some instances more than one SDR could be identified in the sequence. Each SDR will be handled by the system as an individual enzyme with it's own specificity.
In the case of kinomes, SDRs are pre-calculated by the system, therefore the SDR computing step is omitted allowing for a faster analysis with respect to the same number of user-submitted kinaes
Selecting substrates (target proteins)
Similarly to the kinases, the substrates can be copy-pasted in a text-area ("Paste sequences in FASTA format or a list of UNIPROT accessions" option) as FASTA sequences or UniProt ids or uploaded ("Upload file with sequences in FASTA format" option) as a text file containing FASTA sequences.
The user may decide to analyse all the potential phosphorylation targets (all serine, threonine and tyrosine residues, "Analyze all residues" option) in a substrate or to select specific residues for the analysis ("Select residues" option). If the select residues option is choosen, a dedicated interface for residues selection will be proposed before proceeding to the analysis.
The user may provide an e-mail address where a notification will be sent on job completion, with an URL to the analysis results. Some jobs may take a significant amount of time (more than 10 minutes) to be completed. For example the analysis of a set of 500 kinases-substrates couples may well take 10-15 minutes to be completed (may vary depending on the length of the individual proteins). In such a case the user may want to provide an e-mail address so as to be notified when the job is over, rather than waiting for results while staring at a progress icon in a browser window. If an a-mail address is provided, the browser window showing the analysis in progress may be safely closed. An e-mail will notify that the job is over and provide a link to access the analysis results. In order to make sure that the a-mail does not end in the spam folder users may want to add email@example.com, the sender address, to their e-mail contacts list. Should the user not get the e-mail, the spam folder should be checked.
The AKID server output includes the following views for accessing or browsing results
- A text file with full results. The results text file is "residue focused". For each selected residue, the results text file will have a line with the selected residue, the kinase and the score. Therefore, if the analysis includes several kinases, a kinase with more than one SDR, or a kinome, several lines related to a particular residue will be present.
- Pretty view. The pretty view displays each substrate sequence in FASTA format. On mouse hovering on the potential phosphorylation targets, the best 5 kinases phosphorylation scores will be shown as a tooltip. In the pretty view residues will be colored according to phosphorylation scores, cross-kinase, in order to allow an easy visual spotting of the best phosphorylation target residues in relation to the provided kinases set. That is, residues that did get the best phosphorylation scores with any of the kinases in the input kinases set will get a specific color, with black being the strongest targets and yellow the weakest.
- Residue view. The residue view provides a table in which the target residues are represented as columns and each kinase as a row. The very first column is dedicated to the kinase names and it is a fixed column. The user can horizontally scroll over the other columns to find the residue of interest, across substrates set, and click on the small arrows in the columns header to sort results according to scores. Therefore, for each individual residue, kinases can be sorted in score order. This allow to easily find the kinase, among the input kinases set, most likely to be responsible for the phosphorylation of a specific residue.
- Kinase view. The kinase view provides a table in which kinases are represented as columns and each target residue as a row. This view allows to find the best target residue in the targets set for the submitted kinases set. The first column is reserved to the residues names and is fixed. As for the residue view, the other columns can be viewed by horizontal scrolling that will allow to find the kinase of interest, and then the sort by score by clicking on the arrows in the column header so as to identify the best substrate residues across the targets set for a given kinase.
Kinases names may be shortened by the system for smoothing out the formatting of the output. A "domain" suffix will be added to the shortened name, in order to identify a specific SDR.
As an example, consider the JAK2 Human tyrosine kinase, with FASTA header:
>sp|O60674|JAK2_HUMAN Tyrosine-protein kinase JAK2 OS=Homo sapiens GN=JAK2 PE=1 SV=2
This protein contains two SDRs.
In the web server output, the first SDR will be referenced to as:
while the second SDR will be referenced to as:
Input data size and server output
Depending on the number of submitted sequences, the system will behave differently with respect to the output provided
If data include 1 kinome and 1 substrate or a maximum of 50 kinase-substrate couples (for example 1 kinase and 50 substrates, 50 kinases and 1 substrates, or variations in between) the server provides a full output. This includes the text results file, the pretty view and the results and kinase tables.
If data include 1 kinome and max 10 substrates or a maximum of 500 kinase-substrate couples only the text results file is provided in output
If the number of submitted sequences exceeds these figures the analysis cannot be handled by the web server. The user will be warned and the download of an AKID Linux software package will be suggested