Prediction of allergenic proteins and mapping of IgE epitopes in allergens
Lab/Group: Raghava Group (IMTECH, Chandigarh, India)
Introduction
In present era use of genetically modified proteins in foods, therapeutics and biopharmaceuticals is increasing with exponential rate. Thus it is important to predict whether a modified protein allergenic or not. In 2003, the Codex Alimentarius Commission (Codex) conveyed a panel of international food safety regulators to review the FAO/ WHO 2001 recommendations and recognized the uncertainties associated with the bioinformatics part of the guidelines. They recommended various tests for examining allergenic behavior of proteins that includes source of gene, sequence similarities with known allergens, stability of protein and IgE bindings. Considering these points in mind a method was developed for predicting allergenic proteins, which is based on various approaches.
Materials
Reagents
Both formatted and non-formated sequences are accepted as input. For formatted sequences the server uses ReadSeq. software which can read most commonly used standard sequence formats including FASTA/PIR/EMBL/GENBANK etc. The user have to specify whether the sequence is in any format or non-formated as raw/plain text (single letter coded amino acid only)
Equipment
User can access and use this web server from any computer (Windows or Linux or Mac) with web browser and Internet connection
Procedure
To run prediction, follow these stepwise instructions.
Step 1: Type the following URL address in your web browser http://www.imtech.res.in/raghava/algpred
Step 2: The user is required to fill the sequence submission form. A brief description of each of the field is as follows:
Protein sequence name: This is an optional field.
Paste protein sequence in plain or standard format: Paste the query protein sequence in one of the standard format (FASTA, EMBL, PIR etc.) or amino acid sequence only in single letter code.
Or Upload sequence file: The user can also upload the query sequence directly from a file.
NOTE: Care should be taken that the server accepts input from either of two options, not both.
Input sequence format: The user has to select the appropriate format according to the input sequence.
Step 3: Users can select one or more approaches at a time in a submission form as mentioned below:
i) Mapping of IgE epitopes and PID: The server searches known IgE epitopes in query protein sequence and will assign as allergen if any segment have high similarity with any known epitope. If there is a known epitope(s), then mapping of the epitope(s) is performed in the query sequence. The specificity of this approach is very high but the disadvantage is it has low sensitivity, as not all IgE epitopes of all allergens are known.
ii)MEME/MAST motif: The input query protein sequence searched in MEME matrices created by using allergen sequences. The specificity of this approach is high with low sensitivity.
iii)SVM module based on amino acid composition: The SVM module is generated using amino acid composition of protein sequence of allergens and nonallergens. The threshold value used is -0.4. At this value sensitivity and specificity of this method is 88.87% and 81.86% respectively, using fivefold cross-validation.
iv)SVM module based on dipeptide composition: The SVM module is generated using dipeptide composition of protein sequence of allergens and nonallergens. The threshold value used is -0.2. At this value sensitivity and specificity of this method is 82.78% and 85.00% respectively, using fivefold cross-validation.
v)Blast search on allergen representative peptides (ARPs): The query protein sequence search the database of 2890 allergen representative peptides (ARPs), obtained from Bjorklund et al 2005.If there is a hit, then it will assign as allergen and the ARP is shown in the result field. The accuracy of this method is very high with high sensitivity as well as specificity.
vi)Hyprid approach: The query protein sequence is assign as allergen if any one of the methods ( SVM composition based +mapping of IgE epitopes + ARPs BLAST + MEME/MAST) predicts it as allergen.
Step 4: Finally click on "Submit" button
One Submission filled form is shown in Figure 1
Troubleshooting
This server allows users to predict allergens and mapping of IgE epitopes. Server may take time if users choose all the available approaches at one go
Critical Steps
Anticipated Results
The server allows users to present results of various approaches in a single HTML output page. It provides comprehensive information about the prediction that includes score, threshold, distance from threshold, Positive Predictive Value (PPV) and Negative Predictive Value (NPV). If the PPV is >80%, then there is a high chance that the protein is a potential allergen. In case of BLAST search, if the query sequence matches with any ARP in the database, then the matched ARP is also shown. AlgPred also allows the mapping of IgE epitopes on allergenic proteins. The output of AlgPred has been shown in Figure 2. A result of hybrid model is shown in Figure 3, which is output of hybrid model.
References
Saha, S. and Raghava, G.P.S. (2006) AlgPred: prediction of allergenic proteins and mapping of IgE epitopes. Nucleic Acids Res. 34(Web Server issue):W202-9.
Acknowledgements
This work was supported by the Council of Scientific and Industrial Research and the Department of Biotechnology, Government of India.
Keywords
Allergenic Proteins, IgE Epitopes, Prediction, Allergen representative peptides, Support Vector Machine
Submission form of algpred where sequence is in FASTA format; IgE mapping option is used for prediction.
Click here to see a larger version of this image.
Example output of algpred for "IgE Mapping" option
Click here to see a larger version of this image.
Example output of algpred for hybrid option where all methods are used
Click here to see a larger version of this image.

