![]() |
OpenMS
|
Computes a consensus from results of multiple peptide identification engines.
potential predecessor tools | → ConsensusID → | potential successor tools |
---|---|---|
IDPosteriorErrorProbability | PeptideIndexer | |
IDFilter | ||
IDMapper |
Reference:
Nahnsen et al.: Probabilistic consensus scoring improves tandem mass spectrometry peptide identification (J. Proteome Res., 2011, PMID: 21644507).
Algorithms:
ConsensusID offers several algorithms that can aggregate results from multiple peptide identification engines ("search engines") into consensus identifications - typically one per MS2 spectrum. This works especially well for search engines that provide more than one peptide hit per spectrum, i.e. that report not just the best hit, but also a list of runner-up candidates with corresponding scores.
The available algorithms are (see also OpenMS::ConsensusIDAlgorithm and its subclasses):
PEPMatrix:
Scoring based on posterior error probabilities (PEPs) and peptide sequence similarities. This algorithm uses a substitution matrix to score the similarity of sequences not listed by all search engines. It requires PEPs as the scores for all peptide hits. PEPIons:
Scoring based on posterior error probabilities (PEPs) and fragment ion similarities ("shared peak count"). This algorithm, too, requires PEPs as scores. best:
For each peptide ID, this uses the best score of any search engine as the consensus score. All peptide IDs must have the same score type. worst:
For each peptide ID, this uses the worst score of any search engine as the consensus score. All peptide IDs must have the same score type. average:
For each peptide ID, this uses the average score of all search engines as the consensus score. Again, all peptide IDs must have the same score type. ranks:
Calculates a consensus score based on the ranks of peptide IDs in the results of different search engines. The final score is in the range (0, 1], with 1 being the best score. The input peptide IDs do not need to have the same score type.PEPs for search results can be calculated using the IDPosteriorErrorProbability tool, which supports a variety of search engines.
PEPMatrix
algorithm: The similarity scoring method used there can only take unmodified peptide sequences into account, so PTMs are ignored during that step. However, the PTMs are not removed from the peptides, and there will be separate results for differently-modified peptides.File types:
Different input files types are supported:
rt_delta
and mz_delta
). One consensus identification will be generated for each group. With the per_spectrum flag you can also input multiple idXML files. A consensus will be made per combination of originating mzml file and spectrum_ref. Filtering:
Generally, search results can be filtered according to various criteria using IDFilter before (or after) applying this tool. ConsensusID itself offers only a limited number of filtering options that are especially useful in its context (see the filter
parameter section):
considered_hits:
Limits the number of alternative peptide hits considered per spectrum/feature for each identification run. This helps to reduce runtime, especially for the PEPMatrix
and PEPIons
algorithms, which involve costly "all vs. all" comparisons of peptide hits. min_support:
This allows filtering of peptide hits based on agreement between search engines. Every peptide sequence in the analysis has been identified by at least one search run. This parameter defines which fraction (between 0 and 1) of the remaining search runs must "support" a peptide identification that should be kept. The meaning of "support" differs slightly between algorithms: For best
, worst
, average
and rank
, each search run supports peptides that it has also identified among its top considered_hits
candidates. So min_support
simply gives the fraction of additional search engines that must have identified a peptide. (For example, if there are three search runs, and only peptides identified by at least two of them should be kept, set min_support
to 0.5.) For the similarity-based algorithms PEPMatrix
and PEPIons
, the "support" for a peptide is the average similarity of the most-similar peptide from each (other) search run. (In the context of the JPR publication, this is the average of the similarity scores used in the consensus score calculation for a peptide.) count_empty:
Typically not all search engines will provide results for all searched MS2 spectra. This parameter determines whether search runs that provided no results should be counted in the "support" calculation; by default, they are ignored.The command line parameters of this tool are:
INI file documentation of this tool: