OpenMS
Loading...
Searching...
No Matches
Epifany

EPIFANY - Efficient protein inference for any peptide-protein network is a Bayesian protein inference engine. It uses PSM (posterior) probabilities from Percolator, OpenMS' IDPosteriorErrorProbability or similar tools to calculate posterior probabilities for proteins and protein groups.

Experimental classes
This tool is work in progress and usage and input requirements might change.
pot. predecessor tools → Epifany → pot. successor tools
PercolatorAdapter IDFilter
IDPosteriorErrorProbability

It is a protein inference engine based on a Bayesian network. Currently the same model like Fido is used with the main parameters alpha (pep_emission), beta (pep_spurious_emission) and gamma (prot_prior). If not specified, these parameters are trained based on their classification performance and calibration via a grid search by simply running with several possible combinations and evaluating. Unless you see very extreme output probabilities (e.g. many close to 1.0) or you know good parameters (e.g. from an earlier run), grid search is recommended although slower. The tool will merge multiple idXML files (union of proteins and concatenation of PSMs) when given more than one. It assumes one search engine run per input file but might work on more. Proteins need to be indexed by OpenMS's PeptideIndexer but this is usually done before Percolator/IDPEP since target/decoy associations are needed there already. Make sure that the input PSM probabilities are not too extreme already (garbage in - garbage out). After merging the input probabilities are preprocessed with a low posterior probability cutoff to neglect very unreliable matches. Then the probabilities are aggregated with the maximum per peptide and the graph is built and split into connected components. When compiled with the OpenMP flag (default enabled in the release binaries) the tool is multi-threaded which can be activated at runtime by the threads parameter. Note that peak memory requirements may rise significantly when processing multiple components of the graph at the same time.

The command line parameters of this tool are:

INI file documentation of this tool: