What Constitutes a Good “Hit” from a SEQUEST Search

Yet Another SEQUEST tip!

This is a question I get a lot. There are a number of parameters that you can use to evaluate your SEQUEST search results to decide whether you have a solid “hit”. I like to use all of the parameters collectively as an indicator. Once you get the feel for this it’s simple. If anyone has any other thoughts, please feel free to post them. I’d like to see how other people evaluate their SEQUEST results.

DelCn is the Delta Correlation value. When you are viewing the summary page, DelCn tells you how different the first hit is from the second hit in your search results. A general rule of thumb is that a DelCn of 0.1 or greater is good. If you are searching a large database (nr.fasta or owl.fasta) you may find that your DelCn values are smaller than those obtained by searching a small database. This is due to the fact that the chance of sequence similarity is higher in the larger databases.

Sp is the preliminary score. SEQUEST performs two types of scoring. After finding all peptides that match within your mass tolerance, SEQUEST does a preliminary scoring of these candidates to weed the list down to 500 for the final correlation analysis. The scoring is based on the number of ions in the MS/MS spectrum that match with the experimental data. The higher the value of Sp the better. I like to see a value of at least 200. One thing to note is that larger peptides will have bigger Sp values. For good hits, a 20 residue peptide will usually have an Sp value of over 1000, while a 6 residue peptide will likely be below 500.

RSp or Rank/Sp tells you the ranking of the particular match during the preliminary scoring (see Sp above). You can think of the preliminary scoring as “the compulsories”. In the olympics you would not expect to see a skater fall down 5 times during the compulsories and end up winning the gold medal. Likewise, you should be wary of any search result that scores a #1 hit after scoring #450 in the compulsories. Ideally, you’d like to see your match ranked #1 in the preliminary scoring. But, it is not uncommon to see a peptide move up 10 or so notches from the preliminary ranking.

Ions. This is probably my favorite indicator. The Ions value tells you how many of the experimental ions matched with the theoretical ions for the peptide listed. For example, 8/10 says that the MS/MS spectrum contained 8 of the 10 predicted ions for the peptide. It is rare to see 100% coverage of the predicted ions, but 70% or 80% coverage is good. You may drop off ions at either end of your spectrum due to scan range limitations or the low mass cutoff feature of the ion trap. If you click on the ions link you get to see your MS/MS spectrum labeled with the theoretical ions from the matched peptide. Once you do this exercise enough your eye will learn to quickly “QC” the good vs. the bogus hits.

XCorr. The XCorr value is the cross-correlation value from the search. The #1 hit will always have the highest value of Xcorr, as Xcorr is used to produce the final ranking of the candidate peptides in the search. Usually, you’ll see the top 10 or 12 ranked in your search results. XCorr values above 2.0 are usually indicative of a good correlation. However, as with Sp, XCorr values are usually higher for well-matched, large peptides, and lower for smaller peptides. It’s not uncommon for a 20 residue peptide to have an Xcorr of 5, while a 6 residue peptide might be around 1.5.

Other common sense indicators, or what to do if you’re still not sure. If you have some marginal hits that you’re interested in, but you are still unsure of their validity, you might want to combine the above parameters with some common sense indicators. For example, when you search your chromatogram, do you observe at least 2 peptides from the same protein that is showing a match? I’d hate to convict anyone with one piece of sketchy evidence. Nothing beats looking closely at the MS/MS spectrum. Also, if you repeat the search in an unrestricted mode (i.e., do not tell SEQUEST what enzyme you used), do you still get the same peptide identified? Is there another charge state precursor ion that you did or could collect MS/MS on to corroborate your findings? If you did collect data on more than one charge state precursor, and the results for the searches were different, check to make sure that SEQUEST properly assigned the charge states. Otherwise the MW(s) could be wrong which might be throwing off your search.

Explorations:

* If looking at SEQUEST output makes you cross-eyed, check out the interactive SEQUEST summary page for improved vision!