Key points are not available for this paper at this time.
The use of data-independent acquisition methods such as SWATH for mass spectrometry based proteomics is usually performed with peptide MS/MS assay libraries which enable identification and quantitation of peptide peak areas. Reference assay libraries can be generated locally through information dependent acquisition, or obtained from community data repositories for commonly studied organisms. However, there have been no studies performed to systematically evaluate how locally generated or repository-based assay libraries affect SWATH performance for proteomic studies. To undertake this analysis, we developed a software workflow, SwathXtend, which generates extended peptide assay libraries by integration with a local seed library and delivers statistical analysis of SWATH-quantitative comparisons. We designed test samples using peptides from a yeast extract spiked into peptides from human K562 cell lysates at three different ratios to simulate protein abundance change comparisons. SWATH-MS performance was assessed using local and external assay libraries of varying complexities and proteome compositions. These experiments demonstrated that local seed libraries integrated with external assay libraries achieve better performance than local assay libraries alone, in terms of the number of identified peptides and proteins and the specificity to detect differentially abundant proteins. Our findings show that the performance of extended assay libraries is influenced by the MS/MS feature similarity of the seed and external libraries, while statistical analysis using multiple testing corrections increases the statistical rigor needed when searching against large extended assay libraries. The use of data-independent acquisition methods such as SWATH for mass spectrometry based proteomics is usually performed with peptide MS/MS assay libraries which enable identification and quantitation of peptide peak areas. Reference assay libraries can be generated locally through information dependent acquisition, or obtained from community data repositories for commonly studied organisms. However, there have been no studies performed to systematically evaluate how locally generated or repository-based assay libraries affect SWATH performance for proteomic studies. To undertake this analysis, we developed a software workflow, SwathXtend, which generates extended peptide assay libraries by integration with a local seed library and delivers statistical analysis of SWATH-quantitative comparisons. We designed test samples using peptides from a yeast extract spiked into peptides from human K562 cell lysates at three different ratios to simulate protein abundance change comparisons. SWATH-MS performance was assessed using local and external assay libraries of varying complexities and proteome compositions. These experiments demonstrated that local seed libraries integrated with external assay libraries achieve better performance than local assay libraries alone, in terms of the number of identified peptides and proteins and the specificity to detect differentially abundant proteins. Our findings show that the performance of extended assay libraries is influenced by the MS/MS feature similarity of the seed and external libraries, while statistical analysis using multiple testing corrections increases the statistical rigor needed when searching against large extended assay libraries. Data Independent Acquisition (DIA) 1The abbreviations used are:DIAData-Independent AcquisitionCiRTCommon internal Retention Time standard peptidesFCFold ChangeFDRFalse Discovery RateFPFalse PositiveFNFalse NegativeIDAInformation Dependent AcquisitioniRTNormalized Retention timeMLRMost Likely Ratiom/zmass/charge ratioNANot AvailableqFDRQuantification False Discovery RateRTRetention TimeSWATHSequential Window Acquisition of all Theoretical fragment-ion spectraTNTrue NegativeTPTrue PositiveXICExtracted Ion Chromatogram. mass spectrometry workflows are gaining increasing use for proteomic analysis of model systems (1Weisbrod CR Eng JK Hoopmann MR Baker T Bruce JE Accurate peptide fragment mass analysis: multiplexed peptide identification and quantification.Journal of proteome research. 2012; 11: 1621-1632Crossref PubMed Scopus (72) Google Scholar, 2Gillet Ludovic C. et al.“Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis.”.Molecular 11.6: O111-016717Google Scholar, 3Venable J.D. Dong M.Q. Wohlschlegel J. Dillin A. Yates J.R. Automated approach for quantitative analysis of complex peptide mixtures from mass PubMed Scopus Google Scholar, of proteins by a of Proteomics. PubMed Scopus Google Scholar, A. A. acquisition from how to into the proteomics PubMed Scopus Google Scholar, Yates J.R. the data-independent PubMed Scopus Google Scholar, J. mass using Proteomics. PubMed Scopus Google Scholar, of mass spectrometry The integrated and quantitative analysis SWATH Ludovic C. et al.“Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis.”.Molecular 11.6: O111-016717Google was to and proteomic and of the by Proteomics. PubMed Scopus Google Scholar, T et of to human proteins by PubMed Scopus Google Scholar, J. A. analysis of by SWATH mass spectrometry and protein as for Proteomics. PubMed Scopus Google Scholar, J. of in human by PubMed Scopus Google Scholar, C. the of quantitative proteome with data-independent acquisition and to Proteomics. PubMed Scopus Google Scholar, A. of proteins in a human systems 11: PubMed Scopus Google methods spectrometry and protein by the of peptide which to peptide and quantitation this is for as in data samples are These proteome can be for quantitative data by of peptides and quantitative than concept in analysis is use of a assay library to enable peptide identification from generated multiplexed MS/MS spectra T et of to human proteins by PubMed Scopus Google Scholar, C. the of quantitative proteome with data-independent acquisition and to Proteomics. PubMed Scopus Google Scholar, A. J.D. and accurate acquisition from PubMed Scopus Google The and of this library with we is to and this in Acquisition internal Retention Time standard peptides False Discovery False False Dependent Acquisition Retention Likely False Discovery Retention Time Window Acquisition of all Theoretical fragment-ion spectra Ion Chromatogram. The assay library all the of the peptide to be from the SWATH assay library is and of this approach for data-independent acquisition PubMed Scopus Google Reference assay libraries be and be of to enable peptide identification from approach to a assay library usually using samples to library is that library is and for such as which have a large of protein to have the to detect abundant proteins in the A. of proteins in a human systems 11: PubMed Scopus Google Scholar, J. human proteome with in Proteomics. PubMed Scopus Google To this be that a peptide be assay library for to be and using the SWATH with libraries. approach to locally generated assay libraries is to use data or from external data commonly libraries are in data repositories and of the by Proteomics. PubMed Scopus Google Scholar, T et of to human proteins by PubMed Scopus Google Scholar, J. J.R. J.D. of PubMed Scopus Google Scholar, Scholar, C. proteome and and of PubMed Scopus Google studies have demonstrated that assay libraries can be used for SWATH data extraction J. of the library for data through PubMed Scopus Google and a software and have been for assay libraries for data-independent acquisition PubMed Scopus Google Scholar, J. analysis of data-independent acquisition PubMed Scopus Google Scholar, assay libraries for analysis of SWATH PubMed Scopus Google there have been no studies performed to systematically evaluate the of local and extended assay libraries SWATH proteomics To undertake this of assay library performance we developed a and software workflow, which we in with SWATH A. software with the for and identification with based and accurate mass or J. analysis of data-independent acquisition PubMed Scopus Google this software all of in a SWATH workflow, from extended assay library to statistical analysis and libraries are from a locally generated seed which is with libraries, assay libraries or proteome libraries. of the libraries have spiked peptides C. J. a for of 2012; PubMed Scopus Google methods for peptide by using based or of library by and of peptide and library with commonly used software and and protein by protein in The statistical analysis of the software the quantitative analysis of protein with the peak through to the identification of differentially proteins using SWATH peak extraction from this we used to and the performance of assay libraries using a of designed mixtures for quantitative These number of proteins and peptides the to detect locally generated and extended libraries, the and for statistical peptides of protein from yeast and human from and been with with and with by the The yeast and human samples in at and in To of yeast protein into human protein three of samples that and yeast in of human protein the three human samples ratios of and ratios SWATH mass spectrometry and statistical using three for of data and SWATH-MS data a mass with using peptide samples a peptide peptide for and at for with and the peptide was with of and and from the using the from and increasing to at a of data for the human for yeast and for human a was at with with the in the for was used for spectra for in the mass with for for and for and SWATH data three for of the human SWATH by with from to which based in the data of human are in spectra for in the mass with for in To test SWATH extended libraries data from different we data for human and yeast three different mass The was a mass which a different from The was a with a and and of The data acquisition used for as used for data acquisition was the and for at using and the was to a increasing to at for and data a from the a with and for with and for for for and for The was with The of and of was a peptide and with at and from to in MS/MS data in for a was was was and and extended assay libraries used in this the libraries, generated libraries and from The extended libraries generated by or of libraries to libraries and with no of to and fragment and MS/MS data to by using the and yeast protein from and the a new with was used for The as The from the to with SWATH and as libraries in We a for assay libraries a seed usually a local library which was generated with SWATH data using the and the and or libraries as to extended assay libraries to a which peptides and by is based the number of the data and the fragment is based the fragment These are by the protein identification The for are for peptide and for the spectra that be in the library the of SWATH data extraction by the library as The libraries a to that the fragment the library and the seed methods can be used to the similarity of fragment spectra A. A. peptide as a for mass and a feature for Proteomics. PubMed Scopus Google SwathXtend, we as a of the fragment libraries and analysis of mass similarity peak for mass and methods in libraries that the of is than be to the seed We for based and The approach the seed and the libraries have for all the of a number of peptides the seed and the seed and libraries. The approach peptide and in seed and libraries. this approach can be when the library or We use for peptide in to PubMed Scopus Google to the peptide The libraries are with the seed library to integrated assay peptides of or spectra seed and libraries, seed library spectra are in the peptide in seed and libraries, the spectra from the seed library be in the extended library for this The and in the can be used to integrated assay library by using a seed and or libraries, is in or from J. software for and PubMed Google with SWATH quantitation was used to extract SWATH peak with of the libraries in a of to the library and which peptides or be used for These the number of peptides protein to be from the the number of or fragment a peptide in which used to the peptides with a a False Discovery in which is used to the SWATH extraction and the peptide peak with a Ion and mass for and to the We based the of number of peptide and protein by using the assay and peptides The that performed in the to SWATH data for all libraries. SWATH peak the peak peptide peak and protein peak in for statistical The peak by for by the of all for was and the of The of was by the of that the to the for all The data was by peptide and which we to as peptide peak and protein peak of data and was to have the to the the used is in in by with data-independent mass spectrometry PubMed Scopus Google for differentially the approach of with the protein quantitation and the with the peptide quantitation for the protein was assessed by a test or of the protein peak areas. used The change was as the of of the which to the of and the peptide approach the for peptide as the of in the different and was assessed by a test of all peptide to a The of using the peptide approach is that peptides of can by the the is that at different different peptides are for the of the test in this peptide proteins be as differentially The change was as the of the a of and for library and with change from to and from to the of the analysis, a change was as the of the to the in the of analysis, the is used to the multiple testing methods the and the and the a and approach to multiple generated for comparisons. The the proteins in of the increasing and the in this The identified the proteins using the increasing The of this was to use peptide MS/MS assay libraries of different and complexities and how this SWATH-MS quantitative performance for analysis of complex We the use of locally generated peptide MS/MS assay libraries and to external libraries in data To the quantitative performance of SWATH-MS to detect differentially abundant proteins we designed that peptides of yeast into K562 human cell in ratios to differentially abundant proteins in the this we the number of peptides and proteins with different peptide MS/MS assay the quantitation which we as the specificity for of differentially SWATH the quantitation obtained with the for different library To undertake we developed based software software into SWATH analysis workflows as in The of for libraries are in peptide MS/MS assay libraries as in These libraries can be as locally generated or libraries to from data using the and mass spectrometry as we used for SWATH The library can be as or to the number of yeast spectra The external libraries are into three and and To the local we analysis of the human and analysis of the yeast a human proteins and yeast proteins. such a library with of proteins to change in abundance samples be obtained as spectra of the differentially yeast proteins from the human proteins. is library to be used as a for as a large number of yeast and human spectra the abundance of yeast proteins in the is a local library the in proteomics in which differentially proteins are in data the that yeast spiked into the K562 human cell this to the yeast proteins in a of human proteins. the number of yeast proteins in the library the to detect the differentially abundant proteins. We used as a seed library to libraries to evaluate how SWATH quantitative and are external libraries. to and external was generated from of the human and of the yeast used a different human and yeast proteins. a the data used to the library data or with the to use to SWATH the peptide in is different from obtained with the SWATH is a external was from of of and the human cell using a with human and yeast than the and proteome large external libraries. is a human proteome assay library of proteins T et of to human proteins by PubMed Scopus Google while is a yeast proteome assay library which of the yeast proteome and of the by Proteomics. PubMed Scopus Google is a external library generated using with from acquisition of yeast and human cell the local as a we the external libraries and extended libraries and a of fragment the libraries have a of to peptide the library We of external library locally generated seed library was by and using as the seed We the and methods and the are in Our for this the the we used the based for this We performed testing of for SWATH data extraction by using the The the number of peptides number of proteins and the number of yeast proteins We that the for SWATH extraction using this library the number of peptide protein to be from the library as the number of fragment peptide as peptide identification as SWATH for peak by using a to the analysis of Ludovic C. et al.“Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis.”.Molecular 11.6: O111-016717Google Scholar, data and statistical for PubMed Scopus Google as for the peptide as and mass a for the as and peptides this SWATH data was using local libraries, and and extended libraries, and all local or libraries, and the SWATH data extraction was and of the library However, when the external library was used for SWATH data this peptide as peptides in SWATH this was by SWATH data for using a peptides that peptide peak than and the and as The was to all peptides in the library and used for extraction of all SWATH data the number of peptides and proteins obtained from for all test with the local all extended libraries number of proteins human proteins there was a when extended libraries from to and yeast proteins which are at a of in the from proteins using the library generated by with the to proteins using the extended library that yeast protein with the locally generated the extended libraries a and extraction the local was generated by a of human and yeast samples a of human yeast proteins that from the SWATH data which for the library the number of human peptides the extraction of yeast proteins at proteins extended libraries and yeast peptide We this to the abundance of yeast proteins in the test samples and of proteins be in the of with the number of human peptides that are in The protein identification and peak can be in of peptide and protein using local and extended assay libraries. the local as the the local as the seed and are extended assay libraries by the local library and is a proteome external in library proteins in proteins in proteins proteins of human protein of yeast protein in a new the of this we can evaluate with the performance of different statistical to detect differentially abundant proteins from SWATH data We the number of yeast proteins identified as differentially human proteins identified as differentially and the quantitative as is from identification commonly used in data We the change the three yeast spiked by analysis of for protein using the for all assay libraries for all libraries to the data and the data and used to The differentially proteins identified by and the for the locally generated and extended libraries of for library are in proteins with local and extended assay libraries using using the multiple testing of and change the three and the of and The are in for library are performance for the proteins in of and performance for the proteins and and by increasing by increasing in a new the of analysis, we assessed the of using the multiple testing corrections methods commonly used in proteomics A. A. testing in a for Proteomics. PubMed Scopus Google and and the was assessed and to and a as a for all have for all libraries. the multiple testing is at the in of proteins. with a change a number of for this data with The show for the local multiple testing for extended libraries, is the extended libraries have a of proteins with no change in and the corrections as to the of The of and a of than for all libraries at the of However, for this the of with for the local library and extended and and a of identified a of the the performance of the proteins in increasing and the proteins a of such be to experiments the are in the to show the performance is using with the library and this is to that of by the external library and These are consistent with the based change and this that at a the external library of the number of in the local library this can be to by a locally generated seed library The assay library generated with by performed when used to against SWATH data of the in We the of differentially abundant proteins for assay was based the of proteins proteins as differentially based and at the protein using local and extended assay libraries and by in there is a of proteins using the local library and extended libraries, that use of extended libraries are to detect of the differentially proteins that are by using the local is that as the library increases there is of as as in with using the local The and of data in show the protein for the proteins using extended libraries. the abundance with the change for comparisons. the that of the proteins by the extended libraries are differentially abundant proteins. the we the of in protein for all three spiked samples in analysis by using we in three to to and to for libraries and we protein and peptide a change is generated for peptide identified for protein and the from multiple peptides to the protein are The peptide approach is a of peptides better for from We a of and change to be and the of multiple testing corrections for all libraries, we and as quantitation that be in proteomic data using the we of are in and the are in be that is in proteomic experiments as and are using the we can to proteomic for the three and three for and change and and change and and change the protein and peptide are by The are in with are of is in in for of that is for the local library for all and for of the extended libraries the peptide analysis is the multiple testing corrections at the of no for the with to and to that increasing the change to of than for all extended libraries the peptide analysis is for and We that for the local and extended and using a commonly used in proteomics test and for all at the protein the of spiked yeast We that the peptide analysis generates a in all at the of of the extended libraries, and can be by using the peptide and by the change to Our for the of SWATH quantitation human proteins to be at in of the the yeast proteins to be at ratios of and for of the comparisons. The and standard for the human proteins to the and are in The quantitation and for the yeast is in for of the libraries in this the protein analysis, and for all libraries using yeast proteins. the the of the abundance ratios the using protein peak for all yeast proteins identified as differentially the standard of the ratios is in to to to in a new The in the for different of yeast and human proteins for three by using the data with The in the ratios for to to and to to and to of the yeast protein ratios are all to than the However, for to of the ratios are to the can be in the of of the samples in the the SWATH ratios be to of in the peak when samples in the have a the ratios are to the for the data with the extended libraries can be in SWATH-MS is demonstrated to and proteomic the SWATH a assay library is to the MS/MS spectra to enable peptide such a peptide assay library in a local can to peptide assay libraries to new assay libraries for a SWATH been for data-independent acquisition PubMed Scopus Google Scholar, J. analysis of data-independent acquisition PubMed Scopus Google However, no studies have been to systematically evaluate the performance of SWATH dependent assay library by et J. A. protein by data-independent acquisition a mass PubMed Scopus Google by library performance based analysis of human samples the number and of identified peptides and proteins. libraries libraries generated from and and a human external library from T et of to human proteins by PubMed Scopus Google et using of proteins of the quantitation and of differentially proteins. and that of et demonstrated the of in generated libraries for peptide identification Our using a local library as a seed for with the external library to the in protein However, this was the in et the library based human cell and was of the of the of this We a for the analysis of human using the et human library These the for the library and studies for data-independent acquisition PubMed Scopus Google Scholar, J. analysis of data-independent acquisition PubMed Scopus Google have performed assay libraries in by assay libraries, all based use of peptides C. J. a for of 2012; PubMed Scopus Google for The peptides to be into the protein samples at the the peptide multiple and SWATH as data the assay libraries in this of have and of the of the of using is the of of peptides is spiked to The to be for the peptide to be as as to to the The of a of internal for data-independent acquisition mass Proteomics. PubMed Scopus Google methods the of spiked peptides a of internal peptides to be for methods of a of which can be and for large data which was demonstrated in in the of using the external or peptides for based methods with model and for is as a for assay this we that the of the extended assay library is influenced by the of the seed library and libraries. seed library is usually a peptide assay library generated locally using the and as the SWATH-MS The seed library can be generated by which is as a and The data are used to SWATH-MS to a seed library is and the and to the SWATH The seed library can be used to extract SWATH data for or abundant proteins to abundant proteins be to library through library The libraries be with to the seed demonstrated in be when the at a is used for the seed library and MS/MS are The to the similarity of the spectra in the seed and assay and is by mass and acquisition We used to the of assay and this and have the with different mass using the acquisition for and data using a which and with the However, is to that the systems are in in MS/MS and the libraries. We the of and is than that of and and are and be that of the assay libraries for SWATH have been generated and data that differentially abundant peptides be identified in to SWATH acquisition to assay libraries, is to enable and are based assay libraries using for human and The extended assay library generated from and are and and the are and data using with which different MS/MS to the seed is no that the and is with the external libraries, with a of in the quantitative performance of was to extended libraries in this a large library such and can in for differentially abundant protein with libraries and We this to the of and which is and than and of yeast proteins are and human proteins are a large number of human proteins to the extended library is to to is that extended assay libraries be to the studied and and be is to in extended libraries by multiple testing corrections or for as and used to differentially abundant proteins data analysis that the test and change be used in to the than and change than a to differentially proteins in the of this the at the of the SWATH to use which at the of or peptide or or than which enable at the of We the use of was to in when the library is such as in using libraries However, the with the is for the statistical analysis is in is to use or to of the of SWATH is that the data can be using peak extraction to is with using to protein the of data by analysis using the SWATH data with to this that by using extended assay libraries, the number of peptides and proteins as as the number of identified differentially abundant proteins can be while The of proteins as differentially by using the extended assay library and the local library are and the new of differentially proteins by the extended libraries show the The of the SWATH data using the extended libraries is to that obtained when using the local is that of is to that the use of libraries for the of MS/MS spectra from experiments for data-independent acquisition PubMed Scopus Google Scholar, J. J.D. peptide identification for data-independent PubMed Scopus Google to be the to use libraries for SWATH The mass spectrometry proteomics data and SWATH have been into the A. proteomics data and PubMed Scopus Google with the can and the SWATH using with SWATH The is from J. software for and PubMed Google at We for the We Ludovic for with with with
Wu et al. (Fri,) studied this question.