MSstatsQCgui: A shiny app for longitudinal quality monitoring for proteomic experiments

Eralp DOGU eralp.dogu@gmail.com

Sara TAHERI srtaheri66@gmail.com

Olga VITEK o.vitek@neu.edu

2021-10-26

Introduction

MSstatsQCgui translates the modern methods of longitudinal statistical process control, such as simultaneous and time weighted control charts and change point analysis to the context of LC-MS experiments. Details can be found via MSstatsQC website and project github repository, and are available for use stand-alone, or for integration with automated pipelines.

This vignette summarizes functionalities in MSstatsQCgui package.

The GUI was created using Shiny, a Web Application Framework for R, and uses several packages to provide advanced features that can enhance Shiny apps, such as shinyjs. A running version of the GUI is found in MSstatsQCgui

Installation

To install the package from the Bioconductor repository please use the following code.

To install the development version of the package via GitHub:

Quick start

The following commands should be used to start the graphical user interface.

Input

In order to analyze quality control data in MSstatsQCgui, input data must be a .csv file in a “long” format with related columns. This is a common data format that can be generated from spectral processing tools such as Skyline and Panorama AutoQC.

The recommended format includes Acquired Time, Peptide name, Annotations and data for any QC metrics such as Retention Time, Total Peak Area and Mass Accuracy etc. Each input file should include Acquired Time, Peptide name and Annotations. After the Annotations column user can parse any metric of interest with a proper column name. MSstatsQCgui can analyze 20 metrics simultaneously.

  1. AcquiredTime: This column shows the acquired time of the QC/SST sample in the format of MM/DD/YYYY HH:MM:SS AM/PM. European date parser is also accepted.

  2. Precursor: This column shows information about Precursor id. Statistical analysis will be done separately for each unique label in this column.

  3. Annotations: Annotations are free-text information given by the analyst about each run. They can be informative explanations of any special cause or any observations related to a particular run. Annotations are carried in the plots provided by MSstatsQC interactively.

(d)-(f) RetentionTime, TotalPeakArea, FWHM, MassAccuracy, and PeakAssymetry, and other metrics: These columns define a feature of a peak for a specific peptide.

Example dataset was generated during CPTAC Study 9.1 at Site 54. Although the example focus on targeted proteomics, the statistical methods more generally apply. Each row corresponds to a single time point.

Data import

Data import tab is used to import data. User can also run the app with sample data and clear the outputs with the clear button.

See Input Example

Data process

Data import tab automatically checks data and validate it for further use.

Options

Options tab is used to set metrics and peptides of interest. Guide set and known mean and standard deviation are also set within “Options” tab. User should select a proper and representative guide set using Options tab. The lower bound of guide set indicates the index of the first time point to be included in the guide set. For example, if you choose “1” as a lower bound, it means that first time point will be the first element of the guide set. Similarly, upper bound of guide set shows the index for the last observation. It is possible to use different guide sets for different metrics and peptides.

See MSstatsQC Option Tab

Control charts and change point analysis

Control charts tab is used to construct X and mR and CUSUMm and CUSUMv control charts.

See XmR Chart Tab

See CUSUMm and CUSUMv Charts Tab

See Change Point Estimation Tab

Summary functions: river and radar plots

Summary plots are available in the Metric summary tab under Detailed performance: plot summaries.

See Plot Summaries Output

See Input Example for Desicision Map

See Decision Map Output

Output

Plots created by the core plot functions are generated by plotly which is an R package for interactive plot generation. Each output generated by ‘plotly’ can be saved using the “plotly” toolset.

Project website

Please use MSstats.org/MSstatsQC and github repository for further details about this tool.

Question and issues

Please use Google group if you want to file bug reports or feature requests.

Citation

Please cite MSstatsQCGUI:

Session information

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.28   R6_2.5.1        jsonlite_1.7.2  magrittr_2.0.1 
##  [5] evaluate_0.14   rlang_0.4.12    stringi_1.7.5   jquerylib_0.1.4
##  [9] bslib_0.3.1     rmarkdown_2.11  tools_4.1.1     stringr_1.4.0  
## [13] xfun_0.27       yaml_2.2.1      fastmap_1.1.0   compiler_4.1.1 
## [17] htmltools_0.5.2 knitr_1.36      sass_0.4.0