Methods in molecular biology
-
High-throughput techniques are indispensable for aiding basic and translational research. Among them, recent advances in proteomics techniques have allowed biomedical researchers to characterize the proteome of multiple organisms. ⋯ This chapter provides an overview of computational strategies, methods, and techniques reported in this book for bioinformatics analysis of protein data. An outline of many bioinformatics tools, databases, and proteomic techniques described in each of the chapters is given here.
-
Many publicly available data repositories and resources have been developed to support protein-related information management, data-driven hypothesis generation, and biological knowledge discovery. To help researchers quickly find the appropriate protein-related informatics resources, we present a comprehensive review (with categorization and description) of major protein bioinformatics databases in this chapter. We also discuss the challenges and opportunities for developing next-generation protein bioinformatics databases and resources to support data integration and data analytics in the Big Data era.
-
Advancements in MS-based phospho-proteomics techniques have helped uncover hundred thousands of protein phosphorylation sites in human and various model organisms. The majority of these sites are uncharacterized. ⋯ Analyzing the phosphorylation and sequence conservation of uncharacterized sites across species can help reveal a subset of the functionally important phosphorylation events. Here, we outline the workflow and provide an overview of publicly available computational resources for conservation analysis of novel phosphorylation sites.
-
Despite recent advances in mass spectrometric sequencing speed and improved sensitivity, the in-depth analysis of proteomes still widely relies on off-line peptide separation and fractionation to deal with the enormous molecular complexity of shotgun digested proteomes. While a multitude of methods has been established for off-line peptide separation using HPLC columns, their use can be limited particularly when sample quantities are scarce. In this protocol, we describe an approach which combines high pH reversed-phase peptide separation into few fractions in StageTip micro-columns. ⋯ Here, we provide a step-by-step protocol for TMT6plex labeling of peptides, the construction of StageTips, sample fractionation and pooling schemes adjusted to different types of analytes, mass spectrometric sample measurement, and downstream data processing using MaxQuant. To illustrate the expected results using this protocol, we provide results from an unlabeled and a TMT6plex labeled phosphopeptide sample leading to the identification of >17,000 phosphopeptides in 8 h (Q Exactive HF) and >23,000 TMT6plex labeled phosphopeptides (Q Exactive Plus) in 12 h of measurement time. Importantly, this protocol is equally applicable to the fractionation of full proteome digests.
-
Post-translational modifications (PTMs) are covalent modifications that proteins might undergo following or sometimes during the process of translation. Together with gene diversity, PTMs contribute to the overall variety of possible protein function for a given organism. Single-nucleotide polymorphisms (SNPs) are the most common form of variations found in the human genome, and have been found to be associated with diseases like Alzheimer's disease (AD) and Parkinson's disease (PD), among many others. ⋯ However, these data are unsystematically distributed across a number of diverse databases. Thus, there is a need for efforts toward data standardization and validation of bioinformatics algorithms that can fully leverage SNP and PTM information for biomedical research. In this book chapter, we will present some of the commonly used databases for both SNVs and PTMs and describe a broad approach that can be applied to many scenarios for studying the impact of nsSNVs on PTM sites of human proteins.