Lead: Indhumathy Subramaniyan
My research is mainly focused on purifying and identifying stable protein complexes using 3D-HPLC on in-vivo crosslinked whole cell lysate. In vivo cross-linking is used to stabilize transient complexes and weak interactions for analysis of Protein-Protein Interactions. Traditional methods such as IP, Western blot are useful to study a single target protein only. In-contrast, our online 3DLC - LC-MS/MS approach is capable of identifying and quantifying protein-protein interactions on a large scale.
HEK293 cells are grown in SILAC medium supplemented with heavy, light labeled amino acids. Cross-linking was performed by exposing the heavy labeled controls as well as light labeled SDA cross linked cells to UV light and followed by BN-lysis. The heavy, light lysates were mixed 1:1 and fractionated using 3DLC system. In 3DLC, we have 3 orthogonal separation columns consisting of Size-Exclusion Chromatography, Mixed-bed Ion-Exchange chromatography and Hydrophobic Interaction Chromatography. The fractionated samples are digested and analyzed using LC/MS. Using correlation analysis, known and novel protein complexes can be identified based on the co-migration of interacting partners through multiple orthogonal separations.
Our system will be used to study and characterize the Protein-Protein Interactions temporally and spatially, relative to treatment conditions or disease status.
Lead: Sivaramakrishna Yadavalli
My primary research goals are directed towards improving a method for the quantitative proteome-wide profiling of ubiquitin modifications and their topology. I will be using gel-based proteomics with combination shotgun/targeted mass spectrometry(SRM) to identify and quantify polyubiquitin chain linkages in different physiological conditions. This work will allow comprehensive study of ubqiuitome patho-physiology in humans, and subsequent identification of novel E3 ligases important in cancer systems.
Lead: Andy Lemoff, in collaboration with Mangelsdorf Kliewer Lab
In humans, FGF19 (the human analog of FGF15) is well characterized by ELISA. However, in mice the FGF15 ELISA is unreliable and alternative technologies are required to detect and quantify FGF15 levels. Additionally there is some contention in the literature about whether FGF15 behaves similarly in mice as FGF19 in humans. In collaboration with the Kliewer/Mangelsdorf lab, we developed a SISCAPA (Stable Isotope Standard Capture with Anti-Peptide Antibodies) assay coupled with SRM mass spectrometry for detection of low levels FGF15 in mice. This assay enabled the detection of as low as ~1 fmol of the FGF15 peptide on column, and is specific based on tests on murine plasma in wild-type and FGF15-knockout mice. This technology is not limited to FGF15 and we can potentially develop assays for detecting low-level amounts of most proteins of interest.
Protein biomarkers have huge potential to allow the identification of cancer subtypes, permitting the development of novel therapies and individualized prevention and treatment. However, the analysis of proteomes to identify roubust biomarkers is complex and challenging. Comprehensive identification and quantification of proteins in complex mixtures, such as tissue samples, is made possible by the development of modern separation techniques coupled to advanced mass-spectrometers. Despite improvements in the field, the identification of highly specific, low abundance protein biomarkers, which are of particular interest to scientists and clinicians, has been a major obstacle.
In breast cancer, the diversity among tumors makes the identification of proteome even more challenging. A detailed knowledge of the proteins that become activated or repressed within a tumor, post-translational modifications and comparison of their levels of expression in cancer versus normal tissue should provide insight into disease mechanisms. This would greatly help improving patient stratification and individualized disease prevention or treatment.
My focus is to develop protein extraction methods for tissue samples, and LC-MS methods to detect and understand in detail the perturbations caused by tyrosine-kinases and other factors that in tumour versus normal breast tissue.
Lead: Xiaofeng Guo (now at U. Penn), David Trudgian
Recent developments in mass-spectrometry and sample preparation have enabled various research labs to identify in excess of 10,000 proteins from human cell lysate. Most transcripts found in RNA-Seq data can now be identified as proteins, using mass-spectrometry proteomics, ushering in an era of 'complete proteome coverage'. However, comparatively little attention has been given to increasing the sequence coverage of proteins identified. The proteome will only truly be covered when every amino acid of every protein can be observed. Improved sequence coverage is needed to identify otherwise hidden PTMs, isoform splice sites etc. and to improve our ability to generate targeted proteomics assays for these and other regions of interest on a protein.
We used an extensive panel of 48 single, double, and triple enzyme digests to generate a comprehensive map of the HeLa proteome. Over 430,000 unique peptide sequences were identified, more than are present in the current Human PeptideAtlas. We observed >8,500 proteins with 42% mean sequence coverage. Our Confetti coverage map will be a guide for researchers who need to consider non-tryptic digests to target proteins of interest.
Lead: David Trudgian
With the continual improvements in mass-spectrometry technology, as well as general growth in the field of MS proteomics, ever larger amounts of MS proteomics data are being acquired. Simple tasks such as database search for peptide and protein ID become difficult on projects involving 100s of GBs of raw data (such as our Confetti project above). When PTM identification is required, or non-specific enzymes have been used, computational requirements may exceed those available in the lab.
We continuously develop the Central Proteomics Facilities Pipeline, and produce other software, to cope with 'big-data' in our own laboratory. CPFP was improved in 2012 so that it can use remote HPC clusters, and the Amazon Web Services Cloud, to augment local computing resources on large projects. CPFP is an open-source project in collaboration with the University of Oxford, that is freely available to the community.
As well ensuring we can efficiently analyze our own data we want to exploit growing local and public repositories of MS files. Most proteomics data has a short useful life - it is acquired for a specific project and never examined again. Across the many deep-proteome, co-IP, and targeted experiments performed in even just a single institution, a vast amount of data from a wide range of cells, tissues, fluids and treatments has been accumulated. We are interested in developing tools to explore this mountain of data to identify often-neglected PTMs. We believe there is a wealth of PTM information out there, lying undiscovered, as datasets were never searched beyond simple protein ID, or for the most common PTMs such as phosphorylation. Cloud and HPC computing will allow entirely in-silico proteomics discoveries from terabytes of existing MS data.
Trudgian DC, Mirzaei H, Journal of proteome research, 2012 Oct;
Software: Cloud CPFP at SourceForge.net
Lead: David Trudgian, in collaboration with Kessler Group, University of Oxford
Proteomics groups and facilities often invest a great deal of time and effort in the optimization of mass-spectrometry acquisition methods. However, the standard reverse phase linear gradient of approx 0-30% ACN used to separate peptide mixtures for LC-MS is rarely adjusted. This standard gradient is 'good enough' for most gel-bands and complex mixtures, but becomes quite inefficient when analyzing samples that have been pre-fractionated at the peptide level. SCX, HILIC, OFFGEL, AEX, or high pH RP pre-fractionation are all only partially orthogonal to the on-line separation. Fractions can be heavily biased toward hydrophobic or hydrophilic peptides, meaning much of the standard gradient is wasted.
GOAT, our Gradient Optimization and Analysis Tool, provide a simple way to optimize an LC-MS gradient to evenly distribute peptides throughout a run. GOAT does not need peptide ID information, and can accept raw data from multiple vendors as input.
Software: Learn more about GOAT and download it at the GOAT website.