Researchers across pharma share the grand vision to link the genome to the phenotype. The proteome may offer that link and can serve as a very sensitive indicator of all kinds of real-world triggers.
Mass spectrometry is the ideal tool to capture the complexity in proteome profiles – and now in a depth that we can see even faint causes of health or disease – be it in plasma, be it in tissue.
We can correlate this to patient biomedical data across distinct cohorts. And with up-scaling techniques like high throughput automation, Biognosys is establishing the capacity to accumulate sufficient data for such large-scale comparisons.
But now we must cope with a new challenge. Precision medicine defines the cohorts under investigation as more and more narrow and smaller.
In consequence, the statistical confidence in the results suffers. And in search of biomarkers, pharma R&D may detect afterward that quite some of them may not have been worth pursuing.
In his talk at BioData in Basel on November 2nd, Sebastian Schegk, our Head of Data Business, shows how we can identify these false-positive biomarkers and with what strategies we can contain the problem.
In a case from lung cancer, we worked with biomarkers identified via several comparisons across tissue samples from NSCLC patient cohorts. Coming from comparisons between typically small project cohorts, we found among the identified biomarkers a substantial portion to be false positive.
In a comparison between cohorts from stage I vs stage III, in the typical size with 50 cases, we found 40% of the initially identified biomarkers to be false positive.
And in correlation testing along a continuum of progression-free survival time, we found even up to 90% of the biomarkers initially identified to be false positive.
In these two cases, we used samples and data from our partner Indivumed. They have established a high and coherent quality for their human tissue samples. This and the fact that the test cohort and the reference data cohort came from the same source made the case a somewhat ideal simulation.
But what if data availability is less ideal? If the test cohort and the reference data do not come from the same source ensuring the same high quality and very high coherence in biomedical variates?
We can expand the correlation testing and include a set of covariates to bridge the biomedical factors between the test cohort and the reference data cohort. And thus, we can work with a reference data set which may be similar, but surely not identical.
Such approaches are initial examples of how we add context to the proteome with patient biomedical and EHR data. In that case to boost the results from measurements of test samples in the hunt for protein biomarkers. More approaches are available along the R&D process.
Evolving are proteome in-silico studies in which a customer delegates his questions to us without the need of further own samples. Ideal for probing in-silico the settings for an upcoming clinical study.
Step by step, we are approaching our objective to bridge proteomics into big data in healthcare and to make it a major pillar of a federated database besides EHR and genomics.
Get in touch with our Head of Data Business, Sebastian Schegk, to schedule a meeting.