Data product to support research on human genome
As human genomes are increasingly incorporated in patient records and research databases, information technology will be challenged to sort through the massive amounts of data to enable research and improve patient care.
That’s the promise of new technology introduced by a Canadian company, which is building a product intended to help large provider research organizations that will need to manage massive databases to facilitate precision medicine applications.
PHEMI, based in Vancouver, is releasing PHEMI Central Precision Medicine Edition to address the data challenges posed by genomic research. The company estimates that 100 million to 2 billion human genomes will be sequenced in the next decade, with a single human genome estimated to fill at least 3 gigabytes of data.
The company’s product works “out of the box” and uses the Oracle Big Data Appliance and is built on Hadoop Big Data technology, enabling it to take in millions of variants in less than a minute and provide sub-second query response times. The vendor says these capabilities can help reduce the time researchers and analysts spend loading, searching and analyzing vast amounts of genomic and phenotypic data, as well as drive better understanding of disease and treatment options.
“Our customers in precision medicine have told us they need two key things: a unified view across genotypic and phenotypic information at the whole genome level for hundreds of thousands of samples, and the ability to look up and query that data at speed,” says Paul Terry, PHEMI’s CEO. The product provides “interactive performance across the genotype and phenotype at scale, independent of the amount and types of data in the system.”
Standard interfaces within the PHEMI offering provide integration with existing analytics tools, while PHEMI’s built-in data management and privacy functionality enforce rightful access to data. PHEMI Central Precision Medicine Edition provides the means to combine genotypic, phenotypic and clinical data from individual silos, annotate it using multiple reference datasets, as well as index and query it at speed and scale to drive discovery and better understanding of disease and treatment options. It is available as a managed cloud service or on-premise.