Heterogeneity within and between data sets makes scientific analysis significantly more challenging. It is thus imperative that data sets are cleaned up, standardized and harmonized before they are used for analytical purposes.
Data sets that are inconsistent in their use of terminology, ontologies and formats cannot provide the quality results researchers require.
Over the course of almost a decade, MediSapiens has gained a significant amount of experience in curating data and mapping this data to relevant ontologies. Working on both small and large datasets and from numerous different sources, we have experienced the good, the bad and the ugly of the curation process.
This ranges from the large amount of manual work spent on cleaning data sets up, to selecting and applying the appropriate ontologies to different columns in a data set, and all the way to the risks of human error and inconsistency when working in teams.
When we provide our clients with analytical solutions, such as our integrative analysis module Biond™, we cannot stress enough that great results require excellent data.
In our white paper “Data Curation: the essential step for integrated data-driven research” we take a closer look at not only the necessity of data curation and ontology mapping, but also the steps involved, the challenges and what future directions these processes are expected to take.
Read more from the white paper about data curation
Writer: Hans Garritzen, Key Account Manager