Biobanking and biomedical research is now in the beginning of big data era. Biobanks collect, analyze, store, and share the samples and associated data. The amount of data is growing enormously and it comes with set of challenges. Since biobanking is multidisciplinary, it affects many research areas like medicine, biology, systems biology, information technologies (IT), artificial intelligence (AI), machine learning, modelling, statistics and so on. The qualitative and quantitative aspects of data associated with samples is growing fast, the data structure becomes more complicated, and the management of data during their entire life cycle require specific approaches that were not required in recent past. Biobanks are in the center of the new paradigm of big-data-driven biomedical research with promising revolution in healthcare, with important advancements from the management of chronic diseases to delivery of personalized medicine.
Latvian Biomedical Research and Study centre (LBMC) and Genome Database of Latvian Population (LGDB) collects a significant amount of data from various national, regional, and international research projects and deals with various types of samples and data. The “Guidelines for the maintenance of efficient biobank, health register and research associated data management” is committed to improving awareness among all stakeholders within LBMC and outside it about the data ecosystem. This includes in-depth inspection of the current situation in terms of all data-related artefacts, i.e. projects conducted, cohorts, etc. This also includes an attempt to make data complying with FAIR principles, i.e. findable, accessible, interoperable, and re-usable. We expect to identify what data each of the projects generates covering both ongoing and previous projects (the status of the project and relevant data will be fixed, i.e. whether the data is or could be potentially used), whether and how it will or could be made accessible for verification and re-use, and how it will be curated and preserved.
This document describes the data management life cycle for all data to be collected, processed and/or generated by a project and outside the project in the LBMC. Formerly, it contains information on the nature of the data, on the handling and processing of research, and standards/methodology. Essential information of the document is whether and how data will be shared or made open access, and how the data will be curated and preserved. It provides both, the current state of the play in LBMC, best practices, and therefore the agenda for both short- and long-term improvement to be made to meet the goals. The document is therefore beneficial for both, (1) to LBMC by inspecting internal processes, data flows, and their management and defining the scope for the improvements, (2) and for other biobanks since along with the LBMC-specific data management issues, best practices are summarized and defined as guidelines.