Ever wondered how the FAIR and ALCOA + principles have helped scientists during the pandemic? In this blog, we delve into what these principles are and how they’ve acted as a guide to ensure data integrity and accuracy of research.
The COVID-19 pandemic highlighted to the world the importance of science, challenging our vaccine response rate and research and development (R&D) data infrastructures. The pandemic created a need for a wide range of data to be collected, collated, and analyzed both quickly and efficiently. The speed in capturing data on the virus and then producing valuable research which helped us combat its spread was in part facilitated by the FAIR and ALCOA+ guidelines, which ensured that the research was compliant, accurate, and could be replicated.
Making the pandemic related data compliant and in accordance with FAIR and ALCOA+ principles has also meant that the data is accessible to both machines and humans, which has not only perhaps improved the speed of the response to the pandemic, but also helped us prepare for the long term changes in research and development with increasing machine and Internet of Things(IoT) integration.
FAIR principles were introduced in 2014 in response to the rapidly changing technological environment that stimulated a shift in both the rate and volume of the production of research data. As a result of this shift, FAIR principles were created to set a standard for scientific research to increase the reusability of data. FAIR principles are designed to facilitate the reuse of research data to make it more readily accessible for both humans and machines.
> Both people and computers should be able to find your data or metadata related to your research with ease. This is one of the most important components of the process, as this ensures the automatic findability of datasets.
> Data must be stored in a manner where it is openly available, this does not implicitly mean that in all cases data must be ‘Open’, rather it outlines the conditions where this data is accessible. This includes an explanation of where the data, associated metadata, documentation, and code are deposited and how this can be accessed with potential authentication or authorization.
> Data must be structured in a way where it can be combined with other data sets. Data must be described in a standard way, using accepted metadata standards, and needs to interoperate with applications or workflows for storage, processing, and analysis. This allows the data to be ‘machine-actionable’ in order for values of attributes to be scrutinized across a range of data sets to ensure they are being measured and represented in the same way. Ultimately, the interoperability component of FAIR is an essential feature that upholds the value and usability of data.
> The researcher should make sure that the data is reusable by detailing the quality assurance procedures alongside documenting the data licensed. Data and metadata should therefore be described thoroughly so it can be replicated or used in different settings.
Medicine regulatory systems across the world depend on the knowledge of the organizations that develop, manufacture and package, test, distribute and monitor pharmaceutical products.
Therefore there is a certain degree of trust between the regulatory bodies and the pharmaceutical companies that the information submitted and used in decision making is both complete and reliable. Ultimately, the data that forms the basis of decisions should therefore be attributable, legible, contemporaneous, original, and accurate. These principles were put together and formed the acronym for the five main principles of data integrity, commonly referred to as “ALCOA”.
> Information must be captured in a way where it is uniquely identified by the originator of the data (e.g. person or computer).
> Information must be recorded in a way where it can be easily deciphered, understood, allowing a complete and clear picture of the sequencing of steps or events in the record.
> Data must be recorded at the time it was generated or observed.
> Data must include the first or source capture of data or information and all subsequent data required in order to fully reconstruct the activity required.
> Data must be correct, truthful, complete, valid and reliable.
These principles were later updated however to include four new additions, changing the term to ALCOA +.
Whilst the FAIR principles focus predominantly on the infrastructure for data, placing a large emphasis on metadata, conversely the ALCOA+ principles focus on data integrity issues, this makes it especially important for benchwork scientists. If the ALCOA+ principles are adhered to, it increases the trustworthiness of the data and subsequently makes research integrity easier to uphold. However, crucially, managing these attributes within electronic systems requires FAIR principles to be considered and implemented, to ensure that data and metadata are stored appropriately and data is readily accessible to uphold Horizon 2020’s open data aims.
The pandemic has created an unprecedented need for researchers to act quickly in order to tackle the SARS-Cov-2 virus. Everything from documenting and processing samples, establishing its origin, individual susceptibility due to DNA, to developing vaccinations and identifying mutations requires the practices of open research and responsible data sharing to be upheld. ALCOA+ has helped set the highest of standards for data integrity and as official guidance produced by the WHO, it has become increasingly important that data scientists and AI researchers working on COVID-19 adhere to demands of data integrity practices. In addition, it is equally important that research refers to the FAIR criteria for responsible data management, to not only ensure that scientists adopt the best practices for storing and sharing their research data, but also to enable and stimulate further collaboration within scientific communities globally.
Electronic Lab Notebooks (ELN’s) like Labfolder are designed to encourage both the ALCOA+ and FAIR principles. With features that facilitate data integrity and management, the digitization of scientific processes through software has helped researchers follow both the ALCOA+ and FAIR guidelines. This is because ELNs contribute to both reproducibility and reusability of data, alongside ensuring that research can be easily accessed and retrieved. Digital tools also make it easier to ensure data integrity with the oversight of metadata in an ELN; the entire experiment can be recorded from the planning stages to the results. Increasingly, researchers are looking for digital solutions, not only to set the foundations for long-term plans to connect the laboratory but also as a way to improve complex processes and make things more efficient and simpler for those working in research and development.
To find out more about Labfolder and how it can help your lab please:
Leave a Reply