Copyright © 2014-2017 Allotrope Foundation, All Rights Reserved. Confidential Draft.
The Allotrope Data Format (ADF) is a family of specifications designed to standardize the acquisition, exchange, storage and access of analytical data captured in laboratory workflows. This document is an overview of ADF. It provides an entry point to its specifications and documentation:
THESE MATERIALS ARE PROVIDED "AS IS" AND ALLOTROPE EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE WARRANTIES OF NON-INFRINGEMENT, TITLE, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current Allotrope publications and the latest revision of this technical report can be found in the Allotrope technical reports index at http://purl.allotrope.org/TR/.
This document was published by the Allotrope Foundation as a First Public Working Draft. This document is intended to become an Allotrope Recommendation. If you wish to make comments regarding this document, please send them to firstname.lastname@example.org. All comments are welcome.
Publication as a First Public Working Draft does not imply endorsement by the Allotrope Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
The Allotrope Data Format (ADF) defines an interface for storing scientific observations from analytical chemistry. It is intended for long-term stability of archived analytical data and fast real-time access to it. This document provides an overview of the main components and links to corresponding specifications.
The document is structured as follows: Within the remaining part of the introduction, document conventions are described. In particular, naming conventions and a list of namespaces and corresponding prefixes are given that are used throughout different ADF specifications. Then, the general requirements for the ADF API stack and the different components are described. Detailed descriptions are available at the corresponding specifications.
The IRI of an entity has two parts: the namespace and the local identifier.
Within one RDF document the namespace might be associated by a shorter prefix.
For instance the namespace IRI
http://www.w3.org/2002/07/owl# is commonly associated with the prefix
and one can write
owl:Class instead of the full IRI
Within the biomedical domain the local identifier is often an alphanumeric ID which is not human readable.
The Allotrope Foundation Ontologies [AFO] follow this approach, e.g., a process is represented as
To enhance readability within this document, the preferred label from the ontology or taxonomy is used for the corresponding entity.
I.e., instead of
af-p:AFP_0001617 the corresponding entity is named as
If the namespace is clear by the context the prefix MAY be omitted and the entity is named simply
If the label contains spaces, the entity MAY be surrounded by guillemets to avoid ambiguities, e.g.
Within the ADF specifications, the following namespace prefix bindings are used:
Allotrope SHOULD use the common prefixes registered at prefix.cc.
Within this document the definitions of MUST, SHOULD and MAY are used as defined in [rfc2119].
Within this document, decimal numbers will use a dot "." as the decimal mark.
This section describes the key requirements to the Allotrope Data Format (ADF).
The key requirements for ADF are:
This section describes the high-level structure of the Allotrope Data Format (ADF). The following figure illustrates the high-level structure of ADF:An ADF file has six components where all data is stored. The data description component stores meta data in form of RDF [rdf11-concepts] statements. The named graphs store where user defined graphs can be stored in form of RDF [rdf11-concepts] statements The data cube component stores n-dimensional source or processed data that are the output of experimental processes. The data package component is a general purpose file storage (e.g. for reports or images). The audit store contains the audit trail. Finally, there is a component for checksums.
The following figure illustrates the high-level structure of the Allotrope Data Format (ADF) API stack.According to the high-level file structure there are three APIs for data package, data cube and data description (with a more technical RDF store API). The analytical data API provides specific methods for specific analytical techniques. Further, there is an API for the HDF5 file format and one for the audit trail. The taxonomies and ontologies provide the vocabulary and data structures for RDF representations.
The Data Package API specification [ADF-DP] defines how to store multiple (proprietary, binary, text) data files in a single package. The purpose of packaging is to ensure consistency and integrity of data files and metadata during storage and transfer. Files stored in the data package can represent source measurements or results of an experiment or process described in the data description. The vocabulary used within the ADF Data Package operations is described in the ADF Data Package Ontology [ADF-DPO].
The Data Cube API specification [ADF-DC] defines how to store one- or multi-dimensional data. This can be source or processed data, which may be sparse. Data cubes represent measurements or results of an experiment or process described in the data description. The vocabulary used within the ADF Data Cube operations is described in the ADF Data Cube Ontology [ADF-DCO].
The Data Description API specification [ADF-DD] defines how to store experiment or process descriptions and contextual metadata as well as metadata of Data Package and Data Cubes.
The Quadruple Store API specification [ADF-QS] provides an efficient persistence layer for the RDF Data Model based on HDF5 [HDF5]. It allows to store, query and retrieve RDF statements of the RDF Data Model.
The following figure shows the different ontologies that are used to describe data stored in ADF:
The ADF Data Package Ontology [ADF-DPO] provides the vocabulary for the operations of the ADF Data Package API [ADF-DP] on files and folders.
The ADF Data Cube Ontology [ADF-DCO] provides the vocabulary for the descriptions of multi-dimensional data and the operations on them. It is used by the ADF Data Cube API [ADF-DC].
The ADF Data Cube to HDF5 Mapping Ontology [ADF-DCO-HDF] provides the vocabulary for the descriptions of the mapping between the functional representation of data cubes to their physical representation in HDF5 data sets.
The ADF Audit Ontology [ADF-AUDIT] provides the vocabulary for the description of audit trail entries and electronic signatures.
ADF uses HDF5 [HDF5] as the underlying file format. The following figure illustrates the basic layout of an ADF file in HDFView with the six components for data description, named graphs, data cube, data package, checksums and audit trail:
|0.3.0||2015-04-30||Initial Working Draft version|