Table 1 Comparison between contemporary designs and this work towards the realisation of distributed self-driving laboratories (SDLs)
From: A dynamic knowledge graph approach to distributed self-driving laboratories
Reference | Resource abstraction | Workflow orchestration | Data serialisation and storage | Experimental provenance |
|---|---|---|---|---|
SiLA28 | Hardware functions are abstracted as SiLA Features following a micro-service architecture with their behaviour described as a state machine. | A sequence of function calls using gRPC and HTTP/2 protocols. | AnIML28 is employed as a file-less medium for bidirectional analytical data transmission between laboratory information management systems and chromatography data systems. | Device metadata collected during measurements are stored in XML files. |
ChemOS76 | Both software and hardware are abstracted as SiLA servers. Time-consuming computational jobs are managed by AiiDA77 on SLURM41. | The central coordinator executes a sequence of Python function calls. The coordinator is also responsible for creating job files in the format required by each hardware/software. | Data are streamed between the central coordinator and each device in diverse file formats, e.g., pickle object, JSON, and CSV. The storage is done in an internal database with a schema consisting of device-agnostic and device-specific tables. | Job execution logs are stored with timestamps in the database table corresponding to each device. |
ESCALATE24 | The Django framework is used to abstract resources as REST API endpoints. | A sequence of steps encoded for “ExperimentTemplate” and accessible via REST API endpoints. | Data are stored in a PostgreSQL database and served via Django REST API78 endpoints that serialise the data into JSON format for web transfer and inspection. | Metadata associated with the execution of workflow steps are stored with experiment instances in the relational database. |
HALEO25 | Device and their functions (“actions”) are represented as hierarchical and asynchronous FastAPI79 web servers. | A central coordinator executes a sequence of API calls (wrapped as Python functions). | Data are recorded as “groups” and “datasets” in the HDF5 file format and deposited into institutional repositories. | Metadata of “actions” are recorded in the same HDF5 file as the experimental measurements. |
Hardware is categorised/abstracted based on the unit operations in chemical reactions that it can execute. | The sequence of synthesis steps is expressed in XML format, which is later compiled into machine-actionable instructions in a Python script. | The analysis reports are kept in their native file format and bundled with the synthesis description in a PostgreSQL database. | Hardware instructions, the actual performed actions, and other metadata are stored with experiment instances in the relational database. | |
This work | Hardware is virtualised as a digital twin in a knowledge graph, where its control interface, akin to software resources, is wrapped using the derivation agent template. | DMTA cycles are expressed as directed acyclic graphs of “derivations” in the knowledge graph each referring to a step managed by the respective software agent. | Data are expressed in ontological format (triples) wherever possible. Files (e.g., CSV and XLS) are stored on a file server. Their ontological translation and pointers to the file server location are stored in the triple store. | The inputs/outputs annotation of each step in the workflow is recorded by the derived information framework as triples. The detailed operation timing is not recorded owing to API limitations in obtaining information at this level of granularity. |