We would like to understand the community’s needs and preferences regarding metadata for protocols and methods. Please share your thoughts and opinions.
It is important to note that differences in methods can be a significant factor contributing to discrepancies in measurements. Therefore, a well-designed Open Data Model (ODM) should facilitate a clear understanding of these methods.
Currently, the ODM approach can be seen as a middle ground between Protocol.io and PAML. While Protocol.io is increasingly popular in our field, it lacks structured metadata. On the other hand, PAML offers robust support for structured data, but its complexity can be challenging for users.
There are a few other approaches to machine-actionable protocols that are noteworthy. Please add to the list.
LapOP. I believe PAML changed its name to LapOP? http://biorxiv.org/lookup/doi/10.1101/2022.07.05.498808 or https://dl.acm.org/doi/10.1145/3604568. However, the LapOP paper was just published in June 2023, whereas the similar PAML was published in Jan 2022. GitHub - Bioprotocols/labop: Laboratory Open Protocol (LabOP) Language The Laboratory Open Protocol Language (LabOP)
Autoprotocol - PALM/LapOB appears to build from Autoprotocol, and it has the same objectives as what the environmental epidemiology and surveillance community needs. It is high-level Open sourceJSON-based language developed by Transcriptic (now Strateos) The focus seems to be on cytometry, but there is a general metadata approach. Specification.
SBOL (Synthetic Biology Open Language): While SBOL is primarily used to describe genetic parts, it has extensions like PAML to describe protocols in a standard, machine-readable format. https://sbolstandard.org PAML is part of SBOL.
There are a few open source notebooks, but without much metadata or metadata structure:
- SciNote: SciNote is an open-source electronic lab notebook (ELN) that can be used to manage, store, share, and execute protocols. It includes features for data and metadata management, and its digital format may be more easily translated into a machine-readable format.
- eLabFTW: eLabFTW is another open-source electronic lab notebook (ELN) that includes support for protocols. It’s designed with a strong focus on data integrity, traceability, and security.
- Benchling: Benchling provides a suite of applications for life sciences, including an ELN that supports detailed, step-by-step protocols. Protocols in Benchling can include detailed metadata and be linked to other entities in the Benchling system, like samples or results.
- Jupyter Notebook: While not a laboratory-specific tool, Jupyter notebooks are widely used in data-intensive fields to create and share documents that contain both code (e.g., for data analysis or visualization) and narrative text (e.g., to provide context, interpretation, or reporting). In a lab setting, a Jupyter notebook could be used to document a protocol in a way that includes executable code alongside the text description of the protocol.
There are also open-source languages and software that integrates lab equipment. These systems have metadata, too.
- Antha: Antha, developed by Synthace, is an open-source language and software platform designed for specifying and executing complex biological workflows. Antha enables the integration of lab equipment from multiple vendors, allowing it to perform automated experiments as per the defined protocols.
- Aquarium: This system, developed by the Klavins Lab at the University of Washington, includes both a laboratory operating system and a language (Krill) that allows you to specify complex protocols that integrate manual and automated steps.
- SiLA (Standardization in Lab Automation): SiLA develops open communication and data standards for the rapid integration of lab equipment and data management, enabling networked labs and end-to-end automation of workflows.
- OpenTrons Protocol Designer: OpenTrons, a company that makes affordable lab robots, offers a visual protocol designer that lets you design and run experiments without needing to write code.
- LabView: LabVIEW is a system design software that provides tools for scientists to visualize, create, and code engineering systems and ensure the interoperability of devices.
Similarly, there are a growing number of Open-source LIMS that are similar to the above.
I’ve finally had a chance to review some of these options - wow, intense.
I think protocols.io is great, but it runs up against the problem highlighted in a the specifications for a lot of these other options where it stores information in natural language, or as free text. This makes it really hard to store or edit in the ways we might like to do.
On the flip side, PAML/LabOP is super complicated. It actually makes me feel better about the complexity of our protocols recording system, because it truly is in a league of it’s own. It seems really brilliant, but requires knowledge of several ontologies and markup languages to work. Another weakness I identify (besides the complexity and difficulty getting started) is that the protocols have to be stored as open-but-proprietary labOP files. This can be a barrier, since it’s hard to link or store this data in the same place as one stores measurement data. Or so it seems to me, at least.
LabOP builds off autoprotocols and the Synthetic Biology Open Language (SBOL), which also seem very smart and robust, but have the same weakness of complexity, requiring some knowledge of coding, and of needing to be stored as data files separate/different from the rest of your data. This cements for me that one of the major assets of our approach to protocols is that the protocols information is stored in the same place and format as the measures, samples, and metadata information.
I think a lot of the principles, points, and ideas behind these approaches are great, and they align really well with what we’re already doing/have already done. A lot of the problems they identify and seek to overcomes are ones we’ve also had to come up against. I think that after reviewing these, I actually feel better about our work and i think we do have a strong place and niche within the landscape of protocol data storage.