@martinwellman has been working hard on harnessing some of the existing infrastructure for linkml and their “linkml-map” library to do mapping from ODM to and from other data models. It has been quite successful, and has required some creative coding.
Right now there is an issue in mapping between version 1 and version 2 where certain IDs are not found in the same tables, and so mapping them to the final ID can be troublesome.
For example:
-
V2 Instruments table:
PKs/IDs are:- instrumentID
- datasetID
- contactID
- organizationID.
-
V1 Instruments table:
PKs/IDs are:- instrumentID
In version 1, samples and measures have an instrumentID as a header, but in version 2, samples and measures don’t have an instrumentID - only via protocolSteps. So to link the version 1 instrument ID to samples and measures, we’d need to link by following instruments → protocolSteps → protocolRelationships → protocols → samples or measures.
Except there are no rules at present for how to populate protocolRelationships when mapping from ODM v1 to v2.
This is because assayMethods (protocols in v1) didn’t use this same structure, and protocolRelationships are quire specific. I can think of two ways to potentially approach this though:
- We set a rough standard of what assay methods in version 1 would follow one another and then enforce that sort of relationship upon data that is mapped into version 2. This would create a rough road map of things and conform to a V2 structure, but this enforced mapping might not be true in all cases, and so could potentially introduce erroneous data.
- We create a new relationship ID called “not-reported”. This allows us to link things together without necessarily creating a more detailed relationship structure where there isn’t one. it also means a blank protocol step could be linked as “not-reported” and then be linked to protocols, and then to measures and samples.
I’m not sure if this is fully clear to folks, but I would be keen to hear people’s thoughts. @dmanuel @jeandavidt @Sorin