Hierarchy for enumerations

In ODM v2/v3 there is currently no hierarchy for the values that a categorical/enum slot can take on. Below is an example of a hierarchy for the enum values in the “collection device” slot in PHA4GE, which is categorical:

Grab sampler
      Core sampling device
      Vacuum sludge sampling device
      Cone-shaped sampling device
      Horizontal grab sampling device
      Vertical grab sampling device
Composite sampler
      Passive (trap) sampler
            Moore swab
      Automatic composite sampler
            Automatic flow-proportional sampler
            Automatic sequential (time-proportional) sampler
Bag filtration device

For data entry purposes, some people will enter the full hierarchy for certain slots (we discussed this with the PHA4GE people, who said that this is sometimes done). For example, instead of entering just ‘Moore swab’, some people might enter 3 values to specify the full hierarchy: [‘Composite sampler’, ‘Passive (trap) sampler’, ‘Moore swab’]. In this example we would typically want to convert it to a single value by removing the values higher up in the hierarchy, so that we only have ‘Moore swab’ (since in ODM we only allow single-valued slots, instead of multivalued slots).

For the PHES-ODM-Mapper I’ve written code to only keep the deepest enum values in the hierarchy. In other words, if any value in the list of values for the slot is a parent of one of the other values, then it gets removed (in the above case we just keep ‘Moore swab’ from the list of 3 values).

There might be other reasons to include a hierarchy, such as for data entry purposes to select from a pick-list, a hierarchy might make it easier for the user.

Having a hierarchy is supported in LinkML schemas. For an enum value the is_a attribute can be specified for the value. This is similar to specifying that the enum value is a child of another value. For example, Moore swab would have an is_a value of Passive (trap) sampler and Passive (trap) sampler would have an is_a value of Composite sampler.

Adding a hierarchy to ODM should not break anything. I think it would be useful to discuss if we should include this type of hierarchy in ODM.

Martin