Can we constrain required (HSDS) attributes in JSON Schema?

@mrshll I’m thinking is an easy answer, but wanted to get the answer from the lips of the master in case there are peculiarities with our implementation of HSDS in JSON Schema.

Connect 211 is currently storing priority information about a variety of entities in HSDS. For example, the priority in which phone numbers for a given service should be called. We extended several of our tables with a priority column, but are unwinding that in favor of using the attribute object more as it gives us greater flexibility for other types of attributes as well.

My question is this: in a case where we make an a profile for internal use, can I require that certain attributes must exist, and/or that if they exist they must contain enumerated values? Can this be done:

  1. In JSON Schema writ large, and
  2. In the validator as it currently works

Thanks!

Sorry for long reply time on this; every time I’ve sat down to respond something has interrupted me!

@mrshll I’m thinking is an easy answer, but wanted to get the answer from the lips of the master in case there are peculiarities with our implementation of HSDS in JSON Schema.

You humble me!

Headline answer: I think this is possible but it starts drifting away from how attributes are designed to work and I would caution against using attribute in such a way.

My canonical data modelling advice will always be if you’re trying to enforce that a particular key-value pairing exists, that pairing is a candidate for modelling explicitly… which is sort of what you were doing before.

However, you asked to do it in a certain way, and I think it’s possible.

Attribute is a funny one in HSDS, I think. In fact I’d be really keen on seeing how you’re using it, because I think its schema doesn’t make it obvious as to what it’s supposed to be supporting.

Assuming you’re using them to describe key-value pairs for priority, and not delving deep into the use of linking taxonomies to attributes, I think you might be using them a bit like this:

(a phone)


{
  "id": "6f6401da-c9f3-11f0-8d75-2f7fceba557b",
  "number": "+44 1234 234567",
  "attributes": [
  	{
	  "id": "aaacaa58-c9f3-11f0-a18f-abba9a026175",
	  "link_id": "6f6401da-c9f3-11f0-8d75-2f7fceba557b",
	  "link_entity": "phone",
	  "link_type": "priority",
	  "value": "1"
	}
  ]
}

Your specific questions were:

can I require that certain attributes must exist, and/or that if they exist they must contain enumerated values? Can this be done [with JSON Schema and/or the validator]?

The current ORUK validator probably wouldn’t enforce this due to the hard-coded profiles; however Jeff’s iterations will be using the schemas (and Profile schemas) directly. So this is mostly a schema thing unless you want to add a data quality step to any checking you do.

My interpretation of the challenge is that you’d like to enforce for certain objects:

  • that e.g. phone.attributes contains an item where attribute.link_value is set to “priority”
  • wherever attribute.link_value is set to “priority”, the attribute.value property is set to an enum

This should be possible with the contains keyword in JSON Schema (source), along with maxContains and minContains (source).

contains is a little counter-intuitive but seems to be what you want here. It enforces that a certain number (default: at least one) item of an array matches a given sub-schema which is then defined in the contains block. All other array items aren’t subject to that schema, but are subject to the validation requirements given in the items property of the array definition.

So the job becomes to define a sub-schema which will encapsulate your requirements. Written as you’d write it in a Profile, it’d look something like this:

(profile/phone.json)

{
  "properties": {
    "attributes": {
      "contains": {
        "type": "object",
        "properties": {
          "link_type": {
            "const": "priority"
          },
          "value": {
            "enum": [
              "0",
              "1",
              "2",
              "…"
            ]
          }
        },
        "required": [
          "link_type",
          "value"
        ]
      },
      "minContains": 1,
      "maxContains": 1
    }
  }
}

Because the Profile system does JSON Merge Patching, I think you can omit things that you’re not overriding but we need to be aware that every item in the attributes array also needs to be an instance of attribute.json.

What we’re also saying here is that exactly one of those instances of attribute.json must also match this other schema we’re defining here. We define that a link_type property must have a value of priority and that the value of value must be drawn from the enum defined (I think attribute.value is a String, so you’d get a conflict if you tried to override it as an integer). We also enforce that link_type and value are both present.

The minContains overrides the default “at least one” behaviour of the contains block, here I’ve set it explicitly to be 1 to reduce ambiguity but technically it can be omitted. maxContains also being set to 1 also enforces that the maximum number of items matching the new schema is 1. That means you can’t have two instances where attribute.link_type is set to priority.