How far do we expect HSDS resources to be standalone and re-usable?

At Open Data Services we work on a number of standards, such as 360Giving and Open Contracting

Most standards that we work on try to hold to the principle that the standard can, in principle, operate separately from the organisation or community that has created it. This has partly been a defensive measure - most of those organisations started life with limited financial means - and has partly been to avoid lock-in, in the spirit of true Open-ness.

An effect of that has been that we’ve tried to make sure that anything that we create can be used entirely independently: our go-to schema language is JSON Schema, which can be validated using almost any programming language you’re likely to see, and when we need to introduce new features that JSON Schema doesn’t support, we patch the meta-schema.

As a result, you can take the schema and your data, run it through your preferred JSON Schema validation tool, and get basically the same results as the “official” validator. You’ve also got documentation of all the non-standard things, that you can extend your validator to support.

Although the OR-UK validator is well-documented (and maybe open-source?), it doesn’t give me anything that I can just plug into - for example - my CI system to make sure that my pull request doesn’t break the feed’s compliance with OR-UK. Even if I run it myself, I still have to maintain it - and it might not be in a language that I understand.

This approach does introduce new constraints, however: we have to be much clearer on what is considered “correct”, and resolve any ambiguity early on. This is often welcome, but does make making changes harder. It also means that we’re limited by our chosen schema language: some of our standards still use JSON Schema DRAFT-4, which doesn’t have a date type, and means that there’s a horrible regex for one field.

So my question for the OR community is this: how much does this matter? Do we want to have this sort of portability? Is it worth the cost?

1 Like

Good questions and I hope others will respond.

The ORUK validator is open source with code here. The checks it performs are documented here.

My views are:

  • The core HSDS and any application profile derived from it should be in the tabular data format, so it can be validated with any generic validation tool

  • Open Referral should provide an online validator that checks an open API endpoint for validity against a given HSDS version and application profile (these might be identified in the API feed itself) as defined by the tabular data format. I think this is essential rather than relying on people to run their own validators. Such a validator may add extra checks such as whether particular query parameters are supported and how rich the data is.