Hi folks,
As indicated in the Technical Working group meeting, I’ve recently completed a review of the HSDS documentation which has resulted in reorganising and expanding the existing Human Services Data Spec and HSDS Implementation Guidance sections. The first draft is currently staged on a branch of the docs available here. There is an accompanying Pull Request here
We are seeking feedback from the community to ensure that this represents what the community wants from the HSDS Documentation. We won’t merge anything in until we’ve resolved any blocking issues you raise, or have agreed that if something is “safe enough” for now then we can revisit it in a subsequent update.
The review was originally instigated to respond to several disparate issues raised by the community, such as explicating on 1:1 or 1:many relationships between tables, table formatting in the schedules section, clarifying the foreign key fields, and consolidating the advice on Profiles and “Extending HSDS” which was emergent from the recent Profiles Documentation.
Tackling these in isolation revealed underlying issues with the structures of the docs for 3.0. Most notably, there was a lot of guidance which was now invalid because it referenced tables which no longer exist or are handled differently. A lot of the language across the docs also referred to “tables”, whereas JSON is now the canonical format of HSDS, with Tabular Data Packages being a supported serialization.
In addition to this, there was a lot about HSDS which didn’t appear to be written down. This made it difficult to frame certain things; how can we explain what “Extending HSDS” means, without defining what it means to be conformant with HSDS in the first place? In order to address this, I formalised what I believed to be our shared implicit understanding of the way HSDS works. This is obviously open to correction! One of the benefits of writing these down is we get to question our assumptions and refine them.
As part of this, I also took the opportunity to refine some of our examples to ensure that they are presented in the canonical JSON form of HSDS. For most of these, there is also a Tabular Data Package example provided as well so that we continue to support people using this serialization.
The result at this stage is the start of what I hope becomes a more comprehensive set of technical documentation for HSDS and its community. As noted, there is plenty of room for further adjustment based on your suggestions, feedback, and concerns if you have them.
And just for very explicit clarity; there is very little content that was dropped entirely in this review. Essentially, the only thing which was omitted was the Tables to Fields Transformation section, because it didn’t seem to fit the model of HSDS 3.0 at all and thus shouldn’t be encouraged. There was a few repeated instances of “Sharing with the community”, which have been consolidated. If the community feel strongly to the contrary, I hope we can engage productively on understanding the mechanisms behind this feature and refactoring the guidance to fit alongside HSDS 3.0.
A summary of changes encapsulated in the PR:
- There is now a clearly defined normative reference section which consolidates the reference material for HSDS.
- Language has been brought in line with the HSDS schemas. Rather than “table”, we say “object” now when discussing the canonical HSDS models. In some cases, we still refer to table when making a comparison with or describing the tabular representations of HSDS.
- Content such as the “Logical Model” has been formalised as part of the overview and model now contained in the reference section.
- Assuming that HSDS and the community are still happy with the language of “core tables”, there’s now sub-sections in the Schema Reference page for “Core Objects” and “Other Objects” to make this consistent with language used in the ERD and models.
- Identifiers guidance has been formalised as part of the reference section, and greatly expanded to provide more insight as to how identifiers should be used in HSDS.
- Expanded documentation regarding the
Page
schema provided by the API documentation. - Where possible, there’s linking out to appropriate external documentation most notably various IETF RFCs. This is to reduce ambiguity.
- Where appropriate, introduced some language from RFC 2199 in some normative documentation. Because of community feedback from previous engagement, this has been limited in scope to where I felt it was strictly necessary to provide an unambiguous reference.
- Defined what it means to be conformant to HSDS by providing a conformance page, outlining the high-level conformance rules for HSDS.
- Formalised a Profile specification, meaning that it can be de-coupled from the current implementation via HSDS Schema Tools + the example profile repository should the community wish to replace these or provide alternative implementations. This also enables us to define conformance to HSDS in terms of a Profile.
- Removed the UK Compliance page and placed a link to the OR UK Profile docs in the Known Profiles section of “Using Profiles”. (Note: I am keen to hear where people actually want to have links to Profiles so they’re not hidden away!)
- Removed the references to UK Compliance in the API Reference page
- Removed the admonitions in the API reference page and replaced it with explicit REQUIRED and OPTIONAL language derived from RFC 2119.
- Expanded the JSON section of the Publication Formats page to state that JSON should be de-referenced where possible.
- Refactored the Publication Formats page as a reference material using the language of serialization. This enables us to explicitly state the canonical form of HSDS 3.0 is JSON but that the Tabular Data Package format is a supported serialization.
- Refactored the Publication Guidance page by splitting it up. There’s now an explicit “Mapping Data to HSDS” page containing updated guidance on mapping data sources to HSDS, updated guidance on “Extending HSDS” in friendly terms which builds on the rules set out in the Conformance section.
- Removed the “Profiles, Variations, and Interoperability” page and refactored the content. Removed the Tables to fields guidance, formalised the definition of a Profile in the Profiles Reference, and moved the guidance to a new “Using Profiles” guidance page.
- Refactored the guidance on “Schedules”, “Classifications and Taxonomies”, and “Names and Descriptions” to update it in line with HSDS 3.0 and merged the guidance into a new top-level “Field and Object Guidance” page.
- Worked examples on all of these were brought in line with HSDS 3.0, and are now provided as both JSON examples as well as Tabular Data Package examples.
- As part of the above, fixed the errors with the tables being way too long.
- Under-the-hood, made it easier to manage worked examples by storing them in a
figures
directory inside the docs directory. This means that there’s no more nasty inline markdown tables you have to edit in Vim, you can define the examples in individual JSON and CSV files and then import them into the docs.