First draft of HSDS 3.0 proposal

Hi folks – @davidraznick has shared a first draft of the HSDS 3.0 proposal here in this document that outlines major changes, with schemas here: GitHub - openreferral/specification at 3.0-dev

We’re eager to hear your initial feedback. The workgroup will meet on Monday to discuss any remaining significant issues before this is formally staged and presented for a formal Request for Comment period. In the meantime, informally, we request your comments!

~greg

1 Like

Hi all,

  1. I agree with Devin’s comment about the only direction of requesting API.
    It is easy to imagine a request for an organization and a list of its services, a list of services available within a radius of an address (requires search for address - locations - services_at_locations), etc.

  2. Complete search results schema with 53 related datasets (page 2 of the doc) looks like a nightmare for requesting database and performance optimization.
    Previously we discussed an idea of limitation of results completeness by setting some input parameter with desired output structure. Is it still under consideration?

  3. It seemed to me that metadata is returned by default in the results of any query. History data will have bigger volume than up to date ones but will be required rather rare. Does it make sense to establish separate nodes/schemas for them?

Thank you

1 Like

Thanks for the feedback.

The main point of the proposal was to convey the compromises involved with using JSON as the primary representation.
The main concern (and also benefit) is that the standard now has to define one or more directions/perspectives on the data, that publishers who choose the JSON output (datapackages will still be supported), should conform to.

The direction in the proposal was a first attempt at a single representation. So illustrative of the compromises involved, as any single representation will have compromises.

The standard could allow publisher to publish ANY JSON representation as long as it conforms the the structure of the relational schema. However, that is not great for interoperability. It will nonetheless not stop any publisher publishing extra perspectives.

The bigger question is:

To be conformant to HSDS what representation are publishers required to publish?

The 3 main representations that cover most use cases seem to be:

  • Service
  • Service at Location
  • Organization with service list

The standard needs to define if a publisher needs all, or just one of these, to be conformant. My personal opinion on this is that “Service” should be required but the other 2 optional, however “Service at Location” would likely be most useful to most users (my opinion again).
The organization perspective seems more useful to publishers/aggregators but not for people looking for services.

I do agree though that at least those three perspectives should be standardized i.e have defined schemas in the standard. There does need some discussion of exactly what these representations actually are.

The three perspecives above should cover those cases. However, do we mandate that every publisher has to produce all 3?

Most publishers will only have a small subset of those tables/relations so 53 is the worst case. I imagine most will have only around 10-20. Ideally implementations would have the JSON representations cached to help with performance.

Metadata should definitely be optional on any endpoint.

@devin
I have updated the tool to produce examples for top level organization and service at location they can be found here. This is the one for organization.

We had an HSDS workgroup call on this topic.

There seemed to be general agreement on:

  • Structuring the schema development into a folder of schemas as long at there is a way to compile the datapackage.json from them (which there is)
  • We need to define APIs specification with JSON Schema and we can use these to help do that.
  • Adding examples to the schema.

I think there is not complete agreement on:

Having JSON versions of the standard that contains nested data and therefore has an opinionated way to structure HSDS i.e to agree on a primary representations

So for this we have to assess the pros and cons of dong this.

Pros:

  • Have an easier way to validate HSDS.
    • Means that we can have good external validation tooling that works not just for particular APIs but for standard itself.
  • Have better validation rules, especially ones that span relations, so the standard could say if an “organization has at least 1 service”.
  • Less proliferation of different JSON structures used, to aid interoperability.

Cons:

  • Less flexibility for publishers to use the HSDS “data model” in any way they choose.
  • Data will have to be denormalized in some way i.e taxonomy terms repeated each time. This could cause issues for consumer if a publisher uses different data each time.

The sub question, is if we do agree to the above:

What representation(s) should be standardized/mandated?

I think it was agreed that the following should be standardized:

  • Service
  • Service at Location
  • Organization with full service list (when there are never too many services for a particular org)

The question remains: what is acceptable for a publication to produce in order to be conformant?

This simple answer could be, any of them. However, ideally could mandate one for better interoperability. Obviously, we would also say that a datapackage would also be accepted.

The spam filter removed my last post, can anyone restore it?

Weird. I fixed it.

I’m still trying to figure out how to make sure I get notifications from Discourse, for some reason I can’t seem to consistently get emails when new posts are made! And this is the kind of thing I would have wanted to be alerted about… would welcome any pointers.

Thanks David.

I’m not sure I understand all the details, but here’s what’s important to me:

  • to have auto-generated documentation in the format it is now with tables (entities) and their fields (properties) and associated descriptions and validation rules documented
  • to be able to see the data structure in a diagram similar to current entity relation diagrams
  • to have automated ways of generating web response formats to, at least, these web methods (names may differ):
    ** /services
    ** /services/{id}
    ** /serviceAtLocations
    ** /serviceAtLocations/{id}
    ** /organizations
    ** /organizations/{id}
    ** /taxonomies
    ** /taxonomies/(id)
    ** /taxonomyTerms
    ** /taxonomyTerms{id}
  • that the above automated ways work for any application profile (e.g. for Core Tables) - initially just for filtered sets of tables/fields from the full HSDS but later with extra features like code lists for some fields and extra validations

I see a validator as checking a full paginated list of services and then sequentially details of every service. The same approach could be taken to read a full directory.

The proposal documentation has been updated:

The main change is that now both a “Service Orientated” and “Organization Orientated” representation will be allowed. Either one or both should be provided.

Also there is now a list of other optional representations that the standard will provide schema and examples for.

  • Service at Location Full
  • Service at Location Simple (only one-to-one)
  • Service Simple
  • Organization Simple
  • Taxonomy
  • Taxonomy Term
  • Location

This should cover Open Referral UK needs. The intention is that all these, as well as examples and an ERD diagram, can be produced automatically for particular profiles of the standard (once we define how we do that).