HSDS 3.0 - Key Decisions

Hi all,

Reflecting on the discussions that took place in last week’s workgroup, there are some key decisions we think need to be made (ideally sooner rather than later) in order to progress with the HSDS 3.0 release.

Decision 1 - What is the minimally viable set of endpoints required by our API specifications?

At the last meeting there seemed to be a level of agreement that either of the following would be accepted as minimally viable:

  1. A /service endpoint that contains a paginated list of full services defined by service json schema.

  2. A /service endpoint that contains a paginated list of limited service fields example AND a /service/{id} endpoint which contains a single full service defined by service json schema (this is what OpenReferral UK currently do)

This differs from Mike’s proposal which enforces the use of point 2 only and also the inclusion of:

  • taxonomy_term - with the optional parameter of “taxonomy_id”
  • taxonomy_term/{id}
  • taxonomies
  • taxonomies/{id} - renamed from vocabularies in version 1

A decision needs to be made as to whether we accept Mike’s proposal in full? Or whether an alternative needs to be explored?

Decision 2 - What format(s) – i.e. API vs datapackage – should be required for representation? And, should there be tiers of compliance?

We’ve discussed this question already at length, and have come close to reaching provisional consensus that .json/API endpoints would be a requirement, with datapackage as a ‘compatible’ option.

However, at the last meeting (prompted by concerns around data quality), the group discussed taking a stronger stance on the use of API over Datapackage. Inevitably, this resulted in concerns regarding the impact this would have on smaller publishers.

Before committing to the above proposal, we want to get clearer on our stance on the minimally required API endpoints.

So, key decisions to made are:

a/ Will we accept the use of Data Package as HSDS compliant?

b/ Do we want to introduce ‘levels’ of compliance? If so, what terminology should we use? i.e. Compatible/Compliant, Full/Partial/Non-compliance, etc?

c/ Do we need to consider the use of tooling to support those using data packages? Please note: Cost and complexity of the tooling required differ depending on decision 1.

Decision 3 - Should /service and/or /organization be required as a top-level entity?

This has been discussed at meetings and there has also been some dialogue within the comments of the HSDS 3.0 Schema document.

There is a level of consensus around having service as the top-level entity, especially among application developers who organize their tools around the delivery of services. There is, however, some concern about prioritizing /service over /organization since the latter is legally determinable and the former is editorially subjective.

There now needs to be a decision as to whether those using organization as the top-level object (i.e. an ‘/organisation’ endpoint) instead of service will be deemed compliant?

Related threads for reference:

We would really appreciate your views on these areas. Please share them in the comments section or at the next workgroup meetings for those who attend.

Regards,

Dan

Thanks so much for the summary, Dan.

One more link to remind folks: the remaining field-level issues in the schema are being discussed here. Seems like we’re close to consensus.

We’ll try to reach agreement on this on Monday’s workgroup call. Input is encouraged before then. (Although Monday’s decision will be provisional in advance of a formal proposal which will still be subject to comment and revision – so it won’t technically be the last chance for these questions to be considered. But sooner we identify concerns and/or alternatives, the better!)

1 Like

1. Let’s do both? I’ve heard a couple great ideas from Mike (I think it was Mike?) that play well together:

a) by default a core API like /service should return all data that is either part of that table schema or has a 1:1 relationship with that table.
b) Include an optional parameter that returns a fully denormalized version of that table, including 1:many relationships. I’ll add that, if we want these endpoints to be optionally useful for replication between systems, then we need to include the ID field for all records, including 1:1 and 1:many relationships.

I also agree that we should the taxonomy endpoints should be included as well. Because taxonomy_term is the only table outside the core tables that currently allows a many:many relationship by way of a join table, it may be much more efficient in some cases for devs to access this critical data at its own endpoint. I could perhaps make a case for not requiring these APIs if option B above is included in the spec, as devs can potentially get their taxonomies that way. However, if we are not allowing 1:many relationships in any APIs, then we quite obviously need these endpoints as it will be the only way to fetch taxonomies.

2) I do think it’s non-negotiable to keep datapackage.json as some sort of compliance for edge cases where data providers are small, poorly funded, temporary, or otherwise not able to invest in full-on API infrastructure. However, I’d like the industry to see this allowance as a rare exception. Most enterprises should adopt the API standard fully.

I like the language of “compatible” versus “compliant”; I just think of it as HSDS compliant vs HSDA compliant.

3) I’m really in favor of going all-out with the API specification. Require the /organization endpoint as well. If we are allowing datapackage.json as some level of compliance, then let’s require everyone who is going to roll an API make a really useful one. Certainly, for the use case of moving bulk data via API /organization will be far more efficient than /service.

1 Like

Brilliant! Thanks so much for this Skyler. See you on Monday.

Here is my/ODS opinion on the above.

Decision 1

Happy with Mikes full proposal.

Acknowledging this coming from a developer/aggregator perspective and may involve cost but we think it could be improved by:

  • Adding a paginated services endpoint with the full nested data in.
  • A JSON lines version of the paginated endpoints. This gets round some consistancy issues with pagination.
  • Service list views that only have changes since a certain date.
  • Service list view that only has the service ID and latest changed date.

This would future proof the standard by covering a lot of the issues regarding reconciliation discussed at the last meeting.

Decision 2

a/ Yes but deprecate its use. No in the long run (i.e. until sufficient tooling is available)

b/ If we do decide to accept multiple options we need to ensure we clearly communicate the features/capabilities and also the limitations of each option.

Perhaps we can present the various options using a feature/capability tick list with fully compliant API as the gold standard i.e. all of the ticks.

c/ Whether data package is accepted as part of the standard or not, we don’t think it’s possible to achieve adequate interoperability and meaningful uptake without the provision of tooling.

Decision 3

As services are the most useful for the end-user we think they should be mandatory for HSDS compliance.

To support reconciliation we also think organization lists should also be strongly recommended but not mandatory to reduce complexity.

I support Mike’s proposal as well.

But I think we should add a method/endpoint for pulling the entire dataset’s datapackage via API.

Then, Open Referral, as a projec,t, should offer a free/open source tool that can publish a datapackage contents in the standard Open Referral API.

As for the language around “Open Referral compliant/compatible/etc”, I don’t think we’re prepared to make a firm decisions about that, and that’s okay. We can release HSDS 3.0 and the API spec, and vendors/solution_providers/projects can say “we follow the HSDS 3.0 standard.”

When we have a 3.0 HSDS tool ecosystem where data can be validated, registered, queried, etc (like Open Referral UK offered for the previous version of the standard), then I think the minimum/fully compatible/complaint language becomes more important. But the language conversation can and should go beyond tooling to organizational design and branding/marketing as well.

IMO some questions to be discussed include:

  • Does Open Referral, as a project/entity, certify softweare/systems/approaches? Does it offer a directory of endorsed service projects/solutions? Does it charge for that service?
  • Does Open Referral care if data is public? If data is formatted using HSDS but private, is that okay? What do we call that?
  • What tools are accessible to to compatible/compliant and public datasets? Who builds, pays for, markets, maintains them?

These just some of the questions we need to tackle when it comes to thinking through the language of compatibility/compliance/etc, but I think this conversation should be discussed in other work streams. Let’s get HSDS 3.0 and the API specs published before we all go down that rabbit hole.

1 Like

On the heels of our recent HSDS 3.0 Workgroup meeting, I’d like to crystallize a suggestion for defining levels of API compliance:

  1. The “Standard” or minimum level of compliance is a /services endpoint that includes denormalized 1:1 and 1:many relationships. Fundamentally this level of compliance is strictly concerned with delivering services. The dominant use case is to make data available for application consumption.
  2. “Full” compliance expands to include all core tables, /organization, /location, and /service, with fully denormalized 1:1 and 1:many relationships, and arrays of IDs to all many:many relationships. This will expand our original use case to encompass full data replication.

Further clarification.

  1. In all cases, IDs should be included wherever data has a unique ID associated with it, for example in the cases of phone, schedule, and cost_option.
  2. Taxonomies and attributes will need to be included as denormalized data on all core tables, if not wherever they are linked to records (bearing in mind they can be linked to almost any table in 3.0).
  3. The only level that needs to be defined before we drop 3.0 is the Minimum or “Standard” level. Full compliance can be hashed out in detail and dropped later.

Open questions.

  1. Whether or not /taxonomy_term should be included in some or all levels of compliance is an open question. It’s very useful when taxonomies are present (as @MikeThacker pointed out), but not all organizations even have taxonomies (as @bloom pointed out).

Just trying to wrap my head around concrete concepts here. Is this more or less the general understanding?

2 Likes

I agree with all of that @skyleryoung. Just a few clarifications:

  • Regarding @Dan-ODS’s Decision 1 points 1 and 2, I favour a paginated basic /services and a full /services/{id}. And following the same pattern for other methods

  • We said a /service_at_location method should be in the extended (you say “full”) set of methods

  • I’d expect /taxonomy and /taxonomy_term to be in the extended set so individual programmes can require them if needed but they are not expected by everyone

Thanks for filling in the gaps @MikeThacker. I agree on all points.

1 Like

@skyleryoung @MikeThacker

I agree with pretty much everything too.

The only thing that should be allowed (and in my opinion recommended) is that the /services endpoint also be able to show the full service, not just the abbreviated one. It can be up to the endpoint if it whats to show more than just the one-to-one relationships and this should not effect validation.

I also would like in the extended option to have:

/services?full=True

To say that you do want the whole services paginated.

Also as an extended option:

/services?minimal=True

To just give a list of the ids and last modified dates of the service as a whole.

However I agree the minimum should be as Mike outlines.

1 Like

That sounds perfect @davidraznick

1 Like