I was recently struck by a comment made during a client call: “A standard isn’t a standard if you can drive a truck through it”.
This rings a bell with me. One of the the strengths of HSDS is flexibility: I have been able to map over eight separate data schemas into HSDS with a high degree of fidelity (which I would define as having all data present, consistent, and usable in the new format).
However, as both a user of the data and a provider of data to users, and in the context of providing data for mostly public display, I always have to add additional information before publishing resources.
Here are some additional elements that we have to define ahead of publishing:
- Contact information like phone, email, schedule, and URL may be related to multiple tables or “levels” at the same time. For example, if I’m looking at a service at a specific location, and both the service and the location have a phone number, I need to know which phone number to reference first.
- Similarly, a service may have more than one phone number. Within the set of phone numbers related to one entity I need to know in which order they should be presented and recommended to users.
- Source data often lends itself to a particular method of displaying the names of resources. For example, when looking at a service at a location, often the name is
service.name at location.name
. If it’s a virtual service, then perhapsservice.name by organization.name
is better. If it’s a service level display, that might be just the service name, or a concatenation of service name and organization name again. In any case, we have found it necessary to define this on a per-data-source basis.
I would say items 1 and 2 are the most important, but 3 is worth mentioning.
Solutions (?)
Inside Connect 211 we solve these issues by aggregating information at the service_at_location level in ranked lists; these are the source of truth for ordering data. For example, we rank and store phone numbers in an array (as it were) from across all tables. We also format names and descriptions at this level in cases where multiple data elements are concatenated.
This was necessary for our own internal consumption of the data. In a case where we shared data with a national enterprise using strict HDSA they came back with lots of questions about how to parse and prioritize results. When we started exporting our internal data rankings, that ended up answering all their questions, which validates for me that this additional information is generally very useful.
The Conversation
I’m interested in having a conversation about what a strict, or more opinionated, version of HSDS/A might look like. I assume the best mechanism for this would be an application profile, although I’m open to suggestions. I has proven to be so helpful in our experience, and I do think it might warrant even top level recognition in documentation once it’s fleshed out, and perhaps it’s own validator.
What do you think? Have you had to implement similar mechanisms for making data less ambiguous or more usable? Which data fields have proven most tricky for you? Is this a worthwhile Application Profile to invest in?