3.0 for review and comment

Hi all,

We’ve reached a point where we’d like the broader community to review the updated standard before we progress with the full release.

A full list of changes can be found within the changelog.

We’d appreciate it if the technical community could take a look and feedback on:

Unless there are any major objections, we are hoping to pause for now on any further changes to the schema itself. However, as there has been less consultation around the API specification, could we please ask that this is given priority when reviewing.

Please share you views in response to this post and/or at the workgroup meeting on the 27th March.

Thanks,

Dan

1 Like

Hi Dan. Thanks so much for pushing this forward. Looks like you all have done great work.

A few questions:

  • Can you explain what the “full release” will entail? Does that mean there is more documentation forthcoming or simply that it will be announced, etc?
  • Is there a single JSON Schema file available for the standard?
  • Is there a Tabular Data Package?
  • Will there be a Swagger interface for the API?

Thanks!

I think what it is meant, is that it is officially released, and be what users should implement from now on.

The way we have structured it, there can not be a single jsonschema, as it is possible to query the data from many perspectives (service, org and service at location). The directory of schemas has one file per table is the only place schema edits happen. From there, all the other various schema get generated, including the datapackage.

However, if you want a single jsonschema file that is not too nested this service one is possibly the most useful and it may make sense to link to this one from the docs.

Yes, it is generated from the above: datapackage. However, we should probably link it in the docs more clearly somewhere.

Yes, at some point, hopefully soon. We went with the latest version of openapi 3.1 standard, as it had better jsonschema support. However, the swagger UI does not support it yet. It is on their roadmap to complete very soon and they have a release candidate for it.

If there is any issue with them not doing it, in a reasonable time frame, we were going to write a quick translated version of the schema to work on an older interface, but did not want to waste the time if it will be out imminently.

1 Like

Thanks for the reply.

I do think it’s worth linking to the a single service jsonschema and tabular data package files from the docs.

In the past we discussed a single file using organization as the root/highest level because people often compare system data (and will eventually dedupe) organization by organization. I don’t think it’s critical for this to exist but it’d certainly be helpful and worth linking to from docs if you have it handy.

As for the Swagger interface, your approach makes a lot of sense.

Thanks so much for all your work on this and looking forward to the official launch!

@devin

Based on your feedback, we added this page:

http://docs.openreferral.org/en/3.0-dev/hsds/publication_formats_reference.html

The schema reference page now also has a tab where you can see the schemas for each entity eg.

http://docs.openreferral.org/en/3.0-dev/hsds/schema_reference.html#service

Making a downgraded schema (to openAPI 3.0) was in the end not too difficult, so there is a now a swagger page.
This swagger page also has schemas and examples for each endpoint.

1 Like

These are fantastic changes. Really suits my needs. Thanks so much.

One more thing: with the ERD diagrams, can you:

  • include a note about how they were generated and, if it was with code, link to the code used
  • include a link to download the ERD images

And another request: on the Swagger can you add some basic introductory information and a link back to the docs.openreferral.org site so if people find themselves there as their first stop into HSDS land they know where to get more info?

Hi all, Great work with HSDS 3.0 I am excited to implement! I couldn’t find a link to the workgroup meeting today so I have typed up my feedback instead.

Hunger Free America has a very similar use case to Ian from Digital grasps (Our Model for Place-based DoS - #5 by Ian-DigitalGaps) where we have just too much data for our small team to stay on top of it all. To manage all of the data we partner with organizations like food banks that have smaller local lists of services that they maintain. Hunger Free America aggregates the data, cleans it, and assures it before publishing. These different data sources have varying levels of data accuracy and we would like to capture whether the data came from the service itself or from a partner organization. Additionally, our assurers are a mixture of volunteers and Hunger Free America staff so we don’t necessarily want to publish the assurer’s email address. My suggestion would be to create separate tables for assurer and data source so that we can better capture the nuance of who did what and when.

Some fields that I would like to see captured that currently are not:

Data_Source:

  • Email
  • Type (the service itself / 3rd party data source)

Assurer:

  • Name
  • Type (Volunteer/Staff)

Best,
Atticus Rains
Hunger Free America
arains@hungerfreeamerica.org

1 Like

Hi Atticus,

Thanks for this. I’ve invited you to the workgroup meeting incase you still want to come along.

Dan

Hi folks – bumping this thread to share that we have formally introduced version 3.0 for a final Request for Comment period! See here: https://openreferral.org/introducing-version-3-0-of-the-human-service-data-specifications/

We’ll close this final RFC on April 17th, and – depending on how much input we get, and the nature of that input – we’ll plan to have at least one public call in the back end of the month to review any last issues.

Let me know if you have any questions or suggestions.

Many thanks to all of the workgroup members and others who have helped us bring this package forth!

Onward,
greg

So far I’ve received only positive feedback about the proposed version 3.0 – which is encouraging!

This is just a reminder that we’re going to close the Request for Comment period in one week. If we still only have positive feedback at that point, it will be automatically approved!

Let me know if you have any questions or concerns in the meantime. (If you need more time to thoroughly review, I’m sure we can accommodate somehow.)

Looking forward,
greg

@devin The generation of the ERD is done here as part of the docs build here

This is not particularly useful as currently it can not be run outside the context of sphinx documentation build. We are aiming to make this usable outside the docs, and once we do that we will provide the links then.

I have now added download links to:

http://docs.openreferral.org/en/3.0-dev/hsds/logical_model.html#entity-relationship-diagram-full-version

Added a quick link back to the docs in the swagger page. Will work on good introductory information copy.

Awesome David. Really appreciate it.

Another request: can we have an official, single sheet CSV that has all fields and tables similar to this one that you put together in the past: Simple HSDS - Google Sheets

I’m not sure we need all the columns that you have there but the more the merrier.

Thanks - really excited for 3.0 and appreciative of your work on it!

Hi folks – overall, responses to the proposed version 3.0 have been very positive. We got some very thoughtful feedback from 360Giving that I’m going to share here, and @marion.galley can join the discussion here in the forum to elaborate.

See below, and I’ll also try to relay the subsections to other relevant threads here in the forum.

360Giving is pleased to see the updates made for version 3.0 of the Human Service Data Specifications. In particular, we welcome the addition of structures to support the inclusion of externally referenced organisation identifiers and geocodes.

Organisation identifiers

The ability to include organisation identifiers is a huge improvement for the HSDS and we at 360Giving are pleased to see it. This will make it possible to analyse referrals data together with other types of data such as grants data and government contracts data, increasing the ability of the sector to understand its impact in terms of the grants they give and receive, the contracts they win and the services they provide.

The inclusion of a separate entity for organisation identifiers seems to introduce excessive complexity. It now requires an additional identifier for the identifier, which would not be needed if the identifier was simply housed in the organisation entity directly without reference to its own entity. This model significantly increases the complexity for data sharing and data analysis, especially when linking parent organisations, and it reduces the usefulness and interoperability of this data.

We are also not clear on the purposes of some of the different fields. If the Identifier Scheme field allows users to declare a scheme for the third party identifier taken from org-id.guide, then what is the difference between that and the Identifier type, and where are the types drawn from?

Also, if the Organization Identifier field is intended to include third party identifiers, then what is the purpose of the Third Party Identifier field? Is the Organization Identifier expected to include the Third Party Identifier plus a prefix indicating the Identifier Scheme? If so, it is not necessary to create a separate field for this. In the 360Giving Data Standard, we have a single Organization Identifier field that includes both the prefix and the externally referenced identifier together, as this information allows data users to identify the scheme and the identifier.

2

Location updates

The ability to include third party identifiers for locations significantly increases the user-friendliness of sharing geographic data, as latitude and longitude are not always commonly used to describe service delivery locations.

However, we would recommend against highlighting what3words as an example of an external identifier scheme, as it is fraught with issues, and this may increase its perceived legitimacy.

It is not clear whether there are any limitations on the types of third party schemes that can be used for sharing location data. 360Giving has had some experience with this, and found that as a result of our geocode types list not being validated, it has suffered from poor data quality and a proliferation of codes which is now unwieldy, while the official codelist has become out of date. We would recommend defining a closed list of schemes and maintaining that list to reflect changing needs, rather than being permissive of any scheme being used, which makes the data harder to map and compare.

We would strongly recommend supporting the use of externally referenced geocodes for all entities that relate to location, for example service area. This would make the data more comparable to other datasets, including being able to compare the provision of services to the area of impact of grants, or to government administration and reference data. Consider introducing a concept of location scope or similar for indicating a reach of a wider area than a single place.

We welcome the addition of the ability to distinguish between virtual and physical locations, as this reflects the growing trend towards delivery of services online.

Funding

The Funding is defined as describing the source the funding, but it only supports the inclusion of data about the organisation in receipt of the funding, not the funder. This doesn’t really describe the source of funding, so if that is the intent, should the entity support sharing data about the funder(s)? More structured data about funding sources would support interoperability with other data standards and data uses.

Many thanks to Marion and 360Giving for their input! Eager to hear thoughts about whether these issues merit changes to the 3.0 proposal, or whether these concerns can be addressed by other means.

3 Likes

Thanks so much for sharing 360Giving’s response @bloom

If you aren’t aware of 360Giving, you can read more about us here. We decided to respond as 360Giving is a long-time supporter of the OpenReferral initiative. We hope that by working to make our open data standards more interoperable, we can jointly support increased data use and analysis, providing greater understanding of the overall nonprofit landscape.

Also, we have learnt from some of the challenges in operating our own data standard – sharing this learning is part of our commitment to openness in our values.

2 Likes

Just thought I’d add a couple of links for awareness:

  • Here is a link to our geocode types codelist, mentioned in the response above as an example of an area where we’ve struggled to maintain consistency in the data
  • Here is a link to our location scope codelist, mentioned in the response above as an example of a new development we’ve introduced to support describing the geographical scope of a grant, particularly when location data isn’t shared, or when the location refers to a single place but the scope is much wider (e.g. national).
2 Likes

Hi - just adding more feedback here: I think we’re missing the Accessibility section in Schema Reference: Schema Reference — Open Referral Data Specifications 3.0.1 documentation

1 Like

Myself and my colleagues at ODS put our heads together and wanted to get a good response out to 360Giving’s great feeback. It follows below :slight_smile: Sorry in advance for the mini-essay!

Firstly, thanks to both Marion and 360Giving for taking the time to feed back on HSDS 3.0. This is especially true given that the feedback was both positive and constructive, and means a lot coming from 360Giving.

Organization Identifiers

You are correct and we’re very grateful that you raised this ambiguity as an issue. It’s not currently clear in our documentation what some of the fields within organization_identifier are for and we’ll be taking action to clarify this. To respond to you directly:

  • organization_id is the join for the tabular data to the organization.This is a uuid, and is different to the id of the organization_identifier object (discussed below). We see how this naming convention produces confusion given that it is within the organization_identifier object. We will seek to clarify this through documentation.
  • identifier_scheme is the org-id scheme taken from org-id.guide
  • identifier_type is a human-readable version of identifier_scheme and acts as a description. It also covers cases where there’s a need to describe a scheme which is not present in org-id.guide, since publishers often need to publish faster than org-id.guide adds requested schemes.
  • identifier is the actual identifier string e.g. a GB-COH number.

The field title “Third Party Identifier” (for the identifier field) is named to show that it is not the publisher’s identifier used for this organization, but one drawn from an external list.

In terms of the “identifier for the identifier”; this is a result of a few decisions which were made consciously to support publishers and data users.

With 3.0, HSDS is transitioning from a data package model — where different entities were represented in tables — towards a JSON schema model more suited for APIs. We are supporting publishers who continue to use the data package model through tooling to convert between them.

This requires that each entry in the organization_identifier table has a uuid identifier for the “row”. This supports the conversion between formats, allowing publishers to continue publishing via data packages. Alongside this, HSDS does not have a strong concept of a single “top-level object”, unlike other JSON standards which can confidently model based on e.g. “a procurement” (OCDS) or “a grant” (360Giving). id fields therefore become important features of each object definition to allow for multiple representations based on the needs of the data user.

Another benefit of id fields for the organization_identifier object is that publishers can attach multiple organization identifiers to an organization. Different publishers may identify the same organization using different schemes, so having multiple organization identifiers therefore helps with interoperability between HSDS data sets as well as with other standards. Other standards also take this approach e.g. in OCDS, the organization object has an identifier field but also has an array of additionalIdentifiers for the same reason.

For data users analyzing via spreadsheets and publishers publishing via data packages, this id field also makes it possible to represent the one-to-many relationship between organizations and identifiers.

Ultimately, as HSDS continues to focus on JSON publication for future versions of the standard, we expect discussions around top-level objects to play out constructively which may mean that this is not such an issue; however we will still need to cater for spreadsheet analysis as well as being able to apply multiple identifiers to an organization.

Location updates

Thanks for raising the concern about what3words, this echoes other discussions and we will be removing the mention of what3words from the documentation due to community feedback.

In terms of your suggestion on a closed list of third party schemes for location identifiers, we agree that it is much better to standardize this data by restricting options. This also has the benefit of interoperability between datasets which include location information from the same schemes.

We could investigate including this as an optional feature in a future MINOR update, to support backwards compatibility with 3.0. From there, the next MAJOR version of HSDS could afford to be more decisive in restricting or standardizing these identifier schemes.

It is worth noting at this point that HSDS “profiles” may apply additional restrictions which reflect the needs of particular publishers and users. This may provide a way to address this in the short-term and feed into the discussions on how to approach this in future versions of HSDS.

In the long term, we must balance the benefits to standardizing the third party identifier schemes against the work required to decide on an initial list of these schemes, and then to manage this list. As noted, what3words can be considered problematic and we’d want to ensure that any scheme we “endorse” by means of inclusion was appropriate. Analysis of the HSDS corpus may provide some clues as to which schemes are used the most in practice. This would give us a good starting point for engaging the community around this topic for future upgrades.

If we did restrict the use of third party identifier schemes we’d be keen to ensure that this didn’t pose a barrier for some publishers. Some areas (geographic or professional) may not have access to collect or produce data in certain schemes. We’d also want to avoid accidentally creating a list of schemes that could be framed as being too US/Europe centric.

In practice, this may not turn out to be a problem but we’d welcome collaboration to learn from your experiences and the experiences of others to avoid this.

Funding

This can be considered for a future version update, and perhaps added as an optional feature for a MINOR update subject to the priorities of the HSDS community.

One barrier to publishing this data is that it can be difficult for some HSDS publishers to provide this themselves or gain access to it. This is because some may see it as not directly relevant to service discovery. Nevertheless, it would be straightforward to include an optional feature to describe the funding organization as well as the recipient organization in a future update of HSDS.

Longer-term, there are exciting opportunities here for interoperability with various other open data standards including 360Giving. If a service is known to be funded via grants then it makes sense to create links between data sets by referencing e.g. 360Giving grant identifiers. The same can be said for services funded through procurements and linking with OCDS datasets. This supports a fuller picture of service funding and drawing lines to evidence impact of spending on services. This is a larger piece of thinking which will require time to get correct, but we would be excited to approach this for future versions of the standard.

4 Likes

Thanks, @mrshll.

It seems like all of these issues can be addressed either with minor clarifications to the documentation, or by building upon 3.0 in future versions.

Eager to hear any critical responses from @marion.galley or others. Further feedback can help us shape the docs in the near-term and shape our agenda for 3.1 and beyond.

In the meantime, it seems like we’re ready to consider version 3.0 approved and ready for use, pending a handful of final adjustments that are planned for the documentation which you can see in our project board here.

Thanks so much to everyone who contributed to this process! We’ll be reconvening our workgroup shortly to address questions about tooling and our process moving forward. Let me know if you have any questions or suggestions in the meantime.

:white_check_mark: :pray: :tada: :partying_face: :confetti_ball: :raised_hands: :stars:

1 Like

Hi @devin,

JTLYK this is now done - See SwaggerUI

1 Like