Is this user story clear and precise?
a. If not, can you offer a clear, concise alternative suggestion?
(Once we’ve landed on the story formulation – ) How important is it? Let’s start on a 5 point scale of –
a. Need to have this
b. want to have this
c. nice to have this
d. Don’t care about this
e. This should not be built under any circumstances
@kathrynods: Add definitions for publisher/developer in the proposal
@kathrynods: Write a ticket for publisher metadata at a record level
Approval and implementation process
No objections to the use of google forms to vote
No objections to ODS deciding on an approach to implementation (batching vs standalone upgrades)
Action
@kathrynods: Move from the ‘revision and consensus’ to ‘implementation’ phase once all votes are cast.
Handling Confidential Addresses in HSDS 3.0
The committee discussed various aspects of privacy/confidentially that were not included in the 3.1 proposal.
It was agree that due to the complexities and nuances of the subject a dedicated conversation with a broader range of stakeholders is needed to take this froward.
Action
@TechnicalCommittee is anyone willing to take the lead on this. (If not, GB will take this on after the federation conversation)
TBD (see above): Setup a separate/dedicated conversation regarding privacy
Versioning
Update on changes and deprecation of V2.0 documentation
Backwards compatibility considerations discussed but agreed not be an issue
Update on progress and testing of the UK validator, website and dashboard. Due to launch on 24th March.
Discussed ways to test other profiles to determine its utility beyond a UK context once live.
Over the next 6 months there will be work to support, maintain and iterate, align with international group and support UK engagement.
Action
@MikeThacker: To keep the group updated on further developments
Demo of Taxonomy/Terminology Alignment Tool
Mike has created a video showing how Info loom Networker relates to Open Referral and how it ‘Provides the basis for something which somebody might want to develop into a full-blown tool, for both maintaining local taxonomies in a local context and maybe also national taxonomies which can be used for data that’s aggregated from multiple sources’
Unfortunately, we are not in a position to fund the development of this tool
New UK site and validator is live and supports original UK version 1 (JSON) and 3.0. * Stricter than previous so things are failing that weren’t before.
Access to beta development of the tool for testing. Skyler has an endpoint that can be used.
TPXimpact contract has been extended for 6 months.
They are hoping for closer alignment and knowledge sharing with the global community.
Feedback on the ‘crude export to spreadsheet’ tool is being used to revamp it and an MVP for a potential new tool is being mapped out.
Discussed the distinction between schema and data compliance.
Data quality use cases raised. Determined that these have been covered within the existing use cases.
Actions
@mrshll (others are welcome) to meet with Greg and Sasha to input on use cases
@skyleryoung to input on the offline validation user stories
Confidentiality
Discussed specific use cases i.e. DV shelters and generalized use cases i.e. privacy and permissions
Considered a framing question for a dedicated session and how this related to consent and liability considerations.
Action
@bloom to put out a forum post with an invite to a dedicated session
Metadata API
Disucssed David’s work tracking extensive metadata and potential scalability and efficiency concerns.
Identified a need to feed this Skyler’s technical insights into ongoing strategic discussions about effective federation and move Katherine’s metadata proposal forward.
Feedback from David is that the current metadata table is quite robust/flexible.
Discussed the potential for guidelines on how to use the metadata table rather than change the specification.
Skyler suggested an extension for multi-tenant architecture
Airtable documentation will not be added to the main documentation as implementation details are very specific and cannot be standardized.
Potential for tutorials or how-to-documentation for alternative methods (e.g., SQL dump, PHP/Python library) to help get people started. But not for providing implementation-specific details.
Matt shared knowledge on how other standards approach this. Generally with a strict line around reference documentation, maybe a tools library which includes publishing and development tools and then they either use their blog or website where they provide guidance on how to bootstrap a system that speaks their standard.
Agreed there’s a demand for solutions to help small organizations that can’t afford large build-outs.
A resource information exchange layer was proposed for small organizations to trade data using plug-and-play solutions.
Raised that concept of a messaging platform needs further discussion.
Acknowledged the need for deploying modular datasets with exchange infrastructure, which is a bigger issue than just documentation.
Action:
@Dan-ODS Table for next meeting with a view to deciding whether we agree on any concrete actions
Rendering documentation in Markdown – for LLMs – is this useful?
Docs already in markdown but presented in HTML. No action required.
Sorry for the delay posting this summary but please see below for a summary of key discussion points and actions from the June the meeting.
Also, the meeting recording and transcript can be found here
Key Discussion Points & Actions
Metadata & Data Attribution
KL described a federated system with four source applications and varying data-editing capabilities. KL’s overview of the problem space aligns with the concept of “Data Lineage”; recording and understanding how various fields are the result of operations on other datasets.
Record-level metadata is critical: it includes the source database, steward organization (via unique three-letter codes), public URLs, and change request links.
Importance of flexibility in metadata systems due to vendor constraints and lack of standard APIs
Registry of Data Sources
A historical system of unique organization codes allows tracing data origins and enforcing accountability.
There’s strong community interest in a formal registry of data sources, potentially aligning with federation goals.
Record Ownership & Permissions
Systems need to accommodate many-to-many relationships between data contributors and records.
Legal agreements often define these relationships and their respective responsibilities.
Traceability at both feed and record level is important.
HSDS Schema Challenges
The HSDS schema lacks a primary “top-level object” for its records (e.g. every record is a service) and is instead normalized, making it hard to anchor metadata in the model; does it go everywhere or only on a few of the object types?
Discussion leaned toward adopting per-record metadata, inspired by Katherine’s model.
Possible evaluation of whether a “parties array” to define organisational roles and relationships would be suitable for HSDS.
Data Guides
Considered complementary to metadata but don’t address the original problems that KL highlighted.
Provide human-readable guidance for data imports and implementation.
KL clarified that they’re not a harmful endeavor and wouldn’t work in contradiction to what she’s discussing; but rather helpful for standardization and interpretation.
UK Validator
The UK validator supports HSDS profiles and can flag warnings/errors.
Differences in how errors are prioritized may stem from compliance pressures.
Open source and adaptable: instances can be customized to treat all issues as errors, if desired.
Use-Case Driven Validation
Proposal: validation criteria could vary by use case (e.g., replication vs. search).
A research project is underway to develop a use case description specification, which could help shape this approach.
Core Concepts
As highlighted by Matt Marshall:
Who owns the data feed?
What does the data look like (human-readable)?
How to process the data for standardization (machine-readable)?
Actions & Follow-ups
SY to share API key and ensure proper Swagger setup.
MM to research record-level metadata strategies and validation mechanisms.
GB & MT to follow up on validator testing with Michigan 211.
Group to discuss Diataxis documentation framework and revisit federation/how-to/confidentiality next time.
Poll on validation use case value to be an agenda item for the next meeting.
We need to reschedule the meeting on August 13th, because at least one committee member can’t attend and I think we’ll need as many committee members as possible in order to review and decide on two nominations for new members.
Jen Abels from Inform USA is a strong candidate to join.
Kate is unavailable on Wednesdays over the summer.
Devin’s attendance is limited due to new fatherhood.
The group needs a process for outreaching to new prospective members, nominating them, and getting approval from Kate, Skyler, and Mike.
Provenance and Open Lineage
Addressing many-to-many data exchange requires a more complex approach to provenance.
Dataset/API-level provenance can be handled via “data guides” (one-to-many publishing).
Record-level provenance will require specification extensions to accommodate many-to-many sharing.
OpenLineage is a potential standard for data lineage that warrants further investigation.
We must evaluate OpenLineage based on its usability/complexity, adoption by others (for interoperability benefits), and its scope (e.g., does it include a registry of API publishers and data sources?).
There’s a need for clarity on the scope of a “registry of data sources”: Is it for HSDS API publishers, individual data sources, or both?
Validation
The UK validator was initially designed for non-normative feedback based on political context and to work only against open API feeds.
However, the validator tool is more modular than initially thought, with a separate backend that provides a validation service.
The UK validator requires modifications to:
accept an API key for validating non-open APIs
load in schemas for other Profiles
Validating HSDS data is distinct from validating an API.
While the API is the main method for publication and distribution; there’s a need to validate data and assess its quality before it’s published on an open API.
Some implementations will not publish their data as open data at all, so validation tooling needs to account for this.
Validating an API is more complex than validating that data is structured correctly, due to things such as parameters etc.
In other standards, there are three mechanisms for data input to their validator: copy/pasting JSON, providing a URL to data (can be an API response or a download file link), or uploading a file.
API validation involves checking that required endpoints exist and each return conformant data as well as determining that the API behaves correctly as it pertains to parameters.
The most broadly used mechanism for API keys is a bearer token as a header.
There’s a strong desire to avoid specifying API key formats within the core HSDS specification, instead treating it as non-normative guidance in “how-to” documentation for the validator.
There needs to be a clear distinction between a “validator” (checking compliance against the specification) and a “data quality tool” or “ongoing online monitor.”: the latter tools can build off of the former, but we should avoid overcomplicating our notion of what “a validator” does.
Other Items
Metadata API/Data Guides and Federation remain important topics.
Confidentiality is a consideration for some implementers who fear competitors stealing open data.
How-to documentation and tutorials (Diataxis) are recognized as important but are perhaps not a priority in the coming months
Actions
Dan
Split out and archive 2024 and 2023 notes
Consider adding schedule data to a future agenda, potentially with a new proposal or clarification on the balance between simple and complex scheduling.
Understand its scope and how it claims to address data lineage.
Assess its alignment with Open Referral’s needs.
Evaluate the “balance of complexity” – is the benefit of adoption worth any added complexity?
Research who else uses OpenLineage and if there are additional interoperability benefits.
Clarify the discrete sets of information that make up dataset-level and record-level provenance.
Discuss the scope of a “registry of sources” to determine if it should include all data sources or primarily API publishers.
Matt to continue exploring the modifications needed for the UK validator to accept API keys and load full international schemas.
Matt and Mike to discuss the “app store links” question, including referencing relevant documentation for handling multiple links and considering if the UK profile needs to be rebased from HSDS 3.0 to 3.1. Mike will be sent an email with URLs to the appropriate documentation.
Ensure how-to documentation for the validator clarifies API key usage as non-normative guidance, separate from the core specification.
Thanks for these notes, @Dan-ODS. A few minor corrections:
First off, I believe the Confidentiality issue is more about enabling certain kinds of data to be excluded from publication (like records about services whose locations are sensitive, and staff contact information).
Also, re Diataxis (wtf is that name) I personally think that developing tutorials and how-to’s is a priority right after the validator! We should consider what kinds of topics might be the highest priority.
And, I don’t know if I said the Committee “needs a process” for nominating new people – i think there is a sufficient process in place, it’s underway – although outreach to new prospective members should be encouraged.