Challenges with the `schedule` table for organizations in the USA

skyleryoung · September 14, 2023, 12:35am

I’ve been working with the National 211 Data Platform (NDP) in conjunction with a data managers from several 211s towards creating, among other things, a 211 Profile for HSDS. We’ve hit a snag trying to use the schedule table with existing real-world data.

Many software systems where resource data originates provide a single plain text field for schedule data. It’s gross, but that’s the reality for now. Here’s a sample of real schedules provided by the NDP for reference:

&lt;u&gt;Women:&lt;/u&gt;Monday - Friday, 9:00am - 11:00am&lt;u&gt;Men:&lt;/u&gt;Monday - Friday, 2:00pm -
4:00pm&lt;u&gt;Meals:&lt;/u&gt;
Monday - Friday, 11:30am - 1:00pm&quot;
Monday - Friday, 8 a.m. - 5 p.m.
Mon-Thu 7am-5pm, every other Fri 7am-5pm
Mon-Fri - 6:30am to 6:00pm
Mon V - V; Tue V - V; Wed V - V; Thu V - V; Fri V - V;
Mon 8:30am - 5pm; Tue 8:30am - 5pm; Wed 8:30am - 5pm; Thu 8:30am - 5pm; Fri 8:30am - 5pm;
Mon 9am - 5pm; Tue 9am - 5pm; Wed 9am - 5pm; Thu 9am - 5pm; Fri 9am - 5pm;
Mon - Fri 8:00 am - 5:00 pm
Mon - Fri 8:00 am - 4:30 pm

211 data managers are drafting how they would like to be able to represent data, both in an interface and under the hood. I’ve been pushing for using the HSDS schedule schema under the hood, but there are two problems:

The source data will be retained in the description field for a schedule record, but where a single schedule requires more than once record (for example Mon-Thu 7am-5pm, every other Fri 7am-5pm), that description will have to be duplicated across each record, which breaks the rules of normalization.
Further, where an entity has more than one schedule, each with different times on different days, it can be difficult or impossible to determine which schedule rows should be grouped together. We might consider matching by valid_from and valid_to values, but not all records will have that info, nor can we guarantee they will line up cleanly even when they are present.

The solution proposed by our work group is to move the current schedule fields into a table called hours, and then simplify schedule to (more or less) description, valid_from, and valid_to. There would be a 1:many relationship from schedule to hours, thereby allowing hours to a) be grouped and b) share a global description.

It’s an intuitive abstraction that looks similar to the pattern used for location and address.

@davidraznick suggested we could use attributes to group schedule records and add the description, but also acknowledged it’s a little hacky and wouldn’t negate the description duplication problem. Of more importance to me is that it’s not obvious; developers would have to know to look for attributes, and know what to do with them, whereas the table structure suggested above is self documenting.

I’m very interested in getting feedback from others about how you have solved these challenges with schedule data, and whether you think there is merit to abstracting hours a level below schedule.

@bloom @devin @PollyM @MikeThacker

devin · September 14, 2023, 1:10pm

We’ve experienced a simile need for different reasons.

Our users have used the schedule table effectively but they want a way to put a message at the top of the schedule.

They want to document their recurring hours (ex 9-5 M,W and 10-6 Tu,Th,F) but then want to be able to add a message (ex “Closed on Christmas and New Years”). They could use the schedule table to document this but it’s a lot more work than simply documenting recurring hours.

We’ve used an attribute style approach to achieve

As for a technical solution, I think we should consider a third solution in which we use the existing schedule table to hold description and validity period data only and maybe an additional field to document that this field is not iCal valid data but for display purposes only. I’m not convinced that’s the best approach but I think it’s worth considering.

The other two technical solutions proposed thus far seem workable to me.

bloom · September 14, 2023, 1:20pm

Thanks to @skyleryoung and @devin for getting this conversation started.

This is one of our ongoing challenges, so for reference, I just want to point back to the thread where we’ve had most discussion in the past (and which led to our adoption of the RRULE framework in v2), along with other open issues related to schedule in the Github repo.

skyleryoung · September 14, 2023, 4:02pm

Thanks for the links @bloom . I missed in the documentation that schedule is using RRULE. I feel stupid for missing that, but perhaps calling that out a little more boldly and linking to a guide where additional parsers are recommended would help reduce the barrier to entry that RRULE entails.

I’ll note that the recommendations made by the 211 workgroup don’t replace RRULE, they just add a layer above it for more intuitive grouping and labeling. After reading your GitHub thread, I would recommend adding another field to the higher-level schedule table: type with enumerated values of event and hours. Again, that follows the pattern of the location and address relationship.

@devin Have you run into the use case of needing to have more than once schedule active for a service record?

mrshll · October 25, 2023, 3:31pm

Wow this sounds like a gnarly problem. From a standard development perspective, those changes to the schedule object would result in a MAJOR version upgrade (this is not to say that we shouldn’t do this, or that’s not desirable, though!)

I fundamentally agree with you that this needs reworking, and that abstracting the actual schedule details down a level is a good idea. However I think I disagree with the exact approach outlined here. At least for the moment.

My take on this coming from a data modelling and JSON perspective – is that the issue stems from the way that other objects attach multiple schedules directly as an array without accommodating this eventuality. service.schedules[] looks good on the surface, but results in the exact problem you describe.

Looking at the schedule object schema, and at the JSON examples in the upcoming guidance, I think the schedule object itself is pretty much internally coherent. I wouldn’t want to remove valid_from and valid_to from this model.

What I think we’re lacking is a sort of “wrapper” which sits between e.g. service and the array of schedule objects, allowing the data to provide a top-level description of the overall schedule and then have the individual schedule details underneath inside the array. This “wrapper” object could be normalized to keep tabular representations happy, with its own id, although each object would only be expected to have a 1:1 relationship with the “wrapper”.

This basically ends up at the same structures you suggest, but approaches it from the other angle of sliding something in-between objects and an array of schedule.

There would have to be a discussion as to what to call this structure, and what to call the field which is the array of schedule. In the example below I’ve suggested schedule_details for now, but this should be picked apart.

If we adapt the example from this page, we’d end up with something like this (note that the schedule objects don’t actually line up with the description! This is just for outlining the shape of the model):

{
  "services": [
    {
      "id": "ac148810-d857-441c-9679-408f346de14b",
      "name": "Community Counselling",
      "description": "Counselling Services provided by trained professionals. Suitable for people with mental health conditions such as anxiety, depression, or eating disorders as well as people experiencing difficult life events and circumstances. ",
      "status": "active",
      "schedule_details": {
        "id": "f6d405b1-f91c-4bb2-ae66-f3ca73963e78",
        "description": "&lt;u&gt;Women:&lt;/u&gt;Monday - Friday, 9:00am - 11:00am&lt;u&gt;Men:&lt;/u&gt;Monday - Friday, 2:00pm - 4:00pm&lt;u&gt;Meals:&lt;/u&gt; Monday - Friday, 11:30am - 1:00pm&quot; Monday - Friday, 8 a.m. - 5 p.m. Mon-Thu 7am-5pm, every other Fri 7am-5pm Mon-Fri - 6:30am to 6:00pm Mon V - V; Tue V - V; Wed V - V; Thu V - V; Fri V - V; Mon 8:30am - 5pm; Tue 8:30am - 5pm; Wed 8:30am - 5pm; Thu 8:30am - 5pm; Fri 8:30am - 5pm; Mon 9am - 5pm; Tue 9am - 5pm; Wed 9am - 5pm; Thu 9am - 5pm; Fri 9am - 5pm; Mon - Fri 8:00 am - 5:00 pm Mon - Fri 8:00 am - 4:30 pm",
        "schedules": [
          {
            "id": "ceee812e-6ed2-4fe1-87d1-2c88c2d6b8b3",
            "description": "Monday to Thursday, 9am to 12pm",
            "dtstart": "2020-04-01",
            "valid_from": "2020-04-01",
            "valid_to": "2020-12-20",
            "freq": "WEEKLY",
            "byday": "MO,TU,WE,TH",
            "opens_at": "09:00:00Z",
            "closes_at": "12:00:00Z"
          },
          {
            "id": "887404d7-6479-48da-bb86-b1592da85aef",
            "description": "Monday to Thursday, 3pm to 5pm",
            "dtstart": "2020-04-01",
            "valid_from": "2020-04-01",
            "valid_to": "2020-12-20",
            "freq": "WEEKLY",
            "byday": "MO,TU,WE,TH",
            "opens_at": "15:00:00Z",
            "closes_at": "17:00:00Z"
          },
          {
            "id": "23f3f9c6-431d-4e85-b59b-309d6d70274e",
            "description": "First Saturday of the month from July-Nov, 9am-5pm",
            "dtstart": "2020-07-04",
            "until": "2020-11-07",
            "freq": "MONTHLY",
            "byday": "1SA",
            "opens_at": "09:00:00Z",
            "closes_at": "17:00:00Z"
          }
        ]
      }
    }
  ]
}

When this is serialized as tabular data packages, you’d basically end up with Service having a 1:1 relationship with Schedule_Details, and then Schedule_Details has a 1:many relationship with Schedule.

Of course, what we’re forgetting is the current schedule object doesn’t require any fields other than schedule.id. It’d be valid at a schema level to provide a single schedule with a populated schedule.id and schedule.description, and simply omit the other fields.

This technically breaks the model wherein a schedule is supposed to represent a single schedule, but it’s valid from a schema perspective. To me this highlights the need to re-work it and I’m glad the community is trying to find ways to do it properly, but technically you’d not get invalid data by doing this!

mrshll · November 9, 2023, 2:41pm

@skyleryoung I was just reading the minutes of the Standing Technical Meeting and think we should jump on a call at some point in the near future to ensure we’re on the same page as each other, since I don’t speak the language of tables very well and I’m not sure how the API would break since if we updated the JSON model, the API should be returning this.

skyleryoung · November 9, 2023, 5:52pm

@mrshll I completely agree that inserting a normalized table between schedule and service is a Major version upgrade. I believe I was informed in another post that I can just add that in a Profile, which made me very queasy from an engineering perspective, but I’ll address that in a different post.

Your suggested schema is very close to what I imagine (and what was drafted by Andrew Benson at the National 211 Data Platform) with two exceptions:

schedule_details (naming things is hard) needs to be an array. One of the driving use cases is a service that has two seasonal schedules with different hours. service needs a one-to-many with schedule_details, and obviously schedule_details requires a one-to-many with schedule where the RRULEs are.
The valid fields should be duplicated at the schedule_details level. This actually isn’t a deal breaker for me from a standards perspective, but per that use case above and all data I’ve seen so far valid time ranges are always applied to an entire group of RRULEs, so the convenience of using those at the higher level would be nice. However, I’m not advocating for removing those fields from the schedule level; the RRULE specification should remain completely intact for that table. It’s adds an additional level of granularity that’s potentially useful anyway.

What are your thoughts on these two modifications?

I’ll note on a related topic, that our community could really use some documented guidelines for how the schedule table is expected to be used for “business hours” style schedule data. RRULEs are great for event style data, but introduce some ambiguity for business hours data.

Here’s an example of ambiguity: when I have multiple days in my schedule with identical hours, do I collapse those to one record (which is allowed by the specification), or do I always create one schedule record for every day of the week? How I parse these two methods (and there are others!) will be quite different, which means that end-users cannot create a consistent method for parsing schedule data in HSDA at this time, even though it’s a highly specific and well defined data schema.

I would love to chat about this. I’ll hit you up on Slack to schedule a call. Thanks!

@bloom @devin @MikeThacker

Dan-ODS · January 16, 2024, 3:31am

If there is anyone in the community interested in discussing this matter, there is a meeting taking place between @mrshll and @skyleryoung on Thursday, 18th January 4:30 – 4:55pm (GMT)

@davidraznick - I think you’d have a lot to contribute if you we’re able to spare the time?

For anyone keen to join, please @skyleryoung in response to this message or email sky@sitesavvy.com
and Skyler will get you added to the invite.

Many thanks,

Dan

mrshll · January 19, 2024, 9:09am

Just noting that Skyler and I had a really productive chat yesterday which has resulted in the following three issues opened on the specification repo:

#492 – schedule groups. This is likely a priority and can be handled many ways some of which would be appropriate for 3.1 and some would be more radical and should be considered for a 4.0 update.
#493 – how should we model Events in the standard?
#494 – redesigning schedule based on the outcome of the previous two discussions.

I’ll be speaking with Dan soon to get the first of these added to the project board since it’s a present need in the community and we can work to start addressing it.

Topic		Replies	Views
Potentially incorrect data types in the `schedule` table Technical datastructure	4	246	February 12, 2023
Enhancements to OR UK - Proposal to enhance Schedule Technical datastructure , minor-upgrade	5	465	October 7, 2021
3.0 for review and comment Technical datastructure , api	20	726	June 22, 2023
Approaches to extending the HSDS data structure Technical	8	498	January 25, 2023
LocalGov Drupal implementation Technical api	26	1330	August 12, 2021

Challenges with the `schedule` table for organizations in the USA

Related topics