HSDS Transformer Use Cases

Hello Everyone!

Stevens Blueprint is excited to share another update on our progress with the HSDS Transformer. We’ve been focused on testing and documentation, along with some progress on some newer features for which we would love your feedback.

Additionally, we’d like to say a huge thank you to Skyler, Sasha, and Greg for their continued support and feedback! It’s been incredibly helpful as we work through some of the project’s major pain points.

We’re now looking for additional testers. If you’re a potential user of the transformer and would be willing to try it out, we would greatly appreciate your help. We’ve attached an overview video and a link to the repository. If you’re able to test the tool, please let us know and share any feedback you may have. All of these resources and more can be found on the project document.

As for some updates, on a technical level:

  • Logging for each transformation run on a data set

  • Optional generation of UUID-5s upon transformation

Currently we are working on implementing a couple newer features, including:

  • Custom Defined Python Features such that developers can implement particular data operations on datasets

  • Multiple source fields to one HSDS field

  • Addressing the service_at_location edge case

And most excitingly:

  • And a GUI for non technical users to access and use the tool.

  • A reverse transformation tool, which takes in HSDS (or other schemas), and transforms it to a given schema as an output.

  • Generating Mapping templates from schemas as a way to provide ease transitions between versions of HSDS.

With these new updates, we wanted to go ahead and ask a couple questions to the people who would use this tool the most:

  • Currently, we plan for the GUI to remove the need to work directly with Git, Python, etc. However, the mapping process would remain the same. Is that aligned with your use case?

  • What is the use case of a reverse transformation tool (HSDS->some schema). Who would be the ideal user of this tool?

  • Is there a use case for having the transformer as part of a pipeline that accepts source data via an API rather than CSV files?

  • Currently we go from flat CSVs to nested JSON ⇒ is there a use in transforming from JSONs to JSONs?

  • What are aspects of transformations you would like to see recorded in a log after the transformation process?

  • What would you like to see in this tool that you do not see implemented/considered?

Thank you and we hope to hear from you all!