I’m a senior technologist that’s taking an AI Engineering Bootcamp (https://aimakerspace.io/) with a friend. This is my linkedin: https://www.linkedin.com/in/nilay320/
As part of our capstone project, we want to create an AI based system that helps social workers / counselors find resources for people in need that they are working with, recovering addicts, those suffering from mental health issues, people coming out of jail, really anyone (Starting in the USA or a local region of USA, it’s flexible). This idea is based upon pain points of a friend that worked in this industry.
Can you help me identify if there are sufficient sources of online and real-time / static data via web scraping or API to make this viable? Possibly also ideas for datasets that can be fine tuned for the fine tuning part of our project?
The eventual goal is to build on this to help with a real need, but want enough to get started with for our project. Interested in hearing your thoughts on if you think this is doable and a good idea, too! Thank you!
Hi there - thanks for reaching out, and for your interest.
I get this question a lot, and I actually strongly encourage people interested in building AI tools to step back from the “direct to consumer” idea and rethink.
There are many, many tools that try to connect people to resources, including various efforts to “do it with AI” – and none of them have earned much trust, in large part because this is a problem of information supply side first and foremost.
Your question does acknowledge that, so I commend you for asking about supply side matters. And you might find a reliable partner who does curate a reliable supply of information somewhere, though it will be likely highly localized; there are no consistently structured, reliable, actively maintained sources of directory information at a national scale that I am aware of. At least none that can merely be “fine-tuned” into reliability for something as fundamentally non-deterministic as this generation of LLMs.
We have lots of documentation of all this; you don’t have to take it as just “this guy’s opinion” – happy to point you to materials that can elaborate, if you look at our website and have any questions.
Meanwhile, there are some interesting projects happening on the supply side that are using AI tools to help the data curators produce more reliable data supply. Perhaps someone like @CheetoBandito would be interested in connecting with you to discuss the current state of responsible efforts to use these unreliable tools in potentially useful ways.
Thank you! Can I set up some time to talk to you?
Sure. Calendly
Hi, Nilay. Adding to @bloom’s take (and perhaps opening myself up for criticism), I agree that AI can’t yet be tasked with delivering ‘trustworthy’ information, unless (and perhaps even if) connected to an actively maintained local database. At the same time I believe there is room for chat and relatable AI conversation to improve a user’s ability to find and recieve support.
At Streetlives, our YourPeer platform runs off a small (2,700 services or so), accurate directory relying on human data collection and validation (for transparency and a better user experience, we show users when service info was last validated). Our lived expert team gather and manage all the content. They autonomously curate and publish the service information. We trust them to interrogate providers from the perspective of the people looking for assistance and opportunities.
During extensive participatory research we heard that one of the most desired services was Advisor-assisted navigation and referrals (a Case Manager or Peer Advisor you can contact through the site). The city provided funding to hire a cohort of Peer Advisors, but then cut it. Continually reduced funding for vital human roles had us thinking about whether AI could fill gaps, ideally make existing roles more productive and help secure baseline funding. We want to try AI chat to support self-navigation on YourPeer and AI Advisors that can escalate to humans, and are scoping them (I can feel Greg shaking his head in concern and despair).
We think hard about ethics and the complexity of building tools to connect people with services in the current climate. We have challenging privacy waters to navigate in handling conversations that include personal information.
So far, we have avoided any need for people to share PII with us. However, once chat is in the loop, there is no doubt that will change, so we’re looking at ways to prevent that from becoming a possible source of harm for them.
Additionally, on the funding and system side, we are asked to identify how many successful service referrals we create (i.e., did the person connect with the service they sought on the platform, and did that service help them). Since we can’t definitively do that without tracking and maintaining records on individuals, we look to other ways to share the value of our work, which we infer from analytics and user testimonials.
We want to help more people find good services, so we’ll continue to explore ways to enable through-platform referrals and feedback while maintaining user anonymity.
For a community to trust our directory we want to consistently give them a relatable, reliable, safe experience, which means incredible accuracy and protecting their personal information. If AI can help accomplish this, it will be a great benefit.
Lastly, thanks, Greg, for lifting up @cheetobandito. I’ll reach out to them, too.
I’ll just add that I think a lot depends on what we’re talking about when we’re talking about “AI.”
LLMs are demonstrably unreliable, especially for purposes like these – marginalized user needs, context-sensitive information, low tolerance for the risk of false outputs. and their performance is not in fact improving over time; could that change eventually? maybe. (But again many of the most confident voices claiming this have already been proven wrong!) So it seems that any responsible approach would simply prioritize protection of users from, let’s say, flamboyant and erratic techno-adventurism.
Could LLMs still be useful for lower-stakes, less-sexy tasks like aiding resource data managers in resource data management? Plausible!
Could chatbots still be deployed in not-totally-irresponsible ways, for users in specific constrained contexts, working from controlled decision trees rather than massive models, designed for careful triage to human support? Also plausible!
I just think it’s reasonable to expect people to grapple with the empirical reality of these technologies, which is messy and littered with failures and embarrassments and worse (I still have not yet found a single precedent for largescale success, and I have been asking sincerely) and furthermore insist that people stay humble in front of the prospect of experimenting with people’s lives.
Adam I’m very sympathetic to the pressure to do more with less. But on the other hand, that doesn’t mean doing something is necessarily better than doing nothing. Doing something can almost always turn out to be worse than doing nothing.
I’m a strong believer in “Don’t just do something, sit there,” generally. Not exposing people to danger is paramount.
For now, our intention is to design and pilot something non-public with some trusted, informed partners, and evaluate carefully with all stakeholders, most importantly, of course, our users. Our lead engineers having deep security and privacy backgrounds help. We know the best thing is to not gather info in the first place.
It’s a good reminder to look again at decision trees. and think about discrete, targeted use for LLMs - we already use an LLM for non-critical sentiment analysis on users’ service quality feedback, but don’t expose user data. I think we can find ares where LLM’s can help users.
Hi @CheetoBandito , I met with @bloom last week over a call for an insightful discussion on what we are attempting to do and as a basis to ideate - he mentioned that you’re working in the space too and that your insight would also be valuable. Do you have some time to chat with me and my partner over a GMeet, perhaps this week (I’m in NYC btw)? My email is nilay320@gmail.com. Thank you!