this standard sounds amazing as a concept but I haven’t had much luck finding information/guidelines addressing the fact that not all website built & hosted equally, even though the data structure match. There could be various implementations from good to bad.
Most important what would be advice about rate limiting for request either from legitimate or bad actors that could easily take down websites with overload of the traffic?
I think you’re right in that it’s a safe assumption that the quality of implementations will be mixed. And I also assume that those implementations will have to consider load challenges. In the near-term, I don’t know how serious the threat of bad actors might be, but it’s still something that I imagine implementers will need to consider eventually.
I don’t think we’re positioned in this community to conduct all that much risk modeling and mitigation on behalf of implementations, as they’ll likely be better positioned to answer questions pertaining to this stuff than we’d be on their behalf. But there could be something for us to consider somehow.
Do you have specific concerns in mind that you think can be addressed at the level of the API specifications?
To ensure service level & prevent abuse the API rate limit (max amount of API request) is set to whatever the service user would require. Problem it’s not exact science.
I usually extrapolate it from number of endpoints & volume of data and then limiting per IP or session.
Sometimes we have exceptions to specific IP addresses or domains that can have unlimited access.
This serve dual purpose( by restriction I mean the limit of API request):
Public ( untrusted) traffic get access but in a limited way
dedicated , internal, partner infrastructure ( trusted) get unrestricted or less restricted access
not sure if that helps in any way. I am just trying to give more of an example of how this plays out.
BAD ACTORS
Attackers love when things made easy for them. They always look for biggest impact. They would not bother finding a list of all services & directories. They either choose a big target or predefined list that is made available for them!
Around a year ago, the hackers were attacking UK infrastructure. At some point the entire telecom industry was under attack almost overnight in concerted effort, coincidently all top providers were listed conveniently on the Government based website. (I was part of industry)
I am not saying that if it wasn’t listed it would be much more effort to find out but making infrastructure easily accessible & public has it’s own risk.
Thanks for your questions and welcome to the forum! See below for comments in relation to your original post from @robredpath
The point of the standard is to establish a baseline; if you use the standard then you at least provide certain information in the standard format. The minimum is the basics required to be useful; we’re working on a profile mechanism to help people describe more detailed information on top of the baseline
That’s outside the scope of the standard; HSDS describes the format and contents of information transmitted, not the systems that do it. But, this forum is a good place to ask people like @MikeThacker , @skyleryoung, @dovanele and people like @hntpx what their tips are and what open-source software is available.
API security is a whole field in itself, but I would warn that rate-limiting can make it really hard to effect bulk transfer of information over APIs so it’s important to be responsive to load and look for malicious activity while maintaining a good service for legitimate users.
Regarding this:
Unless @robredpath or @davidraznick are able to respond sooner - I will run it past our guys on Tuesday and get back to you (once I’m back from moving house).