Tuesday, September 12, 2023

AskAppSec - Input Validation

Input validation is a topic that's been following me around for years. I've came across countless resources speaking about the importance of input validation, or input filtering as it's called at times. What stuck with me is the recommendation to validate any input coming from any source, no matter if we're speaking about third parties, public interfaces we offer ourselves, internal services behind a firewall or accessible only from inside a private cluster. No matter if the input comes via clients, APIs, messages, data sources, or anything else. Based on my experience of working in testing and quality focused roles for over 14 years, I couldn't agree more. All that makes a lot of sense to me. Not only from a security point of view, yet also from a holistic quality perspective as input validation can help prevent errors, improve usability, increase observability, and more.

Here are a few interesting resources speaking about input validation from a security standpoint.

Well. Seems convincing people to validate input is also a common challenge in AppSec; it definitely is in my bubble advocating for better quality outcomes. Most frequently I've seen these discussions when working with backend for frontend (BFF) services. This architectural pattern is often applied when you develop mobile applications, yet not limited to it. It is usually found along with having a bunch of downstream services that are all tasked with different duties. The BFF acts as the main entry point or proxy for a single client (e.g., the mobile app), hence only this API is public to the outside world. Any requests are routed through the BFF to the respective downstream backend services (which are usually protected further). Doing so, the BFF orchestrates incoming requests to different services; it can take care of authentication and authorization, as well as filter and aggregate data in order to respond with the needed information.

If you'd like to learn more about BFFs and see visual or code examples, I found the following resources useful.

What about input validation for BFFs now? What I've heard frequently from colleagues can be summarized in the following statements.

  • "The BFF is just an API gateway."
  • "The BFF should not contain any logic, just pass through anything it receives to the backend services and vice versa."
  • "Only the downstream backend service behind the BFF should validate input as it's their responsibility, otherwise we replicate the same logic everywhere."
  • "Well, it's okay for the BFF to do sanitization, yet not validation."

Here's my viewpoint, and I'm very curious to hear further opinions.

Any modular component of our system needs to sanitize and validate input coming from outside in order to prevent falling into an unknown state. This is both causing poor user experience as well as presents an attractive situation for malicious actors looking for further insights that can be used for exploits. Components include frontend clients in favor of usability, even though malicious actors can easily circumvent them. As long as there is an interface accepting data from another source, there should be validation. Under that premise, I do think that also BFFs should validate incoming data, especially from the public facing side, yet also from internal backend services or other data sources. The BFF is one of the first layers of defense we have, hence if input is validated, we leave the door less wide open. We cannot rely on underlying backend services having foreseen anything that could enter the system from the outside and having sufficient mitigations in place; especially if they are developed by other teams in other contexts who might not even realize the impact of their local decisions. Also, lots of people tend to underestimate threats coming from within the company, like malicious insiders or mere human mistakes. I do understand that it's not always pragmatic or feasible to validate input on all boundaries. Yet if we have to decide, I choose validation at trust boundaries, like between the BFF as public interface and the outside world, as a minimum.

That being said: context is crucial, as always. Maybe the policy to only validate on the most downstream service works well in your situation. Maybe this is considered way too dangerous as you might be aware that specific services are not in good shape. Or maybe your product domain's nature means you're dealing with lots of confidential, sensitive data and you are more invested in keeping people out right at the door (aka the BFF API) without letting them any farther in. As usual, it depends on the risk appetite of the company, combined with your own ethics of what potential impact and harm you deem acceptable or not.

One thing I learned over and over again in my career is that arguments might convince rationally, yet they often don't reach people in a way that they change their behavior. They usually need to experience it, and usually a few times (and I'm not excluding myself in this equation). The trouble with security and similar quality aspects: I want to prevent the experience of harmful impact as much as possible. Speaking of the topic of whether it makes sense to validate input also for BFFs. I could of course invest in exploiting lack of validation, or showcasing a close to real situation, yet it's effort that still does not easily change the narrative and then behavior. If you have any further idea or tactic for these kinds of situations, your input is appreciated.

All in all, I do have a strong opinion on this topic, yet I hold it loosely enough to allow myself to be convinced by better ones. I did find posts that support my standpoint, like "Web App Security: Understanding The Meaning Of The BFF Pattern" by Syed Wahaj - yet that might be pure confirmation bias. So, I'd sincerely love to hear your thoughts and experience about this and learn more: should BFFs validate input?



UPDATE: I've shared this question with the wider community and received some validating feedback. My appreciation to everyone who offered their thoughts!

What I found especially insightful was the following input from a We Hack Purple Community member, hence sharing it further with their permission so it can help more folks besides me. I think they nailed it, so I mostly maintained the original take with slight format editing from my side.

I think the answer to "Should BFFs validate input?" really depends on what it does with the data. The BFF will need to validate some input, but not necessarily all of it.

Generally, anything that looks at the data, parses it or interprets the data in any way will have to validate input.

A BFF will likely look at the HTTP request headers, so it has to validate those. It cannot assume that the request headers will be sensible, or not malicious. It may also have to decide how to deal with duplicate request headers, etc.

But maybe the BFF does not look at the request bodies, and just passes them through to the backend.

It probably would not make sense to duplicate lots of application logic in the BFF to perform application specific input validation on data the BFF itself does not process. Unless, maybe there are very common things that may make sense for a BFF to filter out before bothering the backend with it. But that would then be a bit more like a WAF that may do some general input validation, like looking for common SQL injection patterns. And this still does not absolve the backend from its input validation responsibilities.

The backend services will always have to validate the input they are handling, but even backend services may pass some data through to other downstream backend services. The important thing is that everything that looks at the data and processes it performs validation, whether that is a web server, an API gateway, a web API, or a BFF.

My deepest thanks go out to the person who took the time and energy to elaborate on this. They made the distinction I was looking for (without knowing I was): what exactly makes sense to validate where and why, given the specific context at hand. I think this is what I struggled with myself and hence struggled to convey more clearly to colleagues. Taking this explicit distinction, I feel enabled to map it to our context to make better informed decisions, and I also feel equipped to bring more clarity to the next conversation on input validation!

As a bonus, here's one more thoughtful response allowing us to weigh further aspects against each other.

No comments:

Post a Comment