Search code examples
apivalidationoopsanitization

Validation / sanitization and OOP


Not sure if the title is informative enough, dough I am wondering what is the best-practice and best design pattern when creating an object-oriented library - should the "client" be responsible to sanitize the data sent to that "black box" library or should the library provide the set of tools to prevent against malicious things.

I will give an example: Let's assume that we are building an open-source library that provides integration with a fictitious service called fooCompany who provides REST API. Our library now needs to make requests to those APIs and provide it with some data, for our example let's take for example the authentication token.

The simplest code will probably look something like that:

class fooCompany {
  private $apiToken;

  public function __construct($apiToken) {
    $this->apiToken = $apiToken;
  }

  public function send() {
    $ch = curl_init('https://fooCompany.xx/api/send');
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, array(
        'Content-Type: application/json',
        'Authorization: Bearer ' . $this->apiToken
    ));

    $data = curl_exec($ch);
    curl_close($ch);
  }
}

We can see that if the client application that uses our library will not protect the apiToken good enough its application will be now vulnerable to a header injection attack.

Thanks.


Solution

  • There is no simple solution for that kind of problem, I guess. Basically what you describe is a feature that you could - or could not - add to your "product". As I understand it, that's not oop-related at all. It's rather a strategic decision that you have to make. And that - of course - depends:

    As a library author you should always ask yourself where, when, why and by whom your lib will be used.

    In general, people have certain expectations when thinking about libraries, and certainly they will be laaaaazy.

    A neat, reliable, documented, error-preventing interface to the underlying core will almost always be highly appreciated. Client code will automatically gain quality and can easier be maintained. Everyone likes that. You, as the library author will always have better knowledge of "the fictitious service called fooCompany" than most of the lib's users. You can solve stuff in minutes. Other's might need days.

    Obviously that requires you to do a lot of work ...

    (Also) obviously, security related affairs are a no-brainer: If you already anticipate something to happen you should of course put in some effort here.

    The robustness principle says: Be conservative in what you do, be liberal in what you expect from others.


    You asked about design patterns: Clearly identify the parts of your lib that client code should explicitly use/call. That's the public api. Everything else is considered internal or core. Ideally, internal stuff should not be exposed to clients at all (if possible).

    By splitting things up, you can focus on different goals. The internal part is highly technical and is expected to change frequently. It does the heavy lifting if you will. The public api has other priorities. It should be convenient, easy to understand and have (backward) compatibility in mind. That way you are able to make certain changes that won't require client code to be touched at all.

    Also, if you ever release and maintain something, follow the rules of semantic versioning. People will like that.