Search code examples
securitydesign-patternsserversystemsystem-design

Does storing host and port data in an app make the server vulnerable?


I recently started development and was wondering. Every frontend application needs a host address and service port to connect to the backend server and the databases. For example when you use fetch in React you cannot avoid addressing the server, which can leave traces for many important internal services exposed, which can be extracted from the source code.

Doesn't that leave the system (backend server) exposed and vulnerable? And what are best practices to prevent this from happening?


Solution

  • Shift of focus: The frontend doesn't matter

    I think there is one essential misconception here: You do not expose the endpoint when you distribute sourcecode that contains its address. You expose your endpoint the moment you make it accessible.

    Even if you never publish any app or any other kind of frontend that uses your endpoint: If your endpoint is reachable from any network where you could potentially have attackers, you should act as if these attackers existed.

    It does not matter if you published your react app containing the hostname and port or not, the endpoint should have been secured the moment it was made accessible.

    Example: SSH - remote access to servers

    As a small example, take remote access to servers that can be done through SSH. If you've ever started a SSH server, you probably know that instantaneously, you will be target of hundreds up to thousands of login attempts per minute. This is just people "scanning" random IP addresses on the default SSH port and trying simple username/password combinations. They hope that they might find an insecure server they can take control over.

    In this case - we haven't even announced or in any other way made public our address or port, and still we're experiencing these attacks.

    Security by obscurity and Kerckhoff's Principle

    Hoping that just because you didn't publicly say where people can access your database, noone will find out, is security through obscurity. You actually want to not do this. Instead, you want to follow Kerkhoff's Principle. The security of your system should not be dependant on you hiding where it can be accessed or similar attributes.

    What to do

    Usually, under these circumstances, you still want to protected a subset of these three attributes:

    • Confidentiality: You might want to keep certain data secret
    • Integrity: You do not want anybody to be able to manipulate your database
    • Availability: You want your backend to be available to real users, even when attackers are present

    And now is the point where we can start thinking actual techniques to achieve this:

    • For confidentiality, we usually deploy encryption and authentication/authorization. Note that encryption on its own does not protect confidentiality: If the attacker starts an encrypted conversation with us, and we tell them our data, the encryption doesn't help.
    • For integrity, we usually want to limit write access to our resources. Here, we usually require authentication and authorization to change data in our database. For example, I might change my user profile here on StackOverflow, but you can't. Note that this explicitly means one thing: You can not, under any circumstance, have database credentials in your frontend, be it an app or a website or whatever. You shouldn't have direct database access over the internet available, anyway. Instead, users should communicate with a backend endpoint that verifies the requested change, checks if the user is authorized to make the change, and only then pushes the change to the database.
    • For availability, it is critical that we usually have limited resources, and attackers can easily exhaust them using flooding attacks (example: SYN flooding). You usually want to set up some kind of rate limiting here. For the SSH example, there are programs like fail2ban. For your custom API endpoint, you probably want to have a reverse proxy in place that is configured to have rate limiting per IP address, or maybe even issue API tokens that users have to use and that allow you to identify and rate-limit users across IP addresses.

    Bots and detecting them

    To what Indrit wrote, I'd like to add: You can not distinguish proper users from bots. You can try, but you're on the disadvantageus side of this battle, and they can, from your point of view, look exactly the same as human users, so I'd advise you to not even try, and instead use mechanisms where you do not need to detect bots at all -- for example using API tokens to employ rate limiting (as Indrit already suggested), or proof of work.