Search code examples
formsvalidationarchitecturesoftware-designmern

Where should data be transformed for the database?


Where should data be transformed for the database? I believe this is called data normalization/sanitization.

In my database I have user created shops- like an Etsy. Let's say a user inputs a price for an item in their shop as "1,000.00". But my database stores the prices as an integer/pennies- "100000". Where should "1,000.00" be converted to "100000"?

These are the two ways I thought of.

  1. In the frontend: The input data is converted from "1,000.00" to "100000" in the frontend before the HTTP request. In this case, the backend would validate that the price it is an integer.

  2. In the backend: "1,000.00" is sent to the backend as is, then the backend validates that it is a price format, then the backend converts the price to an integer "100000" before being stored in the database.

It seems either would work but is one way better than the other or is there a standard way? I would think the second way is best to reduce code duplication since there is likely to be multiple frontends - mobile, web, etc- and one backend. But the first way also seems cleaner- just send what the api needs.

I'm working with a MERN application if that makes any difference, but I would think this would be language agnostic.


Solution

  • Short version

    Both ways would work and I think there is no standard way. I prefer formatting in the frontend.


    Long version

    What I do in such cases is to look at my expected business requirements and make a list of pros and cons. Here are just a few thoughts about this.

    1. In case, you decide for doing every transformation of the price (formatting, normalization, sanitization) in the frontend, your backend will stay smaller and you will have less endpoints or endpoints with less options. Depending on the frontend, you can choose the perfect fit for you end user. The amount of code which is delivered will stay smaller, because the application can be cached and makes all the formatting stuff.
    2. If you implement everything in the backend, you have full control about which format is delivered to your users. Especially when dealing with a lot of different languages, it could be helpful to get the correct display value directly from the server.

    Furthermore, it can be helpful to take a look at some different APIs of well-known providers and how these handle prices.

    The Paypal API uses an amount object to transfer prices as decimals together with a currency code.

    "amount": {
        "value": "9.87",
        "currency": "USD"
    }
    

    It's up to you how to handle it in the frontend. Here is a link with an example request from the docs:

    Stripe uses a slightly different model.

    {
         unit_amount: 1600,
         currency: 'usd',
    }
    

    It has integer values in the base unit of the currency as the amount and a currency code to describe prices. Here are two examples to make it more clear:

    In both cases, the normalization and sanitization has to be done before making requests. The response will also need formatting before showing it to the user. Of course, most of these requests are done by backend code. But if you look at the prebuilt checkout pages from Stripe or Paypal, these are also using normalized and sanitized values for their frontend integrations: https://developer.paypal.com/docs/business/checkout/configure-payments/single-page-app


    Conclusion/My opinion

    I would always prefer keeping the backend as simple as possible for security reasons. Less code (aka endpoints) means a smaller attack surface. More configuration possibilities means a lot more effort to make the application secure. Furthermore, you could write another (micro)service which overtakes some transformation tasks, if you have a business requirement to deliver everything perfectly formatted from the backend. Example use cases may be if you have a preference for backend code over frontend code (think about your/your team's skills), if you want to deploy a lot of different frontends and want to make sure that they all use a single source of truth for their display values or maybe if you need to fulfill regulatory requirements to know exactly what is delivered to your user.

    Thank you for your question and I hope I have given you some guidance for your own decision. In the end, you will always wish you had chosen a different approach anyway... ;)