Search code examples
httprestwebserverrestful-architecture

REST design: what verb and resource name to use for a filtering service


I am developing a cleanup/filtering service that has a method that receives a list of objects serialized in xml, and apply some filtering rules to return a subset of those objects.

  1. In a REST-ful service, what verb shall I use for such a method? I thought that GET is a natural choice, but I have to put the serialized XML in the body of the request which works but feels incorrect. The other verbs don't seem to fit semantically.

  2. What is a good way to define that Service interface? Naming the resource /Cleanup or /Filter seems weird mainly because in the examples I see online, it is always a name rather than a verb being used for resource name.

  3. Am I right to feel that REST services are better suited for CRUD operations and you start bending the rules in situations like this service? If yes, am I then making a wrong architectural choice.

  4. I've pushed to develop this service in REST-ful style (as opposed to SOAP) for simplicity, but such awkward cases happen a lot and make me feel like I am missing something. Either choosing REST where it shouldn't be used or may be over-thinking some stuff that doesn't really matter? In that case, what really matters?


Solution

  • REST is about using HTTP the way it was designed. To be RESTful consider (title was REST design :):

    • URLs should be permalinks to a resource (caching benefits, storing/sharing endpoints etc...)
    • Because they are permalinks to a resource, having verbs in the URL is a hint that you're on the wrong path (filter is a verb).
    • A collection of resources can be an endpoint /foos.
    • If you want to filter the collection of resources, consider querystring params like ?filter= or something like ?ids=1,2,3,4,5.
    • A GET should not change resources. Note that 'cleanup' implies something getting deleted so be cautious of changes to resources when you do a GET. REST says a GET shouldn't alter resources. Imagine a caching server taking you're cleanup request as a GET and returning OK because t's cached. Caching servers know not to cache a POST, DELETE etc... (that's the way HTTP was designed).
    • Don't rule out multiple calls - for example, you may do a get to filter and get a set of resources to clean up and then could be followed by many or one DELETE verb calls to do the cleanup.
    • Sometimes there's a temporal resource like a transaction or a 'job' that could do work like a cleanup. Don't rule out a POST to the resource with the body containing items to cleanup up and it returns a job id. You can then query the jobid for the cleanup progress or status.

    It's hard to give exact guidance because the question isn't clear but hopefully the RESTful principlies guidance and thoughts above set you on the right track. If you clarify the exact calls, I'll try and recommend APIs.

    So, let's say you wanted to cleanup duplicate foos.

    [GET] /foos/duplicates (or /foos?filter=duplicates)

    returns a body with identifies to of foos that are duplicates. Let's say that returns 1,2,5 (could be names).

    Then you could issue:

    [DELETE] /foos with the body being an array containing 1,2,5 (or names if unique). the delete call is passive so even if the GET call is cached according to REST principles it's fine.

    It's also possible and valid to not go the REST route such as POX or JOSN RPC over http but just realize at that point that it's not REST. And that's fine but you're not getting the benefits of REST described in fielding's thesis.

    http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

    Also, read this:

    http://blog.steveklabnik.com/posts/2011-07-03-nobody-understands-rest-or-http

    EDIT:

    After reading the comment where you clarified you're sending the server a set of objects (not persisted server side) and it returns the subset with the dupes filtered out (like a server side helper function), some options are:

    1. Do this client/browser side if possible - why take the network roundtrip to filter out dupes out of collection?
    2. If for some reason only the server has specific knowledge/data to determine that two items are functional equivalent (even though data not exactly the same), then consider POSTing the data set to the server with the response body containing the unique/filtered set. Even though the server isn't persisting the set, it would fall into a 'temporal' object or set and the server is modifying it. It's not conceptually a GET of server resources and caching offers no benefits in that scenario.