Search code examples
google-app-enginegoogle-cloud-datastoreapp-engine-ndb

In Google App Engine, how to check input validity of Key created by urlsafe?


Suppose I create a key from user input websafe url

key = ndb.Key(urlsafe=some_user_input)

How can I check if the some_user_input is valid?

My current experiment shows that statement above will throw ProtocolBufferDecodeError (Unable to merge from string.) exception if the some_user_input is invalid, but could not find anything about this from the API. Could someone kindly confirm this, and point me some better way for user input validity checking instead of catching the exception?

Thanks a lot!


Solution

  • If you try to construct a Key with an invalid urlsafe parameter

    key = ndb.Key(urlsafe='bogus123')
    

    you will get an error like

    Traceback (most recent call last):
      File "/opt/google/google_appengine/google/appengine/runtime/wsgi.py", line 240, in Handle
        handler = _config_handle.add_wsgi_middleware(self._LoadHandler())
      File "/opt/google/google_appengine/google/appengine/runtime/wsgi.py", line 299, in _LoadHandler
        handler, path, err = LoadObject(self._handler)
      File "/opt/google/google_appengine/google/appengine/runtime/wsgi.py", line 85, in LoadObject
        obj = __import__(path[0])
      File "/home/tim/git/project/main.py", line 10, in <module>
        from src.tim import handlers as handlers_
      File "/home/tim/git/project/src/tim/handlers.py", line 42, in <module>
        class ResetHandler(BaseHandler):
      File "/home/tim/git/project/src/tim/handlers.py", line 47, in ResetHandler
        key = ndb.Key(urlsafe='bogus123')
      File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 212, in __new__
        self.__reference = _ConstructReference(cls, **kwargs)
      File "/opt/google/google_appengine/google/appengine/ext/ndb/utils.py", line 142, in positional_wrapper
        return wrapped(*args, **kwds)
      File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 642, in _ConstructReference
        reference = _ReferenceFromSerialized(serialized)
      File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 773, in _ReferenceFromSerialized
        return entity_pb.Reference(serialized)
      File "/opt/google/google_appengine/google/appengine/datastore/entity_pb.py", line 1710, in __init__
        if contents is not None: self.MergeFromString(contents)
      File "/opt/google/google_appengine/google/net/proto/ProtocolBuffer.py", line 152, in MergeFromString
        self.MergePartialFromString(s)
      File "/opt/google/google_appengine/google/net/proto/ProtocolBuffer.py", line 168, in MergePartialFromString
        self.TryMerge(d)
      File "/opt/google/google_appengine/google/appengine/datastore/entity_pb.py", line 1839, in TryMerge
        d.skipData(tt)
      File "/opt/google/google_appengine/google/net/proto/ProtocolBuffer.py", line 677, in skipData
        raise ProtocolBufferDecodeError, "corrupted"
    ProtocolBufferDecodeError: corrupted
    

    Interesting here are is

    File "/opt/google/google_appengine/google/appengine/ext/ndb/key.py", line 773, in _ReferenceFromSerialized
      return entity_pb.Reference(serialized)
    

    which is the last code executed in the key.py module:

    def _ReferenceFromSerialized(serialized):
      """Construct a Reference from a serialized Reference."""
      if not isinstance(serialized, basestring):
        raise TypeError('serialized must be a string; received %r' % serialized)
      elif isinstance(serialized, unicode):
        serialized = serialized.encode('utf8')
      return entity_pb.Reference(serialized)
    

    serialized here being the decoded urlsafe string, you can read more about it in the link to the source code.

    another interesting one is the last one:

    File "/opt/google/google_appengine/google/appengine/datastore/entity_pb.py",   line 1839, in TryMerge
    

    in the entity_pb.py module which looks like this

      def TryMerge(self, d):
        while d.avail() > 0:
          tt = d.getVarInt32()
          if tt == 106:
            self.set_app(d.getPrefixedString())
            continue
          if tt == 114:
            length = d.getVarInt32()
            tmp = ProtocolBuffer.Decoder(d.buffer(), d.pos(), d.pos() + length)
            d.skip(length)
            self.mutable_path().TryMerge(tmp)
            continue
          if tt == 162:
            self.set_name_space(d.getPrefixedString())
            continue
    
    
          if (tt == 0): raise ProtocolBuffer.ProtocolBufferDecodeError
          d.skipData(tt)
    

    which is where the actual attempt to 'merge the input to into a Key' is made.


    You can see in the source code that during the process of constructing a Key from an urlsafe parameter not a whole lot can go wrong. First it checks if the input is a string and if it's not, a TypeError is raised, if it is but it's not 'valid', indeed a ProtocolBufferDecodeError is raised.


    My current experiment shows that statement above will throw ProtocolBufferDecodeError (Unable to merge from string.) exception if the some_user_input is invalid, but could not find anything about this from the API. Could someone kindly confirm this

    Sort of confirmed - we now know that also TypeError can be raised.

    and point me some better way for user input validity checking instead of catching the exception?

    This is an excellent way to check validity! Why do the checks yourself if the they are already done by appengine? A code snippet could look like this (not working code, just an example)

    def get(self):
      # first, fetch the user_input from somewhere
    
      try:
        key = ndb.Key(urlsafe=user_input)
      except TypeError:
        return 'Sorry, only string is allowed as urlsafe input'
      except ProtocolBufferDecodeError:
        return 'Sorry, the urlsafe string seems to be invalid'