Search code examples
pythonnamespacesgetattrsetattr

Given a Python string describing object.attribute, how do I separate the attributes's namespace from the attribute?


Given a Python string describing object.attribute, how do I separate the attributes's namespace from the attribute?

Desired Examples:

ns_attr_split("obj.attr") => ("obj", "attr")
ns_attr_split("obj.arr[0]") => ("obj", "arr[0]")
ns_attr_split("obj.dict['key']") => ("obj", "dict['key']")
ns_attr_split("mod.obj.attr") => ("mod.obj", "attr")
ns_attr_split("obj.dict['key.word']") => ("obj", "dict['key.word']")

Note: I understand writing my own string parser would be one option, but I am looking for a more elegant solution to this. Rolling my own string parser isn't as simple as an rsplit on '.' because of the last option listed above where a given keyword may contain the namespace delimiter.


Solution

  • I've recently discovered the tokenize library for tokenizing python source code. Using this library I've come up with this little code snippet:

    import tokenize
    import StringIO
    
    def ns_attr_split(s):
      arr = []
      last_delim = -1
      cnt = 0
    
      # Tokenize the expression, tracking the last namespace
      # delimiter index in last_delim
      str_io = StringIO.StringIO(s)
      for i in tokenize.generate_tokens(str_io.readline):
        arr.append(i[1])
        if i[1] == '.':
          last_delim = cnt
        cnt = cnt + 1
    
      # Join the namespace parts into a string
      ns = ""
      for i in range(0,last_delim):
        ns = ns + arr[i]
    
      # Join the attr parts into a string
      attr = ""
      for i in range(last_delim + 1, len(arr)):
        attr = attr + arr[i]
    
      return (ns, attr)
    

    This should work with intermediate index/keys as well. (i.e "mod.ns[3].obj.dict['key']")