Search code examples
pythondictionarydataframepandasdata-analysis

Pandas undocumented DataFrame.keys() method


Im new in Pandas, and while playing with its Dataframe, i found method keys() which works pretty like dict.keys(). But I cannot find it in docs. What am i missing?


Solution

  • You can see where this is defined in the source:

    def keys(self):
        return self.columns
    

    And, if you look at the git blame, you can see it was added as a fix for #1240: "Request: keys() method on dataFrame". The rationale seems to be:

    While learning Pandas this kind of method is useful to move from the well-understood dict structure to the more powerful DataFrame. As a pandas novice this kind of mental mapping would be much appreciated.

    However, it's worth noting that DataFrame supports only about half of the mapping interface. For example, there's iteritems and keys, but no iterkeys. And there are also cases where they added similar but not-quite-the-same names, like iterkv, which is equivalent to iteritems but there specifically because the latter "gets incorrectly converted to .items() by 2to3".

    You can go through the source and see where each of these were added and why, but there doesn't seem to be too much rhyme or reason beyond "DataFrame is kind of like a dict, and kind of not."

    The fact that they chose not to document most of these methods, or to document that DataFrame is kind of like a dict, I wouldn't rely on any of this. Just use columns instead of keys(), etc.