Search code examples
pythonpython-3.xipythonpypdf

Type the name of an object in python interpreter - what method is it called?


What method is it called when I type the name of the object? I always thought that it was calling either repr or str but that doesn't hold in case of the PageObject of PyPDF2. As you can see, the output of __repr__ or __str__ is different to the one we get when we type the name of the variable in the interactive console.

>>> reader = PdfFileReader(f)
>>> page = reader.pages[0]
>>> page

'/Encoding': {'/Differences': [32,
      '/space',
      40,
      '/parenleft',
      '/parenright',
      46,
      '/period',
      '/slash',
      '/zero',
      '/one',
      '/two',
      '/three',
      '/four',
      '/five',
      '/six',
      56,
      '/eight',
      '/nine',
      69,
      ...

>>> page.__str__()

"{'/Annots': [], '/Contents': IndirectObject(12, 0), '/Group': {'/CS': '/DeviceRGB', '/S': '/Transparency', '/Type': '/Group'}, '/MediaBox': RectangleObject([0, 0, 460.8, 345.6]), '/Parent': IndirectObject(2, 0), '/Resources': IndirectObject(8, 0), '/Type': '/Page', '/ArtBox': RectangleObject([0, 0, 460.8, 345.6]), '/BleedBox': RectangleObject([0, 0, 460.8, 345.6]), '/CropBox': RectangleObject([0, 0, 460.8, 345.6]), '/TrimBox': RectangleObject([0, 0, 460.8, 345.6])}"
```

>>> page.__repr__()

<same-as-above>

P.S. Probably there's an answer out there for this question and it's just that I haven't typed my query correctly.


UPDATE I observe this behavior in IPython (version 5.5.0). Running with the builtin REPL the output I get when typing the variable name matches the repr output.


Solution

  • Using a variable name (like x) in the standard Python REPL is equivalent to print(repr(x)). You can convince yourself of that (and that it is not simply print(x)) by implementing __repr__ and __str__ yourself:

    >>> class Test:
    ...   def __repr__(self):
    ...     return 'using repr\nmagic, isn\'t it?'
    ...   def __str__(self):
    ...     return 'using str'
    ... 
    >>> Test()
    using repr
    magic, isn't it?
    >>> repr(Test())
    "using repr\nmagic, isn't it?"
    >>> print(Test())
    using str
    >>> print(repr(Test()))
    using repr
    magic, isn't it?
    

    But you are using IPython, which features rich outputs; which means that some objects get special treatment when being displayed. Dicts are such objects; and since your page is a special kind of dict:

    Help on PageObject in module PyPDF2.pdf object:
    
    class PageObject(PyPDF2.generic.DictionaryObject)
     |  PageObject(pdf=None, indirectRef=None)
     |  
     |  This class represents a single page within a PDF file.  Typically this
     |  object will be created by accessing the
     |  :meth:`getPage()<PyPDF2.PdfFileReader.getPage>` method of the
     |  :class:`PdfFileReader<PyPDF2.PdfFileReader>` class, but it is
     |  also possible to create an empty page with the
     |  :meth:`createBlankPage()<PageObject.createBlankPage>` static method.
     |  
     |  :param pdf: PDF file the page belongs to.
     |  :param indirectRef: Stores the original indirect reference to
     |      this object in its source PDF
     |  
     |  Method resolution order:
     |      PageObject
     |      PyPDF2.generic.DictionaryObject
     |      builtins.dict
     |      PyPDF2.generic.PdfObject
     |      builtins.object
    […snip…]
    

    Then you get the special dict display; which is akin to using pprint.pprint:

    >>> import pprint
    >>> from PyPDF2 import PdfFileReader
    >>> PDF = PdfFileReader('…')
    >>> page = PDF.pages[0]
    >>> pprint.pprint(page)
    {'/Contents': IndirectObject(2, 0),
     '/Group': {'/CS': '/DeviceRGB',
                '/I': <PyPDF2.generic.BooleanObject object at 0x7faa67639310>,
                '/S': '/Transparency'},
     '/MediaBox': [0, 0, 842, 595],
     '/Parent': IndirectObject(6, 0),
     '/Resources': IndirectObject(23, 0),
     '/Rotate': 0,
     '/Type': '/Page'}