If I write a C function that does something with an iterable then I create an Iterator first and then loop over it.
iterator = PyObject_GetIter(sequence);
if (iterator == NULL) {
return NULL;
}
while (( item = PyIter_Next(iterator) )) {
...
}
This works fine but I've also seen some functions using tp_iternext
:
iterator = PyObject_GetIter(sequence); // ....
iternext = *Py_TYPE(iterator)->tp_iternext;
while (( item = iternext(iterator) )) {
...
}
the second approach seems faster (I have only one data point: my Windows computer and my msvc compiler).
Is it just coincidence that the iternext
approach is faster and is there any significant difference between these two?
Links to the python documentation of both: PyIter_Next, tp_iternext I have read them but to me it's not clear when and why one should be preferred.
The source code for PyIter_Next
shows that it simply retrieves the tp_iternext
slot and calls it and clears a StopIteration
exception that may or may not have occurred.
If you use tp_iternext
explicitly you have to check for this StopIteration
when exhausting the iterator.
By the way: the documentation of tp_iternext
also says:
iternextfunc PyTypeObject.tp_iternext
An optional pointer to a function that returns the next item in an iterator. When the iterator is exhausted, it must return
NULL
; aStopIteration
exception may or may not be set. When another error occurs, it must returnNULL
too. Its presence signals that the instances of this type are iterators.
While there is no such mention in PyIter_Next
's documentation.
So PyIter_Next
is the simple and safe way of iterating over an iterator. You can use tp_iternext
but then you have to be careful to not trigger a StopIteration
exception at the end.