Python's REPL reads input line by line. However, function definitions consist from multiple lines.
For example:
>>> def answer():
... return 42
...
>>> answer()
42
How does CPython's parser request additional input after partial def answer():
line?
TLDR: Digging into source code of CPython, I figured out that lazy lexer outputs >>>
and ...
promts.
pymain_repl
function:static void
pymain_repl(PyConfig *config, int *exitcode)
{
/* ... */
PyCompilerFlags cf = _PyCompilerFlags_INIT;
int res = PyRun_AnyFileFlags(stdin, "<stdin>", &cf); // <-
*exitcode = (res != 0);
}
Which sets name of compiled file to "<stdin>"
.
"<stdin>"
, then _PyRun_InteractiveLoopObject
will be called.
It's the REPL loop itself. Also, here >>>
and ...
are loaded to some global state.int
_PyRun_InteractiveLoopObject(FILE *fp, PyObject *filename, PyCompilerFlags *flags)
{
/* ... */
PyObject *v = _PySys_GetAttr(tstate, &_Py_ID(ps1));
if (v == NULL) {
_PySys_SetAttr(&_Py_ID(ps1), v = PyUnicode_FromString(">>> ")); // <-
Py_XDECREF(v);
}
v = _PySys_GetAttr(tstate, &_Py_ID(ps2));
if (v == NULL) {
_PySys_SetAttr(&_Py_ID(ps2), v = PyUnicode_FromString("... ")); // <-
Py_XDECREF(v);
}
/* ... */
do {
ret = PyRun_InteractiveOneObjectEx(fp, filename, flags); // <-
/* ... */
} while (ret != E_EOF);
return err;
}
PyRun_InteractiveOneObjectEx
reads, parses, compiles and runs single python's objectstatic int
PyRun_InteractiveOneObjectEx(FILE *fp, PyObject *filename,
PyCompilerFlags *flags)
{
/* ... */
v = _PySys_GetAttr(tstate, &_Py_ID(ps1)); // <-
/* ... (ps1 is set to v) */
w = _PySys_GetAttr(tstate, &_Py_ID(ps2)); // <-
/* ... (ps2 is set to w) */
mod = _PyParser_ASTFromFile(fp, filename, enc, Py_single_input,
ps1, ps2, flags, &errcode, arena);
/* ... */
}
tok_underflow_interactive
function, that requests tokens with prompt through PyOS_Readline(stdin, stdout, tok->prompt)
callP.S: The 'Your Guide to the CPython Source Code' article was really helpful. But beware - linked source code is coming from an older branch.