Search code examples
apache-flinkpyflink

PyFlink - Kafka - Missing module


I am trying to start with PyFlink and Kafka, but get below error.

Thanks for your support !

Installation

python -m pip install apache-flink
pip install pyFlink 

Code

from pyFlink.datastream import StreamExecutionEnvironment

Error

ModuleNotFoundError: No module named 'pyFlink'

Solution

  • To install PyFlink, you only need to execute:

    python -m pip install apache-flink

    and make sure you have a compatible Python version (>= 3.5).

    Imports are case-sensitive; the error is thrown because the package name is "pyflink", not "pyFlink". So, instead, you can try:

    from pyflink.datastream import StreamExecutionEnvironment

    If you're going to use Kafka, please remember to also add the required (JAR) dependencies, using:

    config = t_env.get_config().get_configuration()
    config.set_string("pipeline.jars",
                      "file:///path/to/jar/jarfile.jar")
    

    You can read more about handling connectors and other dependencies in the PyFlink documentation.