From weeks I have been trying to install pdftotext
for python
but have faced challenges & failed due to poppler earlier.
So recently I have:
Windows 10
to Windows 11
to enable Sudo
& use apt
commandssudo apt-get update
sudo apt install python3-pip
sudo apt-get install python-poppler
sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev
Issue:
Now when I goto cmd
and run
pip install pdftotext
Error:
Collecting pdftotext
Using cached pdftotext-2.2.2.tar.gz (113 kB)
Preparing metadata (setup.py) ... done
Building wheels for collected packages: pdftotext
Building wheel for pdftotext (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [11 lines of output]
running bdist_wheel
running build
running build_ext
building 'pdftotext' extension
creating build
creating build\temp.win-amd64-cpython-39
creating build\temp.win-amd64-cpython-39\Release
"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DPOPPLER_CPP_AT_LEAST_0_30_0=0 -DPOPPLER_CPP_AT_LEAST_0_58_0=0 -DPOPPLER_CPP_AT_LEAST_0_88_0=0 -IC:\Users\vinee\anaconda3\include -IC:\Users\vinee\anaconda3\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.41.34120\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" /EHsc /Tppdftotext.cpp /Fobuild\temp.win-amd64-cpython-39\Release\pdftotext.obj -Wall
pdftotext.cpp
C:\Users\vinee\anaconda3\include\pyconfig.h(59): fatal error C1083: Cannot open include file: 'io.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.41.34120\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pdftotext
Running setup.py clean for pdftotext
Failed to build pdftotext
ERROR: Could not build wheels for pdftotext, which is required to install pyproject.toml-based projects
For issue I referred this SO Post and it mentions about installing CMAKE
which I already have and again ran
C:\Windows\System32>pip install Cmake
WARNING: Ignoring invalid distribution -cipy (c:\users\vinee\anaconda3\lib\site-packages)
Requirement already satisfied: Cmake in c:\users\vinee\anaconda3\lib\site-packages (3.30.4)
WARNING: Ignoring invalid distribution -cipy (c:\users\vinee\anaconda3\lib\site-packages)
But I am still stuck on build wheel error. What should I do next. Really need help on this.
Update: I came across this SO post about missing io-h file or directory and I have tried adding below command:
set LIB=C:\Program Files (x86)\Windows Kits\10\Redist\ucrt\DLLs\x64
But I am still getting the same error.
The issue is that your C++ compiler cannot find the header files. Looking at this issue you will need to ensure you have installed: Visual C++ Build Tools core features, MSVC toolset C++, Visual C++ Redist and Windows 10 (or in your case Windows 11) SDK. I found another response where the Windows SDK solved the issue.
Another option is the use the set INCLUDE
and set LIB
commands to tell the compiler where the header files are located. Keep in mind this option would only work if you have the header files already installed in another location (see the first link for more info on this).