Search code examples
pythonrtftext-extraction

Read RTF file using python


  • reading RTF file using striprtf
  • rtf_to_text not able to read URL,what changes need to make in the code?

Input Get latest news update at [email protected]

Output Get latest news update at

Desired Output Get latest news update at [email protected]

python code:-

import os
from striprtf.striprtf import rtf_to_text
import pandas as pd
from os import path

path_of_the_directory= r'C:\Users\Documents\filename.rtf'
print("Files and directories in a specified path:")
for filename in os.listdir(path_of_the_directory):
    f = os.path.join(path_of_the_directory,filename)
    
    if os.path.isfile(f):
      print(f)
      open_rtf_file=open(f,'r')
      file_content_read=open_rtf_file.read()
      text_content=rtf_to_text(file_content_read)
      print(text_content)

Solution

  • It looks like you are treating a file as a directory. your path_of_the_directory varaible is actually the path to a rtf file name. Without knowing what specific error you are getting at runtime, it looks to me like that is the problem. An easy way to fix it is to check to make sure the path is a directory prior to calling os.listdir like I do in the example below.

    path_of_the_directory= r'C:\Users\Documents\filename.rtf' #<--- this is  a file
    print("Files and directories in a specified path:")
    if os.path.isdir(filename):                              # check if path is directory
        for filename in os.listdir(path_of_the_directory):   
            f = os.path.join(path_of_the_directory,filename)
        
            if os.path.isfile(f):
                print(f)
                open_rtf_file=open(f,'r')
                file_content_read=open_rtf_file.read()
                text_content=rtf_to_text(file_content_read)
                print(text_content)