We have many RTF files which we need to upload in Oracle EBS to their respective category. To do so we need to read some info stored in Document Properties of RTF file. These fields are Title, Subject, Author, Company and Category.
When we open a RTF file in notepad, we can see this info but not sure how to extract it using linux command. Using grep wasn't very successful.
I am pasting here part of RTF file which holds this info
\mwrapIndent1440\mintLim0\mnaryLim1}{\info**{\title ^XXSLS_GBL_ORDACK^}****{\subject XXSLS}****{\author ^es_ES,es_FR,ES_IT,ES_de^}**{\doccomm $Header: XXSLS_GBL_ORDACK_ES_ES.rtf $}
{\operator }{\creatim\yr2012\mo11\dy11\hr14\min3}{\revtim\yr2013\mo3\dy2\hr10\min43}{\version24}{\edmins361}{\nofpages4}{\nofwords725}{\nofchars14202}{\*\manager }{\*\company }**{\*\category ^BD^}**{\nofcharsws14898}
{\vern32773}}{\*\userprops {\propname _DocHome}\proptype3{\staticval -974575144}}{\*\xmlnstbl {\xmlns1 http://schemas.microsoft.com/office/word/2003/wordml}}\paperw11850\paperh18144\margl851\margr851\margt851\margb0\gutter0\ltrsect
Can someone please suggest how we can extract this info as follows:
Title=^XXSLS_GBL_ORDACK^
Subject=XXSLS
Author=^es_ES,es_FR,ES_IT,ES_de^
Category=^BD^
Grep can do it with the -E (advanced regex) flag and -o (only matching output) flag.
title=`grep -oE 'title [^\}]+' file.rtf | sed 's/title //g'`
echo "title=$title"
subject=`grep -oE 'subject [^\}]+' file.rtf | sed 's/subject //g'`
echo "subject=$subject"
author=`grep -oE 'author [^\}]+' file.rtf | sed 's/author //g'`
echo "author=$author"
category=`grep -oE 'category [^\}]+' file.rtf | sed 's/category //g'`
echo "category=$category"
I get
title=^XXSLS_GBL_ORDACK^
subject=XXSLS
author=^es_ES,es_FR,ES_IT,ES_de^
category=^BD^