Firstly, I tried to find the "ifrs_Revenue" and change the word that is located next to "ifrs_Revenue"
, but it failed.
f = open(input_path +'/CIS' + "/16_1분기보고서_03_포괄손익계산서_연결_2212.txt")
while line:
if "ifrs_Revenue" in line:
s_line = line.split("\t")
idx = s_line.index("ifrs_Revenue")
value = s_line[idx+1]
value = value.replace(value,'매출액')
break
line = f.readline()
Then, I found another way to replace specific words in the same file at once.
def inplace_change(filename, old_string, new_string):
with open(filename) as f:
s = f.read()
if old_string not in s:
print('"{old_string}" not found in {filename}.'.format(**locals()))
return
with open(filename, 'w') as f:
print('Changing "{old_string}" to "{new_string}" in {filename}'.format(**locals()))
s = s.replace(old_string, new_string)
f.write(s)
b_list = os.listdir(input_path +'/CIS')
for blist in b_list:
for old, new in zip([' 지배기업의 소유주에게 귀속되는 당기순이익(손실)','수익(매출액)', '영업수익', '영업이익(손실)', '관리비및판매비', '영업관리비용(수익)',' 지배기업의 소유주지분' ],['당기순이익(지배)', '매출액', '매출액','영업이익', '판매비와관리비', '판매비와관리비','당기순이익(지배)' ]):
inplace_change(input_path +'/CIS'+ '/' + blist, old_string= old, new_string= new)
break
What I want is to uniformly change the word next to a specific word, but no matter how much I searched, I couldn't find a way, so I came here. I am a non-English speaking resident, so I ask for your understanding using a translator.
I am attaching a picture to help you understand. Non-English words are Korean: Picture
I made a simple example file that mimics the data you are using, all data is separated by tab characters ("\t"):
col1 col2 col3
randomwords ifrs_Revenue replaceme
morerandomwords ifrs_CostOfSales this_should_stay_the_same
asdfasdfasdf ifrs_Revenue alsoreplaceme
jajajajajaja ifrsGrossProfit this_should_not_be_replaced
I then use the pandas module to search through and find all locations where "col2" == "ifrs_Revenue". In your case, you will replace 'col2' with the name of your column. Same goes for "col3", you want to replace this with the column name you are replacing. The code is as follows:
import pandas as pd
df = pd.read_csv("example.txt", sep="\t") # read in data
# NOTE: make sure to replace "example.txt" with your own filename
print(df.head())
mask = df.col2 == "ifrs_Revenue" # create mask that finds all rows with "ifrs_Revenue"
df.loc[mask, "col3"] = "REPLACED_VALUE" # "REPLACED_VALUE" will be the valie you want to use to replace
# also replace "col3" with the column you are replacing
print("=" * 50)
print(df.head())
df.to_csv("results.tsv", sep="\t") # this saves the results, change "results.tsv" to be whatever you want the save to be
These are the results:
col1 col2 col3
0 randomwords ifrs_Revenue REPLACED_VALUE
1 morerandomwords ifrs_CostOfSales this_should_stay_the_same
2 asdfasdfasdf ifrs_Revenue REPLACED_VALUE
3 jajajajajaja ifrsGrossProfit this_should_not_be_replaced
Let me know if you need any clarification!