Below is the original nfo file in format that Emby using
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<movie>
<plot />
<outline />
<lockdata>false</lockdata>
<dateadded>2023-02-22 21:52:29</dateadded>
<title>old title</title>
<sorttitle>old title</sorttitle>
<runtime>119</runtime>
<fileinfo>
<streamdetails>
<video>
<codec>h264</codec>
<micodec>h264</micodec>
<bitrate>5744052</bitrate>
<width>1920</width>
<height>1080</height>
<aspect>16:9</aspect>
<aspectratio>16:9</aspectratio>
<framerate>29.96973</framerate>
<language>und</language>
<scantype>progressive</scantype>
<default>True</default>
<forced>False</forced>
<duration>119</duration>
<durationinseconds>7168</durationinseconds>
</video>
<audio>
<codec>aac</codec>
<micodec>aac</micodec>
<bitrate>256000</bitrate>
<language>und</language>
<scantype>progressive</scantype>
<channels>2</channels>
<samplingrate>48000</samplingrate>
<default>True</default>
<forced>False</forced>
</audio>
</streamdetails>
</fileinfo>
</movie>
And I am trying to update the title with below python script
import xml.etree.ElementTree as ET
title = "千と千尋の神隠し"
# Load the NFO file
filename = "movie.nfo"
tree = ET.parse(filename)
root = tree.getroot()
# Find the <title> tag and replace its text value with the new title
title_elem = root.find("title")
title_elem.text = title
# Write the updated XML structure to the NFO file
tree.write(filename, encoding="utf-8", xml_declaration=True)
But after I run the script, the title turned into garbled character
<title>千と千尋の神隠し</title>
I know it is must be an encoding issue, but I do not know how to solve it
The nfo file should be updated to
<title>千と千尋の神隠し</title>
You face a mojibake case:
print("千と千尋の神隠し".encode('utf-8').decode('cp437'))
千と千尋の神隠し
The problem is the .NFO
file extension:
The NFO file extension is used for a Warez Information File developed by THG. NFO file is basically pirated information pertaining to a software or program that is released and distributed by any organized group without the knowledge or permission of the creator or owner of such programs…
Wikipedia .nfo
says - NFO files often contain elaborate ANSI art (It is similar to ASCII art, but constructed from a larger set of 256 letters, numbers, and symbols — all codes found in IBM code page 437
, often referred to as extended ASCII).
Oddly enough, *.nfo
files are always recognized as OEM-US
encoding even in Notepad++ (see this issue at github)
Result: your file is UTF8
.
Proof #1:
import xml.etree.ElementTree as ET
# Load the NFO file
filename = "movie.nfo"
tree = ET.parse(filename)
root = tree.getroot()
# Find the <title> tag
title_elem = root.find("title")
print( title_elem.text)
千と千尋の神隠し
Proof #2:
filename = "movie.nfo"
with open(filename, mode='r', encoding='utf-8') as fnfo:
lines = fnfo.readlines()
print([line for line in lines if '<title>' in line])
[' <title>千と千尋の神隠し</title>\n']