Search code examples
utf-8coldfusioncharacter-encodingasciiiconv

How Can I Convert From US-ASCII to UTF-8 with iconv?


I'm attempting to convert multiple files in batch from US-ASCII to UTF-8. I've narrowed the problem down to the iconv comnand, which I appear to be using wrong, despite my best scouring of Stack Overflow. Any idea what's going wrong here?

Checking the encoding:

file -i accounting.cfm
accounting.cfm: text/html; charset=us-ascii

Attempt using iconv to convert:

iconv -f us-ascii -t utf-8 accounting.cfm > accounting.cfm.recode

Check the encoding on the resultant file:

file -i accounting.cfm.recode
accounting.cfm.recode: text/html; charset=us-ascii

It seems the resultant file is still encoded for the US-ASCII charset. When I save a single file through Sublime as UTF-8 (Save With Encoding), it show the charset as utf-8. I understand the US-ASCII is a subset of UTF-8, but when I bring up the US-ASCII encoded file in the browser, I get garbage characters (the dreaded question marks in diamonds if loaded in a browser). This is for a legacy ColdFusion site. When I bring up the file Saved With Encoding through Sublime, my foreign character set appears properly. Any ideas what I'm doing wrong? Thanks.


Solution

  • So I figured it out. ColdFusion does need the BOM to work correctly, unless you want to put a <cfprocessingdirective pageencoding="utf-8"> tag at the top of each and every CFM file you may have non-ASCII characters in. Reference:

    https://forums.adobe.com/thread/930550 https://www.adobe.com/support/coldfusion/internationalization/internationalization_cfmx/internationalization_cfmx3.html

    I'm a Sublime user, so I simply went to File -> Save With Encoding, UTF-8 with BOM, and it works without the tag. I then became quite happy that I spend most of my days in Python 3!