Search code examples
utf-8asp-classicutf8-decode

ASP: I can´t decode some character from utf-8 to iso-8859-1


I use this function to decode UTF-8:

function DecodeUTF8(s)
  dim i
  dim c
  dim n
  i = 1
  do while i <= len(s)
    c = asc(mid(s,i,1))
    if c and &H80 then
      n = 1
      do while i + n < len(s)
        if (asc(mid(s,i+n,1)) and &HC0) <> &H80 then
          exit do
        end if
        n = n + 1
      loop
      if n = 2 and ((c and &HE0) = &HC0) then
        c = asc(mid(s,i+1,1)) + &H40 * (c and &H01)
      else
        c = 191 
      end if
      s = left(s,i-1) + chr(c) + mid(s,i+n)
    end if
    i = i + 1
  loop

  DecodeUTF8 = s
end function

But there are some probles to decode that characters:

€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ

In that case

c=191-->c='¿'

I found some info related with this problem: http://www.i18nqa.com/debug/utf8-debug.html

Do you know any function to decode correctly?


Solution

  • Public Function DecodeUTF8(s)
      Set stmANSI = Server.CreateObject("ADODB.Stream")
      s = s & ""
      On Error Resume Next
    
      With stmANSI
        .Open
        .Position = 0
        .CharSet = "Windows-1252"
        .WriteText s
        .Position = 0
        .CharSet = "UTF-8"
      End With
    
      DecodeUTF8 = stmANSI.ReadText
      stmANSI.Close
    
      If Err.number <> 0 Then
        lib.logger.error "str.DecodeUTF8( " & s & " ): " & Err.Description
        DecodeUTF8 = s
      End If
      On error Goto 0
    End Function