Search code examples
htmlapachehaskellxhtmlcgi

Haskell XHTML works with manually typed data, but not with data read from file


I'm working on a board game where each team must submit an order each turn. To prevent abuse, I'm trying to make a login page, where you can select a team, give the team's password, and continue to the next page.

I'm using Haskell, using the resources provided here, specifically the "Getting user input" section.

Relevant documentation: Network.CGI, Text.XHtml

Imports, relevant data/types, and page function:

import Network.CGI
import Text.XHtml

data Team = Team
      {teamID :: Int,
     teamName :: String} deriving Eq
type Lang = Int
type Teams = [Team]

page :: String -> Html -> Html
page t b = header << thetitle << t +++ body << b

I have the following loginPage function:

loginPage :: Lang -> Teams -> Html
loginPage lang teams = page (["Lépés Bejelentkezés", "Turn Login"] !! lang) $
  form ! [method "post"] << 
    -- [paragraph << (["Csapat: ", "Team: "] !! lang +++ (select ! [name "teamID"] << teamOpts)), -- Generated version, does not work
    {- -}
    [paragraph << (["Csapat: ", "Team: "] !! lang +++
      (select ! [name "teamID"] <<
        [option ! [value "0"] << "Anglia", option ! [value "1"] << "Franciaország"])), --} -- Manually typed version, works perfectly
     paragraph << (["Jelszó: ", "Password: "] !! lang +++ password "password"),
     submit "" (["Tovább", "Next"] !! lang) ]
  where
    teamOpts = map (\t -> option ! [value . show $ teamID t] << teamName t) $ teams

The commented line uses teamOpts to generate a list of options, and put them in a select tag, with name "teamID". In the currently uncommented lines, I wrote (part of) the list I expect when calling teamOpts.

In ghci, both methods produce the exact same HTML. Yet, when calling this program on my webserver (Apache on Rocky Linux), I get the following outputs:

  • Generated:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
><head
  ><title
    >L&#233;p&#233;s Bejelentkez&#233;s</t

  • Manually written:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
><head
  ><title
    >L&#233;p&#233;s Bejelentkez&#233;s</title
    ></head
  ><body
  ><form method="post"
    ><p
      >Csapat: <select name="teamID"
    ><option value="0"
      >Anglia</option
      ><option value="1"
      >Franciaorsz&#225;g</option
      ></select
    ></p
      ><p
      >Jelsz&#243;: <input type="password" name="password" id="password"
     /></p
      ><input type="submit" value="Tov&#225;bb"
       /></form
    ></body
  ></html
>

As you can see, the generated version simply terminates before even finishing the title tag.

When running the script on the command line (on the webserver), I get the expected outcome (same as Manually written), with the CGI Header: Content-type: text/html; charset=ISO-8859-1 . I've also set it to Content-type: text/html; charset=UTF-8, but the same problem persists.

Other things I tried:

  • Using (teams seq ) before creating the list, and other methods of forcing evaluation (Usually the program terminates after returning <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/

  • Trying to debug using a simpler generator: (Works perfectli in cli, but not when loading it, returning up to </t)

showTeams :: Teams -> Html
showTeams teams = page "Teams" $
  paragraph << (concat . map (\t -> teamName t ++ ", ") $ teams)
  • Trying to use strict versions of IO functions (as recommended here). Worked in cli, but gave hGetContents: invalid argument (invalid byte sequence) errors over the web.

  • Making sure everything in the directory is owned by apache:apache

The necessary code (newUnitsPage, cgiMain, and main):

newUnitsPage :: Lang -> Teams -> Units -> Maybe String -> Maybe String -> Html
newUnitsPage lang teams units tid passwd = page (["Új egységek", "New units"] !! lang) body
  where
    body = paragraph << "PLACEHOLDER" +++ paragraph << fromJust tid +++ paragraph <<  fromJust passwd

cgiMain = do
  -- General setup
  liftIO $ hSetEncoding stdin utf8 -- This doesn't change anything either
  -- (What I use) {- -
  paths' <- liftIO $ listDirectory "./"
  let mapPaths = sort $ filter (=~ "\\.hmap$") paths'
  hmap <- liftIO $ getNewestMap mapPaths
  let teams = fetchTeams hmap --}
  {- For your convenience:
     Write the following to a file named "test.hmap":
     Team {teamID = 0, teamName = "Anglia"}
     Team {teamID = 1, teamName = "Franciaország"}
  -}
  test <- readFile "test.hmap" -- Pretty sure this is where it all goes wrong, but strict reading (Sysem.IO.Strict) does not fix it
  let teams = map (\line -> read line :: Team) . lines $ test

  -- Defaults to 0 (Hungarian)
  mlang <- getInput "lang"
  let lang = maybe 0 (\l -> if l `elem` ["1", "en"] then 1 else 0) mlang

  -- All Inputs
  -- Authentication
  tid <- getInput "teamID"
  password <- getInput "password"

  newUnitOrders <- getInput "newUnitOrders" -- This is for the next page, not yet implemented, since login doesn't work yet.

  -- Number coding for which form to show - method to show certain form based on what inputs exist
  let code = fromJust $ foldM (\lastCode (mInput, code) -> if isNothing mInput then Just lastCode else Just code)
        0 -- If username / password is not supplied, be on login page
        [(tid,1),(password,1), -- If newUnitOrders are not supplied, be on newUnit page
         (newUnitOrders,2)] -- Etc.

  -- The html output
  let pages =
        [loginPage lang teams,
        -- [showTeams teams,
         newUnitsPage lang teams units tid password]

  setHeader "Content-type" "text/html; charset=UTF-8" -- Optional
  output . renderHtml $ pages !! code

main = runCGI $ handleErrors cgiMain

I've checked over and over in the documentation, and I've found no indication of what is going wrong.

Thanks for any help!


Solution

  • I was able to duplicate with your test.hmap version. Check the server error logs, and you should see a note about invalid byte sequences in your hmap files, along the lines of:

    AH01215: xhtml.cgi: test.hmap: hGetContents: invalid argument (invalid
    byte sequence): /usr/lib/cgi-bin/xhtml.cgi
    

    The problem appears to be that Apache runs CGI scripts with LANG=C, and the Haskell script will die at some random point in time when it reads unicode data from one of your hmap files. Making the evaluation stricter may cause the script to fail earlier, but it isn't going to fix the problem.

    The easiest fix is probably to add:

    liftIO $ setLocaleEncoding utf8   -- from GHC.IO.Encoding
    

    to the top of your cgiMain function. (Changing the encoding of stdin wasn't necessary in my testing.)

    Here's my full version of your script used for testing. Without the liftIO $ setLocaleEncoding utf8 line, it truncates output exactly the same way you observed; with that line, it works fine:

    import Control.Monad
    import Data.Maybe
    import Data.List
    import System.Directory
    import System.IO
    import Network.CGI
    import Text.XHtml
    import GHC.IO.Encoding
    
    data Team = Team
          {teamID :: Int,
         teamName :: String} deriving (Read, Eq)
    type Lang = Int
    type Teams = [Team]
    
    page :: String -> Html -> Html
    page t b = header << thetitle << t +++ body << b
    
    loginPage :: Lang -> Teams -> Html
    loginPage lang teams = page (["Lépés Bejelentkezés", "Turn Login"] !! lang) $
      form ! [method "post"] <<
        [paragraph << (["Csapat: ", "Team: "] !! lang +++ (select ! [name "teamID"] << teamOpts)), -- Generated version, does not work
         paragraph << (["Jelszó: ", "Password: "] !! lang +++ password "password"),
         submit "" (["Tovább", "Next"] !! lang) ]
      where
        teamOpts = map (\t -> option ! [value . show $ teamID t] << teamName t) $ teams
    
    cgiMain :: CGI CGIResult
    cgiMain = do
      liftIO $ setLocaleEncoding utf8
    
      paths' <- liftIO $ listDirectory "./"
      test <- liftIO $ readFile "test.hmap" -- Pretty sure this is where it all goes wrong, but strict reading (Sysem.IO.Strict) does not fix it
      let teams = map (\line -> read line :: Team) . lines $ test
    
      -- Defaults to 0 (Hungarian)
      mlang <- getInput "lang"
      let lang = maybe 0 (\l -> if l `elem` ["1", "en"] then 1 else 0) mlang
    
      -- All Inputs
      -- Authentication
      tid <- getInput "teamID"
      password <- getInput "password"
    
      newUnitOrders <- getInput "newUnitOrders" -- This is for the next page, not yet implemented, since login doesn't work yet.
    
      -- Number coding for which form to show - method to show certain form based on what inputs exist
      let code = fromJust $ foldM (\lastCode (mInput, code) -> if isNothing mInput then Just lastCode else Just code)
            0 -- If username / password is not supplied, be on login page
            [(tid,1),(password,1), -- If newUnitOrders are not supplied, be on newUnit page
             (newUnitOrders,2)] -- Etc.
    
      -- The html output
      let pages =
            [loginPage lang teams]
            -- [showTeams teams,
            -- newUnitsPage lang teams units tid password]
    
      setHeader "Content-type" "text/html; charset=UTF-8" -- Optional
      output . renderHtml $ pages !! code
    
    main :: IO ()
    main = runCGI $ cgiMain