Search code examples
gf

Extending Determiners in GF


I asked a question earlier about extending the which_RP record, and changing the relative clause "that" to "which" under this question here.

Recently I started working on the Italian language, and whenever I use the function someSg_Det or somePl_Det I get the following result respectively.

qualche albero
qualche alberi

From my knowledge of The Italian language, I knew that for single nouns qualche albero is correct, but for plural nouns we must use alcune or alcuni depending on the gender of the noun. Thus I tried to extend the function somePl_Det to somePlM_Det and somePlF_Det as below:

   somePlM_Det : Str -> Det =
        \str ->
             somePl_Det ** {s = table {_ => "alcuni"}};
   somePlF_Det : Str -> Det =
        \str ->
             somePl_Det ** {s = table {_ => "alcune"}};

But GF compiler keep reporting this error:

Happend in operation somePlM_Det
   type of "alcuni"
   Expected: DiffIta.Case => Str;
   inferred: Str;
Happend in operation somePlF_Det
   type of "alcune"
   Expected: DiffIta.Case => Str;
   inferred: Str;

The extending missing the variable DiffIta.Case. After searching I found param Case located in the file DiffRomance.gf. Even though I found DiffRomance.Case but I still can't extend this function.

Would you kindly explain the usage of DiffRomance.Case and what is the correct way to extend this function. Thank you~


Solution

  • Just to make it clear for anyone reading this answer and not the previous:

    someWord ** {s = "some string"} is a hack and there's no guarantee it will work, when the RGL updates.

    I did suggest that hack, because it can be useful: sometimes the RGL doesn't output exactly what you want it to do, but it's not really an error that should be fixed upstream. Then you can override some RGL definition locally in your application grammar, but you should be aware of the potential instability.

    Now on to the answer to this question.

    Lincat of Det in Italian RGL

    You're trying to do this:

    somePlM_Det : Str -> Det =
         \str ->
              somePl_Det ** {s = table {_ => "alcuni"}};
    

    You're trying to insert table {_ => "alcuni"} into the s field of a Det in Italian. Let's see here what the actual lincat is:

        Det     = {
          s : Gender => Case => Str ;
          n : Number ;
          s2 : Str ;            -- -ci
          sp : Gender => Case => Str ;   -- substantival: mien, mienne
          isNeg : Bool -- negative element, e.g. aucun
          } ;
    

    Notice the s field, which is of type Gender => Case => Str. That's two nested list, whereas you tried to give it only one list.

    This would work, but not make sense:

       somePlF_Det : Str -> Det =
            \str ->
                 somePl_Det ** {s = table {_ => table {_ => "alcune"}}};
    

    What's wrong with that? Two things, explained below.

    Det has variable Gender – not inherent

    In RGL, those languages that have a gender or other noun class, the design is usually as follows.

    Nouns (and by inheritance CNs and NPs) have an inherent gender. This means that their lincats look like this:

    lincat 
      N = {s : Whatever => Str ; g : Gender} ;
    

    In contrast, all the modifiers and quantifiers of nouns (etc.) have a variable gender. So their lincats look like this:

    lincat 
      Det, Quant, A, AP, RS ... = {s : Gender => Whatever => Str} ;
    

    Gender is on the left-hand side of the inflection table, which means that the output of the modifier/quantifier depends on the gender of its eventual head. The noun (or CN, or NP) will be the head, and we will use the head's inherent gender to choose the right gender from the modifier/quantifier. The GF code looks like this:

    -- In the abstract syntax
    fun
      Modify : Modifier -> Head -> Head ;
    
    -- In the concrete syntax (different file)
    lin
      Modify mod head = {
        s = \\x => mod.s ! head.g ! x ++ head.s ! x
        } ;
    

    We use head.g to choose the string from mod.s. The variable x can represent any other inflection features, such as case or number---doesn't matter for this example.

    So it doesn't make sense to define somePlM_Det and somePlF_Det. We want to apply just one somePl_Det into any CN, and the CN decides whether the Det will output alcune or alcuni.

    What to do with Case

    The other wrong thing to do is just to replace every branch of the inflection table with a single string "alcune" or "alcuni". These cases (and genders) actually do something. Let's look at a table of some other Det, that doesn't need fixing.

    I'm in the directory gf-rgl/src/italian and I open the GF shell. (The commands prefixed by > are done inside the GF shell. I'm using the flags -retain -no-pmcfg because the Italian RG is so slow, this makes it faster.)

    $ gf 
    > i -retain -no-pmcfg LangIta.gf
    > cc -unqual -table many_Det
    s . Masc => Nom => molti
    s . Masc => Acc => molti
    s . Masc => CPrep P_di => di molti
    s . Masc => CPrep P_a => a molti
    s . Masc => CPrep P_da => da molti
    s . Masc => CPrep P_in => in molti
    s . Masc => CPrep P_su => su molti
    s . Masc => CPrep P_con => con molti
    s . Fem => Nom => molte
    s . Fem => Acc => molte
    s . Fem => CPrep P_di => di molte
    s . Fem => CPrep P_a => a molte
    s . Fem => CPrep P_da => da molte
    s . Fem => CPrep P_in => in molte
    s . Fem => CPrep P_su => su molte
    s . Fem => CPrep P_con => con molte
    

    All these prepositions are part of the inflection table! If you wonder why, that's because those prepositions merge with articles. First, examples where they don't merge:

    > cc -one PrepNP in_Prep (DetCN many_Det (UseN house_N))
    in molte case
    
    > cc -one PrepNP with_Prep (DetCN many_Det (UseN cat_N))
    con molti gatti
    

    And with these, they do merge.

    > cc -one PrepNP in_Prep (DetCN (DetQuant DefArt NumPl) (UseN house_N))
    nelle case   -- not "in le case"
    
    > cc -one PrepNP with_Prep (DetCN (DetQuant DefArt NumPl) (UseN cat_N))
    coi gatti    -- not "con i gatti"
    

    So that's why we need case in the inflection table. If I replaced somePl_Det with what you wrote, I'd get this:

    > cc -one PrepNP with_Prep (DetCN somePl_Det (UseN cat_N))
    alcuni gatti   -- "with" is missing!
    
    > cc -one PrepNP in_Prep (DetCN somePl_Det (UseN house_N))
    alcuni case    -- "in" is missing, and no gender agreement!
    

    How to actually fix somePl_Det

    This seems like something to be fixed upstream, and not hacked individually on everyone's local grammars.

    Replace line 86 in StructuralIta with this:

      somePl_Det = {s,sp = \\g,c => prepCase c ++ genForms "alcuni" "alcune" ! g ; n = Pl ; s2 = [] ; isNeg = False} ;
    

    Recompile your RGL, and from now on, you should get this output:

    > cc -one PrepNP in_Prep (DetCN somePl_Det (UseN house_N))
    in alcune case
    
    > cc -one PrepNP with_Prep (DetCN somePl_Det (UseN cat_N))
    con alcuni gatti
    

    If you make a fork of the gf-rgl repo, and fix this in your own branch, then you can make a pull request from your branch and get it merged to the upstream.