Search code examples
recursiontail-recursionfactor-lang

Complex-recursive stack effects?


USING: accessors html.parser.analyzer io kernel math namespaces
  present regexp sequences ;
IN: all-roads-to-wiki

SYMBOL: G

: match-good-pages ( a -- ?/f )
  R/ \/wiki\/[^:]*$/ first-match ;

: filter-urls ( tags -- urls )
  find-hrefs [ present ]     map
  [ match-good-pages ]       filter
  [ match-good-pages seq>> ] map ;

: findpath ( url -- url )
  G get =
  [
     ! false
  ]
  [ scrape-html nip
    [
      dup "title" find-by-name drop 1 + swap nth
      text>> R/ - Wikipedia,/ re-split first print
    ]
    [
      "bodyContent" find-by-id-between filter-urls [ findpath ] map
    ] bi
  ] if ; inline recursive

: allroads-entry ( -- a )
  readln "http://en.wikipedia.org/wiki/" prepend G set-global
  "enwp.org/Special:Random" findpath ; inline

The above code will recurse over every link on Wikipedia until it finds the one it's looking for.

That's okay, because (hopefully) findpath will eventually "return" (i.e. not call itself again) and leave a huge nested data structure on the stack. But when I try to compile this, I get an unbalanced-recursion error:

The recursive word “findpath” leaves with the stack having the wrong height

unbalanced-recursion: Thrown when stack effect inference determines that an inline recursive word has an incorrect stack effect declaration.

No matter what I do, Factor (understandably) complains about the stack effect not matching up. What do I have to do to get this to recurse properly?


Solution

  • Look closely at the find-path word. I'll add comments so that you can see what is on the stack:

    : findpath ( url -- url )
        ! 1 item: { url }
        G 
        ! 2 items: { url G }
        get 
        ! 2 items: { url value-of-G }
        =
        ! 1: item { t/f }
        [
           ! 0 items!!!!
           ! false
        ]
        [ scrape-html nip
            [
                dup "title" find-by-name drop 1 + swap nth
                text>> R/ - Wikipedia,/ re-split first print
            ]
            [
                "bodyContent" find-by-id-between filter-urls 
                [ findpath ] map
            ] bi
        ] if ; inline recursive
    

    The if combinator consumes the last item on the stack so this code can't possibly work. Here is working code for the findpath word:

    : page-title ( seq -- title )
        dup "title" find-by-name drop 1 + swap nth
        text>> R/ - Wikipedia,/ re-split first ;
    
    : page-links ( seq -- links )
        "bodyContent" find-by-id-between filter-urls ;
    
    : scrape-en-wiki-url ( wiki-url -- seq )
        "https://en.wikipedia.org" prepend
        dup print flush scrape-html nip ;
    
    : found-url? ( wiki-url -- ? )
        G get [ = ] [ drop t ] if* ;
    
    : findpath ( wiki-url -- seq/f )
        dup found-url?
        [ drop f G set f ] [
            scrape-en-wiki-url
            [ page-title print flush ] [
                page-links [ findpath ] map
            ] bi
        ] if ; inline recursive
    

    Also take a look at the Wikipedia vocab which is meant for tasks like these.