Search code examples
javascriptrecursionabstract-syntax-tree

Trouble with recursively joining an array formed from an abstract syntax tree-like structure


I have a rough AST-like structure; an array containing either a string, or an object, where the object has an id and arguments, where the arguments are an array of either a string or an object, as above.

The goal is to output a string from the joined strings and objects' strings, recursively built(due to the nature of the structure).

A further complication is that the objects are really function calls, with differing argument consumption. However, the output is always a string.

I've tried varied methods of translating this AST-like structure into a string, without success; below is my current attempt, but it doesn't work as aside from the obvious output clashes, I'm having trouble figuring out the exact logic I need to achieve the goal.

const argLists = [
  [
    "one",
    { id: "$id1", args: [["two", "three"], "4", "3", "$1-"] },
    "four",
    "five",
    { id: "$id1", args: [["six", "seven"], "3", "4", "$1-"] }
  ],
  ["(", "$+", "$1", "$+", "text", "$+", "$2", "$+", ")", "$3-"],
  [
    {
      id: "$id1",
      args: [
        [
          "one",
          "$+",
          { id: "$+", args: ["two", { id: "$chr", args: ["44"] }, "three"] },
          "$+",
          "four"
        ],
        "4",
        "5",
        "$1-"
      ]
    }
  ]
]

function joinArgs(args = [], output) {
      for (let i = 0; i < args.length; i++) {
        let arg = args[i];
        if (typeof arg === "string") {
          i--;
          let tempOutput = "",
            prev = args[i] || "",
            join = "",
            next = args[i + 2] || "";
          if (typeof prev === "string" && typeof next === "string") {
            join += arg === "$+" ? "" : `${arg}`;
            tempOutput = `${prev}${join}${next}`;
          } else if (typeof prev === "string" && typeof next === "object") {
            tempOutput = `${prev}${join}${joinArgs(
              next.args,
              output
            )}`;
          } else if (typeof prev === "object" && typeof next === "string") {
            tempOutput = `${joinArgs(
              prev.args,
              output
            )}${join}${next}`;
          }
          i += 3;
          output += tempOutput;
        } else if (typeof arg === "object") {
          if (Array.isArray(arg)) {
            output += joinArgs(arg, output);
          } else {
            if (arg.id === "$+") {
              output += joinArgs(arg.args, output);
            } else if (arg.id === "$chr") {
              output += String.fromCharCode(arg.args[0]);
            } else if (arg.id === "$id1") {
              const id1Args = [];
              for (
                let id1ArgIdx = 0;
                id1ArgIdx < arg.args.length;
                id1ArgIdx++
              ) {
                let id1Arg = arg.args[id1ArgIdx];

                if (Array.isArray(id1Arg)) {
                  id1Arg = joinArgs(id1Arg, output).trim();
                }
                id1Args.push(id1Arg);
              }
              output +=
                " "
                // This function is irrelevant to the problem; but it does return a string
                //id1(
                //  ...id1Args.slice(0, id1Args.length - 1),
                //  id1Args[id1Args.length - 1]
                //);
            }
          }
        }
      }
      return output;
}

argLists.forEach(arg => {
  console.log(joinArgs(arg, ""));
});

The data for the test cases:

const argLists = [
  [
    "one",
    { id: "$id1", args: [["two", "three"], "4", "3", "$1-"] },
    "four",
    "five",
    { id: "$id1", args: [["six", "seven"], "3", "4", "$1-"] }
  ],
  ["(", "$+", "$1", "$+", "text", "$+", "$2", "$+", ")", "$3-"],
  [
    {
      id: "$id1",
      args: [
        [
          "one",
          "$+",
          { id: "$+", args: ["two", { id: "$chr", args: ["44"] }, "three"] },
          "$+",
          "four"
        ],
        "4",
        "5",
        "$1-"
      ]
    }
  ]
];

The expected output:

const output = [
    "one (two three 4 3 $1-) four five (six seven 3 4 $1-)",
    "($1text$2) $3-",
    "(onetwo,threefour 4 5 $1-)"
];

The output should be concatenated according to these rules:

  1. If the argument is a string and is a $+, concatenate the previous and next values without a space between them(prevnext).
  2. If the argument is a string and is not a $+, concatenate the previous, current, and next value with a space between them(prev current next).
  3. If the argument is an object, get its arguments from the args property and recurse as necessary, however:
    1. If the object id is $+, join all arguments without spaces, recursing as necessary per argument.
    2. If the object id is $chr, return String.fromCharCode() with the only argument supplied to the object given to that static method.
    3. If the object id is $id1, treat the first argument as an array according to the rules above(if it is an array), and join all remaining arguments with a space and wrap everything in parenthesis.
    4. It is technically possible that an object's id may not be any of the prior three possibilities, but that eventuality isn't an issue, since the function the object calls is always going to return a string.

Solution

  • This version separates the processing of a generic value into handling for arrays, plain objects, and other values (which are just returned.) The main joinArg function calls joinObject and joinArray, each of which can recursively call back to joinArg.

    I think the breakdown in joinObject makes it clear enough how to add other cases if the need arises. It also has a place for default handling, which you alluded to but didn't specify.

    const joinObject = ({id, args}) => (({
      '$+': (ns) => ns .map (joinArgs) .join (''),
      '$chr': ([n]) => String .fromCharCode (n),
      '$id1': ([ns, ...more]) => `(${joinArgs (ns)} ${more .map (joinArgs) .join (' ')})`,
    }) [id] || ((args) => args .map (joinArgs) .join (' '))) (args)
    //          `----------------------------------------' 
    //                                `--- default.  Not sure what belongs here
    
    const joinArray = (args) => args .reduce(
      ({str, skip}, arg) => ({
        str: str + (skip || arg == '$+' ? '' : ' ') + (arg == '$+' ? '' : joinArgs(arg)), 
        skip: arg == '$+'
      }),
      {str: '', skip: true}
    ).str
    
    const joinArgs = (args) =>
      Array .isArray (args) 
        ? joinArray (args) 
      : Object (args) === args 
        ? joinObject (args) 
      : args
    
    
    const argLists = [["one", {id: "$id1", args: [["two", "three"], "4", "3", "$1-"]}, "four", "five", {id: "$id1", args: [["six", "seven"], "3", "4", "$1-"]}], ["(", "$+", "$1", "$+", "text", "$+", "$2", "$+", ")", "$3-"], [{id: "$id1", args: [["one", "$+", {id: "$+", args: ["two", {id: "$chr", args: ["44"]}, "three"]}, "$+", "four"], "4", "5", "$1-"]}]]
    
    argLists.forEach(args => console.log(joinArgs(args)))

    This handles the samples you supplied. I'm curious to see if it captures your full needs or if it still would need to be tweaked for other cases.

    The big trick here is the mutual recursion. It can well simplify the code you write in circumstances like this. Another useful technique is the object holding the handlers to run for the different Object id values. I find this very useful for separating out the main parts of the logic from what glues them together.