Search code examples
design-patternsprogramming-languages

(Esoteric) programming language design escape characters multiple times


I'm creating an (esoterick) programming language. It's going well, this is what I have come up with:

So the basics of the language is:

  • Most of the core operations are one character. For example: % = the character to loop.
  • This programming language is on a single line, all operations are separated by this character: !.
  • It works with a stack, on which you can push and pop data.

Some things useful to know:

  • %<code> is the loop operation and requires the amount to loop to be pushed on the stack beforehand.

  • "<value> pushes the thing after it on the stack

  • ! seperates the operations

  • #<name> calls a function

So when I want to loop something, it works like this:

"10!%<code>

So a breakdown of the code I just showed you: "10 pushes 10 on the stack (the amount of times to loop) ! separates the operations % loops <code> the code to run

The <code> isn't different so it needs to also be separated by a character. I did it with the $, this works. So when I actually give code, it would look like this: "10!%"hello$#print

Another breakdown:

  • "10 pushes 10, the amount to loop on the stack
  • ! separates
  • % loops the code next to it
  • "hello push the text to print to the stack
  • $ this is an escaped !, this is where the problem starts occurring. This is already escaped once and nested in a loop, what if I want to do another loop in the loop?
  • #print calls the print function

As I have shown, I use $ as separator in a nested piece of code because if I use !, the lexer will think that a new top-level operation will happen again (not nested). So it will not do what you expect. With this approach of separating nested code, I can only nest one level, before the code starts breaking.

So let's say that we were to put another loop in a loop. With the current approach, it will look like this: "10!%"test$#print$"10$%"test2$#print

Yes I know this is unreadable, so I will break it down:

  • "10 push the amount to loop (10) to stack
  • % loop
  • "test$#print this will print the message test. Do keep in mind this already is in a loop, and therefore it is nested.
  • $"10$% this will loop again (10 times).
  • "test2$#print this should normally, if it's not nested two layers or deeper, print test2 10 times (with the loop).

But it doesn't print test2 more than once because this uses the same character as the first layer to escape the ! separator. Thus it escapes the code block of the while loop. This will produce unexpected results, making the second loop not work, and just print test2 once.

What is a good way to escape the ! separator more than once, so the different nested levels can be distinguished by the lexer?

Thank you for your time.


Solution

  • If you want to be able to nest things arbitrarily deep, you need your syntax to give you some way of knowing when each thing ends. The usual way is to have some sort of matching delimiter. For example, you might use " to know when " ends and > to know when < ends. Then you don't even need ! and could have something like:

    "10"%<"hello"#print>
    

    to print hello 10 times.

    You could then extend % to allow a name for the loop counter and use @ to get a named value

    "10"%i<"hello "#print@i#println>
    

    would then count with hello on each line