Search code examples
programming-languageslanguage-designrule

Is it possible, in any language, to implement rules that will affect every instance of an object?


For example, could I implement a rule that would change every string that followed the pattern '1..4' into the array [1,2,3,4]? In JavaScript:

//here you create a rule that changes every string that matches /$([0-9]+)_([0-9]+)*/
//ever created into range($1,$2) (imagine a b are the results of the regexp)
var a = '1..4';
console.log(a);
>> output: [1,2,3,4];

Of course, I'm pretty confident that would be impossible in most languages. My question is: is there any language in which that would be possible? Or have anyone ever proposed something like that? Does this thing have a 'name' for which I can google to read more about?


Solution

  • Modifying the language from whithin itself falls under the umbrell of reflection and metaprogramming. It is referred as behavioral reflection. It differs from structural reflection that opperates at the level of the application (e.g. classes, methods) and not the language level. Support for behavioral reflection varies greatly across languages.

    We can broadly categorize language changes in two categories:

    1. changes that modify the semantics (i.e. the rules) of the language itself (e.g. redefine the method lookup algorithm),
    2. changes that modify the syntax (e.g. your syntax '1..4' to create arrays).

    For case 1, certain languages expose the structure of the application (structural reflection) and the inner working of their implementation (behavioral reflection) to the application itself via special object, called meta-objects. Meta-objects are reifications of otherwise implicit aspects, that become then explicitely manipulable: the application can modify the meta-objects to redefine part of its structure, or part of the language. When it comes to langauge changes, the focus is usually on modifiying message sending / method invocation since it is the core mechanism of object-oriented language. But the same idea could be applied to expose other aspects of the language, e.g. field accesses, synchronization primitives, foreach enumeration, etc. depending on the language.

    For case 2, the program must be representated in a suitable data structure to be modified. For languages of the lisp family, the program manipulates lists, and the program can be itself represented as lists. This is called homoiconicity and is handy for metaprogramming, hence the flexibility of lisp-like languages. For other languages, their representation is usually an AST. Transforming the representation of the program, or rewriting it, is possible with macro, preprocessors, or hooks during compilation or class loading.

    The line between 1 and 2 is however blurry. Syntactical changes can appear to modify the semantics of the language. For instance, I can rewrite all fields accesses with proper getter and setter and perform additional logic there, say to implement transactional memory. Did I perform a semantical change of what a field access is, or merely a syntax change? Also, there are other constructs the fall bewten the lines. For instance, proxies and #doesNotUnderstand trap are popular techniques to simulate the reification of message sends to some extent.

    Lisp and Smalltalk have been very influencial in the field of metaprogramming, and I think the two following projects/platform are interesting to look at for a representative of each of these:

    • Racket, a lisp-like language focused on growing languages from within the langauge
    • Helvetia, a Smalltalk extension to embed new languages into the host language by leveraging the AST of the host environment.

    I hope you enjoyed this even if I did not really address your question ;)

    Your desired change require modifying the way literals are created. This is AFAIK not usually exposed to the application. The closed work that I can think of is Virtual Values for Language Extension, that tackled Javascript.