Search code examples
ccompiler-construction

C: value vs type vs expression


Let's say I want to build a C compiler in some OO language, so I create classes like Scope, Object, LValue, Type, Pointer, Expression, etc.

What I need to understand is what is the exact relationship between types, values, expressions and objects. Because some C references say something like:

Expression evaluates to lvalue.

In other places you can read something like:

lvalue is an expression...

So, how exactly do the C concepts relate to each other?

For binary expression, say addition a + b, the a and b operands, are those expressions? Or objects? Or just types that may or may not be an object?

Consider a simple code:

int a = 1;
a++;
  • a++; is expression-statement, right?
  • a++ is expression, with postfix increment operator, right?
  • a, so what is a? Is it expression, object, or lvalue? Or is a an expression that evaluates to lvalue and this lvalue has an associated object and type?

EDIT: What I am doing is writing a library (called jsc) that will allow to "write C in JavaScript".

Consider the following C function:

void main() {
    int a = 100;
    int b = 200;
    int c = 300;
    return a + (b + c);
}

Using jsc builder interface in JavaScript you would write that function like so:

var _ = require('jsc').Builder.context();

var main = _.func(_ => {
    var a = _.int(100);
    var b = _.int(200);
    var c = _.int(300);
    _.return(_.['+'](a, _['+'](b, c)));
});

console.log(main.compile());

Internally jsc would generate something like this:

var fs = new FunctionScope();

var a = new Object(Type.int(), 100);
var b = new Object(Type.int(), 200);
var c = new Object(Type.int(), 300);
fs.body.push(new Declaration(a));
fs.body.push(new Declaration(b));
fs.body.push(new Declaration(c));

var pa = new PrimaryExpression(a);
var pb = new PrimaryExpression(b);
var pc = new PrimaryExpression(c);
var expr1 = new AdditionExpression(pb, pc);
var expr2 = new AdditionExpression(pa, expr1);
fs.body.push(new ReturnStatement(expr2));

var codegen = new Codegen();
var bin = codegen.compileFunction(fs);
console.log(bin);

Solution

  • What I need to understand is what is the exact relationship between types, values, expressions and objects

    Expressions have a type and evaluate to values. Objects contain data that, when interpreted as a specific type, may represent values (may because an object can be uninitialized or otherwise contain invalid data). Lvalues (which are a type of expression) refer to objects. To quote the relevant definitions from the standard:

    3.15

    object

    region of data storage in the execution environment, the contents of which can represent values

    3.19

    value

    precise meaning of the contents of an object when interpreted as having a specific type

    6.3.2.1 Lvalues, arrays, and function designators

    An lvalue is an expression (with an object type other than void) that potentially designates an object


    a++; is expression-statement, right?

    Yes, an expression statement is an expression followed by a semicolon (grammar rule: "expression_statement: expressionopt ;" from section 6.8.3 of the standard). a++ is an expression thus a++; is an expression statement.

    a++ is expression, with postfix increment operator, right?

    Right. The relevant clauses of the grammar are "postfix-expression: postfix-expression ++", "postfix-expression: primary-expression" (section 6.5.2) and "primary-expression: identifier" (section 6.5.1). So we get the following derivation (skipping part of the expression hierarchy for brevity) for a++;:

         expression-statement
       expression           ';'
       postfix-expression   ';'
    postfix-expression '++' ';'
    primary-expression '++' ';'
    identifier         '++' ';'
    'a'                '++' ';'
    

    a, so what is a? Is it expression, object, or lvalue? Or is a an expression that evaluates to lvalue and this lvalue has an associated object and type?

    Syntactically a is an identifier, which is a primary expression, which is an expression. Semantically it's the name of a variable and a variable is an object. As cited above, "an lvalue is an expression (with an object type other than void) that potentially designates an object" (section 6.3.2.1), so a is also an lvalue.