Search code examples
javadictionarycurryingmethod-referencepartial-application

Java8 expressing conditionals as an array of method reference


I am representing a huge swath of objects (specifically MIPS32-instructions). My minimum working example will be describing an instruction in the R-format.

MIPS32 background (R-type instruction)

An R-type instruction is determined uniquely by the combination of its opcode and its funct (function) field. The opcode is the leftmost 6-bits of the instruction when represented as a 32-bit number and the rightmost 6-bits compose the funct field.

A common decomposition of an R-type instruction is into bitfields of lengths (6, 5, 5, 5, 5, 6). The bit-fields then represent the following units of

--------------------------------------------------------
| 6 bits  | 5 bits | 5 bits | 5 bits | 5 bits | 6 bits |
|:-------:|:------:|:------:|:------:|:------:|:------:|
| op      | rs     | rt     | rd     | shamt  | funct  |
--------------------------------------------------------

Hence, the unique identifier (key) for an R-type instruction is the tuple (opcode, funct) which has a one-to-one relation with an R-type instruction

Example: mul

Consider the 32-bit number 0x71014802. It is decomposed into fields of varying lengths depending on the format of the instruction.

For all numbers in the MIPS32 instruction set the leftmost six bits always represent the opcode for the instruction. The opcode alone is not always sufficient to identify the particular instruction, but it is always sufficient to identify the format of the instruction.

The leftmost six bits of 0x71014802 is 0x1c. It is known that this number corresponds to an instruction in the R-format. The format specifies into which fields the remaining bits decompose into.

As alluded to previously, all instructions may not be discerned by their opcode alone. This holds for all R-type instructions.

Decomposing 0x71014802 into the fields shown in the above table yields rs=8, rt=1, rd=9, shamt=0, and funct=2. The decomposed representation of this instruction in hexadecimal form is thus [0x1c 8 1 9 0 2]. The corresponding decimal representation is [28 8 1 9 0 2].

To identify the particular instruction represented by 0x71014802 the funct field must be consulted. Pairing the opcode, 0x1c and the value in the funct field uniquely identifies the instruction a mul instruction.


In my source code I represent the mul instruction in the following manner

/**
 * Multiply (without overflow). Put the low-order 32 bits of
 * the product of rs and rt into register rd.
 */
// TODO: Validate that shamt is 0
MUL(0x1c, 2, R::rd, R::rs, R::rt),

The method references R::rd, R::rs, R::rt are used to create a human-legible representation of the instruction by fetching the appropriate fields from the decomposed representation and by looking up register names in a table (this is of no importance to us, but it explains why it is there).

The TODO comment signifies that these objects should also satisfy 0 or more conditions to be deemed valid. As you can see we have stored a lot of information about MUL in one place, its opcode and its funct field which uniquely identifies it as well as how to produce a human-legible representation.

What remains is enclosing the validation step as well.

In Python I would use a dictionary.

{'shamt': 0, 
 other conditions
}

that I would later parse.

In Java I'd either have to have a statically initialized HashMap to represent this, or conceivably a two-dimensional array (Object[][]) could serve and then do some internal parsing and evaluation of that. The verbosity of Java would, according to me, make the intent harder to comprehend.

What I would like to express is to somehow state that when a particular function is called with a certain argument I want it to return true.

I expect all of these conditions to evaluate to true so I would be fine with evaluating them later.

So I am thinking some form of partial application, say that I have a function

shamt(int expectedValue) {
    // Check that the value of shamt matched the expected value
}

then something akin to

new Supplier<Boolean>[] {
    0x00 -> RTypeInstruction::shamt
}

which obviously does not work, might be in the right direction. It is important to somehow have a named reference here, as specifying an ordering relation between integers is not sufficient because that does not give any indication as to which bitfield has to satisfy a particular condition.

I do not want to have an array specifying a condition for each bit-field as rarely do the conditions affect more than one or two bit-fields, but it does happen.

It could be argued that the identification step (opcode, funct) are also conditions but this makes it difficult to distinguish between a failure to identify an instruction and that the instruction is simply semi-valid. What this means is that we would like to be able to identify an instruction as being the mul instruction even if the shamt is non-zero and inform the user that the input is somewhat malformed (semi-valid).

The values that the shamt method can operate on are stored internally in the enum. The reason for not specifying a plethora of methods say

boolean shamtIsZero() { ... }

because there are so many different conditions. See the later example. Is Java too ill-suited for this? Should I use a HashMap instead and evaluate that or is there some neat FunctionalInterface around that will help me do this?


Example: mtc1

There is another instruction, mtc1 identified by the opcode 0x11, and the funct field being set to 0x00.

/**
 * Move to coprocessor 0, move CPU register rt to register
 * fs in the FPU. fs occupies the rd field. Note the pattern that
 * the MTC and MTF operations share the same opcode and funct
 * field. The rs field distinguishes them.
 */
// TODO: Validate that rs = 4 and funct and shamt = 0
MTC1(0x11, 0x00, R::rt, R::fs)

as you can see here we have to satisfy several conditions and one of the bitfields has to be something other than 0. It is because of this reason we do not want to have an individual method for each condition as they would become to numerous.


Solution

  • You can pass multiple conditions in the constructor - you can even make them Predicates:

    enum MIPS32Instructions {
    
        MUL("MUL", (v) -> v > 0),
        DIV("DIV", (v) -> v > 1, (v) -> v != 9);
        final String id;
        final Predicate<Integer>[] conditions;
    
        MIPS32Instructions(String id, Predicate<Integer>... conditions) {
            this.id = id;
            this.conditions = conditions;
        }
    
        public boolean checkConditions(int v) {
            return Arrays.stream(conditions)
                    .allMatch((c) -> c.test(v));
        }
    
    }