Search code examples
pointersassemblyswitch-statement8051jump-table

high-speed case construct assembler + load DPTR fast - 8051


I'm currently implementing a serial routine for an 8051 IC (specifically AT89C4051) and I don't have much stack space or memory left, and in order for me to achieve a decent baud rate on the serial port (38K or better), I need to make a high speed case construct since in my serial port interrupt routine, I'm building a packet and checking it for validity.

Assume we are in the serial interrupt and R0 is the address to the memory space in which data is to be received. Let's assume start address is 40h

So here we go with a bunch of compares:

Branching via many compares

serial:
    mov A,SBUF
    mov @R0,A
    mov A,R0
    anl A,#07h ;our packet is 8 bytes so get current packet # based on what we stored so far
    cjne A,#0h,nCheckMe ;this gets scanned if A=7... waste 2 clock cycles
        //We're working on first byte 
        ajmp theend
    nCheckMe:    
    cjne A,#1h,nCheckThem ;this gets scanned if A=7... waste 2 clock cycles
        //We're working on second byte 
        ajmp theend
    nCheckThem:    
    ...
    cjne A,#7h,nCheckEnd
        //We're working on last byte 
        ajmp theend
    nCheckEnd:    
    theend:
    inc R0
reti

The above code might be practical at first but as the current byte in the packet to work on increases, the routine runs 2 clock cycles slower each time because of the extra "cjne" instruction processing. For example, if we are on the 7th byte, then "cjne" would happen many times because it has to scan through each case which adds slowness.

Branching via jump

Now I thought of using just a jump but I can't figure out how to load DPTR at high speed because the interrupt can get called even when some other process is using the value of DPTR.

I thought of this code:

serial:
    mov A,SBUF
    mov @R0,A
    mov A,R0
    anl A,#07h ;our packet is 8 bytes so get current packet # based on what we stored so far
    swap A ;multiply A times 16 and
    rr A ;divide A by 2 so we get address times 8 since each block uses 8 bytes of code space.

    mov R3,DPH ;save DPTR high byte without breaking stack
    mov R6,DPL ;save DPTR low byte
    mov dptr,#table
    jmp @A+DPTR
    theend:
    mov DPL,R6 ;restore DPTR low byte
    mov DPH,R3 ;restore DPTR high byte
    inc R0     ;move on to next position
reti
table:
;insert 8 bytes worth of code for 1st case
;insert 8 bytes worth of code for 2nd case
;insert 8 bytes worth of code for 3rd case
...
;insert unlimited bytes worth of code for last case

In my code, R3 and R6 were free so I used them to store the old DPTR value but those mov instructions as well as loading the new DPTR value take 2 cycles each for 10 cycles total (including restoring old value).

Is there a faster way to process a case construct in 8051 assembly code so that my serial routine processes faster?


Solution

  • Don't run logic in the ISR if possible. If you insist, you might be able to assign DPTR to the ISR and only use it in very short pieces of normal code with interrupts disabled. Alternatively, a PUSH+RET trick could work.

    Here is a chained approach, where each processed character just sets the address for the next step. If you can ensure the steps are within the same 256 byte block, you only ever need to update the low byte. The total overhead is 8 cycles, but you also save the 4 cycles for the arithmetic so it's a win of 6 cycles.

    .EQU PTR, 0x20  ; two bytes of SRAM
    
    ; Initialize PTR to address of step1 
    
    serial:
        push PTR
        push PTR+1
        ret
    
    step1:
        ; do stuff
        mov PTR, #low8(step2)
        reti
    
    last_step:
        ; do stuff
        mov PTR, #low8(step1)
        reti