I'm currently implementing a serial routine for an 8051 IC (specifically AT89C4051) and I don't have much stack space or memory left, and in order for me to achieve a decent baud rate on the serial port (38K or better), I need to make a high speed case construct since in my serial port interrupt routine, I'm building a packet and checking it for validity.
Assume we are in the serial interrupt and R0 is the address to the memory space in which data is to be received. Let's assume start address is 40h
So here we go with a bunch of compares:
Branching via many compares
serial:
mov A,SBUF
mov @R0,A
mov A,R0
anl A,#07h ;our packet is 8 bytes so get current packet # based on what we stored so far
cjne A,#0h,nCheckMe ;this gets scanned if A=7... waste 2 clock cycles
//We're working on first byte
ajmp theend
nCheckMe:
cjne A,#1h,nCheckThem ;this gets scanned if A=7... waste 2 clock cycles
//We're working on second byte
ajmp theend
nCheckThem:
...
cjne A,#7h,nCheckEnd
//We're working on last byte
ajmp theend
nCheckEnd:
theend:
inc R0
reti
The above code might be practical at first but as the current byte in the packet to work on increases, the routine runs 2 clock cycles slower each time because of the extra "cjne" instruction processing. For example, if we are on the 7th byte, then "cjne" would happen many times because it has to scan through each case which adds slowness.
Branching via jump
Now I thought of using just a jump but I can't figure out how to load DPTR at high speed because the interrupt can get called even when some other process is using the value of DPTR.
I thought of this code:
serial:
mov A,SBUF
mov @R0,A
mov A,R0
anl A,#07h ;our packet is 8 bytes so get current packet # based on what we stored so far
swap A ;multiply A times 16 and
rr A ;divide A by 2 so we get address times 8 since each block uses 8 bytes of code space.
mov R3,DPH ;save DPTR high byte without breaking stack
mov R6,DPL ;save DPTR low byte
mov dptr,#table
jmp @A+DPTR
theend:
mov DPL,R6 ;restore DPTR low byte
mov DPH,R3 ;restore DPTR high byte
inc R0 ;move on to next position
reti
table:
;insert 8 bytes worth of code for 1st case
;insert 8 bytes worth of code for 2nd case
;insert 8 bytes worth of code for 3rd case
...
;insert unlimited bytes worth of code for last case
In my code, R3 and R6 were free so I used them to store the old DPTR value but those mov instructions as well as loading the new DPTR value take 2 cycles each for 10 cycles total (including restoring old value).
Is there a faster way to process a case construct in 8051 assembly code so that my serial routine processes faster?
Don't run logic in the ISR if possible. If you insist, you might be able to assign DPTR to the ISR and only use it in very short pieces of normal code with interrupts disabled. Alternatively, a PUSH+RET trick could work.
Here is a chained approach, where each processed character just sets the address for the next step. If you can ensure the steps are within the same 256 byte block, you only ever need to update the low byte. The total overhead is 8 cycles, but you also save the 4 cycles for the arithmetic so it's a win of 6 cycles.
.EQU PTR, 0x20 ; two bytes of SRAM
; Initialize PTR to address of step1
serial:
push PTR
push PTR+1
ret
step1:
; do stuff
mov PTR, #low8(step2)
reti
last_step:
; do stuff
mov PTR, #low8(step1)
reti