I am building a Commodore PET on an FPGA. I've implemented my own 6502 core in Kansas Lava (code is available at https://github.com/gergoerdi/mos6502-kansas-lava), and by putting enough IO around it (https://github.com/gergoerdi/eightbit-kansas-lava) I was able to boot the original Commodore PET ROM on it, get a blinking cursor and start typing.
However, after typing in the classic BASIC program
10 PRINT "HELLO WORLD"
20 GOTO 10
it crashes after a while (after several seconds) with
?ILLEGAL QUANTITY ERROR IN 10
Because my code has fairly reasonable per-opcode test coverage, and it passes AllSuiteA, I thought I would look into tests for more complicated behaviour, which is how I arrived at Klaus Dormann's interrupt testsuite. Running it in the Kansas Lava simulator has pointed out a ton of bugs in my original interrupt implementation:
I
flag was not set when entering the interrupt handlerB
flag was all over the placeI
was unset when they arrived (the correct behaviour seems to be to queue interrupts when I
is set and when it gets unset, they should still be handled)After fixing these, I can now successfully run the Klaus Dormann test, so I was hoping by loading my machine back onto the real FPGA, with some luck the BASIC crash could be going away.
However, the new version, with all these interrupt bugs fixed, and passing the interrupt test in a simulator, now fails to respond to keyboard input or even just blink the cursor on the real FPGA. Note that both keyboard input and cursor blinking is done in response to an external IRQ (connected from the screen VBlank signal), so this means the fixed version somehow broke all interrupt handling...
I am looking for any kind of vague suggestions what could be going wrong or how I could begin to debug this.
The full code is available at https://github.com/gergoerdi/mos6502-kansas-lava/tree/interrupt-rewrite, the offending commit (the one that fixes the test and breaks the PET) is 7a09b794af. I realize this is the exact opposite of a minimal viable reproduction, but the change itself is tiny and because I have no idea where it goes wrong, and because reproducing the problem requires a machine featureful enough to boot the stock Commodore PET ROM, I don't know how I could shrink it...
Added:
I managed to reproduce the same issue on the same hardware with a very simple (dare I say minimal) ROM instead of the stock PET ROM:
.org $C000
reset:
;; Initialize PIA
LDY #$07
STY $E813
LDA #30
STA $42
STA $8000
CLI
JMP *
irq:
CMP $E812 ; ACK irq
DEC $42
BNE endirq
LDX $8000
INX
STX $8000
LDA #30
STA $42
endirq: RTI
.res $FFFC-*
.org $FFFC
resetv: .addr reset
irqv: .addr irq
Interrupts aren't queued; the interrupt line is sampled on the penultimate cycle of each instruction and if it is active then, and I unset, then a jump to interrupt occurs next instead of a fetch/decode. Could the confusion be that IRQ is level triggered, not edge triggered, and is usually held high for a period, not a single cycle? So clearing I will cause an interrupt to occur immediately if it was already ongoing. It looks like on the PET interrupt is held active until the CPU acknowledges it?
Also notice the semantics: SEI
and CLI
adjust the flag in the final cycle. A decision on whether to jump to interrupt was made the cycle before. So a SEI
as the final thing when an interrupt comes in, you'll enter the interrupt routine with I set. If an interrupt is active when you hit a CLI
then the processor will perform the operation after the CLI
before branching.
I'm on a phone so it's difficult to assess more thoroughly than to offer those platitudes; I'll try to review properly later. Is any of that helpful?