DMA implemented, and many bugs fixed along the way
I started working on DMA support and got it most of the way there, but then some of my tests were acting weird. After digging into things with GTKWave a bunch it looked like both 16-bit loads and stores were both acting weird, and it turns out neither of those have ever worked properly! My unit tests were using values that caused this to be hidden though, so the tests passed even though there were major bugs hiding in the functionality.
ld16 was acting little-endian even though the rest of the CPU is big-endian, that was a quick fix thankfully! My unit tests were testing with “symmetrical” values like 0x1111 and 0xC4C4, I don’t know why I ever thought that was a good idea but the obvious problem with that was happening and it was getting byte-swapped but not reporting as a failure. I’ve fixed the problem and changed these test values as well, so that similar problems won’t pop up in the future without getting caught.
st16 was storing the high byte into both RAM locations rather than the big end first and little end second, which was again not caught because I was using test values that hid the problem. That took a little more shuffling around, but a couple microcode changes later this is acting as it should be. This bit of microcode is kind of a mess now and needs some optimizing, but that’s a problem for Future Me.
Once those were sorted out the DMA test was still failing, but it turns out that was because the load instructions don’t set the status bits, so testing for zero (Z status) was always failing. I added a load_status bit to the end of the ld16 instruction and then this test passed as well! The DMA test harness now starts up the CPU, lets it get settled, then asserts DMA_REQ. Once the CPU responds with DMA_ACK, it pokes at a couple locations in memory to change them, then releases DMA_REQ again. In the meantime the test code running on the CPU itself has set things up, stored a couple values into memory, then gone into a loop waiting for those values to change. Once the harness pokes at those values and changes them, it exits the loop and reports success.
I’ve since gone back and added the Z-status-setting to the ld8h and ld8l versions as well, but it turns out the high-byte operations have never set the Z status properly and I just never implemented any tests for that, so never caught it. Oops! So my next task will be implementing proper high-byte Z status setting, and setting up proper unit tests for that so it’ll be caught if it regresses later. If I’d realized back at the beginning how much extra logic the x86-style “16-bit registers can also be two 8-bit registers” thing would add I’m not sure I would have implemented it, but at this point it’s a pretty core part of the architecture so I’m sticking with it.