My next step with the game was building a workflow for taking the spritesheet and turning it into assembler. Writing image bytes directly using unrolled assembler on older PCs is way faster than copying pixels in sets of (x,y) loops, especially when transparency comes into play. And this speed up is just from the arena tiles! Once the characters and shots are using assembler, it'll be even faster.
I wrote Ruby code that reads the spritesheet image and a YAML config file, and builds Open Watcom Assembler-friendly sprites & a C include header so I can call the sprites easily. Not only did I learn a lot about assembler and using it with C, I also learned about C calling conventions and how to get those working well.
Next is collision detection between objects, which will require tweaking the Ruby code to make it nice, and then I can get on with the actual game part of all this.