Lots more toolchain work

April 4, 2019July 13, 2019

I haven’t made any posts here in ages, but that doesn’t mean that work hasn’t been happening behind the scenes!

I looked at what my options were last fall when it came to assemblers, and it looked like I had a few options. vasm seems to be a common choice, and looked easily re-targetable, which made this my original plan. Once I thought about it more though and talked to others though, using LLVM seemed a better choice. If I port an assembler, I’ll then need another re-targeting effort later on for a C compiler (there are lots of options there as well for what to use), whereas if I re-target LLVM, I get an assembler, C compiler, and a lot of other tools with one effort. I might have underestimated just how big that one effort would be, though.

There isn’t really a “here’s how you build an LLVM backend for a new target from scratch” tutorial or guide anywhere, it turns out. There are one or two older ones, but the LLVM code base has changed enough since they were written that they’re more or less useless with modern versions unless you already understand how to build an LLVM backend, in which case you wouldn’t really need the guide. I considered working with an older version of LLVM, but putting all that effort in knowing I’d then have to work on getting it up to date seemed silly. In the end (after a few false starts) I’ve settled on working through the RISC-V LLVM backend patch set, which goes through building a backend step-by-step, with chunks that should compile and provide some new functionality each time one is completed. The patch set is somewhat out of date, but it’s close enough that I’ve been able to follow along and just google the odd thing here and there when it won’t compile.

A fully complete working toolchain is still a ways off, but it’s getting there little by little, and the path I’m on now seems like the best way to go in the end. It’s exciting to see it coming together, and I guess this means the project now has four main learning objectives, learning how ECL works and how to design with it, learning how CPUs work inside, learning how LLVM works and how to build a new backend, and eventually the MINIX internals and how to port that as well.

The toolchain repository is on GitHub under sarahemm/llvm-eclair if anyone’s interested in following along as I get it working.