Pages

Sunday, February 22, 2026

Foray into LLVM - Intercept RiscV Assembly Code Generation

This weekend, I started my adventure to hook my assembler / linker to LLVM. The first step is to see what's the right position to intercept the asm code generation. I highly suggest anyone who is interested in LLVM to start with the offical web site forum, especially Beginner Resources + Documentation - Beginners - LLVM Discussion Forums. Personally, I did not want to bother anyone on forum, but that document is a good start. 

And I did use ChatGPT and Google Gemini. The following is the comparison between these two systems. Gemini seems confident; however, its solution does not fit and I was struggle with endless compile error, missing include files, and link errors. In the end, I gave up on Gemini. ChatGPT's solution is more accurate on the architect level and the code is more accurate. Although GPT tried to lead me, it did not succeed. 😅 That's because those beginner resource on LLVM web site and some coding experience.

What I did is to add "TALIU" label before all the instructions. The final assembly generated is like the following:

TALIU:  li      a0, 42
TALIU:  ret

GPT's solution is to replace return new RISCVInstPrinter(MAI, MII, MRI); in the createRISCVMCInstPrinter.

The createRISCVMCInstPrinter is located in llvm-project\llvm\lib\Target\RISCV\MCTargetDesc\RISCVMCTargetDesc.cpp. I use my own InstPrinter replace the existing RISCVInstPrinter. The .h definition is 

namespace llvm {
  class MyRISCVInstPrinter : public RISCVInstPrinter {
  public:
    using RISCVInstPrinter::RISCVInstPrinter;

    void printInst(const MCInst *MI, uint64_t Address, StringRef Annot,
                   const MCSubtargetInfo &STI, raw_ostream &O) override;
  };
} // namespace llvm

And the .c file is like the following.

#include "MyRISCVInstPrinter.h"

namespace llvm {

    void MyRISCVInstPrinter::printInst(
        const MCInst *MI,
        uint64_t Address,
        StringRef Annot,
        const MCSubtargetInfo &STI,
        raw_ostream &OS) {

        OS << "TALIU: ";
        RISCVInstPrinter::printInst(MI, Address, Annot, STI, OS);
    }

}

After this, make sure the build command (under build folder) is "ninja LLVMRISCVCodeGen LLVMAsmPrinter" with configuration command "cmake -G Ninja -DLLVM_ENABLE_PROJECTS="" -DLLVM_TARGETS_TO_BUILD="RISCV" -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ../llvm"

Gemini told me to do "cmake --build . --target llvm-mc" and does not work. Given my limited experience with llvm, I don't fully understand why. 

 

No comments: