Pages

Monday, April 20, 2026

Building My First RISC-V VSCode Extension (and Everything That Went Wrong)


I set out to build a simple VSCode extension for RISC-V assembly. The idea sounded straightforward: define a new language, hook up a Rust-based parser through LSP, and get syntax highlighting plus diagnostics. I’ve built compilers before—how hard could this be? It turns out, the hard part wasn’t the compiler at all. It was everything around it.

The Setup

On paper, the architecture was clean. A TypeScript extension runs inside VSCode, launches a Rust-based language server, and that server parses .rv.s files, reports errors, and sends back semantic tokens for highlighting. In development mode, everything worked beautifully. I could open a file and immediately see highlighting, introduce an error and watch diagnostics appear exactly where I expected. It felt solid, and at that point I thought I was basically done. I wasn’t.

The VSIX Reality Check

The moment I packaged everything into a .vsix and installed it, things broke—quietly. There was no highlighting, no diagnostics, and no logs. The extension just sat there stuck on “Activating…”. That’s when it really clicked: dev mode hides problems, but packaging reveals them.

The First Wall: Missing Dependencies

The first real clue came from an error saying it couldn’t find vscode-languageclient/node. That was confusing because it was clearly listed in my package.json. The issue, as it turned out, was entirely self-inflicted. VSCode extensions don’t bundle dependencies automatically, and I had explicitly excluded node_modules in .vscodeignore. In other words, I built an extension and then removed part of it before shipping. Once I included node_modules again, that problem disappeared. Simple in hindsight, but not obvious when you’re in the middle of it.

The Silent Failure: LSP Not Starting

After fixing dependencies, the extension still didn’t work. This time there wasn’t even an error—just silence. That silence made it harder to debug than an actual crash. The issue turned out to be the path to the Rust binary. In development, I was using a relative path like server/target/release/rust_keyword_lsp_server.exe, which worked because everything ran inside my workspace. But once installed, VSCode runs the extension from a completely different location, so that path no longer pointed to anything valid. The fix was to resolve paths using context.extensionPath. Once I did that, the server finally started.

The Subtle Killer: stdout

Then came one of the most subtle bugs in the entire process. My parser used println! to print errors, which is perfectly normal in Rust. But in an LSP setup, stdout is not for logging—it’s the protocol itself. Every time I printed something, I was corrupting the JSON stream between the server and VSCode. The client would silently disconnect, making it look like the server never started. There was no obvious error pointing to this. The fix was simply to use eprintln! instead, sending logs to stderr. One small change, but it made a massive difference.

The Debugging Breakthrough: Developer Tools

At one point, I realized I was essentially debugging blind. There were no logs, no clear signals, just a stuck extension. That changed when I discovered the developer tools in VSCode through “Help → Toggle Developer Tools”. This was a turning point. Suddenly I could see activation errors, inspect console logs, catch missing modules immediately, and verify runtime paths. Before this, I was guessing. After this, I was actually debugging. If you’re building a VSCode extension and not using this, you’re making things much harder than they need to be.

Packaging Is a Minefield

Even after getting everything working, packaging still had its own set of traps. I had to carefully tune .vscodeignore to exclude Rust source code and large build folders, include only the release binary, and still keep all required Node dependencies. A single mistake here could break the extension again in ways that looked completely unrelated. It became clear that packaging isn’t just cleanup—it’s part of the system itself.

What I Learned

The biggest surprise in all of this was where the complexity actually lives. It’s not in parsing, not in Rust, and not even in the LSP protocol. It’s in the boundaries: the differences between development and packaged environments, stdout versus stderr, relative paths versus resolved ones, and what exists locally versus what actually gets shipped. Each of these seems small on its own, but together they create a system where things can fail silently in ways that are hard to reason about.

Where It Ended Up

After working through all of that, I ended up with a VSIX that installs cleanly, a Rust LSP that starts reliably, working diagnostics, and semantic highlighting for RISC-V assembly. More importantly, I now understand the real challenges of building tooling inside VSCode, and they’re not where I initially expected them to be.

Closing Thought

If you’re building an LSP-based extension, expect the bugs to come from the edges, not the core. Your parser will probably work, and your design will probably make sense. But the thing that breaks everything might be something as small as a single println!, and you won’t see it coming.


Eventually, I got instruction and registers highlight works. 




Wednesday, March 25, 2026

Rust FlameGraph

cargo install Flamegraph 
cargo flamegraph --bin <my binary>

Please note that <my binary> does not need to add .exe on windows. 

The call stack is up side down. The top most element is the funnction at lower level. Now let's see if i can get more time on debuggin rather than waiting for the code finish. 



Saturday, March 21, 2026

Left & Right Asscociate in Parsing

I made the same mistake again when using Pratt parsing. I almost messed up the left asscociate and right asscociate. If an operator is left asscociate, the generated AST will be a tree lean-to-left. For struct, union, and other composite types, left asscoiate can expose the member as the right node. This can make the code generation much easier, because I know the type for the dot operator. 

For example, StructA.structB.a, the lean-to-left tree can expose the "a" as the right node and I can easily find out the whole expression StructA.structB.a is what type. I did not set assignment as an operator, the assignment can be lean-to-right tree. 

Hopefully this self-note can let me remember this rule and won't miss this case in the future. 

Saturday, March 7, 2026

RiscV GCC Dynamic GOT layout

I've been struggling with the Linux ELF's dynamic link format for days. All GPT's provided wrong answer and they're wrong in the same way. 

The correct GOT layout for Linux ELF for RiscV are like the following if there are two external funtions. These two functions are funct0 and funct1. 

  1. ffffffff ffffffff is reserved data
  2. 00000000 00000000 is reserved data
  3. GOT[funct0]
  4. GOT[funct1]
  5. .dynamic virutal address 
#1 and #2 are linker map and resolver address. These fields will be set by dynamic linker (or called dynamic loader). 

Unlike GPT's info, .dynamic virtual address is set to the first slot. The .dynamic virtual address is set as the last element. 

Sunday, February 22, 2026

Foray into LLVM - Intercept RiscV Assembly Code Generation

This weekend, I started my adventure to hook my assembler / linker to LLVM. The first step is to see what's the right position to intercept the asm code generation. I highly suggest anyone who is interested in LLVM to start with the offical web site forum, especially Beginner Resources + Documentation - Beginners - LLVM Discussion Forums. Personally, I did not want to bother anyone on forum, but that document is a good start. 

And I did use ChatGPT and Google Gemini. The following is the comparison between these two systems. Gemini seems confident; however, its solution does not fit and I was struggle with endless compile error, missing include files, and link errors. In the end, I gave up on Gemini. ChatGPT's solution is more accurate on the architect level and the code is more accurate. Although GPT tried to lead me, it did not succeed. 😅 That's because those beginner resource on LLVM web site and some coding experience.

What I did is to add "TALIU" label before all the instructions. The final assembly generated is like the following:

TALIU:  li      a0, 42
TALIU:  ret

GPT's solution is to replace return new RISCVInstPrinter(MAI, MII, MRI); in the createRISCVMCInstPrinter.

The createRISCVMCInstPrinter is located in llvm-project\llvm\lib\Target\RISCV\MCTargetDesc\RISCVMCTargetDesc.cpp. I use my own InstPrinter replace the existing RISCVInstPrinter. The .h definition is 

namespace llvm {
  class MyRISCVInstPrinter : public RISCVInstPrinter {
  public:
    using RISCVInstPrinter::RISCVInstPrinter;

    void printInst(const MCInst *MI, uint64_t Address, StringRef Annot,
                   const MCSubtargetInfo &STI, raw_ostream &O) override;
  };
} // namespace llvm

And the .c file is like the following.

#include "MyRISCVInstPrinter.h"

namespace llvm {

    void MyRISCVInstPrinter::printInst(
        const MCInst *MI,
        uint64_t Address,
        StringRef Annot,
        const MCSubtargetInfo &STI,
        raw_ostream &OS) {

        OS << "TALIU: ";
        RISCVInstPrinter::printInst(MI, Address, Annot, STI, OS);
    }

}

After this, make sure the build command (under build folder) is "ninja LLVMRISCVCodeGen LLVMAsmPrinter" with configuration command "cmake -G Ninja -DLLVM_ENABLE_PROJECTS="" -DLLVM_TARGETS_TO_BUILD="RISCV" -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ../llvm"

Gemini told me to do "cmake --build . --target llvm-mc" and does not work. Given my limited experience with llvm, I don't fully understand why. 

 

Saturday, February 14, 2026

Dynamic ELF Structure - Endless Segment Fault

I created a RiscV assembler and linker. One of the challenges I was facing is to understand the ELF dynamic structure and make it work on a new chip (RiscV). The first challenge is the segment and section and how the segment header and section header work together. I want to post a few posts about the ELF and dynamic ELF from scratch. The previosu post (GOT/PLT) is part of my assembler / linker work. 

Before you start the work, make sure you have WSL ready with RiscV support ready. Although the RiscV support is not mandatory, but you will find it very helpful if your WSL can support RiscV binary execution. 

Ok, let me start from ELF structure with a diagram. ELF standard is big and I struggled a lot to put this thing into my head. Frankly, I still do not have something to learn about this ELF. 

My assembler/linker generate ELF binary and let libc modify the PLT/GOT data, and eventually invoke the "printf" function. Even the target instruction is RiscV, but the ELF format is same for RiscV and other CPU/chips. 

The following digram has two columns. The 1st column is binary data in th file. The layout of these binary files can be in a different order, but the concept is there. The 2nd column is the segment which is runtime memory structure. The linkage from two columns shows how the data from file mapped into the runtime memory. 

Another view to read these two columns are. Left column contains ELF header, program headers, sections, and section headers. The data on the right is how segment is created in the memory and section data is loaded into those segment. 

There are tons of attributes about these sections and segments. Let's leave it for the later posts. The important thing are: what are data, how the data mapped to runtime memory. 

use read-elf command from linux, it shows there are program headers. Those program headers describe this runtime memory contains what data (from which section). I use the following diagram shows its origin. Please note that these memory ranges can overlap. For example, the PHDR loads all program headers data and the PHDR segment is located inside the 1st load segment. The first loaded segment contains PHDR and also it also load the ELF header. If you do the dynamic ELF, this first section is mandatory; otherwise, segment fault will be the result. 

Those green color blocks are same case. They're programers but overlaps with other segments. 

Also, the note section is loaded to memory as well. I do not think it's mandatory, but note section is the last thing I added before I concluded the dynamic ELF generation work. The Interp segment is another case where the INTERP overlap with another segment. 





Wednesday, February 4, 2026

RiscV Assembly Got/PLT index Calculation

I searched many online document about RiscV PLT table and how the relocation works. Eventually I got my own version working. This is my notes in case one day I will forget. 

Based on the PLT/GOT RISC V code, the lazy binding code is like the following two tables. The first one is the PLT[0] code stub where the _dl_runtime_resolver code is being invoked. The t3 stores the _dl_runtime_resolver address and the jr t3 is to call that function. That function will requires two parameters: t0 holds the link map address, and t1 holds the function (e.g. printf) offset in the .got.plt. In the code, t2 is used as temporary register.

The summary is below

·       t3 is the _dl_runtime_resolver function address stored in GOT

·       t0 is the link map address in GOT

·       t1 is the function (e.g. printf) offset in .got.plt

·       t2 is temporary register

The hdr_size is the PLT[0] size which is two 16 bytes (32 bytes) size section. It has 8 instructions. In the following code, PTRSIZE = 4 if 32bits and PTRSIZE = 8 if 64bits

Table 1 PLT[0] Code Stub

1:  auipc  t2, %pcrel_hi(.got.plt)

    sub    t1, t1, t3               # shifted .got.plt offset + hdr size + 12

    l[w|d] t3, %pcrel_lo(1b)(t2)    # _dl_runtime_resolve

    addi   t1, t1, -(hdr size + 12) # shifted .got.plt offset, hdr_size is PLT0 size (32 bytes)

    addi   t0, t2, %pcrel_lo(1b)    # &.got.plt

    srli   t1, t1, log2(16/PTRSIZE) # .got.plt offset

    l[w|d] t0, PTRSIZE(t0)          # link map

    jr     t3

 

And the above code is invoked by the following code. The following code is a function (e.g. printf) stub in plt section.

Table 2 PLT[N] Code Stub

1:  auipc   t3, %pcrel_hi(function@.got.plt)

    l[w|d]  t3, %pcrel_lo(1b)(t3)

    jalr    t1, t3

    nop

 

How to get offset in t1 register

The t1 is used to compute the got.plt offset from the plt code stub.

When the code called from Table 2 PLT[N] Code Stub to Table 1 PLT[0] Code Stub. The first time the Table 2 PLT[N] Code Stub is called, the function@.got.plt is points to PLT[0]. The PLT[0]’s job is to update the function@.got.plt to actually function address.

Table 2 PLT[N] Code Stub set t1 and t3 to the following value

·       t1 = &nop = &PLT[N] + 12,

o   12 is from 3 instruction and each instruction is 4 bytes

·       t3 = &PLT[0]

In the Table 1 PLT[0] Code Stub, the t1 was computed

t1 = [t1 – t3 – (hdr_size + 12)] >>   

     = [&PLT[N] + 12 - &PLT[0] – hdr_size – 12] >>  

= [&PLT[N] - &PLT[0] – hdr_size] >>   (where hdr_size is the PLT[0] size, it is 32bytes, refer to the diagram below)

= [(N-1)*16] >>  

=  =  

= (N-1)PTRSIZE

 

 

N starts from 1 because PLT[0] is reserved for dynamic resolver. So the 1st function is at 0 and 2nd function is at PTRSIZE. The PTRSIZE is the address data size.

·       If 32bit address, the PTRSIZE is 4, which is 4 bytes equals 32 bits.

·       If 64bit address, the PTRSIZE is 8, which is 8 bytes equals 64 bits.

Please note the GOT.PLT has 1st and 2nd element to be reserved value. The 1st element is PLT[0] address and the 2nd element is link map address.

Link map into t0

The link map is a reserve entry in got.plt section. The link_map is a data structure used internally by the Linux dynamic linker (ld.so) to keep track of all shared libraries (shared objects, .so files) loaded into a running process.

Executable (.got.plt)

├── [0] => address of resolver entry (__dl_runtime_resolve)

├── [1] => pointer to link_map (used by resolver)

├── [2] => address of function (e.g., printf)

 

Please note this is not the end. If you want to call printf, you will need to implement the gnu version table and hash algorithm. I tried both old hash algorithm and new hash algorithm, both works. My environemnt is QEMU and StarFive2 Ubuntu Linux.