Pages

Saturday, February 14, 2026

Dynamic ELF Structure - Endless Segment Fault

I created a RiscV assembler and linker. One of the challenges I was facing is to understand the ELF dynamic structure and make it work on a new chip (RiscV). The first challenge is the segment and section and how the segment header and section header work together. I want to post a few posts about the ELF and dynamic ELF from scratch. The previosu post (GOT/PLT) is part of my assembler / linker work. 

Before you start the work, make sure you have WSL ready with RiscV support ready. Although the RiscV support is not mandatory, but you will find it very helpful if your WSL can support RiscV binary execution. 

Ok, let me start from ELF structure with a diagram. ELF standard is big and I struggled a lot to put this thing into my head. Frankly, I still do not have something to learn about this ELF. 

My assembler/linker generate ELF binary and let libc modify the PLT/GOT data, and eventually invoke the "printf" function. Even the target instruction is RiscV, but the ELF format is same for RiscV and other CPU/chips. 

The following digram has two columns. The 1st column is binary data in th file. The layout of these binary files can be in a different order, but the concept is there. The 2nd column is the segment which is runtime memory structure. The linkage from two columns shows how the data from file mapped into the runtime memory. 

Another view to read these two columns are. Left column contains ELF header, program headers, sections, and section headers. The data on the right is how segment is created in the memory and section data is loaded into those segment. 

There are tons of attributes about these sections and segments. Let's leave it for the later posts. The important thing are: what are data, how the data mapped to runtime memory. 

use read-elf command from linux, it shows there are program headers. Those program headers describe this runtime memory contains what data (from which section). I use the following diagram shows its origin. Please note that these memory ranges can overlap. For example, the PHDR loads all program headers data and the PHDR segment is located inside the 1st load segment. The first loaded segment contains PHDR and also it also load the ELF header. If you do the dynamic ELF, this first section is mandatory; otherwise, segment fault will be the result. 

Those green color blocks are same case. They're programers but overlaps with other segments. 

Also, the note section is loaded to memory as well. I do not think it's mandatory, but note section is the last thing I added before I concluded the dynamic ELF generation work. The Interp segment is another case where the INTERP overlap with another segment. 





Wednesday, February 4, 2026

RiscV Assembly Got/PLT index Calculation

I searched many online document about RiscV PLT table and how the relocation works. Eventually I got my own version working. This is my notes in case one day I will forget. 

Based on the PLT/GOT RISC V code, the lazy binding code is like the following two tables. The first one is the PLT[0] code stub where the _dl_runtime_resolver code is being invoked. The t3 stores the _dl_runtime_resolver address and the jr t3 is to call that function. That function will requires two parameters: t0 holds the link map address, and t1 holds the function (e.g. printf) offset in the .got.plt. In the code, t2 is used as temporary register.

The summary is below

·       t3 is the _dl_runtime_resolver function address stored in GOT

·       t0 is the link map address in GOT

·       t1 is the function (e.g. printf) offset in .got.plt

·       t2 is temporary register

The hdr_size is the PLT[0] size which is two 16 bytes (32 bytes) size section. It has 8 instructions. In the following code, PTRSIZE = 4 if 32bits and PTRSIZE = 8 if 64bits

Table 1 PLT[0] Code Stub

1:  auipc  t2, %pcrel_hi(.got.plt)

    sub    t1, t1, t3               # shifted .got.plt offset + hdr size + 12

    l[w|d] t3, %pcrel_lo(1b)(t2)    # _dl_runtime_resolve

    addi   t1, t1, -(hdr size + 12) # shifted .got.plt offset, hdr_size is PLT0 size (32 bytes)

    addi   t0, t2, %pcrel_lo(1b)    # &.got.plt

    srli   t1, t1, log2(16/PTRSIZE) # .got.plt offset

    l[w|d] t0, PTRSIZE(t0)          # link map

    jr     t3

 

And the above code is invoked by the following code. The following code is a function (e.g. printf) stub in plt section.

Table 2 PLT[N] Code Stub

1:  auipc   t3, %pcrel_hi(function@.got.plt)

    l[w|d]  t3, %pcrel_lo(1b)(t3)

    jalr    t1, t3

    nop

 

How to get offset in t1 register

The t1 is used to compute the got.plt offset from the plt code stub.

When the code called from Table 2 PLT[N] Code Stub to Table 1 PLT[0] Code Stub. The first time the Table 2 PLT[N] Code Stub is called, the function@.got.plt is points to PLT[0]. The PLT[0]’s job is to update the function@.got.plt to actually function address.

Table 2 PLT[N] Code Stub set t1 and t3 to the following value

·       t1 = &nop = &PLT[N] + 12,

o   12 is from 3 instruction and each instruction is 4 bytes

·       t3 = &PLT[0]

In the Table 1 PLT[0] Code Stub, the t1 was computed

t1 = [t1 – t3 – (hdr_size + 12)] >>   

     = [&PLT[N] + 12 - &PLT[0] – hdr_size – 12] >>  

= [&PLT[N] - &PLT[0] – hdr_size] >>   (where hdr_size is the PLT[0] size, it is 32bytes, refer to the diagram below)

= [(N-1)*16] >>  

=  =  

= (N-1)PTRSIZE

 

 

N starts from 1 because PLT[0] is reserved for dynamic resolver. So the 1st function is at 0 and 2nd function is at PTRSIZE. The PTRSIZE is the address data size.

·       If 32bit address, the PTRSIZE is 4, which is 4 bytes equals 32 bits.

·       If 64bit address, the PTRSIZE is 8, which is 8 bytes equals 64 bits.

Please note the GOT.PLT has 1st and 2nd element to be reserved value. The 1st element is PLT[0] address and the 2nd element is link map address.

Link map into t0

The link map is a reserve entry in got.plt section. The link_map is a data structure used internally by the Linux dynamic linker (ld.so) to keep track of all shared libraries (shared objects, .so files) loaded into a running process.

Executable (.got.plt)

├── [0] => address of resolver entry (__dl_runtime_resolve)

├── [1] => pointer to link_map (used by resolver)

├── [2] => address of function (e.g., printf)

 

Please note this is not the end. If you want to call printf, you will need to implement the gnu version table and hash algorithm. I tried both old hash algorithm and new hash algorithm, both works. My environemnt is QEMU and StarFive2 Ubuntu Linux. 

Tuesday, January 13, 2026

Rust Macro

Rust macro is an interesting topic and it provides a way to generate code on the fly. One of the problem for Rust programmer is the code is not easy to understand, especially for compiler and low level programming. 

The code is located at ttliu2000/rust_macro: Rust macro project to make rust programming easier. which making my assembler / linker project code simpler. My background is C# and I will keep everything private and use accessor methods to access the code, which is heavy boilerplate code. Using the macro make the code more readable. 

Thanks to ChatGPT's help, I quickly start the project and get the feature in. So if you start to create macro, ask GPT first. 

Thursday, January 1, 2026

Tao's Fragments: RiscV Linux lib ELF Generation

Today is Jan 1st, 2026. And I will be back with what is meaningful. Years ago, I decide to start a new journey. After working on Rust, Linux OS, ELF file format, RiscV chip/ABI, and assembler and linker, and started C language compiler. Let's see how compiler tech and latest AI can change how technology can make programmer's life easier. 

I feel I can start blog again and share some fragments of my work and ready for open source. 

I've done the ELF executable with dynamic link support, it means that I can use my assembler+linker to generate ELF binary to invoke printf in libc. I've to say the RISCV's PLT/GOT RELA is really a headache, which can be topic for another fragemnt of mine. 😉

Today I finished the .so file generation and it can generate a library can be linked with gcc. Now I know why F# choose to support library after executable is done. And I know why F# library has its own printf. But anyway. 

The .so ELF generation is simple, my test .s file is listed below. Please note that i added my own pseudo instruction for RiscV assembly language. 

.data
.extern printf

tt_fmt: .string "%d\n"

const_float_or_string_value_104:
.string "Array index access test, expected value 42 and actual value: %d\n"

.text 
    .globl  add2               
    .type   add2, @function
add2:
    add     a0, a0, a1
    ret
    .size   add2, .-add2

    exit 42

The .extern will trigger dynamic symbol structure generation. The key here is the add2 function related instruction which is in bold font. 

  • globl add2 is to add "add2" to the dynamic symbol (.dynsym) table.
  • add2 shows the starting point of add2 function and it will be used to compute the size of the add2 function. 
  • .size show the add2's size is equal to current (the small dot) minus add2. 
After these info being added to dynamic symbol table, the readelf -a shows the following:



the riscv64 gcc link can link this shared library with c code and command is listed below:

#include <stdio.h>

int add2(int a, int b);

int main() {
    printf("%d\n", add2(20, 22));
    return 0;



and the exeuction result shows 42 which is 20 + 22 from add2(20, 22). 



Thanks to ChatGPT's which helps me to generate Linux command. Being a long time windows guy, I am still struggle to use correct Linux language, but let's see if this can change when I add more feature.