Lesson 14: What's in that module file, anyway?

Printer-friendly versionPrinter-friendly version

The sections of an executable file

In preparation for upcoming lessons on kernel and module debugging, we're going to tear apart a loadable module to examine its internal structure, where you'll get to see how executable files (including loadable modules and even the kernel image itself) consist of various sections with different attributes, and how those attributes dictate how the loader treats those sections when that module is loaded. And so, without further ado ...

Our sample loadable module

Here's our sample module source file crash_elf.c (which is also attached to the bottom of this page):

/* Module source file 'crash_elf.c'. */

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>

static int whatever;
static int answer = 42;

static char __initdata elf_howdymsg[] = "Good day, eh?";
static char __exitdata elf_exitmsg[] = "Taking off, eh?";

void
useless(void)
{
    printk(KERN_INFO "I am totally useless.\n");
}

static int __init elf_hi(void)
{
    printk(KERN_INFO "module crash_elf being loaded.\n");
    printk(KERN_INFO "%s\n", elf_howdymsg);
    printk(KERN_INFO "The answer is %d.\n", answer);
    return 0;
}

static void __exit elf_bye(void)
{
    printk(KERN_INFO "module crash_elf being unloaded.\n");
    printk(KERN_INFO "%s\n", elf_exitmsg);
    printk(KERN_INFO "The answer is now %d.\n", answer);
}

module_init(elf_hi);
module_exit(elf_bye);

MODULE_AUTHOR("Robert P. J. Day");
MODULE_LICENSE("GPL");

For now, simply add a Makefile for this module, build it, load it, verify the log messages printed to /var/log/messages, then unload it. Once you've done that, then we can talk about what makes it so interesting.

What is this "ELF" thing of which you speak?

As the first step in deconstructing our crash_elf.ko loadable module file (and, conveniently, any other executable file you might generate), we can use the file command to poke at it thusly:

$ file crash_elf.ko
crash_elf.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$

What this tells us is that this is an ELF-format file, that it was (in my case, anyway) compiled for a 64-bit architecture, and that it is not "stripped," suggesting that it still has symbol table information that will be useful for later debugging.

More specifically, the "ELF" part means that the file itself comes in sections, the most common ones being the ones you're probably familiar with:

  • the "text" section: the executable code itself,
  • the "data" section: initialized data, and
  • the "BSS" section: "Block Started by Symbol" or, as most people know it, uninitialized data which--unlike the first two sections -- takes up no space in the executable file and is allocated only at run time.

There will almost certainly be other sections, but it's likely that you've at least heard of the above. But before we dig any further, let's do a quick recap of some of that "__init" stuff you saw in an earlier lesson.

All that __init stuff and how it's processed

Recall from way back in Lesson 8 that you learned how tagging your entry and exit routines with, respectively, __init and __exit allowed the module loader to be more space-efficient by discarding routines when they were either no longer needed, or never needed in the first place. If you look at your new source file here, you can see the same principle being applied to simple data objects.

Consider the following snippet from this sample module:

static int whatever;
static int answer = 42;

static char __initdata elf_howdymsg[] = "Good day, eh?";
static char __exitdata elf_exitmsg[] = "Taking off, eh?";

Without much of an explanation, it should be obvious that the tags of __initdata and __exitdata represent ways to identify simple data objects that can either be discarded after module initialization, or need not be loaded at all if there's no chance of the module being unloaded.

The rationale for __initdata is that your module might require a fair bit of initialization that takes advantage of a large, static table of some kind but, once the initialization is done, you have no need for that table anymore so there's no point letting it hang around in kernel space, wasting RAM. So tag it as __initdata, at which point it's deleted once module initialization is complete.

Exercise for the student: How could you tell, after your module is loaded and initialized, which data objects and routines are still taking up space in RAM and which were thrown away from the sample module file being used here? Since this might not be obvious, I'll help you out here.

Recall that the entire symbol table for the running kernel can be examined with:

$ cat /proc/kallsyms

Using what you know of the grep command, search that kernel symbol table file and verify what is still in kernel space after the module is loaded, and what was clearly discarded.

BONUS exercise for the student: If you have experience programming in user space, write and compile the standard C language "Hello, world" program, and run the file on the resulting executable. What are the differences? Why?

Tearing apart your loadable module file

At this point, there are a couple commands you can use to dig into the internals of your loadable module file. Let's look first at objdump, and how to list the various sections in the file (the output below is truncated to not be overwhelming):

$ objdump --section-headers crash_elf.ko

crash_elf.ko:     file format elf64-x86-64

Sections:
Idx Name          Size      VMA               LMA
  0 .note.gnu.build-id 00000024  000000000000000
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         0000001c  0000000000000000
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .exit.text    00000041  0000000000000000
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  3 .init.text    0000003e  0000000000000000
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  4 .rodata.str1.8 0000004b  0000000000000000
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .rodata.str1.1 00000051  0000000000000000
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .modinfo      000000b7  0000000000000000
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 __mcount_loc  00000008  0000000000000000
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
  8 __versions    000000c0  0000000000000000
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .data         00000000  0000000000000000
                  CONTENTS, ALLOC, LOAD, DATA
 10 .exit.data    00000010  0000000000000000
                  CONTENTS, ALLOC, LOAD, DATA
 11 .init.data    0000000e  0000000000000000
                  CONTENTS, ALLOC, LOAD, DATA
 12 .gnu.linkonce.this_module 00000250  0000000000000000
                  CONTENTS, ALLOC, LOAD, RELOC, DATA, LINK_ONCE_DISCARD
 13 .bss          00000000  0000000000000000
                  ALLOC
 14 .note.GNU-stack 00000000  0000000000000000
                  CONTENTS, READONLY, CODE
 15 .comment      00000048  0000000000000000
                  CONTENTS, READONLY

Without any further explanation, quite a bit of the above should be self-explanatory in that, depending on how you tag your routines and data objects, those routines and objects will be placed in different ELF "sections" in the final loadable module file so that the kernel module loader can treat that content appropriately -- for example discarding the entire .init.data and .init.text sections once the module is loaded.

You can examine the entire symbol table of your module file with:

$ objdump -t crash_elf.ko

or pick on individual sections with:

$ objdump -t -j .init.data crash_elf.ko
$ objdump -t -j .init.text crash_elf.ko
$ objdump -t -j .exit.data crash_elf.ko
$ objdump -t -j .exit.text crash_elf.ko 

So how do those "sections" work again?

If you want to know how all this tagging identifies into which sections different content will be placed, you can see that in the kernel header file include/linux/init.h, an excerpt of which can be seen here:

...
#define __init          __section(.init.text) __cold notrace
#define __initdata      __section(.init.data)
#define __initconst     __section(.init.rodata)
#define __exitdata      __section(.exit.data)
#define __exit_call     __used __section(.exitcall.exit)
...

As you can see, those tags you've been using until now are simply macros that translate to gcc section identifiers, and if you peruse that header file, you'll notice that there are many more such tags for explicitly identifying different parts of your module depending on how you want those parts to be categorized and processed at load time.

So ... where to from here?

In the next lesson, we'll be using some of what you see here to debug both the running kernel and your loaded module on the fly and, before then, you have some homework.

Based on what you've learned in previous lessons, it's your job to configure and build a new kernel, and reboot to be running under that new kernel. (It won't be enough to just build and load modules under the current, default kernel.)

While it's almost a certainty that a default configuration will have these settings, make sure that your new kernel is configured and built with both CONFIG_PROC_KCORE and CONFIG_DEBUG_INFO configuration settings.

Finally, once you're running under that new kernel, make sure that you can recompile and load a trivial module example, since you'll need to do that in the next lesson for the purposes of debugging.

Until next lesson ...

AttachmentSize
crash_elf.c861 bytes

Comments

Run objdump on hello{,.o}

It's fun. Also on the stripped versions.

Makes me wonder what the file "crtstuff.c" is, in the output from "hello.o"

00000000 l df *ABS* 00000000 crtstuff.c

Unsurprisingly (but especially after first seeing the objdump output), this fails:

make hello.o
strip hello.o
make hello

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <p> <br> <pre> <h1> <h2> <h3> <h4>
  • Lines and paragraphs break automatically.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.