vegard / kernel-dev.md
The Linux kernel is written in C, so you should have at least a basic understanding of C before diving into kernel work. You don’t need expert level C knowledge, since you can always pick some things up underway, but it certainly helps to know the language and to have written some userspace C programs already.
It will also help to be a Linux user. If you have never used Linux before, it’s probably a good idea to download a distro and get comfortable with it before you start doing kernel work.
Lastly, knowing git is not actually required, but can really help you (since you can dig through changelogs and search for information you’ll need). At a minimum you should probably be able to clone the git repository to a local directory.
I always recommend getting comfortable with building and booting the kernel yourself, this is really the very first thing you should know as this will allow you to start experimenting with the code. Or put another way, you can’t meaningfully do anything with the kernel code until you know how to build and run it. (Well, you can still read the code, but by itself that doesn’t buy you very much.)
Grab a copy of the mainline repository:
git clone http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Make sure you have the gcc and binutils packages installed, as this is required to build the kernel.
Run make defconfig , this will give you a basic config that should work on your current architecture/platform (which almost certainly is x86).
Run make menuconfig , this will allow you to change the kernel configuration. The default configuration is mostly fine, the only thing I would recommend changing is to turn off CONFIG_MODULES , since this will make it easier to boot the kernel (no need to install the modules somewhere that the kernel can find them during boot). To do this, type /MODULES (and press enter) to search for the config option. Presssing 1 will take you to the first match. Then press n to disable this option. That’s it! Save and exit.
Run make -j8 (change 8 to whatever number of cores your computer has) to build the kernel, the important files at the end will be arch/x86/boot/bzImage and vmlinux . It can take anywhere from 5 to 15 minutes or more with the default config, depending also on your hardware of course.
Download a minimal disk image, for example:
kvm -cpu host -hda debian_wheezy_amd64_standard.qcow2 -kernel arch/x86/boot/bzImage -append "console=ttyS0 root=/dev/sda init=/bin/bash" -serial stdio -no-reboot -display none -m 1G
(On some distros you have to use qemu or qemu-system-x86_64 instead of kvm , but it’s really the same program.)
This should give you a root shell in the VM (type Ctrl-D/Ctrl-C to exit/kill the VM).
Kernelnewbies has a lot of information that is useful for beginners:
If you want more information about kernel internals (differences from userspace programming, more about internal APIs, etc.) you can try these links:
Some nice posts on getting a development environment set up:
I personally never used books much, but others have suggested these in the comments:
If you are unsure about what to work on, there are a few things you can look at. Personally I recommend playing around with the kernel before you go off trying to find your «big» project. By «playing around» I mean adding some printk()s here and there, maybe trying to trace the function calls that happen when you call a system call (e.g. start from fork() , follow it into _do_fork() , etc.). It can be fun to just understand how something works, you don’t necessarily have to produce patches in your first few weeks/months of looking at the kernel (in fact, it would probably be a mistake to start producing patches right away).
When you are a bit more comfortable with the kernel sources and you want to try to take on a bigger project or just submit your first patches there is the kernelnewbies list of projects to look at:
It looks a little bit outdated, but it could still be useful to have a look just to get some ideas.
If you are interested in security, the Kernel Self-Protection Project also has a list of things that need to get done (although these projects may be a little bit more advanced):
Finally, if you just want to dive into some bugs (to try to understand or fix them), you can always have a look at the syzkaller list of open bugs:
If you are just starting out, you shouldn’t expect to be able to fix or even find the root cause of any of these bugs. But that doesn’t mean you shouldn’t try! You can learn a lot about the kernel from trying to understand how something works, even if it doesn’t end up in a concrete patch or contribution. I wrote a blog post a few years ago about trying to fix a random syzkaller report that popped up, it could be a good read if you’re into this sort of thing: http://www.vegardno.net/2016/08/sync-debug.html
Joining mailing lists (see below) or reading LWN can also be a good way to discover things that need to be done in the kernel.
In general I would discourage sending a lot of small patches (e.g. typo fixes — unless they’re part of the Documentation/ directory). Most maintainers are fine taking a few small patches like that (and it can be a good way to get used to the patch submission process), but if you keep sending only/mostly patches like that it will be more disruptive than helpful.
One thing that could be relatively easy (and which could be seen as potentially very useful) would be to add more kernel-doc comments to code that is not documented. See more info about kernel-doc comments here:
Whatever you choose to work on, it could be a good idea to send an email to the mailing list for the subsystem/driver you are thinking of working on before you spend a lot of time on it. For example, there’s probably not much point adding kernel-doc comments to some old derelict serial driver that hasn’t been used in 15 years. Note that not all maintainers appreciate receiving private emails (see https://people.kernel.org/tglx/notes-about-netiquette), so don’t do that unless they’ve explicitly said that they are fine receiving private/direct emails (for example, I am personally happy to receive private emails).
If you have specific areas of interest (e.g. BPF, filesystems, networking, etc.), I recommend subscribing to the mailing lists pertaining to that area. You can find a big list of many of the mailing lists (and how to subscribe to them) here:
There is also a list of archives so you can go look at past discussions as well:
Subscribing to the mailing lists will give you a feel for the kinds of things people are working on in that subsystem, it will show you who the other people working on it are, what you can expect in terms of responses, etc.
Maybe also check out the LKML FAQ:
Some people have expressed fear around submitting patches or writing on mailing lists. In general I would say that you don’t have that much to be scared of. Think twice before you hit «send» (i.e. don’t send lots of little typo patches, don’t send patches you haven’t tested, etc.). You can use git-send-email —dry-run before you send off any patches to ensure that you’re sending what you think you’re sending. Also do a little bit of research around what is expected in terms of formatting and etiquette, there is some more info here:
Everybody makes mistakes from time to time, so don’t sweat it too much. If you are sending your first email to LKML, say that you are new and people will generally be understanding.
How was kernel written? [closed]
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Running a program in kernel mode forbids using standard C library because the only thing your program linked to is kernel itself. So I’m allowed to use functions defined in kernel. But kernel itself is a program written in C and compiled for some particular architecture. And it shouldn’t use C standard library, but it also shouldn’t use any drivers since drivers are loadable modules. So my question is what actual C functions are used when writing a kernel? How can you interact with hardware not through kernel? Don’t say me to look at sources it’s too next level for me, TY.
3 Answers 3
The Unix kernel has traditionally included some assembly language code. I haven’t looked at its source code recently, but I suspect that that’s still true.
See How does a driver actually communicate with a hardware device? for an overview of that topic. The answers to that question discuss two kinds of computer architecture. On a system that uses port-mapped I/O (PMIO), the kernel must be written partly in assembly language — although you may be able to get by with a couple very short routines. On a system that uses memory-mapped I/O (MMIO), even device drivers can be written entirely in C. All they need to do is declare a pointer, set it equal to the virtual address of the device, and then use it to manipulate the device as if it were accessing memory.
Not all drivers are loadable modules, being loadable is simply an option but some crucial drivers are not dynamically loaded, they are part of the kernel.
The kernel reproduces a whole bunch of the functionality provided by libc statically, within itself.
Just like in user-mode C programming, a function can be defined in one compilation unit and another unit can simply reference it (usually via a header file), the compiler will make it an undefined reference, and the linker will link it with the compilation unit that actually provides the symbol.
Loading kernel modules works on the same principle as dynamic loading and it’s described here: http://www.tldp.org/LDP/tlk/modules/modules.html
I mean, is there any functions in C which are not implemented neither in kernel nor in standard library? What is the most «naked»(assembly-like) C only compiler knows what to do with ?
@user30167 You can use any standard types known by the compiler to implement functions to manipulate data (all the basic operators like: +,-,*,/,=,^,|,&. ). In order to do useful stuff with the data, you need the drivers implemented by the kernel which are also accessible by libc (which uses syscalls to the kernel). There is also bare metal programming, but that will typically require some assembly language or a specific compiler.