ftp.cn.debian.org is there!

Thank jameszhang for providing us with the new powerful machine, so that we have mirrors.ustc.edu.cn. Now I am glad to announce that ftp.cn.debian.org is CNAME for mirrors.ustc.

mirrors.ustc.edu.cn is now official mirrors for debian, ubuntu, archlinux, fedora, centos, mozilla, gnome, kde, cpan, ctan and deepin linux.

You can access our server via http, ftp and rsync. Due to the limited campus-shared bandwidth, we have to limit bandwidth up to 100mbps on China Telecom interface during peak time (8am~11pm, CST), CERNET/CERNET2 users will not be affected. During off-peak time, there will be no limitations on any interface. So if you want to rsync large data from us, please try to do that in the midnight, to avoid the crowd traffic.

Feel free to contact us via mirrors AT ustc DOT edu DOT cn if you have and problem or suggestions.

URL of this post: ftp.cn.debian.org is there!

Share

USTC Cloud Live Debian

Leverage USTC PXE Service, we are developing a Customized Cloud Live System for USTCers. In the past, we can boot Live Linux systems from PXE in USTC campus, however, all modifications after boot are remained in the RAM, so they won’t persist after system reboot. Thanks to the 300M free ftp space provided by USTC Network Center to every student and teacher, our customized system now will mount it on home directory, so modifications in $HOME will persist.

After the system boot up, you will be asked to enter your membership, username and password (the same with email.ustc.edu.cn account):

After you entered the correct username/password, the system is ready for login. You can login with the same username as email account, without password. You will have sudo privilege.


Currently, it’s just a debian basic system, without any desktop environment installed. We will further customized it for USTC environment. E.g. setting up thunderbird to use email.ustc.edu.cn by default, install software popular among USTCers.

We will also try to make apt-get installed software to be saved in home dir, so they will also persist.

Enjoy it! Period.

URL of this post: USTC Cloud Live Debian

Share

Use debian-live to Create Customized PXE Live Debian

There already have been a lot of Live Linux there. However they are mostly just for general purpose and may not be suitable for you. The debian group introduced a Debian Live Project which makes it easier  for you to customize a Live Linux. Here is the full manual. It is too long if your purpose is as simple as to make a customized PXE Live Debian. In this post, I will give a simplified instruction.

Create Live Debian

We use live-build tools to build Live Debian, first install it:

sudo apt-get install live-debian

Then, we will get many lb * commands, you can observe them with man lb_*. However, only three commands we use directly, lb config/build/clean, all others are low level implementations.

Let’s first create a folder for our live system, and copy a sample auto-scripts:

mkdir live && cd live
mkdir auto
cp /usr/share/live/build/examples/auto/* auto/

There are three files in auto/, config, build, clean. auto/config is executed when you run lb config, auto/build is executed by lb build and auto/clean is executed by lb clean.

Let’s make a little modifications to auto/config:

 cat auto/config
#!/bin/sh

lb config noauto \
    --architectures i386 \
    --binary-images net \
    --bootloader syslinux \
    --compression bzip2 \
    --distribution squeeze \
    --hostname debian-live \
    --language zh \
    --linux-flavours 686 \
    --parent-mirror-bootstrap http://debian.ustc.edu.cn/debian/ \
    --parent-mirror-chroot-security http://debian.ustc.edu.cn/debian-security/ \
    --parent-mirror-binary http://debian.ustc.edu.cn/debian/ \
    --parent-mirror-binary-security http://debian.ustc.edu.cn/debian-security/ \
    --mirror-bootstrap http://debian.ustc.edu.cn/debian/ \
    --mirror-chroot-security http://debian.ustc.edu.cn/debian-security/ \
    --mirror-binary http://debian.ustc.edu.cn/debian/ \
    --mirror-binary-security http://debian.ustc.edu.cn/debian-security/ \
    --packages "ibus-pinyin" \
    --archive-areas "main non-free contrib" \
        "${@}"

Most of the parameters are literally self explained. For detailed instructions, see man lb_config

Then we run config and build process.

lb config
sudo lb build

lb config may take a long time (20min or more, depends on you network bandwidth and your package settings). It will first run debootstrap to install a standard system in chroot/ directory. Then it will do install packages in chroot environment. It will procedure further customization.

After a long time waiting, we get binary/live/. files in this directory are all we need to boot from PXE.

Customize Live Debian

Till now, we’ve get a PXE bootable live system. If you’re can’t wait to have a try on it, you can goto the next section. But you’re sure to come back.

The system does not have a normal user, meanwhile, you don’t know password for root user. So you cannot login into that system. So the first thing is to create a user and set the password. First we have to change root to the live system, and all stuffs in this section should be done in the chroot environment.

sudo chroot chroot/ /bin/bash
useradd yourusername
passwd yourusername
adduser yourusername sudo  # so you can use sudo
passwd root  # in case you wanna use root user
exit  # exit chroot environment

You can install packages, write scripts and all other customizations in the chroot environment.

sudo chroot chroot/ /bin/bash
aptitude update
aptitude install xxx  # install more packages
# do other customizations.
exit

Boot Live Debian via PXE

I an not going deep into how to setup PXE boot environment in this post. I just give you the menu entry to boot this system.

First, you should export filesystem.squashfs via NFS.

cp -r binary/live /nfsroot/debian/
echo '/nfsroot/debian/ *(ro,async,no_root_squash_no_subtree_check)' >> /etc/exports
exportfs -a

Then the syslinux menu entry:

LABEL live-debian
MENU LABEL My Live Debian
KERNEL vmlinuz-xxx
INITRD initrd.img-xxx
APPEND boot=live netboot=nfs nfsroot=your-ip:/nfsroot/debian/

Done!

Share

USTC Ethernet Boot Service Update (2011-04-01)

Introduction to iPXE

USTC Ethernet Boot Service was first started by Fengguang Wu early in 2000. It enables people within USTC campus to boot their computer through network, to install Linux, experience Linux, and perform system maintainance tasks such as partitioning and backup.

In 2010, we upgrade the service with gPXE, which is an enhanced version to PXE. It is capable with protocols like HTTP, FTP, etc, which gains great performance improvement against TFTP.

Today, we upgrade gPXE to iPXE. iPXE is the official replacement of gPXE. It is developed by people who originally developed gPXE (which envolved from Etherboot). Unfortunately, the gpxe.org and etherboot.org domains are owned by an individual who wishes to exercise a high degree of control over the project and the codebase, so in April 2010 the decision was taken to create a new project named iPXE, using theexisting code base as a starting point. Since the two project diverged, development on gPXE has stopped, while iPXE is very actively updated.

Update Contents

The first notable update is the chain load sequence. In previous configuration, our chain is like:

[PXE (BIOS)]/Grub --> gPXE --> menu.c32 (Syslinux <= 3.86) --> gPXE

As gPXE/iPXE does not support natively load com32r modules from Syslinux newer than 3.86, and is not that compatible with that older than Syslinux 3.86, it caused many incompatible problems. Many PC’s will reboot immediately after hit one menu item. Now then chain load sequence has changed to:

[PXE (BIOS)]/Grub --> iPXE --> Syslinux 4.0.3 --> menu.c32

Now, it works perfectly!

There are some minor updates with the service. Added Offline NT Password & Registry Editor. Added Hardware Detection Tool from Syslinux.

Usage

For users within USTC campus, there are generally two ways to use the newly updated iPXE.

  • Chain load from PXE. If your PC supports PXE, you can boot from LAN and get loaded into PXE, and then type “iPXE<ret>”, then you will be brought to iPXE.
  • Chain load from Grub. If your PC does not support PXE, or you are in a LAN without DHCP server setup properly, you can download ustc.ipxe.lkrn, and boot it with grub.
    • Grub 1: kernel ustc.ipxe.lkrn
    • Grub 2: linux16 ustc.ipxe.lkrn

Troublesome

  • If you don’t know how to use it, please post your problem in Linux board of USTCBBS.
  • If you encountered some bugs, please contact lug AT ustc.edu.cn.

Source Code

The source code are located in http://git.onebitbug.me/?p=ustc-pxe.git;a=summary. You can get the code:

git clone http://git.onebitbug.me/ustc-pxe.git

Then setup your web server, add a virtual path to ustc-pxe/src. Then get the binary files:

cd src
./updatebin.sh

you then can boot you other PC with grub loading bin/ipxe.lkrn, after iPXE starts, press CTRL-B to enter command line. Then type the following commands:

dhcp
chain http://your-site/path-to-src/boot.php

If you want native PXE boot environment, you have to set up tftp and dhcp. See instructions here: http://ipxe.org/howto/chainloading

Enjoy it!


Share

Return to MySQL

Though sqlite is quiet small and portable, there are too many plugins that won’t live with it. At last, I give up sqlite, and return to MySQL. I didn’t know what trouble I would met while transfering to MySQL, I just reinstalled my entire blog. I lost all the comments :-(

For those who have subscribed to my blog’s RSS, the last five post will appear again, sorry for the noises.

URL of this post: Return to MySQL

Share

Introducing Linux Kernel Symbols

In kernel developing, sometimes we have to examine some kernel status, or we want to reuse some kernel facilities, we need to access (read, write, execute) kernel symbols. In this article, we will see how the kernel maintains the symbol table, and how we can use the kernel symbols.

This article is more of a guide to reading kernel source code and kernel development. So we will work a lot with source code.

What are kernel symbols

Let’s begin with some basic knowledge. In programming language, a symbol is either a variable or a function. Or more generally, we can say, a symbol is a name representing an space in the memory, which stores data (variable, for reading and writing) or instructions (function, for executing). To make life easier for cooperation among various kernel function unit, there are thousands of global symbols in Linux kernel. A global variable is defined outside of any function body. A global function is declared without inline and static. All global symbols are listed in /proc/kallsyms. It looks like this:

$ tail /proc/kallsyms
ffffffff81da9000 b .brk.dmi_alloc
ffffffff81db9000 B __brk_limit
ffffffffff600000 T vgettimeofday
ffffffffff600140 t vread_tsc
ffffffffff600170 t vread_hpet
ffffffffff600180 D __vsyscall_gtod_data
ffffffffff600400 T vtime
ffffffffff600800 T vgetcpu
ffffffffff600880 D __vgetcpu_mode
ffffffffff6008c0 D __jiffies

It’s in nm‘s output format. The first column is the symbol’s address, the second column is the symbol type. You can see the detailed instruction in nm‘s manpage.

In general, one will tell you this is the output of nm vmlinux. However, some entries in this symbol table are from loadable kernel modules, how can they be listed here? Let’s see how this table is generated.

How is /proc/kallsyms generated

As we have seen in the last two articles, contents of procfs files are generated on reading, so don’t try to find this file anywhere on your disk. But we can directly go to the kernel source for the answer. First, let’s find the code that creates this file in kernel/kallsyms.c.

static const struct file_operations kallsyms_operations = {
        .open = kallsyms_open,
        .read = seq_read,
        .llseek = seq_lseek,
        .release = seq_release_private,
};

static int __init kallsyms_init(void)
{
        proc_create("kallsyms", 0444, NULL, &kallsyms_operations);
        return 0;
}
device_initcall(kallsyms_init);

On creating the file, the kernel associates the open() operation with kallsyms_open(), read()->seq_read(), llseek()->seq_lseek() and release()->seq_release_private(). Here we see that this file is a sequence file.

The detail about sequence file is out of scope of this article. There is a comprehensive description located in kernel documentation, please go through Documentation/filesystems/seq_file.txt if you don’t know what is sequence file. In a short way, due to the page limitation in proc_read_t, the kernel introduced sequence file for kernel to provide large amount of information to the user.

Ok, back to the source. In kallsyms_open(), it does nothing more than create and reset the iterator for seq_read operation, and of course set the seq_operations:

static const struct seq_operations kallsyms_op = {
        .start = s_start,
        .next = s_next,
        .stop = s_stop,
        .show = s_show
};

So, for our goals, we care about s_start() and s_next(). They both invoke update_iter(), and the core of update_iter() is get_ksymbol_mod(), and followed by get_ksymbol_mod(). At last, we reached module_get_kallsym() in kernel/module.c:

int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
                        char *name, char *module_name, int *exported)
{
        struct module *mod;

        preempt_disable();
        list_for_each_entry_rcu(mod, &modules, list) {
                if (symnum < mod->num_symtab) {
                        *value = mod->symtab[symnum].st_value;
                        *type = mod->symtab[symnum].st_info;
                        strlcpy(name, mod->strtab + mod->symtab[symnum].st_name,
                                KSYM_NAME_LEN);
                        strlcpy(module_name, mod->name, MODULE_NAME_LEN);
                        *exported = is_exported(name, *value, mod);
                        preempt_enable();
                        return 0;
                }
                symnum -= mod->num_symtab;
        }
        preempt_enable();
        return -ERANGE;
}

In module_get_kallsym(), it iterates all modules and symbols. Five properties are assigned values. value is the symbol’s address, type is the symbol’s type, name is the symbol’s name, module_name is the module name if the module is not compiled in core, otherwise empty. exported indicates whether the symbol is exported. Have you ever wondered why there are some many “local” (the type char is in lower case) symbols in the symbol table? Let’s have a lookat s_show():

        if (iter->module_name[0]) {
                char type;

                /*
                 * Label it "global" if it is exported,
                 * "local" if not exported.
                 */
                type = iter->exported ? toupper(iter->type) :
                                        tolower(iter->type);
                seq_printf(m, "%0*lx %c %s\t[%s]\n",
                           (int)(2 * sizeof(void *)),
                           iter->value, type, iter->name, iter->module_name);
        } else
                seq_printf(m, "%0*lx %c %s\n",
                           (int)(2 * sizeof(void *)),
                           iter->value, iter->type, iter->name);

Ok, clear about it? All these symbols are global in C language aspect, but only exported symbols are labeled as “global”.

After the iteration finished, we see the contents of /proc/kallsyms.

How to access symbols

Here, access can be read, write and execute. Let’s have a look at this simplest module:

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/jiffies.h>

MODULE_AUTHOR("Stephen Zhang");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Use exported symbols");

static int __init lkm_init(void)
{
    printk(KERN_INFO "[%s] module loaded.\n", __this_module.name);
    printk("[%s] current jiffies: %lu.\n", __this_module.name, jiffies);
    return 0;
}

static void __exit lkm_exit(void)
{
    printk(KERN_INFO "[%s] module unloaded.\n", __this_module.name);
}

module_init(lkm_init);
module_exit(lkm_exit);

In this module, we used printk() and jiffies, which are both symbols from kernel space. Why are these symbols available in our code? Because they are “exported”.

You can think of kernel symbols as visible at three different levels in the kernel source code:

  • “static”, and therefore visible only within their own source file
  • “external”, and therefore potentially visible to any other code built into the kernel itself, and
  • “exported”, and therefore visible and available to any loadable module.

The kernel use two macros to export symbols:

  • EXPORT_SYMBOL exports the symbol to any loadable module
  • EXPORT_SYMBOL_GPL exports the symbol only to GPL-licensed modules.

We find the two symbols exported in the kernel source code:
kernel/printk.c:EXPORT_SYMBOL(printk);
kernel/time.c:EXPORT_SYMBOL(jiffies);

Except for examine the kernel code to find whether a symbol is exported, is there anyway to identify it more easily? The answer is sure! All exported entry have another symbol prefixed with __ksymab_. e.g.

ffffffff81a4ef00 r __ksymtab_printk
ffffffff81a4eff0 r __ksymtab_jiffies

Let’s just have another look at the definition of EXPORT_SYMBOL:

/* For every exported symbol, place a struct in the __ksymtab section */
#define __EXPORT_SYMBOL(sym, sec)                               \
        extern typeof(sym) sym;                                 \
        __CRC_SYMBOL(sym, sec)                                  \
        static const char __kstrtab_##sym[]                     \
        __attribute__((section("__ksymtab_strings"), aligned(1))) \
        = MODULE_SYMBOL_PREFIX #sym;                            \
        static const struct kernel_symbol __ksymtab_##sym       \
        __used                                                  \
        __attribute__((section("__ksymtab" sec), unused))       \
        = { (unsigned long)&sym, __kstrtab_##sym }

#define EXPORT_SYMBOL(sym)                                      \
        __EXPORT_SYMBOL(sym, "")

The highlighted line places a struct kernel_symbol __ksymtab_##sym int the symbol table.

There is one more thing that worth noting, __this_module is not an exported symbol, nor is it defined anywhere in the kernel source. In the kernel, all we can find about __this_module are nothing more than the following two lines:

extern struct module __this_module;
#define THIS_MODULE (&__this_module)

How?! It’s not defined in the kernel, what to link against while insmod then? Don’t panic. Have you noticed the temporary file hello.mod.c while compiling the module ? Here is the definition for __this_module:

struct module __this_module
__attribute__((section(".gnu.linkonce.this_module"))) = {
 .name = KBUILD_MODNAME,
 .init = init_module,
#ifdef CONFIG_MODULE_UNLOAD
 .exit = cleanup_module,
#endif
 .arch = MODULE_ARCH_INIT,
};

So far, as we see, we can use any exported symbols directly in our module; the only thing we have to do is to include the corresponding header file, or just to have the right declaration. Then, what if we want to access the other symbols in the kernel? Though it’s not a good idea to do such a thing, any symbol that is not exported, usually don’t expect anyone else to visit them, avoiding potential disasters; someday, just to fulfill one’s curiosity, or one knows exactly what he is doing, we have to access the non-exported symbols. Let’s go further.

How to access non-exported symbol

For each symbol in the kernel, we have an entry in /proc/kallsyms, and we have addresses for all of them. Since we are in the kernel, we can see any bit we want to see! Just read from that address. Let’s take resume_file as an example. Source code comes first:

#include <linux/module.h>
#include <linux/kallsyms.h>
#include <linux/string.h>

MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Access non-exported symbols");
MODULE_AUTHOR("Stephen Zhang");

static int __init lkm_init(void)
{
    char *sym_name = "resume_file";
    unsigned long sym_addr = kallsyms_lookup_name(sym_name);
    char filename[256];

    strncpy(filename, (char *)sym_addr, 255);

    printk(KERN_INFO "[%s] %s (0x%lx): %s\n", __this_module.name, sym_name, sym_addr, filename);

    return 0;
}

static void __exit lkm_exit(void)
{
}

module_init(lkm_init);
module_exit(lkm_exit);

Here, instead of parsing /proc/kallsyms to find the a symbol’s address, we use kallsyms_lookup_name() to do it. Then, we just treat the address as char *, which is the type of resume_file, and read it using strncpy().

Let’s see what happens when we run:

sudo insmod lkm_hello.ko
dmesg | tail -n 1
[lkm_hello] resume_file (0xffffffff81c17140): /dev/sda6
grep resume_file /proc/kallsyms
ffffffff81c17140 d resume_file

Yeap! We did it! And we see the symbol address returned by kallsyms_lookup_name() is exactly the same as in /proc/kallsyms.
Just like read, you can also write to a symbol’s address, but be careful, some addresses are in rodata section or text section, which cannot be written. If you try to write to a readonly address, you will probably get a kernel oops. However, this does not mean NO. You can turn off the protection. Follow instructions in this page. The basic idea is changing the page attribute:

int set_page_rw(long unsigned int _addr)
{
    struct page *pg;
    pgprot_t prot;
    pg = virt_to_page(_addr);
    prot.pgprot = VM_READ | VM_WRITE;
    return change_page_attr(pg, 1, prot);
}

int set_page_ro(long unsigned int _addr)
{
    struct page *pg;
    pgprot_t prot;
    pg = virt_to_page(_addr);
    prot.pgprot = VM_READ;
    return change_page_attr(pg, 1, prot);
}

Conclusion

Well, that’s too much for this post. In this article, we first dig into the Linux kernel source code, to find out how the kernel symbol table is generated. Then we learned how to use exported kernel symbols in our modules. Finally, we saw the tricky way to access all kernel symbols within a module.

Reference

Share

NIC Packet Broken Issue Solved

In the first post, I mentioned that errors occurred while using USTC Debian Repo to upgrade this server. There were also problems when transfering data with scp and wget at a high speed (>5MB/s).

Thanks for Shiwei Liu‘s help, the problem is solved now. It is due to a bug in rtl8169 nic. When the firmware version is below 25, we should turn off rx offload option. RX offload is the checksum for received mac packets. rtl8169 computes this checksum in hardware, to reduce CPU workload, when rx offload is on, the operating system will not compute it again. But there is a bug in computing rx offload checksum with firmware version below 25, so we should turn it off with ethtool:

ethtool -K eth0 rx off

After executing the above command, everything works fine!

URL of this post: NIC Packet Broken Issue Solved

Share

Using LKM and procfs — Part II

In the last post, we saw that it is very simple to implement an loadable kernel module in Linux. In this post, we will see how to use procfs.

Introduction to procfs

I believe that most Linux users know /proc, we can obtain much information on processes from it, we can lookup the basic usage from the proc manpage. There are also a lot of materials. Then, have you think of using procfs to help us providing other information? OK, we will see how to do it soon. Believe that, it can’t be more complicated than writing lkm.

Create/Remove proc entry

Let’s first have a look at how to create an remove proc entry. We use create_proc_entry() to create a proc entry. The three args are filename, access mode, and parent directory. If the parent directory is /proc, we can pass NULL. The return value is a pointer of struct proc_dir_entry (NULL on failure). With this pointer, we can manipulate the other properties of the file, like the what to do when a user read the file. Here is the prototype of create_proc_entry() and struct proc_dir_entry:

struct proc_dir_entry *create_proc_entry(const char *name, mode_t mode,
						struct proc_dir_entry *parent);
struct proc_dir_entry {
	unsigned int low_ino;
	unsigned short namelen;
	const char *name;
	mode_t mode;
	nlink_t nlink;
	uid_t uid;
	gid_t gid;
	loff_t size;
	const struct inode_operations *proc_iops;
	const struct file_operations *proc_fops;
	struct proc_dir_entry *next, *parent, *subdir;
	void *data;
	read_proc_t *read_proc;
	write_proc_t *write_proc;
	atomic_t count;		/* use count */
	int pde_users;	/* number of callers into module in progress */
	spinlock_t pde_unload_lock; /* proc_fops checks and pde_users bumps */
	struct completion *pde_unload_completion;
	struct list_head pde_openers;	/* who did ->open, but not ->release */
};

Soon we will see how to use proc_read() and proc_write() to set the handler for reading and writing this file.

We use proc_remove_entry() to delete a proc entry. The two args are the filename and the parent dir’s pointer. If the parent dir is /proc, you can use NULL. Here is the declaration:

void remove_proc_entry(const char *name, struct proc_dir_entry *parent);

Write callback function

When a user writes to a proc file, the relevent proc_write() will be called. This is the declaration:

typedef	int (write_proc_t)(struct file *file, const char __user *buffer,
			   unsigned long count, void *data);

Among the arguments, buffer is the data user written, and len is the size of data. This buffer is a user space address so you can not access it directlly inside the kernel. You should use copy_from_user() to copy it into the kernel space address. data is a pointer to your private data.

Read callback function

When a user tries to read from the proc entry, the relevent read_proc() will be invoked, and inside this function, the kernel prepares the data user will get. Here is the declaration:

typedef	int (read_proc_t)(char *page, char **start, off_t off,
			   int count, int *eof, void *data);

In the following example, we will see how to use these functions.

Other useful functions

Linux kernel also provides some other useful functions to use procfs.

struct proc_dir_entry *proc_symlink(const char *,
		struct proc_dir_entry *, const char *);
struct proc_dir_entry *proc_mkdir(const char *,struct proc_dir_entry *);
struct proc_dir_entry *proc_mkdir_mode(const char *name, mode_t mode,
			struct proc_dir_entry *parent);
static inline long copy_from_user(void *to,
		const void __user * from, unsigned long n);
static inline long copy_to_user(void __user *to,
		const void *from, unsigned long n);
void *vmalloc(unsigned long size);
void vfree(const void *addr);

A simple calculator via procfs

We will use procfs to implement a simple accumulator. After the module is loaded, it will create a file /proc/simacc, then we can use echo to put some integers into this file, the next time we read the file, we will get the sum of these integers.

Here is the full code:

#include
#include
#include
#include
#include

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Stephen Zhang");
MODULE_DESCRIPTION("Simple accumulator via procfs");

#define MAX_INPUT_SIZE 1024
char input_buf[MAX_INPUT_SIZE];
char result[16];
static struct proc_dir_entry *simacc_file;

static int simacc_read(char *page, char **start, off_t off,
	int count, int *eof, void *data)
{
    int len;
    len = sprintf(page, "%s\n", result);
    return len;
}

static int simacc_write(struct file *file, const char *buffer,
    unsigned long count, void *data)
{
    int len;
    unsigned long num = 0, sum = 0;
    int i;
    char c;
    if (count > MAX_INPUT_SIZE)
	len = MAX_INPUT_SIZE;
    else
	len = count;

    if(copy_from_user(input_buf, buffer, len))
	return -EFAULT;
    input_buf[len] = '\0';

    i = 0;
    do {
	c = input_buf[i++];
	if (c >= '0' && c <= '9') {
	    num = num * 10 + c - '0';
	} else {
	    sum += num;
	    num = 0;
	}
    } while (c != '\0' && i < MAX_INPUT_SIZE);
    sprintf(result, "%lu", sum);
    return len;
} 

static int __init init_simacc(void) {
    simacc_file = create_proc_entry("simacc", 0666, NULL);
    if (simacc_file == NULL) {
	return -ENOMEM;
    }
    input_buf[MAX_INPUT_SIZE - 1] = '\0';
    result[0] = '0'; result[1] = '\0';
    simacc_file->data = input_buf;
    simacc_file->read_proc = simacc_read;
    simacc_file->write_proc = simacc_write;
    printk(KERN_INFO "simacc: Module loaded.\n");

    return 0;
}

static void __exit cleanup_simacc(void)
{
    remove_proc_entry("simacc", NULL);
    printk(KERN_INFO "simacc: Module unloaded.\n");
}

module_init(init_simacc);
module_exit(cleanup_simacc);

In the initialization function init_simacc(), we use create_proc_entry() to create /proc/simacc, and then set the read/write handler for this file. The file is created with access mode 0666, so any one can read and write to it. (The file’s owner is root, and main group is also root.) In the exit function cleanup_simacc(), we deleted this function.

In the write callback function simacc_write(), we first use copy_from_user() to copy the user written data to a buffer in kernel. As we only allocated MAX_INPUT_SIZE for input_buf, we have to first check the size of input data, we can at most copy MAX_INPUT_SIZE bytes of data to input_buf, or we will suffer buffer overflow, which may at worst crash the whole system. And then comes the code for calculating. As this is not the keypoint of this article, I make it in a very naive way, all characters beyond [0-9] is recognized as delimiters. After calculating, the resule is written to another buffer result.

The read callback function simacc_read() is very simple, as page is a kernel space address, we can directory write data into it, we can use either strcpy or sprintf to make it.

Now let’s examine the result:

sudo insmod simacc.ko
ls -l /proc/simacc
-rw-rw-rw- 1 root root 0 Feb  3 10:28 /proc/simacc
cat /proc/simacc
0
echo 1 2 3 > /proc/simacc
cat /proc/simacc
6
echo 12d34 > /proc/simacc
cat /proc/simacc
46
sudo rmmod simacc

Conclusion

In these two post, we have seen how to write our own loadable kernel module, and written an very simple snip. Although the example is very simple, it demonstrate the basic elements of the kernel. We know how kernel generate data when we read or write to a proc entry. In later posts, I will write more about how to use procfs to help debug Linux kernel.

Reference

I referred to these materials, but be aware that some of these articles maybe outdated. Keep an eye on what version of kernel they were using. I am using Linux 2.6.37 when writing this post.

Share

Using LKM and procfs — Part I

The proc filesystem was originally designed for providing information on the processes in a system. But given the usefulness of procfs, many elements use it both to report information and enable dynamic module configuration. It can also be used as an communication mechanism between kernel space and user space. LKM is for Loadable kernel module. In this post, we will practice to write our own module, and use procfs to communicate with user environment.

An example of LKM

This book is very detailed on LKM programming, The Linux Kernel Module Programming Guide. For a quick start, I just provide a very simple hello world module to demonstrate usage of lkm.

#include

MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Simple LKM sample");
MODULE_AUTHOR("Stephen Zhang");

int init_simple_lkm(void)
{
    printk(KERN_INFO "SLKM: module loaded. speaking from kernel.\n");
    return 0;
}

void cleanup_simple_lkm(void)
{
    printk(KERN_INFO "SLKM: module unloaded.\n");
}

module_init(init_simple_lkm);
module_exit(cleanup_simple_lkm);

In this simple code, line 19-20 defines the entrance and exit function of the module, they will be invoked while the module is loaded/unloaded. The declaration is in linux/init.h:

typedef int (*initcall_t)(void);
typedef void (*exitcall_t)(void);

Sometimes you will see two macros __init and __exit, __init. The __init macro causes the init function to be discarded and its memory freed once the init function finishes. And the same, the __exit macro causes the omission of the function.

As stdin is an fd associated with a specific process, kernel module don’t have stdin or stdout, thus you cannot use printf to put messages on a terminal. Instead, you should use printk, and the messages will go to /var/log/messages. You can use dmesg or cat /var/log/messages to examine them. KERN_INFO is the level of the message, defined in linux/printk.h:

#define	KERN_EMERG	"<0>"	/* system is unusable			*/
#define	KERN_ALERT	"<1>"	/* action must be taken immediately	*/
#define	KERN_CRIT	"<2>"	/* critical conditions			*/
#define	KERN_ERR	"<3>"	/* error conditions			*/
#define	KERN_WARNING	"<4>"	/* warning conditions			*/
#define	KERN_NOTICE	"<5>"	/* normal but significant condition	*/
#define	KERN_INFO	"<6>"	/* informational			*/
#define	KERN_DEBUG	"<7>"	/* debug-level messages			*/

In kernel 2.6+, build a kernel module is quite simple, just use the following Makefile:

obj-m += simple_lkm.o
all:
	make -C /lib/modules/${shell uname -r}/build/ M=${PWD} modules

And now just type make in your current directory, everything will be done for you. You will get a simple_lkm.ko, load it with insmod:

$ insmod simple_lkm.ko

Now we can see the module is loaded with lsmod and dmesg:

$ lsmod | grep simple_lkm
simple_lkm               938  0
$ dmesg | tail -n 1
[349718.844315] SLKM: module loaded. speaking from kernel.

We can use rmmod to unload the module:

$ rmmod simple_lkm
$ dmesg | tail -n 1
[349913.592481] SLKM: module unloaded.

So far, the first lkm is finished. In the next post, I will demonstrate how to use procfs with an simple procfs calculator.

Reference

In this post, I referred the following materials. But be ware that, some of them may be outdated, keep an eye on the version of linux they are using.

Share

The New Blog Comes

It’s been a long time I have written any blog since Live Space moved to WordPress. Meanwhile, all subdomains of wordpress is blocked in China, which makes me more lazy writing blogs. I was considering buying my own domain name and host space. As I was busy with the final exam of the term, it had been put off and off. Finally I get some breath, and bought this domain on GoDaddy with $29/2yrs. As for the host space, I used my own Fuloong box , and put it in the school library’s server room.

This blog is hosted by wordpress, with nginx and sqlite3. Personally I prefer such database which needs nearlly zero-configuration and highly portable. After, it’s just a personal blog with very poor traffic, MySQL is more than too powerful for this. For the good of sqlite, when I have to move my blog, I just need to backup my nginx/php-fastcgi configuration and the entire wp folder.

The hosting operating system is GNU/Debian testing. As there was no system on the box at first, I was thinking how to install system for it. I installed system on a loongson box before through network, boot via a USB disk, but I wanna try some way new. So I took off the harddisk and plugged it onto my own machine, mounted it, and tried debootstrap. However there was one point that I forgot, at the second stage of debootstrap, it will chroot in to the target env to proceed on, however, the target env requires MIPS env, so my port x86 host can offer nothing… The only thing I can do is give up.

Of course, I can use qemu to install system within my x86 host:

qemu-mipsel -hda /dev/sdb

but I didn’t try it. I followed an old article. Since I already had access to the target harddisk, I didn’t need a flash disk, just copy the boot strap program into the disk and boot them.

I skimmed loongson repo on anheng.com, and found a useful tool: pmon-loongson-config, after installing the latest kernel, just fire update-pmon to update pmon’s boot.cfg automatically. But don’t forget to modify /etc/fstab and boot.cfg, change all hda* to sda*, since in the latest kernel, linux no longer use hd* to present harddisk.

I met another problem while using USTC Debian Repo:

gzip: stdin: invalid compressed data—crc error

At first I thought errors occurred while the repo sync from the upstream, then I changed to use the upstream mirror, the errors still persist. After tried dozens of mirrors, I found only the official repo works. I don’t know if there were anything wrong with distributing mirrors from the official repo, or there were anything wrong with my own machine.

So that’s all for this post. I just wanna replace the default Hello World post of wordpress, so I can test plugins and typesetting.

URL of this post: The New Blog Comes

Share

Staypressed theme by Themocracy, modified by Stephen