Efficient way to find task_struct by pid

There is a better way to get the instance of task_struct from a module. Always try to use wrapper function/ helper routines because they are designed in such a way if driver programmer missed something, the kernel can take care by own. For eg - error handling, conditions checks etc.

/* Use below API and you will get a pointer of (struct task_struct *) */

taskp = get_pid_task(pid, PIDTYPE_PID);

and to get the PID of type pid_t. you need to use below API -

find_get_pid(pid_no);

You don't need to use "rcu_read_lock()" and "rcu_read_unlock()" while calling these API's because "get_pid_task()" internally calls rcu_read_lock(),rcu_read_unlock() before calling "pid_task()" and handles concurrency properly. That's why I have said above use these kind of wrapper always.

Snippet of get_pid_task() and find_get_pid() function below :-

struct task_struct *get_pid_task(struct pid *pid, enum pid_type type)
{
    struct task_struct *result;
    rcu_read_lock();
    result = pid_task(pid, type);
    if (result)
        get_task_struct(result);
    rcu_read_unlock();
    return result;
}
EXPORT_SYMBOL_GPL(get_pid_task);

struct pid *find_get_pid(pid_t nr)
{
    struct pid *pid;

    rcu_read_lock();
    pid = get_pid(find_vpid(nr));
    rcu_read_unlock();

    return pid;
}
EXPORT_SYMBOL_GPL(find_get_pid);

In a kernel module, you can use wrapper function in the following way as well -

taskp = get_pid_task(find_get_pid(PID),PIDTYPE_PID);

PS: for more information on API's you can look at kernel/pid.c


If you want to find the task_struct from a module, find_task_by_vpid(pid_t nr) etc. are not going to work since these functions are not exported.

In a module, you can use the following function instead:

pid_task(find_vpid(pid), PIDTYPE_PID);

No one mentioned that the pid_task() function and the pointer (which you obtain from it) should be used inside RCU critical section (because it uses RCU-protected data structure). Otherwise there can be use-after-free BUG.
There are lots of cases of using pid_task() in Linux kernel sources (e.g. in posix_timer_event()).
For example:

rcu_read_lock();
/* search through the global namespace */
task = pid_task(find_pid_ns(pid_num, &init_pid_ns), PIDTYPE_PID);
if (task)
    printk(KERN_INFO "1. pid: %d, state: %#lx\n",
           pid_num, task->state); /* valid task dereference */
rcu_read_unlock(); /* after it returns - task pointer becomes invalid! */

if (task)
    printk(KERN_INFO "2. pid: %d, state: %#lx\n",
           pid_num, task->state); /* may be successful,
                                   * but is buggy (task dereference is INVALID!) */

Find out more about RCU API from Kernel.org


P.S. also you can just use the special API functions like find_task_by_pid_ns() and find_task_by_vpid() under the rcu_read_lock().

The first one is for searching through the particular namespace:

task = find_task_by_pid_ns(pid_num, &init_pid_ns); /* e.g. init namespace */

The second one is for searching through the namespace of current task.


What's wrong with using one of the following?

extern struct task_struct *find_task_by_vpid(pid_t nr);
extern struct task_struct *find_task_by_pid_ns(pid_t nr,
            struct pid_namespace *ns);