互斥锁和request_module（）行为

我在Linux内核中观察到以下代码模式，例如net/sched/act_api.c或许多其他地方：

 rtnl_lock(); rtnetlink_rcv_msg(skb, ...); replay: ret = process_msg(skb); ... /* try to obtain symbol which is in module. */ /* if fail, try to load the module, otherwise use the symbol */ a = get_symbol(); if (a == NULL) { rtnl_unlock(); request_module(); rtnl_lock(); /* now verify that we can obtain symbols from requested module and return EAGAIN.*/ a = get_symbol(); module_put(); return -EAGAIN; } ... if (ret == -EAGAIN) goto replay; ... rtnl_unlock();

在request_module成功之后，我们感兴趣的符号在内核内存空间中可用，我们可以使用它。但是我不明白为什么要返回EAGAIN并重新读取符号，为什么不能在request_module()之后继续？

如果您查看Linux内核中的当前实现，则在第二次调用之后有一个等价于上面代码中的tc_lookup_action_n()它是tc_lookup_action_n() ），它正确地解释了原因：

 rtnl_unlock(); request_module("act_%s", act_name); rtnl_lock(); a_o = tc_lookup_action_n(act_name); /* We dropped the RTNL semaphore in order to * perform the module load. So, even if we * succeeded in loading the module we have to * tell the caller to replay the request. We * indicate this using -EAGAIN. */ if (a_o != NULL) { err = -EAGAIN; goto err_mod; }

尽管模块可以被请求和加载，但是因为信号量被下载以便加载可以睡眠的操作的模块（并且不是“标准方法”，所以执行该功能，函数返回EAGAIN来发信号。

编辑澄清：

如果我们看看添加新动作（这可能导致需要的模块被加载）的调用序列，我们有这个序列： tc_ctl_action() – > tcf_action_add() – > tcf_action_init() – > tcf_action_init_1() 。现在，如果在case RTM_NEWACTION:中将EAGAIN错误恢复到tc_ctl_action() ，我们看到使用EAGAIN ret值重复对tcf_action_add的调用。