Kernel Korner - Why and How to Use Netlink Socket
January 5th, 2005 by Kevin Kaichuan He in
Due to the complexity of developing and maintaining the kernel, only the most essential and performance-critical code are placed in the kernel. Other things, such as GUI, management and control code, typically are programmed as user-space applications. This practice of splitting the implementation of certain features between kernel and user space is quite common in Linux. Now the question is how can kernel code and user-space code communicate with each other?
The answer is the various IPC methods that exist between kernel and user space, such as system call, ioctl, proc filesystem or netlink socket. This article discusses netlink socket and reveals its advantages as a network feature-friendly IPC.
Netlink socket is a special IPC used for transferring information between kernel and user-space processes. It provides a full-duplex communication link between the two by way of standard socket APIs for user-space processes and a special kernel API for kernel modules. Netlink socket uses the address family AF_NETLINK, as compared to AF_INET used by TCP/IP socket. Each netlink socket feature defines its own protocol type in the kernel header file include/linux/netlink.h.
The following is a subset of features and their protocol types currently supported by the netlink socket:
-
NETLINK_ROUTE: communication channel between user-space routing dæmons, such as BGP, OSPF, RIP and kernel packet forwarding module. User-space routing dæmons update the kernel routing table through this netlink protocol type.
-
NETLINK_FIREWALL: receives packets sent by the IPv4 firewall code.
-
NETLINK_NFLOG: communication channel for the user-space iptable management tool and kernel-space Netfilter module.
-
NETLINK_ARPD: for managing the arp table from user space.
Why do the above features use netlink instead of system calls, ioctls or proc filesystems for communication between user and kernel worlds? It is a nontrivial task to add system calls, ioctls or proc files for new features; we risk polluting the kernel and damaging the stability of the system. Netlink socket is simple, though: only a constant, the protocol type, needs to be added to netlink.h. Then, the kernel module and application can talk using socket-style APIs immediately.
Netlink is asynchronous because, as with any other socket API, it provides a socket queue to smooth the burst of messages. The system call for sending a netlink message queues the message to the receiver's netlink queue and then invokes the receiver's reception handler. The receiver, within the reception handler's context, can decide whether to process the message immediately or leave the message in the queue and process it later in a different context. Unlike netlink, system calls require synchronous processing. Therefore, if we use a system call to pass a message from user space to the kernel, the kernel scheduling granularity may be affected if the time to process that message is long.
The code implementing a system call in the kernel is linked statically to the kernel in compilation time; thus, it is not appropriate to include system call code in a loadable module, which is the case for most device drivers. With netlink socket, no compilation time dependency exists between the netlink core of Linux kernel and the netlink application living in loadable kernel modules.
Netlink socket supports multicast, which is another benefit over system calls, ioctls and proc. One process can multicast a message to a netlink group address, and any number of other processes can listen to that group address. This provides a near-perfect mechanism for event distribution from kernel to user space.
System call and ioctl are simplex IPCs in the sense that a session for these IPCs can be initiated only by user-space applications. But, what if a kernel module has an urgent message for a user-space application? There is no way of doing that directly using these IPCs. Normally, applications periodically need to poll the kernel to get the state changes, although intensive polling is expensive. Netlink solves this problem gracefully by allowing the kernel to initiate sessions too. We call it the duplex characteristic of the netlink socket.
Finally, netlink socket provides a BSD socket-style API that is well understood by the software development community. Therefore, training costs are less as compared to using the rather cryptic system call APIs and ioctls.
In BSD TCP/IP stack implementation, there is a special socket called the routing socket. It has an address family of AF_ROUTE, a protocol family of PF_ROUTE and a socket type of SOCK_RAW. The routing socket in BSD is used by processes to add or delete routes in the kernel routing table.
In Linux, the equivalent function of the routing socket is provided by the netlink socket protocol type NETLINK_ROUTE. Netlink socket provides a functionality superset of BSD's routing socket.
The standard socket APIs—socket(), sendmsg(), recvmsg() and close()—can be used by user-space applications to access netlink socket. Consult the man pages for detailed definitions of these APIs. Here, we discuss how to choose parameters for these APIs only in the context of netlink socket. The APIs should be familiar to anyone who has written an ordinary network application using TCP/IP sockets.
To create a socket with socket(), enter:
int socket(int domain, int type, int protocol)
The socket domain (address family) is AF_NETLINK, and the type of socket is either SOCK_RAW or SOCK_DGRAM, because netlink is a datagram-oriented service.
The protocol (protocol type) selects for which netlink feature the socket is used. The following are some predefined netlink protocol types: NETLINK_ROUTE, NETLINK_FIREWALL, NETLINK_ARPD, NETLINK_ROUTE6 and NETLINK_IP6_FW. You also can add your own netlink protocol type easily.
Up to 32 multicast groups can be defined for each netlink protocol type. Each multicast group is represented by a bit mask, 1<<i, where 0<=i<=31. This is extremely useful when a group of processes and the kernel process coordinate to implement the same feature—sending multicast netlink messages can reduce the number of system calls used and alleviate applications from the burden of maintaining the multicast group membership.
As for a TCP/IP socket, the netlink bind() API associates a local (source) socket address with the opened socket. The netlink address structure is as follows:
struct sockaddr_nl
{
sa_family_t nl_family; /* AF_NETLINK */
unsigned short nl_pad; /* zero */
__u32 nl_pid; /* process pid */
__u32 nl_groups; /* mcast groups mask */
} nladdr;
When used with bind(), the nl_pid field of the sockaddr_nl can be filled with the calling process' own pid. The nl_pid serves here as the local address of this netlink socket. The application is responsible for picking a unique 32-bit integer to fill in nl_pid:
NL_PID Formula 1: nl_pid = getpid();
Formula 1 uses the process ID of the application as nl_pid, which is a natural choice if, for the given netlink protocol type, only one netlink socket is needed for the process.
In scenarios where different threads of the same process want to have different netlink sockets opened under the same netlink protocol, Formula 2 can be used to generate the nl_pid:
NL_PID Formula 2: pthread_self() << 16 | getpid();
In this way, different pthreads of the same process each can have their own netlink socket for the same netlink protocol type. In fact, even within a single pthread it's possible to create multiple netlink sockets for the same protocol type. Developers need to be more creative, however, in generating a unique nl_pid, and we don't consider this to be a normal-use case.
If the application wants to receive netlink messages of the protocol type that are destined for certain multicast groups, the bitmasks of all the interested multicast groups should be ORed together to form the nl_groups field of sockaddr_nl. Otherwise, nl_groups should be zeroed out so the application receives only the unicast netlink message of the protocol type destined for the application. After filling in the nladdr, do the bind as follows:
bind(fd, (struct sockaddr*)&nladdr, sizeof(nladdr));
In order to send a netlink message to the kernel or other user-space processes, another struct sockaddr_nl nladdr needs to be supplied as the destination address, the same as sending a UDP packet with sendmsg(). If the message is destined for the kernel, both nl_pid and nl_groups should be supplied with 0.
If the message is a unicast message destined for another process, the nl_pid is the other process' pid and nl_groups is 0, assuming nlpid Formula 1 is used in the system.
If the message is a multicast message destined for one or multiple multicast groups, the bitmasks of all the destination multicast groups should be ORed together to form the nl_groups field. We then can supply the netlink address to the struct msghdr msg for the sendmsg() API, as follows:
struct msghdr msg;
msg.msg_name = (void *)&(nladdr);
msg.msg_namelen = sizeof(nladdr);
The netlink socket requires its own message header as well. This is for providing a common ground for netlink messages of all protocol types.
Because the Linux kernel netlink core assumes the existence of the following header in each netlink message, an application must supply this header in each netlink message it sends:
struct nlmsghdr
{
__u32 nlmsg_len; /* Length of message */
__u16 nlmsg_type; /* Message type*/
__u16 nlmsg_flags; /* Additional flags */
__u32 nlmsg_seq; /* Sequence number */
__u32 nlmsg_pid; /* Sending process PID */
};
nlmsg_len has to be completed with the total length of the netlink message, including the header, and is required by netlink core. nlmsg_type can be used by applications and is an opaque value to netlink core. nlmsg_flags is used to give additional control to a message; it is read and updated by netlink core. nlmsg_seq and nlmsg_pid are used by applications to track the message, and they are opaque to netlink core as well.
A netlink message thus consists of nlmsghdr and the message payload. Once a message has been entered, it enters a buffer pointed to by the nlh pointer. We also can send the message to the struct msghdr msg:
struct iovec iov;
iov.iov_base = (void *)nlh;
iov.iov_len = nlh->nlmsg_len;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
After the above steps, a call to sendmsg() kicks out the netlink message:
sendmsg(fd, &msg, 0);
A receiving application needs to allocate a buffer large enough to hold netlink message headers and message payloads. It then fills the struct msghdr msg as shown below and uses the standard recvmsg() to receive the netlink message, assuming the buffer is pointed to by nlh:
struct sockaddr_nl nladdr;
struct msghdr msg;
struct iovec iov;
iov.iov_base = (void *)nlh;
iov.iov_len = MAX_NL_MSG_LEN;
msg.msg_name = (void *)&(nladdr);
msg.msg_namelen = sizeof(nladdr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
recvmsg(fd, &msg, 0);
After the message has been received correctly, the nlh should point to the header of the just-received netlink message. nladdr should hold the destination address of the received message, which consists of the pid and the multicast groups to which the message is sent. And, the macro NLMSG_DATA(nlh), defined in netlink.h, returns a pointer to the payload of the netlink message. A call to close(fd) closes the netlink socket identified by file descriptor fd.
The kernel-space netlink API is supported by the netlink core in the kernel, net/core/af_netlink.c. From the kernel side, the API is different from the user-space API. The API can be used by kernel modules to access the netlink socket and to communicate with user-space applications. Unless you leverage the existing netlink socket protocol types, you need to add your own protocol type by adding a constant to netlink.h. For example, we can add a netlink protocol type for testing purposes by inserting this line into netlink.h:
#define NETLINK_TEST 17
Afterward, you can reference the added protocol type anywhere in the Linux kernel.
In user space, we call socket() to create a netlink socket, but in kernel space, we call the following API:
struct sock *
netlink_kernel_create(int unit,
void (*input)(struct sock *sk, int len));
The parameter unit is, in fact, the netlink protocol type, such as NETLINK_TEST. The function pointer, input, is a callback function invoked when a message arrives at this netlink socket.
After the kernel has created a netlink socket for protocol NETLINK_TEST, whenever user space sends a netlink message of the NETLINK_TEST protocol type to the kernel, the callback function, input(), which is registered by netlink_kernel_create(), is invoked. The following is an example implementation of the callback function input:
void input (struct sock *sk, int len)
{
struct sk_buff *skb;
struct nlmsghdr *nlh = NULL;
u8 *payload = NULL;
while ((skb = skb_dequeue(&sk->receive_queue))
!= NULL) {
/* process netlink message pointed by skb->data */
nlh = (struct nlmsghdr *)skb->data;
payload = NLMSG_DATA(nlh);
/* process netlink message with header pointed by
* nlh and payload pointed by payload
*/
}
}
This input() function is called in the context of the sendmsg() system call invoked by the sending process. It is okay to process the netlink message inside input() if it's fast. When the processing of netlink message takes a long time, however, we want to keep it out of input() to avoid blocking other system calls from entering the kernel. Instead, we can use a dedicated kernel thread to perform the following steps indefinitely. Use skb = skb_recv_datagram(nl_sk) where nl_sk is the netlink socket returned by netlink_kernel_create(). Then, process the netlink message pointed to by skb->data.
This kernel thread sleeps when there is no netlink message in nl_sk. Thus, inside the callback function input(), we need to wake up only the sleeping kernel thread, like this:
void input (struct sock *sk, int len)
{
wake_up_interruptible(sk->sleep);
}
This is a more scalable communication model between user space and kernel. It also improves the granularity of context switches.
Just as in user space, the source netlink address and destination netlink address need to be set when sending a netlink message. Assuming the socket buffer holding the netlink message to be sent is struct sk_buff *skb, the local address can be set with:
NETLINK_CB(skb).groups = local_groups;
NETLINK_CB(skb).pid = 0; /* from kernel */
The destination address can be set like this:
NETLINK_CB(skb).dst_groups = dst_groups;
NETLINK_CB(skb).dst_pid = dst_pid;
Such information is not stored in skb->data. Rather, it is stored in the netlink control block of the socket buffer, skb.
To send a unicast message, use:
int
netlink_unicast(struct sock *ssk, struct sk_buff
*skb, u32 pid, int nonblock);
where ssk is the netlink socket returned by netlink_kernel_create(), skb->data points to the netlink message to be sent and pid is the receiving application's pid, assuming NLPID Formula 1 is used. nonblock indicates whether the API should block when the receiving buffer is unavailable or immediately return a failure.
You also can send a multicast message. The following API delivers a netlink message to both the process specified by pid and the multicast groups specified by group:
void
netlink_broadcast(struct sock *ssk, struct sk_buff
*skb, u32 pid, u32 group, int allocation);
group is the ORed bitmasks of all the receiving multicast groups. allocation is the kernel memory allocation type. Typically, GFP_ATOMIC is used if from interrupt context; GFP_KERNEL if otherwise. This is due to the fact that the API may need to allocate one or many socket buffers to clone the multicast message.
Given the struct sock *nl_sk returned by netlink_kernel_create(), we can call the following kernel API to close the netlink socket in the kernel:
sock_release(nl_sk->socket);
So far, we have shown only the bare minimum code framework to illustrate the concept of netlink programming. We now will use our NETLINK_TEST netlink protocol type and assume it already has been added to the kernel header file. The kernel module code listed here contains only the netlink-relevant part, so it should be inserted into a complete kernel module skeleton, which you can find from many other reference sources.
In this example, a user-space process sends a netlink message to the kernel module, and the kernel module echoes the message back to the sending process. Here is the user-space code:
#include <sys/socket.h>
#include <linux/netlink.h>
#define MAX_PAYLOAD 1024 /* maximum payload size*/
struct sockaddr_nl src_addr, dest_addr;
struct nlmsghdr *nlh = NULL;
struct iovec iov;
int sock_fd;
void main() {
sock_fd = socket(PF_NETLINK, SOCK_RAW,NETLINK_TEST);
memset(&src_addr, 0, sizeof(src_addr));
src__addr.nl_family = AF_NETLINK;
src_addr.nl_pid = getpid(); /* self pid */
src_addr.nl_groups = 0; /* not in mcast groups */
bind(sock_fd, (struct sockaddr*)&src_addr,
sizeof(src_addr));
memset(&dest_addr, 0, sizeof(dest_addr));
dest_addr.nl_family = AF_NETLINK;
dest_addr.nl_pid = 0; /* For Linux Kernel */
dest_addr.nl_groups = 0; /* unicast */
nlh=(struct nlmsghdr *)malloc(
NLMSG_SPACE(MAX_PAYLOAD));
/* Fill the netlink message header */
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = getpid(); /* self pid */
nlh->nlmsg_flags = 0;
/* Fill in the netlink message payload */
strcpy(NLMSG_DATA(nlh), "Hello you!");
iov.iov_base = (void *)nlh;
iov.iov_len = nlh->nlmsg_len;
msg.msg_name = (void *)&dest_addr;
msg.msg_namelen = sizeof(dest_addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
sendmsg(fd, &msg, 0);
/* Read message from kernel */
memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));
recvmsg(fd, &msg, 0);
printf(" Received message payload: %s\n",
NLMSG_DATA(nlh));
/* Close Netlink Socket */
close(sock_fd);
}
And, here is the kernel code:
struct sock *nl_sk = NULL;
void nl_data_ready (struct sock *sk, int len)
{
wake_up_interruptible(sk->sleep);
}
void netlink_test() {
struct sk_buff *skb = NULL;
struct nlmsghdr *nlh = NULL;
int err;
u32 pid;
nl_sk = netlink_kernel_create(NETLINK_TEST,
nl_data_ready);
/* wait for message coming down from user-space */
skb = skb_recv_datagram(nl_sk, 0, 0, &err);
nlh = (struct nlmsghdr *)skb->data;
printk("%s: received netlink message payload:%s\n",
__FUNCTION__, NLMSG_DATA(nlh));
pid = nlh->nlmsg_pid; /*pid of sending process */
NETLINK_CB(skb).groups = 0; /* not in mcast group */
NETLINK_CB(skb).pid = 0; /* from kernel */
NETLINK_CB(skb).dst_pid = pid;
NETLINK_CB(skb).dst_groups = 0; /* unicast */
netlink_unicast(nl_sk, skb, pid, MSG_DONTWAIT);
sock_release(nl_sk->socket);
}
After loading the kernel module that executes the kernel code above, when we run the user-space executable, we should see the following dumped from the user-space program:
Received message payload: Hello you!
And, the following message should appear in the output of dmesg:
netlink_test: received netlink message payload:
Hello you!
In this example, two user-space applications are listening to the same netlink multicast group. The kernel module pops up a message through netlink socket to the multicast group, and all the applications receive it. Here is the user-space code:
#include <sys/socket.h>
#include <linux/netlink.h>
#define MAX_PAYLOAD 1024 /* maximum payload size*/
struct sockaddr_nl src_addr, dest_addr;
struct nlmsghdr *nlh = NULL;
struct iovec iov;
int sock_fd;
void main() {
sock_fd=socket(PF_NETLINK, SOCK_RAW, NETLINK_TEST);
memset(&src_addr, 0, sizeof(local_addr));
src_addr.nl_family = AF_NETLINK;
src_addr.nl_pid = getpid(); /* self pid */
/* interested in group 1<<0 */
src_addr.nl_groups = 1;
bind(sock_fd, (struct sockaddr*)&src_addr,
sizeof(src_addr));
memset(&dest_addr, 0, sizeof(dest_addr));
nlh = (struct nlmsghdr *)malloc(
NLMSG_SPACE(MAX_PAYLOAD));
memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));
iov.iov_base = (void *)nlh;
iov.iov_len = NLMSG_SPACE(MAX_PAYLOAD);
msg.msg_name = (void *)&dest_addr;
msg.msg_namelen = sizeof(dest_addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
printf("Waiting for message from kernel\n");
/* Read message from kernel */
recvmsg(fd, &msg, 0);
printf(" Received message payload: %s\n",
NLMSG_DATA(nlh));
close(sock_fd);
}
And, here is the kernel code:
#define MAX_PAYLOAD 1024
struct sock *nl_sk = NULL;
void netlink_test() {
sturct sk_buff *skb = NULL;
struct nlmsghdr *nlh;
int err;
nl_sk = netlink_kernel_create(NETLINK_TEST,
nl_data_ready);
skb=alloc_skb(NLMSG_SPACE(MAX_PAYLOAD),GFP_KERNEL);
nlh = (struct nlmsghdr *)skb->data;
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = 0; /* from kernel */
nlh->nlmsg_flags = 0;
strcpy(NLMSG_DATA(nlh), "Greeting from kernel!");
/* sender is in group 1<<0 */
NETLINK_CB(skb).groups = 1;
NETLINK_CB(skb).pid = 0; /* from kernel */
NETLINK_CB(skb).dst_pid = 0; /* multicast */
/* to mcast group 1<<0 */
NETLINK_CB(skb).dst_groups = 1;
/*multicast the message to all listening processes*/
netlink_broadcast(nl_sk, skb, 0, 1, GFP_KERNEL);
sock_release(nl_sk->socket);
}
Assuming the user-space code is compiled into the executable nl_recv, we can run two instances of nl_recv:
./nl_recv &
Waiting for message from kernel
./nl_recv &
Waiting for message from kernel
Then, after we load the kernel module that executes the kernel-space code, both instances of nl_recv should receive the following message:
Received message payload: Greeting from kernel!
Received message payload: Greeting from kernel!
Netlink socket is a flexible interface for communication between user-space applications and kernel modules. It provides an easy-to-use socket API to both applications and the kernel. It provides advanced communication features, such as full-duplex, buffered I/O, multicast and asynchronous communication, which are absent in other kernel/user-space IPCs.
Special Magazine Offer -- Free Gift with
Subscription
Receive a free digital copy of Linux
Journal's System Administration Special Edition as well as instant online
access to current and past issues. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
'Linux' 카테고리의 다른 글
프로세스의 메모리 사용량 (0) | 2017.01.11 |
---|---|
rpm 사용법 (0) | 2014.02.13 |
Linux 커널 드라이버 모형: 협업의 장점 - in Beautiful code (0) | 2010.10.22 |
Linux kernel : NMI 감시기 (0) | 2009.12.31 |
[Windowing System] Linux X Server (0) | 2009.08.27 |
Help regarding Netlink sockets for 2.6 kernels: Kernel module
On November 25th, 2009 P (not verified) says:
Hi,
I'm a netlink newbie developing a kernel module (as stated above) for 2.6.x kernels.
I'm simply truing to pass a(any) message between the user space and the kernel.
The changes to the netlink APIs from kernel to kernel are confusing.
Below is my kernel code:
when I run a make on this file, it shows the following errors:
The two errors are for
1. netlink_kernel_create function
2. struct netlink_skb_parms
Can anyone help me figure out how to solve these errors?
Thanks!
netlink_kernel_create function example for kernel 2.6.29
On November 12th, 2009 Anonymous (not verified) says:
Can you provide an example for netlink_kernel_create function for centos 5.1 2.6.29 kernel
Need working code for latest kernel
On April 13th, 2009 prashant bhole (not verified) says:
I tried to modify this for latest kernel... but when kernel send msg to user space, kernel hangs after few seconds... I am not able to figure out the problem
Reading additinal bytes from the netlink subsytem
On November 19th, 2008 Ravi kumar (not verified) says:
Hi,
Please let me know the following problem is a real issue or not?
I have written a program using generic netlinks to communicate to/from kernel space. The user program sends a string and expects two strings from the kernel.
The program is working well and as expected kernel sends two hello strings to the user.
But the problem is kernel is sending one more message on the same socket which I don't expect.
I.e after the first two reads on the socket, the third should block for the data until kernel sends further messages. But instead of blocking, the third read in the user application reads a message which seems to be an error message from kernel.
Please check below for the message prints from kernel.
On first socket read using recv(......)
38 00 00 00 30 00 00 00 1F 13 24 49 00 00 00 00 8 . . . 0 . . . . . $ I . . . . 01 01 00 00 22 00 01 00 68 65 6C 6C 6F 20 77 6F . . . . " . . . h e l l o w o
72 6C 64 20 66 72 6F 6D 20 6B 65 72 6E 65 6C 20 r l d f r o m k e r n e l 73 70 61 63 65 00 00 00 s p a c e . . .
second read
40 00 00 00 30 00 00 00 1F 13 24 49 00 00 00 00 @ . . . 0 . . . . . $ I . . . .
01 01 00 00 29 00 01 00 53 65 63 6F 6E 64 20 68 . . . . ) . . . S e c o n d h
65 6C 6C 6F 20 77 6F 72 6C 64 20 66 72 6F 6D 20 e l l o w o r l d f r o m
6B 65 72 6E 65 6C 20 73 70 61 63 65 00 00 00 00 k e r n e l s p a c e . . . .
Third read which should block is receiving following
message from kernel which I think is a bug.
24 00 00 00 02 00 00 00 1E 13 24 49 68 39 00 00 $ . . . . . . . . . $ I h 9 . .
00 00 00 00 30 00 00 00 30 00 05 00 1E 13 24 49 . . . . 0 . . . 0 . . . . . $ I
68 39 00 00 h 9 . .
The third message is not sent by my kernel generic driver but is always received in the user application.
Please help resolve the above issue.
Thanks in advance.....
Usage of netlinks for kernel to user space communication.
On November 10th, 2008 ravikumar (not verified) says:
Hi,
I am new to this networking field and usage of netlinks.
My requirement is to pass some data asynchronously to the kernel module from user space and viceversa.
I managed to satisfy my first requirement using netlinks. I.e I have created my own generic family in the kernel with some registered operations. Using netlinks library I manged to pass the data with appropriate command to my corresponding kernel module.
But I doubt whether data from the kernel module can be passed to user space asynchronously using netlinks.
Is there anyway that I can register some callback functions in the user on the same netlink family i have created in the kernel for specific commands and pass the data to the user space?
If yes please let me know how can i achieve it with generic netlink infrastructure.
If not it will greatfull if I can get some hints on the alternatives.
Thanks in advance,
Ravi kumar
Through Netlink sockets,Kernel echoes! -->Only echoes possible?
On October 6th, 2008 Ajith Pullanikkat (not verified) says:
HI all,
Through Netlink sockets,Kernel echoes! -->Only echoes possible?
Rather than expecting an echo message from kernel,can user by some means ask Kernel to send an expected reply.?
I will make things more clear.
Scenario:
I am expecting messages from kernel on any USB plug in.I am able to receive then also through netlink sockets.But if the USB is already plugged in,Kernel fails to send a message to user space.Can I demand a USB plug in /plug out message from Kernel by sending my requirement through sendmsg()!! -- :)
Responses appreciated.Thanks in advance.
Ajith
This code doesn't work
On January 19th, 2007 Stepchenko (not verified) says:
This code doesn't work correctly
#define MAX_PAYLOAD 1024
struct sock *nl_sk = NULL;
void netlink_test() {
sturct sk_buff *skb = NULL;
struct nlmsghdr *nlh;
int err;
nl_sk = netlink_kernel_create(NETLINK_TEST,
nl_data_ready);
skb=alloc_skb(NLMSG_SPACE(MAX_PAYLOAD),GFP_KERNEL);
nlh = (struct nlmsghdr *)skb->data;
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = 0; /* from kernel */
nlh->nlmsg_flags = 0;
strcpy(NLMSG_DATA(nlh), "Greeting from kernel!");
NETLINK_CB(skb).groups = 1;
NETLINK_CB(skb).pid = 0; /* from kernel */
NETLINK_CB(skb).dst_pid = 0; /* multicast */
NETLINK_CB(skb).dst_groups = 1;
/*multicast the message to all listening processes*/
netlink_broadcast(nl_sk, skb, 0, 1, GFP_KERNEL);
sock_release(nl_sk->socket);
}
here is missed one importang thing:
before strcpy we should call
skb_put(skb, NLMSG_SPACE(MAX_PAYLOAD))
or change
nlh = (struct nlmsghdr *)skb->data;
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = 0; /* from kernel */
nlh->nlmsg_flags = 0;
to
NMLSG_PUT(...)
Best regards
nlh = (struct nlmsghdr *)
On May 4th, 2007 kamo (not verified) says:
nlh = (struct nlmsghdr *) skb_put(skb, NLMSG_SPACE(MAX_PAYLOAD));
best regards
Kernel Module
On December 6th, 2006 Amit Sahrawat (not verified) says:
#include linux/config.h
#include linux/socket.h
#include linux/kernel.h
#include linux/module.h
#include linux/netlink.h
#include net/sock.h
#define NETLINK_TEST 17
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Test");
MODULE_DESCRIPTION("Testing Kernel/User socket");
static int debug = 0;
module_param(debug, int, 0);
MODULE_PARM_DESC(debug, "Debug information (default 0)");
static struct sock *nl_sk = NULL;
static void nl_data_ready (struct sock *sk, int len)
{
wake_up_interruptible(sk->sk_sleep);
}
static void netlink_test()
{
struct sk_buff *skb = NULL;
struct nlmsghdr *nlh = NULL;
int err;
u32 pid;
nl_sk = netlink_kernel_create(NETLINK_TEST, nl_data_ready);
skb = skb_recv_datagram(nl_sk, 0, 0, &err);
nlh = (struct nlmsghdr *)skb->data;
printk(KERN_INFO "%s: received netlink message payload: %s\n", __FUNCTION__, NLMSG_DATA(nlh));
pid = nlh->nlmsg_pid;
NETLINK_CB(skb).groups = 0;
NETLINK_CB(skb).pid = 0;
NETLINK_CB(skb).dst_pid = pid;
NETLINK_CB(skb).dst_groups = 0;
netlink_unicast(nl_sk, skb, pid, MSG_DONTWAIT);
sock_release(nl_sk->sk_socket);
}
static int __init my_module_init(void)
{
printk(KERN_INFO "Initializing Netlink Socket");
netlink_test();
return 0;
}
static void __exit my_module_exit(void)
{
printk(KERN_INFO "Goodbye");
}
module_init(my_module_init);
module_exit(my_module_exit);
Makefile contents:
obj-m := netkernel.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Install and test using:
insmod netkernel.ko
for messages check do:
tail /var/log/messages
Great article. Helped me
On September 11th, 2006 Anonymous (not verified) says:
Great article. Helped me alot to get my code ported to 2.6. Thanks.
Should the Linux kernal be recompiled?
On July 19th, 2006 Eswari (not verified) says:
Hi,
Should I recompile the linux kernal, after writing the Kernal module of netlink. If so, could you tell me how to do it?. I could not understand how the Kernal module of netlink will get activated. I want to send certain packets (coming from a certain IP addresses) to my application residing in User space. To filter the messages I want to use IP tables. How the IPtable filtered messages will go to the Kernal module of netlink, so that from there it will be sent to my user space application.
Could some one help me
Thanks
Eswari
Netlink is not the silver bullet
On June 19th, 2006 A concerned app programmer (not verified) says:
Shell/PERL/etc apps can use /proc on any distro without having to rebuild the app or worry about library incompatibilities. Conscientious developers are cautious when changing the /proc contents since hundreds of apps could be using the information... The netlink infrastructure may be more efficient, but how would the wealth of information provided by /proc be made available to system administrators as easily as /proc is via cat, less, or grep? How can I get information from netlink using those applications? The /proc support does have its advantages. Netlink is not the silver bullet.
Howerver, this is a great netlink article!
need for compiled example
On August 27th, 2005 majid taghiloo (not verified) says:
thanks for your article it is very useful . i try to create communication socket beetwin Kernel module and user land program . i used you proposed code . but it is not worked correctly . i compile my module and userland code correctly but there is no communication between them .
please , It would be nice if you could add working or at least compilable examples.
best regard's
M.taghiloo
need for compiled example
On August 27th, 2005 majid taghiloo (not verified) says:
thanks for your article it is very useful . i try to create communication socket beetwin Kernel module and user land program . i used you proposed code . but it is not worked correctly . i compile my module and userland code correctly but there is no communication between them .
please , It would be nice if you could add working or at least compilable examples.
best regard's
M.taghiloo
need for compiled example
On August 27th, 2005 majid taghiloo (not verified) says:
thanks for your article it is very useful . i try to create communication socket beetwin Kernel module and user land program . i used you proposed code . but it is not worked correctly . i compile my module and userland code correctly but there is no communication between them .
please , It would be nice if you could add working or at least compilable examples.
best regard's
M.taghiloo
Working userspace prog (below was kernel module, not userspace)
On July 7th, 2005 Anonymous (not verified) says:
/* Working version of the Netlink Socket code from Linux Journal's Kernel Korner */
#include
#include
#include
#include
#include
#include
#define MAX_PAYLOAD 1024
struct sockaddr_nl src_addr, dst_addr;
struct nlmsghdr *nlh = NULL;
struct msghdr msg;
struct iovec iov;
int sock_fd;
int main()
{
sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_NITRO);
memset(&src_addr, 0, sizeof(src_addr));
src_addr.nl_family = AF_NETLINK;
src_addr.nl_pid = getpid();
src_addr.nl_groups = 0; // no multicast
bind(sock_fd, (struct sockaddr*)&src_addr, sizeof(src_addr));
memset(&dst_addr, 0, sizeof(dst_addr));
dst_addr.nl_family = AF_NETLINK;
dst_addr.nl_pid = 0; // 0 means kernel
dst_addr.nl_groups = 0; // no multicast
nlh = (struct nlhmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD));
/* Fill the netlink message header */
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = getpid();
nlh->nlmsg_flags = 0;
strcpy(NLMSG_DATA(nlh), "Yoo-hoo, Mr. Kernel!");
iov.iov_base = (void *)nlh;
iov.iov_len = nlh->nlmsg_len;
msg.msg_name = (void *)&dst_addr;
msg.msg_namelen = sizeof(dst_addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
sendmsg(sock_fd, &msg, 0);
/* Read message from kernel */
memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));
recvmsg(sock_fd, &msg, 0);
printf("Received message payload: %s\n", NLMSG_DATA(nlh));
close(sock_fd);
return (EXIT_SUCCESS);
}
ENOBUFS error (solved)
On December 9th, 2006 Sébastien Barré (not verified) says:
Hi,
First, I would like to thank the author a lot for this article, it was very useful indeed.
I have tried the user space code above, and noticed a ENOBUFS (No buffer space available). I finally discovered the reason : the 'struct msghdr msg' was not zeroed, and only some fields are filled (msg_name, msg_namelen, msg_iov, msg_iovlen), letting for example the msg_controllen field undefined (a check of it is made in the kernel, if too large, a ENOBUFS is returned).
My problem was solved by adding the following line :
memset(&msg,0,sizeof(msg));
(of course, before filling the various necessary fields of the message).
I hope this will help some people in getting their code working.
The userspace program that compiles
On July 7th, 2005 Daniel Purcell (not verified) says:
/* The Linux Journal Kernel Korner -- Working, compiling version of the kernel code */
#include
#include
#include
#include
#include
#include
#include
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Daniel Purcell");
MODULE_DESCRIPTION("Kernel Korner's working versinon of netlink sockets");
// Note: Debug is not implemented
static int debug = 0;
module_param(debug, int, 0);
MODULE_PARM_DESC(debug, "Debug information (default 0)");
static struct sock *nl_sk = NULL;
static void nl_data_ready (struct sock *sk, int len)
{
wake_up_interruptible(sk->sk_sleep);
}
static void netlink_test()
{
struct sk_buff *skb = NULL;
struct nlmsghdr *nlh = NULL;
int err;
u32 pid;
nl_sk = netlink_kernel_create(NETLINK_NITRO, nl_data_ready);
skb = skb_recv_datagram(nl_sk, 0, 0, &err);
nlh = (struct nlmsghdr *)skb->data;
printk(KERN_INFO "%s: received netlink message payload: %s\n", __FUNCTION__, NLMSG_DATA(nlh));
pid = nlh->nlmsg_pid;
NETLINK_CB(skb).groups = 0;
NETLINK_CB(skb).pid = 0;
NETLINK_CB(skb).dst_pid = pid;
NETLINK_CB(skb).dst_groups = 0;
netlink_unicast(nl_sk, skb, pid, MSG_DONTWAIT);
sock_release(nl_sk->sk_socket);
}
static int __init my_module_init(void)
{
printk(KERN_INFO "Initializing Netlink Socket");
netlink_test();
return 0;
}
static void __exit my_module_exit(void)
{
printk(KERN_INFO "Goodbye");
}
module_init(my_module_init);
module_exit(my_module_exit);
Problem communicating using NETLINK SOCKETS
On November 27th, 2006 Nagendra KS (not verified) says:
Hi,
I am using NETLINK sockets to communicated from userspace to kernel space.
I have a code in the kernel which is responsible for forwarding input IP packets to the IP stack. The module that i have written in kernle will block communication between the network driver and the IP stack. In this case the driver gives the incoming packet directly to our userspace program that is waiting for such packets.
Once these packets arrive at the userspace using netlink sockets I give to back to the kernel, where in I have a netlink socket in kernel waiting for these packets.
I have a kernel thread running which waiting for the packets from the user space.
The piece of code that waits is given below:
skb = skb_recv_datagram(nl_sk_ip,0 , 0, &err).
This thread sleeps till it gets any data from the user space. Once it gets any packet from the userspace, its only job is to inject that packet to the IP Stack for processing.
Now I ping from my machine to some other machine in the network. The ping packet goes out in the normal way. But when u get a response back, the network driver instead of giving it to the IP stack it gives to the userspace program which is listening on a raw socket. This user sapce program forms a netlink message and sends it to the kernel space netlink code. This code calls the entry function for the IP stack with the received packet. The IP stack the analysis of the packet and sends the response back in the normal way out.
The problem is, the whole setup works fine for arround 40 ICMP packets after that the "sendmsg" at the userspace return with EAGAIN (Resource temporarily unavailable) error.
Any idea why I am getting this error?
Your help in solving this would be appreciated.
Thanks,
Nagendra.
netlink socket using
On April 12th, 2005 Michael (not verified) says:
Hello,
The article is very clear and understood. It describes the advantages of using netlink sockets. I suppose it might be very useful in inter processes / threads communication in user-space application. But regarding the kernel space, there are disadvantages such as:
1. Kernel recompiling, because it requires netlink.h update.
2. Because it's running in the context of sendmsg prosses, the trivial ioctl is preferred just in the reason that it's not so sophisticated.
Any comments are very welcome,
Regards,
Michael
kernel to kernel communication
On February 25th, 2005 linuxram says:
I hear that netlink provides support for communication within two different subsystems of the kernel. Wish this article had covered that.
RP
examples
On January 24th, 2005 mike_k says:
It would be nice if you could add working or at least compilable examples.
thanks,
-M
Code sample in kernel itself
On August 29th, 2005 Samiullah Mohammed (not verified) says:
netlink is implemented as a device like /dev/netlink on 2.4.20-8
open,read,write functions from userland to /dev/netlink actually map to socket calls.
The kernel-sidecode for netlink is under /usr/src/linux-2.4/net/netlink/netlink_dev.c
If you wish to customize, you can change the NETLINK_MAJOR to a number you like (check major.h) and compile the module separrately with a makefile like
export KERN_NAME = linux-2.4.20-8
CFLAGS = -I /usr/src/$(KERN_NAME)/include -D__KERNEL__
netlink_dev1.o: netlink_dev1.c
user space code for 2.4 kernel
On October 18th, 2006 Anonymous (not verified) says:
Hi,
I am a netlink newbie. I saw your comment abt the kernel space code on 2.4 kernel. I was able to compile and load the kernel module as per your suggestion. Could you tell me how I can test it, as I need a user space code.
I tried the user space code provided in the article. It gives me the following error
mipsel-linux-gcc netlink.c
In file included from netlink.c:3:
/opt/toolchains/uclibc-crosstools_linux-2.4.25_gcc-3.3.5_uclibc-20050308-20050502/mipsel-linux-uclibc/sys-include/linux/netlink.h:22: error: parse error before "__u32"
/opt/toolchains/uclibc-crosstools_linux-2.4.25_gcc-3.3.5_uclibc-20050308-20050502/mipsel-linux-uclibc/sys-include/linux/netlink.h:28: error: parse error before "__u32"
/opt/toolchains/uclibc-crosstools_linux-2.4.25_gcc-3.3.5_uclibc-20050308-20050502/mipsel-linux-uclibc/sys-include/linux/netlink.h:30: error: parse error before "nlmsg_flags"
/opt/toolchains/uclibc-crosstools_linux-2.4.25_gcc-3.3.5_uclibc-20050308-20050502/mipsel-linux-uclibc/sys-include/linux/netlink.h:31: error: parse error before "nlmsg_seq"
/opt/toolchains/uclibc-crosstools_linux-2.4.25_gcc-3.3.5_uclibc-20050308-20050502/mipsel-linux-uclibc/sys-include/linux/netlink.h:32: error: parse error before "nlmsg_pid"
/opt/toolchains/uclibc-crosstools_linux-2.4.25_gcc-3.3.5_uclibc-20050308-20050502/mipsel-linux-uclibc/sys-include/linux/netlink.h:82: error: field `msg' has incomplete type
netlink.c: In function `main':
netlink.c:16: error: invalid application of `sizeof' to an incomplete type
netlink.c:17: error: invalid use of undefined type `struct sockaddr_nl'
netlink.c:18: error: invalid use of undefined type `struct sockaddr_nl'
netlink.c:19: error: invalid use of undefined type `struct sockaddr_nl'
netlink.c:20: error: invalid application of `sizeof' to an incomplete type
netlink.c:22: error: invalid application of `sizeof' to an incomplete type
netlink.c:23: error: invalid use of undefined type `struct sockaddr_nl'
netlink.c:24: error: invalid use of undefined type `struct sockaddr_nl'
netlink.c:25: error: invalid use of undefined type `struct sockaddr_nl'
netlink.c:27: error: invalid application of `sizeof' to an incomplete type
netlink.c:27: warning: assignment from incompatible pointer type
netlink.c:30: error: dereferencing pointer to incomplete type
netlink.c:30: error: invalid application of `sizeof' to an incomplete type
netlink.c:31: error: dereferencing pointer to incomplete type
netlink.c:32: error: dereferencing pointer to incomplete type
netlink.c:34: error: invalid application of `sizeof' to an incomplete type
netlink.c:37: error: dereferencing pointer to incomplete type
netlink.c:40: error: invalid application of `sizeof' to an incomplete type
netlink.c:49: error: invalid application of `sizeof' to an incomplete type
netlink.c:51: error: invalid application of `sizeof' to an incomplete type
netlink.c: At top level:
netlink.c:6: error: storage size of `src_addr' isn't known
netlink.c:6: error: storage size of `dst_addr' isn't known
my userspace appl code is as follows
======================================
#include
#include
#define MAX_PAYLOAD 1024 /* maximum payload size*/
struct sockaddr_nl src_addr, dst_addr;
struct nlmsghdr *nlh = NULL;
struct msghdr msg;
struct iovec iov;
int sock_fd;
int main()
{
sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_FIREWALL);
memset(&src_addr, 0, sizeof(src_addr));
src_addr.nl_family = AF_NETLINK;
src_addr.nl_pid = getpid();
src_addr.nl_groups = 0; // no multicast
bind(sock_fd, (struct sockaddr*)&src_addr, sizeof(src_addr));
memset(&dst_addr, 0, sizeof(dst_addr));
dst_addr.nl_family = AF_NETLINK;
dst_addr.nl_pid = 0; // 0 means kernel
dst_addr.nl_groups = 0; // no multicast
nlh = (struct nlhmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD));
/* Fill the netlink message header */
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = getpid();
nlh->nlmsg_flags = 0;
strcpy(NLMSG_DATA(nlh), "Yoo-hoo, Mr. Kernel!");
iov.iov_base = (void *)nlh;
iov.iov_len = nlh->nlmsg_len;
msg.msg_name = (void *)&dst_addr;
msg.msg_namelen = sizeof(dst_addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
printf("Waiting for message from kernel\n");
sendmsg(sock_fd, &msg, 0);
/* Read message from kernel */
memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));
recvmsg(sock_fd, &msg, 0);
printf("Received message payload: %s\n", NLMSG_DATA(nlh));
close(sock_fd);
return (0);
}
Thanks in advance
Ashwin.
Help Please
On July 20th, 2006 Eswari (not verified) says:
Hi,
I used the following command to compile the netlink_dev
cc -o netlink_dev.o netlink_dev.c -I /usr/src/linux-2.4.7-10/include -D__KERNEL__
I am getting the following error,
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o: In function `_start':
/usr/lib/gcc-lib/i386-redhat-linux/2.96/../../../crt1.o(.text+0x18): undefined reference to `main'
/tmp/ccYfFxIZ.o: In function `netlink_write':
/tmp/ccYfFxIZ.o(.text+0xc5): undefined reference to `sock_sendmsg'
/tmp/ccYfFxIZ.o: In function `netlink_read':
/tmp/ccYfFxIZ.o(.text+0x157): undefined reference to `sock_recvmsg'
/tmp/ccYfFxIZ.o: In function `netlink_open':
/tmp/ccYfFxIZ.o(.text+0x1e4): undefined reference to `sock_create'
/tmp/ccYfFxIZ.o(.text+0x244): undefined reference to `sock_release'
/tmp/ccYfFxIZ.o: In function `netlink_release':
/tmp/ccYfFxIZ.o(.text+0x2e7): undefined reference to `sock_release'
/tmp/ccYfFxIZ.o: In function `devfs_register_chrdev':
/tmp/ccYfFxIZ.o(.text+0x496): undefined reference to `register_chrdev'
/tmp/ccYfFxIZ.o: In function `init_netlink':
/tmp/ccYfFxIZ.o(.text.init+0x5a): undefined reference to `printk'
collect2: ld returned 1 exit status
Could you please help me, thanks
Eswari
compile error
On September 29th, 2005 liuhua (not verified) says:
I type all the source code as above article in FC4(2.6.11-1.1369_FC4-i686 kernel).
kernel code error:
for "sk->sk_sleep" and "sock_release(nl_sk->sk_socket)":
dereferencing pointer to incomplete type
user code error:
on line"nlh->nlmsg_len=NLMSG_SPACE(MAX_PAYLOAD)"
syntax error before '=' token
what's the reason?? Help me please
Re: compile error
On May 25th, 2006 Chinmaya (not verified) says:
Inclued the following line at top of the program.
#include
Thanks
Chinmaya
Re: compile error
On May 25th, 2006 Chinmaya (not verified) says:
Inclued the following line at top of the program. #include net/sock.h.
Thanks
Chinmaya
User Space module
On December 6th, 2006 Amit Sahrawat (not verified) says:
#include sys/stat.h
#include unistd.h
#include stdio.h
#include stdlib.h
#include sys/socket.h
#include sys/types.h
#include string.h
#include asm/types.h
#include linux/netlink.h
#include linux/socket.h
#define NETLINK_TEST 17
#define MAX_PAYLOAD 1024
struct sockaddr_nl src_addr, dst_addr;
struct nlmsghdr *nlh = NULL;
struct msghdr msg;
struct iovec iov;
int sock_fd;
int main(int argc,char **argv)
{
sock_fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_TEST);
memset(&dst_addr, 0, sizeof(dst_addr));
dst_addr.nl_family = AF_NETLINK;
printf("%s :",argv[1]);
if(argc>0)
dst_addr.nl_pid = atoi(argv[1]); // 0 means kernel
else
dst_addr.nl_pid = 0;
dst_addr.nl_groups = 0; // no multicast
printf("SOCK FD :%d \n",sock_fd);
nlh = (struct nlhmsghdr *)malloc(NLMSG_SPACE(MAX_PAYLOAD));
/* Fill the netlink message header */
nlh->nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);
nlh->nlmsg_pid = getpid();
nlh->nlmsg_flags = 0;
strcpy(NLMSG_DATA(nlh), "User Spaces: Message from User to Kernel!");
iov.iov_base = (void *)nlh;
iov.iov_len = nlh->nlmsg_len;
msg.msg_name = (void *)&dst_addr;
msg.msg_namelen = sizeof(dst_addr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
sendmsg(sock_fd, &msg, 0);
memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));
recvmsg(sock_fd, &msg, 0);
printf("Received message payload: %s\n", NLMSG_DATA(nlh));
close(sock_fd);
return (1);
}
Save as netwriter.c,
Compile using
gcc netwriter.c -o netwriter
For execution, give Process ID as arguement. '0' for kernel.
Post new comment