Linux epoll Tutorial
epoll
epoll is an I/O event notification mechanism widely being used in modern web servers like Nginx. epoll stands for event poll and it is a Linux specific construct.
Epoll refers to a kernel data structure. It maintains a list of file descriptors, which allows a process to get notifications whenever an I/O event is possible on any of these file descriptors. epoll_create()
, epoll_ctl()
and epoll_wait()
are the three main system calls that deal with the implementation of epoll. All these system calls are included in the <sys/epoll.h>
header file.
epoll_create()
epoll_create()
is the system call used to create an epoll instance. An epoll instance is identified by a file descriptor in linux. The epoll_create()
returns a file descriptor that identify the newly created kernel data structure.
#include <sys/epoll.h>
int epoll_create(int size);
Arguments
size - This is the only argument for epoll_create(
). It is of integer type. It refers to the number of file descriptors that the process wants to monitor. The kernel decides the size required for the epoll instance(data structure) according to the value in size.
Return value
epoll_create()
returns an integer which refers to the file descriptor of the created epoll instance.
epoll_create1()
This system call is also used to create an epoll instance similar to epoll_create()
, but the arguments differ.
int epoll_create1(int flags)
Here the flags can be either 0
or EPOLL_CLOEXEC
if the flag is set to zero, this function behaves same as epoll_create()
when the flag is set to EPOLL_CLOEXEC
, whenever a child process is forked from the current process it closes the epoll file descriptor in the child before starting its execution. So only the current process will have access to the epoll instance.
Since epoll instance is treated as a file descriptor, we use the close()
system call in <unistd.h>
header file to close an epoll instance.
epoll_ctl()
An epoll instance can maintain a list of file descriptors that has to be monitored. All file descriptors registered with an epoll instance is collectively referred to as epoll list or interest list. Whenever any of the file descriptors become ready for I/O operations, they are added to the ready list. Ready list is a subset of interest list. epoll_ctl()
system call is used to add, modify or delete file descriptors to be monitored in the interest list of epoll.
#include<sys/epoll.h>
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
Arguments
epfd - It is of integer type. It is the file descriptor of the epoll instance in the kernel.
fd - It is of integer type. It refers to the target file descriptor that has to be added or modified or deleted from the epoll instance referred to by the epfd.
op - It is of integer type. It refers to the operation to be performed on the target file descriptor(fd) in the epoll instance(epfd). Valid values that op can take include
EPOLL_CTL_ADD
,EPOLL_CTL_MOD
, andEPOLL_CTL_DEL
.EPOLL_CTL_ADD
- Add a file decriptor (fd) to the interest list of epoll instance.EPOLL_CTL_MOD
- Change the setting associated with fd which is already in the interest list of epfd to new settings specified in the event argument.EPOLL_CTL_DEL
- Removes the target file descriptor(fd) from the interest list of epfd. If the fd has been added to multiple epoll instances, then closing it from will remove it from all the epoll interest lists to which it was a part of.event - It is of epoll_event type.
epoll_event
structure is included in the<sys/epoll.h>
header. The structure ofepoll_event
is shown below.cstruct epoll_event { uint32_t events;/* Epoll events */ epoll_data_t data;/* User data variable */ } union epoll_data { void *ptr; int fd; uint32_t u32; uint64_t u64; }; typedef union epoll_data epoll_data_t;
The first field of
epoll_event
structure refers to the events that the fd is being monitored for. The major available event types include,EPOLLIN
- used for read operationsEPOLLOUT
- used for write opeartionsEPOLLLET
- requests edge triggered notifications for the fdThe second field of
epoll_event
structure specifies the data that kernel should save and return when the file descriptor becomes ready.
Return Value - when successful, epol_ctl returns zero. When an error occurs, it returns -1 and errno is set to indicate the error.
epoll_wait()
This system call is used to notify the process, when some event has occurred on the interest list of the epoll instance. It blocks the process until any of the descriptors being monitored becomes ready for I/O events.
#include <sys/epoll.h>
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
Arguments
- epfd - It is of type integer. It is the file descriptor that refers to the epoll instance.
- events - It is an array of
epoll_event
structures. It is allocated by the calling process. The values in this array is filled whenepoll_wait
returns. It will have the list of file descriptors that are in ready state. - maxevents - It is of integer type. It refers to the length of events array.
- timeout - It is of integer type. It refers to the time in milliseconds for which
epoll_wait()
will block. If it is set to 0,epoll_wait()
will not block and if it is set to -1, it will block forever.
Return Value
The return value is of an integer type.
If one or more file descriptors in the interest list becomes ready, then returns a positive integer that denotes the number of file descriptors in the events array.
If timeout occurred before any file descriptor becomes ready, then returns 0.
If an error has occurred, then returns -1.