Hey,
I’ve wanted to cover some Linux networking basics, and I felt that going through the exercise of setting up a straightforward TCP server would do good, and writing a guide during that process would certainly force me to go over details that I’d usually skip.
If you’re curious (or just wanted a tutorial) on how to set up a TCP server using Linux and C, this is for you.
- The overview
- The socket
- Creating a Socket
- (extra) Writing to a socket in CLOSED state
- Binding the socket to an address
- Making the socket listen for connections
- (extra) Messing up with the backlog
- Accepting connections
- Closing thoughts
- Resources
The overview
At the very high level, there are two actors: a client and a server.
The server has a know IP address that represents a machine to connect to at some point (this address could be retrieved via DNS) and a known port (that represents the service that we want to talk to in that machine).
At this moment, nothing can happen - TCP has its communication based on connections and without some handshaking and connection establishment, no data can be sent.
So, that’s what happens at the first moment.
Once the connection gets established, it’s then up to the application to decide whether there’s indeed a server component and a client component. They could be peers and synchronize data back and forth, for instance.
For an application protocol like HTTP/1.1 though, there’s a well-defined client-server model, where a client issues requests with specific methods, and a server that processes these requests and gives back results.
Being at the application-level, HTTP simply assumes that there’s a way for the server to send data back to the client and vice-versa. If there’s a channel in which messages can be sent, all good for HTTP.
In case you’re not familiar with these terms or the way the networking layers fit together, make sure you get a copy of Computer Networking: A top-down approach.
It goes “layer-by-layer”, allowing the reader to build the understanding of networking concepts from a high-level to as low level as it gets.
The socket
The interface that allows the interaction between the application layer (e.g., HTTP/1.1) and the transport layer (e.g., TCP) is the socket
.
An analogy that I remember from college is the one of a series of houses and its doors:
“Considering that houses are applications in machines, then sockets are their doors: to get a message from one to another, it needs to cross a door of the first house and a door of the second.”
In C
, we can access such socket interface via a file descriptor that is returned by the socket(2)
syscall.
As this interface is required for any communication to happen (it’s the abstraction presented to us by our TCP/IP implementation under the hood), in my example I started by creating a struct
that keeps track of it:
/**
* Encapsulates the properties of the server.
*/
typedef struct server {
// file descriptor of the socket in passive
// mode to wait for connections.
int listen_fd;
} server_t;
This way we can pass the server_t
struct around the functions that depend on that socket.
Our main
routine can then be declared:
/**
* Creates a socket for the server and makes it passive such that
* we can wait for connections on it later.
*/
int server_listen(server_t* server);
/**
* Accepts new connections and then prints `Hello World` to
* them.
*/
int server_accept(server_t* server);
/**
* Main server routine.
*
* - instantiates a new server structure that holds the
* properties of our server;
* - creates a socket and makes it passive with
* `server_listen`;
* - accepts new connections on the server socket.
*
*/
int
main()
{
int err = 0;
server_t server = { 0 };
err = server_listen(&server);
if (err) {
printf("Failed to listen on address 0.0.0.0:%d\n",
PORT);
return err;
}
for (;;) {
err = server_accept(&server);
if (err) {
printf("Failed accepting connection\n");
return err;
}
}
return 0;
}
Let’s implement each of those methods.
Creating a Socket
To have the socket created, the first thing we do is call the socket(2)
syscall specifying the type of communication protocol to be used (TCP, in this case) and the domain in which we’re using it (IPv4).
note.: the domain is relevant because we could be using, e.g., unix sockets to communicate - not internet / network specific.
Looking at socket(2) man page:
SOCKET(2) Linux Programmer's Manual SOCKET(2)
NAME
socket - create an endpoint for communication
SYNOPSIS
int socket(int domain, int type, int protocol);
DESCRIPTION
socket() creates an endpoint for communication and
returns a file descriptor that refers to that endpoint.
The file descriptor returned by a successful call will be
the lowest-numbered file descriptor not currently open
for the process.
We can see that if we succeed, we end up with a file descriptor that we can reference later. This file descriptor that we receive is what we can store under server->listen_fd
:
// The `socket(2)` syscall creates an endpoint for communication
// and returns a file descriptor that refers to that endpoint.
//
// It takes three arguments (the last being just to provide greater
// specificity):
// - domain (communication domain)
// AF_INET IPv4 Internet protocols
//
// - type (communication semantics)
// SOCK_STREAM Provides sequenced, reliable,
// two-way, connection-based byte
// streams.
err = (server->listen_fd = socket(AF_INET, SOCK_STREAM, 0));
if (err == -1) {
perror("socket");
printf("Failed to create socket endpoint\n");
return err;
}
To verify that we really end up with a file descriptor right after the socket(2)
call, check out lsof
(which lists open files):
# Capture the PID of the server process that only
# calls `socket(2)` and the `pause(2)` to wait for
# a signal indefinitely.
SERVER_PROC=$(pgrep server.out)
# List the open files of this process but filter
# out those that do not contain `sock` in the
# line.
lsof | ag $SERVER_PROC | ag sock
COMMAND PID TYPE NODE NAME
server.out 8824 sock 34368 protocol: TCP
# Check out what file descriptors are assigned
# to our process by inspecting the `proc` virtual
# filesystem.
ls -lah /proc/$SERVER_PROC/fd
.
..
0 -> /dev/pts/0
1 -> /dev/pts/0
2 -> /dev/pts/0
3 -> socket:[34368]
While at university I read this great book by W. Richard Stevens. As it’s in its title - it’s all about the sockets networking API in Unix systems.
At this point we have a socket that is both not connected and can’t accept connections, but what if we try to write to this socket?
(extra) Writing to a socket in CLOSED state
Naturally, things break:
err = (server->listen_fd = socket(AF_INET, SOCK_STREAM, 0));
// ...
err = write(server->listen_fd, "hey", 3);
if (err == -1) {
perror("write");
printf("failed to write\n");
return err;
}
Compile it and run:
$ ./server.out
$
Nothing shows! That’s because the default behavior of a write to a closed socket is to fail with SIGPIPE
, which has the default action of killing thr process (although you can catch the signal and do whatever you want):
strace ./server.out
socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
write(3, "hey", 3) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11008, si_uid=1001} ---
+++ killed by SIGPIPE +++
Looking at the man pages:
man 7 signal
SIGPIPE 13 Broken pipe: write to pipe with no
readers.
man 2 write
EPIPE fd is connected to a pipe or socket whose
reading end is closed. When this happens the
writing process will also receive a SIGPIPE signal.
(Thus, the write return value is seen only if
the program catches, blocks or ignores this signal.)
Very neat.
Binding the socket to an address
Although we have a socket, this socket isn’t bound to an address yet. This is where another syscall comes: bind(2)
.
bind(2)
takes a socket and a well defined structure that lets you tell it about the address you want to bind the socket to.
For an IPv4 application like we’re creating here, we make use of sockaddr_in
which allows you to specify a 32-bit address (the IPv4 address) and a 16-bit number (the port).
struct sockaddr_in server_addr = { 0 };
// `sockaddr_in` provides ways of representing a full address
// composed of an IP address and a port.
//
// SIN_FAMILY address family AF_INET refers to the address
// family related to internet
// addresses
//
// S_ADDR address (ip) in network byte order (big endian)
// SIN_PORT port in network byte order (big endian)
server_addr.sin_family = AF_INET;
// INADDR_ANY is a special constant that signalizes "ANY IFACE",
// i.e., 0.0.0.0.
server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
server_addr.sin_port = htons(PORT);
With the sockaddr_in
structure set, it’s time to call bind
:
// bind() assigns the address specified to the socket referred
// to by the file descriptor (`listen_fd`).
//
// Here we cast `sockaddr_in` to `sockaddr` and specify the
// length such that `bind` can pick the values from the
// right offsets when interpreting the structure pointed to.
err = bind(server->listen_fd,
(struct sockaddr*)&server_addr,
sizeof(server_addr));
if (err == -1) {
perror("bind");
printf("Failed to bind socket to address\n");
return err;
}
Once bind
is called, now our socket is attached to a specific address, but we still can’t see it on ss
or any other program like that, and that’s because our socket is still in the CLOSED
state (no connection state at all).
Making the socket listen for connections
As, by default, the socket is created for active connections (acting as a client), we must make it passive using listen(2)
if we want to accept connections:
LISTEN(2) Linux Programmer's Manual LISTEN(2)
NAME
listen - listen for connections on a socket
SYNOPSIS
int listen(int sockfd, int backlog);
DESCRIPTION
listen() marks the socket referred to by sockfd as
a passive socket, that is, as a socket that will be
used to accept incoming connection requests using
accept(2).
With listen(2)
we then mark the socket with SO_ACCEPTCON
to listen for connections and specify a backlog size - the maximum amount of pending connections that can be enqueued for that socket.
// listen() marks the socket referred to by sockfd as a passive socket,
// that is, as a socket that will be used to accept incoming connection
// requests using accept(2).
err = listen(server->listen_fd, BACKLOG);
if (err == -1) {
perror("listen");
printf("Failed to put socket in passive mode\n");
return err;
}
Calling this syscall also has the effect of moving our socket to a different state.
Instead of CLOSED
, now the socket is put into LISTEN
state (represents waiting for a connection request from any remote TCP and port - RFC 793 - TCP):
# Check that `lsof` identifies our socket listening on
# port 8080
lsof -i:8080
COMMAND TYPE DEVICE NAME
server.out IPv4 38593 TCP *:http-alt (LISTEN)
# Check that `ss` properly recognizes our socket as `listen`
ss \
--listening \
--numeric \
--tcp | ag 8080
LISTEN 0 128 *:8080 *:*
We can also look at /proc/net/tcp
to see exactly the same as what the commands above are seeing (i.e., going under the hood):
# Get the inode of our socket.
#
# Given that `socket(2)` creates a file descriptor
# that is the next fd in the sequence (0, 1, and 2
# are already taken), we know that `3` references
# our socket.
SERVER_SOCKET=stat \
--dereference \
--printf %i \
/proc/$SERVER_PROC/fd/3
# Check the state of our socket inode in the
# /proc/net/tcp table:
cat /proc/net/tcp | ag $SERVER_SOCKET
sl local_address rem_address st
1: 00000000:1F90 00000000:0000 0A ...
st
defines the state of the socket, so we can know what 0A
(10 in decimal) means by looking at /usr/src/linux-headers-<LINUX_VER>/include/net/tcp_states.h
:
enum {
TCP_ESTABLISHED = 1,
TCP_SYN_SENT, // 02
TCP_SYN_RECV, // 03
TCP_FIN_WAIT1, // ...
TCP_FIN_WAIT2,
TCP_TIME_WAIT,
TCP_CLOSE,
TCP_CLOSE_WAIT, // ...
TCP_LAST_ACK, // 09
TCP_LISTEN, // 0A <<<<<<<<<
TCP_CLOSING,
TCP_NEW_SYN_RECV,
TCP_MAX_STATES
};
As we wanted, it’s in the TCP_LISTEN
state.
To know more about /proc
and other Linux interfaces, make sure you check the book The Linux Programming Interface.
(extra) Messing up with the backlog
As at this point we’re able to listen for income connections, but we don’t accept them, what would happen if people started trying to connect to our server?
One easy way of simulating the backlog exhaustion is by putting our server up and then start creating connections to it.
With a call to pause(2)
right after listen(2)
we’ll start enqueuing connections without closing them, so if we create a bunch of telnet
clients that connect to our server, we should see them failing after N
connections (where N
equals the size of the backlog).
# In one terminal, start the server.
# ps.: I modified the code to have the
# backlog set to 4.
./server.out
# In another terminal, start a handful of
# connections using telnet.
for i in $(seq 1 10); \
do sleep 1; \
telnet localhost 8080 & \
done
[1] 11696 # <<<<< worked
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
telnet localhost 8080
[2] 11698 # <<<<< worked
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
telnet localhost 8080
[3] 11700 # <<<<< worked
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
telnet localhost 8080
[4] 11702 # <<<<< worked
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
telnet localhost 8080
[5] 11704 # <<<<< worked
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
# From now on, the backlog got to its maximum,
# so it won't accept more connections.
telnet localhost 8080
[6] 11707
Trying 127.0.0.1...
[7] 11709
Trying 127.0.0.1...
[8] 11711
Trying 127.0.0.1...
[9] 11713
Trying 127.0.0.1...
[10] 11715
Trying 127.0.0.1...
What that means is that we can directly affect that rate at which connections get processed by our server without even accepting them in the first place.
Accepting connections
Once a connection lands on the server that has a passive socket for that destination address, it’s up to the application to decide to do something with does connections or not.
As the connections are being established, they float around two queues that are limited by the backlog value:
- an incomplete connection queue: keeps track of connections that just arrived and didn’t finish the three-way handshake;
- a completed connection queue: keeps track of connections that successfully finished the three-way handshake and can be utilized by the server.
What accept(2)
ends up doing is looking at the completed connections first in first out (FIFO) queue and popping from it, giving to the application a file descriptor that represents that connection (such that it can send data or whatever). If the queue is empty, it just blocks.
ACCEPT(2) Linux Programmer's Manual ACCEPT(2)
NAME
accept, accept4 - accept a connection on a socket
SYNOPSIS
int accept(int sockfd,
struct sockaddr *addr,
socklen_t *addrlen);
DESCRIPTION
The accept() system call is used with connection-based
socket types (SOCK_STREAM, SOCK_SEQPACKET).
It extracts the first connection request on the queue of pending
connections for the listening socket, sockfd, creates a new
connected socket, and returns a new file descriptor referring to
that socket.
The newly created socket is not in the listening state.
The original socket sockfd is unaffected by this call.
The argument sockfd is a socket that has been created with socket(2),
bound to a local address with bind(2), and is listening for connections
after a listen(2).
With that said, the code:
int
server_accept(server_t* server)
{
int err = 0;
int conn_fd;
socklen_t client_len;
struct sockaddr_in client_addr;
client_len = sizeof(client_addr);
err =
(conn_fd = accept(
server->listen_fd, (struct sockaddr*)&client_addr, &client_len));
if (err == -1) {
perror("accept");
printf("failed accepting connection\n");
return err;
}
printf("Client connected!\n");
err = close(conn_fd);
if (err == -1) {
perror("close");
printf("failed to close connection\n");
return err;
}
return err;
}
And that’s it!
Now our server can adequately receive connections and act upon them if we want.
Closing thoughts
I found that going through this rather simple example was great - I took the time to review some concepts and make the whole flow even more concrete in my mind.
If you’re looking for the implementation, make sure you check github.com/cirocosta/hstatic
.
I plan to make this a simple HTTP server that serves static files, but that’ll come next.
Just in case I got something wrong or you just want to chat about something related, get in touch with me on Twitter! I’m @cirowrc there.
Have a good one!
finis
Resources
Throughout the article, some books were mentioned.
Here’s the list of them: