Hey,

This article contains the contents of a talk that I’ve delivered at the office (Pivotal - Toronto!)

The event was an internal thing, a session of three lightning talks at the end of the day.

If you’re even more curious, go check out the series of articles I’m creating around /proc! You can find all of them in A month of Proc.

Also, Pivotal is hiring! Check out the jobs here: Job openings at Pivotal. In case you’d like to know more, let me know! I’m @cirowrc on Twitter.

Now, onto the transcriptions and slides!


Title of the presentation

In this very short talk, I want to cover a filesystem that I’m 100% sure that everyone here already used, even if just indirectly.

The reason why I think that, is because I guess everyone, at some point, needed to answer the following two questions:

  1. how much RAM is my process consuming? and
  2. how much RAM does my whole system has available?
The question of How much RAM does a given thing consume

Naturally, to answer those, you probably used either ps, or something like top.

Annotated screenshot of top showing the system memory

But, what does /proc has to do with this?

To answer that, we need to go back to operating system stuff, and remember that the OS takes two roles:

  • First, managing resources, like scheduling tasks to run in a limited set of CPUs; and
The OS doing the work of scheduling tasks to run in the finite set of CPUs that it manages
  • second, providing abstractions for users to consume these resources that the OS manages.
Illustration of a user process interacting with a nice interface while the OS does the heavy work of interacting with the hardware

So, if we think about where software is running in a machine at any given point in time, we can see that it can be running in either one of two possible spaces:

  • userspace, where code is sandboxed to this thing called process, not being able to touch hardware at any point in time; and

  • kernelspace, where it has complete access to pretty much all of the hardware, being able to essentially execute whatever instructions the machine is capable of executing.

The two possible spaces where software can run on a system - userspace and kernelspace

Given that the user can’t just tell the hardware directly that it wants to know about it, the user program needs to first ask the OS for that info, so that the OS, which can talk directly with the hardware, can then let the user program know such information.

The user space code talking with the kernel in order to retrieve hardware information

But, how does a process talk with the Kernel in the first place?

How can it ask that question?

Illustration of someone questioning what is that interface between user code and kernel code

The answer is syscalls - a set of well defined interfaces that allow a user to request a specific service to be completed by the kernel.


A zoomed-in view of the interface between user code and the operating system, revealing some system calls in such boundary


So, let’s say we want to count how many bytes a file has :

A terminal showing example code where it counts the bytes that a file has by making use of two system calls - open and read

To do that, we use two system calls:

  • one for opening a file, and
  • another for reading it.

For us, users, that’s awesome!

We don’t need to care about the semantics of the filesystem where the file lives - all I want to do is read a file, which might live on disk, tape, network or RAM!

It doesn’t matter for the consumer of such contract - It’s up to the Kernel to figure out which filesystem is responsible delagating to a specific filesystem to respond to such call.

Illustration of what calling the read system calls ends up triggering within the kernel, showing that in a call to a file under ext4, the kernel delegates the read operation to the ext4 implementation of read.

So, at this point, you might imagine that there’s a syscall like get_my_tcp_stats()!!, but that’s not the case.


Representation the non-existence of a system call that shows TCP statistics


Whenever a system call gets added, it becomes part of the kernel API, which has to be supported indefinitely, involving a lot of work to document, test, and essentially support it forever!

So, for that reason, there are just a few of them (check syscall.h header w/ all of them).

Knowing that, this is where /proc fits in.

Given that Linux abstracts the idea of reading a file, why not change the details of it and swap the concept of reading from disk by simply “writing back specific information”?

TODO

For instance, consider the path that a read to an EXT4 filesystem takes:

A terminal illustrating the result from tracing a file read that lives under ext4 down to the block IO methods that the disk driver handles.


Once the read syscall is handled, its arguments are passed down to an abstract interace, the virtual file system, which is then responsible for passing down the actual operation to whoever owns such file - in this case, EXT4.

So, given that the Kernel already has this support for providing an interface that is common between all of the filesystems, which allows filesystems to implement adaptors to adhere to such interface, why not just implement the interface and let the user get the data from a pseudo filesystem?


TODO

That’s what procfs does! It implements the virtual filesystem interface, providing files that represent operations to retrieve kernel info.

TODO

So, what kind of things are exposed? Well, a bunch!

To have an idea of what are some of those methods, we can looks at some of the files that are exposed for networking stuff.

TODO
TODO

And not only that! It not only support reads, but also writes!

If you’ve needed to use sysctl before to change things like to total number of open files that your system can handle, or a particular behavior of tcp, that’s touching /proc!

TODO

Resources