Using FUSE without root on Linux

Created On: August 05 2022

I recently wrote a FUSE driver for Linux. My FUSE driver had to work in an environment where root access was not permitted and SUID binaries were also not permitted. Overcoming this constraint was tricky and this blog post outlines how I did it.

How does FUSE work?

FUSE on Linux requires the driver to conduct a very specific sequence in order to create and mount the filesystem. First the driver needs to open /dev/fuse for reading and writing. Opening the device returns a file descriptor which will be used to communicate with the kernel.

Next, the driver needs to initiate a mount(2) system call for a fuse filesystem and provide the desired mountpoint. The driver also needs to provide an fd mount option which is set to the file descriptor from above.

After, the above the driver can use the file descriptor to communicate with the kernel. The kernel implements a RPC protocol over the file descriptor which tells the driver which files are opened, closed, etc.

The problem with this sequence is that issuing a mount(2) system call requires the caller to have CAP_SYS_ADMIN which a regular user does not have. Without it, it’s not possible for a fuse driver to actually make the filesystem available.

Typical Approach

The typical way to solve this is to implement a FUSE driver using libfuse. This library allows a user to implement each file system operation as a C callback. The user implements these functions and passes them to a libfuse main function which does the mounting described above. It overcomes the privilege issue above by being bundled with a SUID helper binary called fusermount.

libfuse expects fusermount binary to exist and delegates the mounting operation to fusermount. A driver using libfuse actually allocates a local unix domain socket, passes one of them to fusermount which then performs the mount(2) call and proxies the I/O between /dev/fuse and the driver. The fusermount process runs as long as the filesystem is mounted.

This is easy to observe by using a driver that uses libfuse. For example mounting a filesystem with squashfuse.

$ squashfuse -o auto_unmount,allow_other,default_permissions ./alpine-minirootfs-3.15.4-x86_64.sqfs mount

Once this command is run two processes will have been launched:

root     2132739  0.0  0.0   2792  1032 ?        Ss   03:02   0:00 fusermount -o rw,nosuid,nodev,allow_other,default_permissions,auto_unmount,subtype=squashfuse -- /home/zmanji/tmp/mount
zmanji   2132741  0.0  0.0   4752   344 ?        Ss   03:02   0:00 squashfuse -o auto_unmount,allow_other,default_permissions ./alpine-minirootfs-3.15.4-x86_64.sqfs mount

and they have established a bi-directional unix socket pair between them.

$ sudo lsof +E -aUc fuse
COMMAND       PID   USER   FD   TYPE             DEVICE SIZE/OFF   NODE NAME
fusermoun 2132739   root    4u  unix 0xffff9d931e79a640      0t0 308828 type=STREAM ->INO=308829 2132741,squashfus,5u
squashfus 2132741 zmanji    5u  unix 0xffff9d931e79d940      0t0 308829 type=STREAM ->INO=308828 2132739,fusermoun,4u

We can see that this approach will fail to work if fusermount can not execute.

$ sudo chmod -x /usr/bin/fusermount3
$ squashfuse ./alpine-minirootfs-3.15.4-x86_64.sqfs mount
fuse: failed to exec fusermount: Permission denied

Preventing fusermount from executing prevents libfuse from working at all

Alternative Approach

The alternative to the above is to take advantage of a feature of Linux introduced in 4.18: fuse: Allow fully unprivileged mounts. This change allows a root in a user namespace to issue mount(2) for a FUSE filesystem.

With this any regular user can call mount(2) if they created a new user and mount namespace. This technique even works with libfuse which tries to use mount(2) directly before falling back to using fusermount. Simply creating a new user, pid and mount namespaces with unshare with the same restrictions as before works.

$ sudo chmod -x /usr/bin/fusermount3
$ squashfuse ./alpine-minirootfs-3.15.4-x86_64.sqfs mount
fuse: failed to exec fusermount: Permission denied
$ unshare -pfr --user --mount --kill-child /bin/bash
# squashfuse ./alpine-minirootfs-3.15.4-x86_64.sqfs mount
# ls mount
bin  dev  etc  home  lib  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

With the mount in a different namespace however it’s not possible to see the contents of the mount outside of the namespace. With the above mount, another shell session would only see an empty directory at the mount point.

This is because mount_namespaces(7) comes with two restrictions:

  [1] Each mount namespace has an owner user namespace.  As
     explained above, when a new mount namespace is created, its
     mount list is initialized as a copy of the mount list of
     another mount namespace.  If the new namespace and the
     namespace from which the mount list was copied are owned by
     different user namespaces, then the new mount namespace is
     considered less privileged.

 [2] When creating a less privileged mount namespace, shared
     mounts are reduced to slave mounts.  This ensures that
     mappings performed in less privileged mount namespaces will
     not propagate to more privileged mount namespaces.

The combination of the two points means it’s impossible to bind mount the mount point in the mount namespace to the root namespace.

However the owner of the mount namespace can peek into the contents of the mount from the root namespace through procfs. If the user can get the pid of the process that called mount(2) or any other process in the same mount namespace then running the following will show the contents of the mount.

$ ls /proc/2719682/root/home/zmanji/tmp/mount
bin  dev  etc  home  lib  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

Viewing the mount in the mount namespace from outside the namespace

Conclusion

It’s possible to mount a FUSE filesystem without use of root permissions or SUID binaries by doing the mount inside of a user namespace. This comes with the issue that viewing the contents of the mount from outside of the namespace is restricted to using procfs.