DRM Security ============== During the last few years, users of the DRM API have increased significantly. Aside from the X-Server different parts of the linux desktop stack use the DRM API directly. This includes Plymouth, Weston, Mir, kmscon and more. While the DRM and KMS APIs could mostly withstand the strain, the lack of a sole user-space DRM user showed several shortcomings in the design. We cannot rely on X-Server or DDX fixes to work around kernel API deficiencies, anymore. We have to carefully take all the different DRM applications into account while changing or improving the DRM API. By opening /dev/dri/ to more applications than the X-Server, we also open it for spoofing attacks. In this talk I want to built on the results of last year's DRM2 talk (XDC-2012) and address the GEM-Flink, DRM-mmap() and DRM-Master related spoofing attacks. I developed several examples that reveal how easy it is to misuse these and will discuss the fixes that were introduced to DRM during the last year. 0) Prerequisites ================ Name: David Herrmann Email: dh.herrmann@gmail.com Date: 2013/07/02 The reader is expected to be familiar with the DRM API and its major concepts, including the following: DRM-Master, GEM + TTM, Flink, dma-buf, DRM mmap, DRI1 and DRI2 These concepts are used throughout the article and will not be explained in detail. Last year's talk on DRM2 is available at: http://www.youtube.com/watch?v=4fRXNHAjMIY 1) Current Situation ==================== Before we can discuss fixes for DRM deficiencies we must outline the current situation and supported use-cases. The kernel API must be backwards-compatible, so introducing new setups to fix old bugs is not acceptable. Instead, we must understand the current situation in its entirety and always preserve backwards compatiblity. Not all bugs can be fixed retroactively, but large user-space modifications should be avoided so existing systems can benefit from these fixes. 1.1) Setup ---------- In a typical DRM setup we have many different DRM users. A central role is taken by the graphics-server which can have multiple authenticated render-clients. This setup may exist many times in parallel, on a single seat or on independent seats. Apart from a server-client layout, we might also have independent offscreen DRM users. - Graphics Servers: X-Server, Weston or other compositors provide a central place for clients to display their window contents and take care of any modesetting or compositing. Multiple servers can be run on different seats in parallel. On a single seat, only one server is active at a time, the others run in background. - Render-Clients: Graphics servers can allow clients to use the GPU to render window contents. Clients have limited DRM access and cannot alter global GPU state. They can share state with the server, but must retain control over what is shared with whom. - Offscreen-Clients: Offscreen clients are like Render-Clients but are not associated with a graphics server. They require the GPU for offscreen use like GPGPU or offscreen-rendering. 1.2) Security ------------- With many different applications accessing the GPU in parallel, we must provide definite DRM namespaces for each of them. While graphics servers are granted global DRM access, all DRM users must retain control over private objects. A graphics server should not be allowed to access a GPGPU client's buffers. And different render clients should not be able to see what each other is doing. But locking down object namespaces is not the ultimate solution as buffer sharing is one of the fundamental concepts of DRM. The DRM-Master, GEM-Flink, DRM-mmap() and dma-buf APIs are currently used to allow context separation and shared state. But they have several flaws that pose a security risk to current linux desktop systems. The known problems (in no particular order) are: - gem-flink doesn't provide any private namespaces to applications and servers. Instead, only one global namespace is provided per DRM node. Malicious authenticated applications can attack other clients via brute-force "name-guessing" of gem buffers. - DRM mmap() does not provide any private namespaces to applications. Once a buffer has a fake-offset available for mmap()-use, it will be global. A malicious application can guess the offset and alter it arbitrarily. - drmModeGetFB() returns a gem-handle to the framebuffer's backing gem object. This can be used by malicious applications to get access to the currently active framebuffer and alter it arbitrarily. - DRM-Master is limited to CAP_SYS_ADMIN. This requires applications to run as root or use hackish workarounds. The complex design of compositors makes it unlikely that they are bug-free so we should do our best to avoid running them with root-privileges. - DRM-Master management is left to the active graphics server. This allows malicious applications to continously ask for DRM-Master and intercept it during VT-switches. This doesn't even require root-privileges! - DRM-Master context separation cannot be controlled entirely from user-space. 2) Attacks ========== I looked for an attack scenario for each API deficiency and developed example programs to exploit it. While I limited the examples to a specific implementation (mostly Xorg), one must take into account that they are applicable to others as well. 2.1) GEM-Flink -------------- The GEM-Flink attack is very simple. We need a running X-Server and two clients that render on the GPU. Clients must be authenticated on the DRM node via the DRI API, which mostly means being in the "video" group. Client A (the target) renders window contents via the GPU, creates an GEM-flink name for the buffer and passes it to the X-Server. This allows the X-Server to open the buffer and display it. Client B (the attacker) can guess the Flink name (brute force) and use the GEM_OPEN ioctl to open the same buffer, even though it wasn't supposed to get access. The buffer may thus leak private information or allow the attacker to alter the visual appearance of the target. The following pseudo-code shows how easy it is for Client B to get a GEM handle to the buffer of Client A: Client A (target) | Client B (attacker) -----------------------------------+------------------------------- int fd; int fd, err; uint32_t handle, name; uint32_t name, handle; struct drm_gem_flink pl; struct drm_gem_open pl; fd = open("/dev/dri/card0"); fd = open("/dev/dri/card0"); .. handle = GEM_OPEN_* .. .. card specific .. pl.handle = handle; ioctl(fd, DRM_IOCTL_GEM_FLINK, &pl); name = pl.name; for (name = 0; name < INT_MAX; ++name) { pl.name = name; err = ioctl(fd, DRM_IOCTL_GEM_OPEN, &pl); if (!err) break; } handle = pl.handle; With the quite low number of global Flink names in freshly booted systems, the bute-force attack has a very high success rate. The kernel uses the "IDR" system for name allocation and thus the flink-names are highly predictable. The attacker cannot tell what buffer they opened, however, they can easily open all buffers until they find what they need. 2.1.1) GEM-Flink Alternatives ----------------------------- While limiting the lifetime of flink-names or requiring DRM-Master for GEM_OPEN would reduce the attack surface, they break DRM API semantics. No final fix for the GEM-Flink attack is known, but with dma-buf we have a replacement which allows fine-grained access management via file-descriptors. The flink API was designed around global names and it is very unlikely that it will ever change. Use dma-buf! 2.2) DRM-mmap() --------------- The mmap() attack on DRM devices is based on fake DRM offsets. If a client wants to map a GPU buffer for CPU access, it requests an mmap() offset on the DRM node and uses this offset as argument to mmap() to map the buffer. The same scenario as for GEM-Flink (2.1) applies here. An attacker could easily guess the offset and map a buffer that they have no access to. Client A (target) | Client B (attacker) -----------------------------------+------------------------------- int fd; int fd; uint32_t dumb_handle, offset; uint32_t off; struct drm_mode_map_dumb pl; void *mem; fd = open("/dev/dri/card0"); fd = open("/dev/dri/card0"); ... dumb_handle = ioctl(fd, DRM_IOCTL_MODE_CREATE_DUMB, ..args..); ... pl.handle = dumb_handle; ioctl(fd, DRM_IOCTL_MODE_MAP_DUMB, &pl); offset = pl.offset; mmap(0, ..len.., ..prot.., ..flags.., fd, offset); for (off = 0; off < INT_MAX; ++off) { mem = mmap(0, ..len.., ..prot.., ..flags.., fd, off); if (mem != MAP_FAILED) break; } In this example, Client B will end up with some buffer (not guaranteed to be the buffer of Client A) mapped at @mem. It can read from or write to it. An attacker can easily map all available buffers, which guarantees that the buffer of Client A is mapped. Internally, drm_mm is used for offset allocations. The algorithm is simple and can be mirrored by the attacker. Similar to GEM-Flink (2.1) a brute-force attack is likely to succeed. 2.2.1) DRM-mmap() Namespaces ---------------------------- While the global mmap() offset namespace is part of the DRM API, no application did make use of this. Hence, a simple fix is to bind mmap() access to the GEM-name. A patch-series is pending on dri-devel which thus reduces the DRM-mmap() attack to a GEM buffer attack (eg., see GEM-Flink 2.1 and "Unified VMA Offset Manager" on dri-devel). VMA Offset Manager: http://lists.freedesktop.org/archives/dri-devel/2013-July/042141.html mmap() Access Management: http://lists.freedesktop.org/archives/dri-devel/2013-July/041222.html The idea is to restrict mmap() access to applications which own a handle to the target buffer. In this case, an application is always allowed to create mmap offsets theirselves. So they own the buffer and should be allowed mmap() access. If a client does not own a handle to the buffer, they must not get any access. The VMA offset manager and access-management patches are likely to be included in linux-3.12 and thus fix this security problem. 2.3) drmModeGetFB() ------------------- drmModeGetFB() is part of the DRM-KMS API and allows any DRM client to retrieve a GEM-handle for the currently active framebuffer on any CRTC. A simple attack requires a running X-Server with one or more allocated framebuffers. Any client with access to /dev/dri/ can now open the DRM node, retrieve a GEM handle for any CRTC via drmModeGetFB() and read/write it arbitrarily. Client A (attacker) ------------------------------------ int fd; drmModeFB *fb; uint32_t handle, id; fd = open("/dev/dri/card0"); for (id = 0; id < INT_MAX; ++id) { fb = drmModeGetFB(fd, id); if (fb) break; } handle = fb->handle; In this example, the attacker will own a handle for some existing framebuffer. If this is done for all IDs, an attacker can get access to the currently displayed framebuffer. Note that owning a handle implies owning the buffer, so arbitrary mmap() access is possible. No root rights or CAP_SYS_ADMIN is needed. No DRM authentication is needed. This can be used by any client who has access to /dev/dri/card0. While modesetting commands are limited to DRM-Master, drmModeGetFB() is supposed to be passive and thus globally accessible. Other ioctls like CREATE_DUMB also allow similar denial-of-service attacks if clients consume all of GPU memory. 2.3.1) drmModeGetFB() Fix ------------------------- Requiring DRM-Master for all these commands limits the attack surface to the running graphics server (which would mostly mean the attack is useles). However, this prevents background graphics servers from managing their buffers. Especially during server shutdown, DRM-Master shouldn't be required to free allocated buffers. So other fixes are preferred. The most serious bug, the drmModeGetFB buffer leak, can, however, be fixed by returning an invalid gem handle for clients without DRM-Master access. Patches are pending on dri-devel, but require more thorough investigation. The more appropriate fix is the introduction of DRM render nodes. This splits DRM nodes between graphics servers and render clients and thus provides fine-grained access management for KMS ioctls. Render Nodes proposal: http://lists.freedesktop.org/archives/dri-devel/2013-July/041222.html An attacker would need access to /dev/dri/card0 (instead of /dev/dri/renderD0) to perform this attack. However, this access will be restricted to privileged compositors once render-nodes are established. 2.4) DRM-Master and CAP_SYS_ADMIN --------------------------------- Graphics servers are required to have DRM-Master privileges to perform modesetting or modify global GPU state. However, DRM-Master can only be acquired with CAP_SYS_ADMIN capabilities, which is roughly equivalent to root rights. No specific attack scenario is known, but any bug in a graphics server will essentially be an attack surface to gain root rights on a desktop system. The huge size of common graphics servers makes it likely that exploitable bugs exist. We should thus reduce the required capabilities for graphics servers. 2.4.1) Outsourcing DRM-Master ----------------------------- CAP_SYS_ADMIN is only required to acquire DRM-Master. So what Weston does is running a small helper process which is connected to the compositor via a pipe. During VT-Switches, the helper acquires and drops DRM-Master accordingly and the compositor no longer needs CAP_SYS_ADMIN to manage DRM devices. The attack surface is thus reduced to a small helper. If we extend this idea, we could easily move it into a central daemon which takes care of DRM-Master management for all graphics servers. There is ongoing work to reuse systemd-logind to manage DRM-Master and Input devices and drops DRM-Master / mutes the devices while a graphic server is inactive and re-enables them during wakeup. A description of this proposal can be found at: http://dvdhrm.wordpress.com/2013/07/08/thoughts-on-linux-system-compositors/ Development is still ongoing and a first prototype is expected for GUADEC 2013 in Brno. 2.5) DRM-Master Management -------------------------- DRM-Master is a concept to separate multiple graphics-servers from each other. A different DRM-Master context is assigned to each graphics server and for all contexts, only a single DRM user can be "DRM-Master". So while multiple contexts might exist, only a single context is considered active, the context the current DRM-Master is assigned to. A graphics server can call drmDropMaster() to drop DRM-Master and drmSetMaster() to gain DRM-Master. drmSetMaster() fails if the current user is not a DRM-Master. A fundamental flaw is that both calls do not take any context as argument. Moreover, user-space has no chance to find out which context a user is assigned to. During open() on a DRM node, DRM core will assign the new user to the currently active context (ie, the context of the current DRM-Master). If no context is active, a new context is created. While this allows minor control over context-creation and assignment, it does not allow assigning clients or servers to a specific context. So if a new session X-Server is started while the current X-Server is still active, both will get assigned to the same context. On the other hand, if a client is started while the corresponding graphics-server is inactive, the client will get assigned to a different context than the server (which breaks DRI among other things). While the security implications might be subtle, this concept allows major denial of service attacks if clients get assigned to wrong contexts. A main problem is that a DRM user cannot detect this so it has no way to verify that it is assigned to the correct context. It may allocate buffers on a context which is actually the context of an attackers graphics server, not the context of the target server. While this provides a huge surface for attackers, one might argue that it requires CAP_SYS_ADMIN so we can mostly ignore it. However, that is not true! During open(), if no context is active, a new context is created and automatically is assigned DRM-Master. This allows any user with access to /dev/dri/ to become DRM-Master! Of-course, one cannot drop and re-acquire it as drmSetMaster() is protected. Nevertheless, this can be used for a denial-of-service attack by hijacking DRM-Master during VT-switches by unprivileged applications. Moreover, it can be used to display arbitrary content on the screen and simulating login-screens or more. Client A (attacker) ------------------------------------ int fd; fd = -1; do { close(fd); fd = open("/dev/dri/card0"); } while (drmAuthMagic(fd, 0) == -EACCES); In this example an attacker opens a DRI device and uses a dummy drmAuthMagic() call to test whether it is DRM-Master. drmAuthMagic() returns -EACCES if the caller is no DRM-Master, otherwise -EINVAL is returned as 0 is an invalid DRM-Magic number. If this attacker runs during a VT switch, chances are high that it becomes DRM-Master without having the CAP_SYS_ADMIN capability. Arbitrary modesetting commands can be issued afterwards. 2.5.1) Static DRM-Master Contexts --------------------------------- The current API design should make it pretty clear that multiple DRM-Master contexts cannot be used properly. In fact, there is no application known to me which profits or makes use of multiple contexts. Instead, DRM contexts were reduced to a minimum and today manage no more than DRM-Master assignment. So we can easily create a single static context during DRM device creation and assign each user to it. This prevents any situation where clients are assigned to wrong contexts. All users will now share the same context. This indirectly fixes the DRM-Master hijacking problem as new users will never be able to become DRM-Master by opening /dev/dri/card0 as the static context will always be active. With render-nodes we allow offscreen clients, anyway. Hence, we don't have to limit DRM authentication to the currently active master but can additionally allow background clients to be authenticated and make use of the DRM device. 2.5.2) Centralized DRM-Master Management ---------------------------------------- By moving drmSetMaster() and drmDropMaster() calls to a central daemon (like systemd-logind), we can provide a central place for DRM-Master management. Hijacking will be limited to CAP_SYS_ADMIN and can be detected via error codes on drmSetMaster(). On the same time, clients can use Render-Nodes instead of authenticating via drmAuth(). The concept of DRM-Master is thus reduced to managing exclusive hardware access. 3) Final Notes ============== Most of the spoofing attacks are based on the fact that all DRM users share the same DRM node. Linux provides many advanced access-management facilities that we could make use of. However, they can require huge changes to user-space. The big DRM legacy makes it almost impossible to guarantee no old UMS driver might break, so we cannot drop backwards-compatibility. But at the same time, this shouldn't hold us back. New concepts that fix all known issues are already available and wait for wider adoption. By keeping /dev/dri/card as it is, we can always guarantee backwards-compatibility but limit new users to a sane and safe API. During last year's talk, most of these issues were still unfixed. However, a lot has happened and nearly all fixes and new facilities are either already upstream or pending on dri-devel and waiting for adoption. The emergence of so many different DRM applications should motivate us to finally move forward.