Blame - docs/security.texi - qemu

blob: 0d6b30edfc0b1582c9b55af3abb3a09dd250832b [file] [log] [blame]

Stefan Hajnoczi	e841257	2019-05-09 13:18:20 +0100	[diff] [blame]	1	@node Security
				2	@chapter Security
				3
				4	@section Overview
				5
				6	This chapter explains the security requirements that QEMU is designed to meet
				7	and principles for securely deploying QEMU.
				8
				9	@section Security Requirements
				10
				11	QEMU supports many different use cases, some of which have stricter security
				12	requirements than others. The community has agreed on the overall security
				13	requirements that users may depend on. These requirements define what is
				14	considered supported from a security perspective.
				15
				16	@subsection Virtualization Use Case
				17
				18	The virtualization use case covers cloud and virtual private server (VPS)
				19	hosting, as well as traditional data center and desktop virtualization. These
				20	use cases rely on hardware virtualization extensions to execute guest code
				21	safely on the physical CPU at close-to-native speed.
				22
				23	The following entities are untrusted, meaning that they may be buggy or
				24	malicious:
				25
				26	@itemize
				27	@item Guest
				28	@item User-facing interfaces (e.g. VNC, SPICE, WebSocket)
				29	@item Network protocols (e.g. NBD, live migration)
				30	@item User-supplied files (e.g. disk images, kernels, device trees)
				31	@item Passthrough devices (e.g. PCI, USB)
				32	@end itemize
				33
				34	Bugs affecting these entities are evaluated on whether they can cause damage in
				35	real-world use cases and treated as security bugs if this is the case.
				36
				37	@subsection Non-virtualization Use Case
				38
				39	The non-virtualization use case covers emulation using the Tiny Code Generator
				40	(TCG). In principle the TCG and device emulation code used in conjunction with
				41	the non-virtualization use case should meet the same security requirements as
				42	the virtualization use case. However, for historical reasons much of the
				43	non-virtualization use case code was not written with these security
				44	requirements in mind.
				45
				46	Bugs affecting the non-virtualization use case are not considered security
				47	bugs at this time. Users with non-virtualization use cases must not rely on
				48	QEMU to provide guest isolation or any security guarantees.
				49
				50	@section Architecture
				51
				52	This section describes the design principles that ensure the security
				53	requirements are met.
				54
				55	@subsection Guest Isolation
				56
				57	Guest isolation is the confinement of guest code to the virtual machine. When
				58	guest code gains control of execution on the host this is called escaping the
				59	virtual machine. Isolation also includes resource limits such as throttling of
				60	CPU, memory, disk, or network. Guests must be unable to exceed their resource
				61	limits.
				62
				63	QEMU presents an attack surface to the guest in the form of emulated devices.
				64	The guest must not be able to gain control of QEMU. Bugs in emulated devices
				65	could allow malicious guests to gain code execution in QEMU. At this point the
				66	guest has escaped the virtual machine and is able to act in the context of the
				67	QEMU process on the host.
				68
				69	Guests often interact with other guests and share resources with them. A
				70	malicious guest must not gain control of other guests or access their data.
				71	Disk image files and network traffic must be protected from other guests unless
				72	explicitly shared between them by the user.
				73
				74	@subsection Principle of Least Privilege
				75
				76	The principle of least privilege states that each component only has access to
				77	the privileges necessary for its function. In the case of QEMU this means that
				78	each process only has access to resources belonging to the guest.
				79
				80	The QEMU process should not have access to any resources that are inaccessible
				81	to the guest. This way the guest does not gain anything by escaping into the
				82	QEMU process since it already has access to those same resources from within
				83	the guest.
				84
				85	Following the principle of least privilege immediately fulfills guest isolation
				86	requirements. For example, guest A only has access to its own disk image file
				87	@code{a.img} and not guest B's disk image file @code{b.img}.
				88
				89	In reality certain resources are inaccessible to the guest but must be
				90	available to QEMU to perform its function. For example, host system calls are
				91	necessary for QEMU but are not exposed to guests. A guest that escapes into
				92	the QEMU process can then begin invoking host system calls.
				93
				94	New features must be designed to follow the principle of least privilege.
				95	Should this not be possible for technical reasons, the security risk must be
				96	clearly documented so users are aware of the trade-off of enabling the feature.
				97
				98	@subsection Isolation mechanisms
				99
				100	Several isolation mechanisms are available to realize this architecture of
				101	guest isolation and the principle of least privilege. With the exception of
				102	Linux seccomp, these mechanisms are all deployed by management tools that
				103	launch QEMU, such as libvirt. They are also platform-specific so they are only
				104	described briefly for Linux here.
				105
				106	The fundamental isolation mechanism is that QEMU processes must run as
				107	unprivileged users. Sometimes it seems more convenient to launch QEMU as
				108	root to give it access to host devices (e.g. @code{/dev/net/tun}) but this poses a
				109	huge security risk. File descriptor passing can be used to give an otherwise
				110	unprivileged QEMU process access to host devices without running QEMU as root.
				111	It is also possible to launch QEMU as a non-root user and configure UNIX groups
				112	for access to @code{/dev/kvm}, @code{/dev/net/tun}, and other device nodes.
				113	Some Linux distros already ship with UNIX groups for these devices by default.
				114
				115	@itemize
				116	@item SELinux and AppArmor make it possible to confine processes beyond the
				117	traditional UNIX process and file permissions model. They restrict the QEMU
				118	process from accessing processes and files on the host system that are not
				119	needed by QEMU.
				120
				121	@item Resource limits and cgroup controllers provide throughput and utilization
				122	limits on key resources such as CPU time, memory, and I/O bandwidth.
				123
				124	@item Linux namespaces can be used to make process, file system, and other system
				125	resources unavailable to QEMU. A namespaced QEMU process is restricted to only
				126	those resources that were granted to it.
				127
				128	@item Linux seccomp is available via the QEMU @option{--sandbox} option. It disables
				129	system calls that are not needed by QEMU, thereby reducing the host kernel
				130	attack surface.
				131	@end itemize
Daniel P. Berrangé	4f24430	2019-07-03 14:41:35 +0100	[diff] [blame]	132
				133	@section Sensitive configurations
				134
				135	There are aspects of QEMU that can have security implications which users &
				136	management applications must be aware of.
				137
				138	@subsection Monitor console (QMP and HMP)
				139
				140	The monitor console (whether used with QMP or HMP) provides an interface
				141	to dynamically control many aspects of QEMU's runtime operation. Many of the
				142	commands exposed will instruct QEMU to access content on the host file system
				143	and/or trigger spawning of external processes.
				144
				145	For example, the @code{migrate} command allows for the spawning of arbitrary
				146	processes for the purpose of tunnelling the migration data stream. The
				147	@code{blockdev-add} command instructs QEMU to open arbitrary files, exposing
				148	their content to the guest as a virtual disk.
				149
				150	Unless QEMU is otherwise confined using technologies such as SELinux, AppArmor,
				151	or Linux namespaces, the monitor console should be considered to have privileges
				152	equivalent to those of the user account QEMU is running under.
				153
				154	It is further important to consider the security of the character device backend
				155	over which the monitor console is exposed. It needs to have protection against
				156	malicious third parties which might try to make unauthorized connections, or
				157	perform man-in-the-middle attacks. Many of the character device backends do not
				158	satisfy this requirement and so must not be used for the monitor console.
				159
				160	The general recommendation is that the monitor console should be exposed over
				161	a UNIX domain socket backend to the local host only. Use of the TCP based
				162	character device backend is inappropriate unless configured to use both TLS
				163	encryption and authorization control policy on client connections.
				164
				165	In summary, the monitor console is considered a privileged control interface to
				166	QEMU and as such should only be made accessible to a trusted management
				167	application or user.