| Software Guard eXtensions (SGX) |
| =============================== |
| |
| Overview |
| -------- |
| |
| Intel Software Guard eXtensions (SGX) is a set of instructions and mechanisms |
| for memory accesses in order to provide security accesses for sensitive |
| applications and data. SGX allows an application to use its particular |
| address space as an *enclave*, which is a protected area provides confidentiality |
| and integrity even in the presence of privileged malware. Accesses to the |
| enclave memory area from any software not resident in the enclave are prevented, |
| including those from privileged software. |
| |
| Virtual SGX |
| ----------- |
| |
| SGX feature is exposed to guest via SGX CPUID. Looking at SGX CPUID, we can |
| report the same CPUID info to guest as on host for most of SGX CPUID. With |
| reporting the same CPUID guest is able to use full capacity of SGX, and KVM |
| doesn't need to emulate those info. |
| |
| The guest's EPC base and size are determined by QEMU, and KVM needs QEMU to |
| notify such info to it before it can initialize SGX for guest. |
| |
| Virtual EPC |
| ~~~~~~~~~~~ |
| |
| By default, QEMU does not assign EPC to a VM, i.e. fully enabling SGX in a VM |
| requires explicit allocation of EPC to the VM. Similar to other specialized |
| memory types, e.g. hugetlbfs, EPC is exposed as a memory backend. |
| |
| SGX EPC is enumerated through CPUID, i.e. EPC "devices" need to be realized |
| prior to realizing the vCPUs themselves, which occurs long before generic |
| devices are parsed and realized. This limitation means that EPC does not |
| require -maxmem as EPC is not treated as {cold,hot}plugged memory. |
| |
| QEMU does not artificially restrict the number of EPC sections exposed to a |
| guest, e.g. QEMU will happily allow you to create 64 1M EPC sections. Be aware |
| that some kernels may not recognize all EPC sections, e.g. the Linux SGX driver |
| is hardwired to support only 8 EPC sections. |
| |
| The following QEMU snippet creates two EPC sections, with 64M pre-allocated |
| to the VM and an additional 28M mapped but not allocated:: |
| |
| -object memory-backend-epc,id=mem1,size=64M,prealloc=on \ |
| -object memory-backend-epc,id=mem2,size=28M \ |
| -M sgx-epc.0.memdev=mem1,sgx-epc.1.memdev=mem2 |
| |
| Note: |
| |
| The size and location of the virtual EPC are far less restricted compared |
| to physical EPC. Because physical EPC is protected via range registers, |
| the size of the physical EPC must be a power of two (though software sees |
| a subset of the full EPC, e.g. 92M or 128M) and the EPC must be naturally |
| aligned. KVM SGX's virtual EPC is purely a software construct and only |
| requires the size and location to be page aligned. QEMU enforces the EPC |
| size is a multiple of 4k and will ensure the base of the EPC is 4k aligned. |
| To simplify the implementation, EPC is always located above 4g in the guest |
| physical address space. |
| |
| Migration |
| ~~~~~~~~~ |
| |
| QEMU/KVM doesn't prevent live migrating SGX VMs, although from hardware's |
| perspective, SGX doesn't support live migration, since both EPC and the SGX |
| key hierarchy are bound to the physical platform. However live migration |
| can be supported in the sense if guest software stack can support recreating |
| enclaves when it suffers sudden lose of EPC; and if guest enclaves can detect |
| SGX keys being changed, and handle gracefully. For instance, when ERESUME fails |
| with #PF.SGX, guest software can gracefully detect it and recreate enclaves; |
| and when enclave fails to unseal sensitive information from outside, it can |
| detect such error and sensitive information can be provisioned to it again. |
| |
| CPUID |
| ~~~~~ |
| |
| Due to its myriad dependencies, SGX is currently not listed as supported |
| in any of QEMU's built-in CPU configuration. To expose SGX (and SGX Launch |
| Control) to a guest, you must either use ``-cpu host`` to pass-through the |
| host CPU model, or explicitly enable SGX when using a built-in CPU model, |
| e.g. via ``-cpu <model>,+sgx`` or ``-cpu <model>,+sgx,+sgxlc``. |
| |
| All SGX sub-features enumerated through CPUID, e.g. SGX2, MISCSELECT, |
| ATTRIBUTES, etc... can be restricted via CPUID flags. Be aware that enforcing |
| restriction of MISCSELECT, ATTRIBUTES and XFRM requires intercepting ECREATE, |
| i.e. may marginally reduce SGX performance in the guest. All SGX sub-features |
| controlled via -cpu are prefixed with "sgx", e.g.:: |
| |
| $ qemu-system-x86_64 -cpu help | xargs printf "%s\n" | grep sgx |
| sgx |
| sgx-debug |
| sgx-encls-c |
| sgx-enclv |
| sgx-exinfo |
| sgx-kss |
| sgx-mode64 |
| sgx-provisionkey |
| sgx-tokenkey |
| sgx1 |
| sgx2 |
| sgxlc |
| |
| The following QEMU snippet passes through the host CPU but restricts access to |
| the provision and EINIT token keys:: |
| |
| -cpu host,-sgx-provisionkey,-sgx-tokenkey |
| |
| SGX sub-features cannot be emulated, i.e. sub-features that are not present |
| in hardware cannot be forced on via '-cpu'. |
| |
| Virtualize SGX Launch Control |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| QEMU SGX support for Launch Control (LC) is passive, in the sense that it |
| does not actively change the LC configuration. QEMU SGX provides the user |
| the ability to set/clear the CPUID flag (and by extension the associated |
| IA32_FEATURE_CONTROL MSR bit in fw_cfg) and saves/restores the LE Hash MSRs |
| when getting/putting guest state, but QEMU does not add new controls to |
| directly modify the LC configuration. Similar to hardware behavior, locking |
| the LC configuration to a non-Intel value is left to guest firmware. Unlike |
| host bios setting for SGX launch control(LC), there is no special bios setting |
| for SGX guest by our design. If host is in locked mode, we can still allow |
| creating VM with SGX. |
| |
| Feature Control |
| ~~~~~~~~~~~~~~~ |
| |
| QEMU SGX updates the ``etc/msr_feature_control`` fw_cfg entry to set the SGX |
| (bit 18) and SGX LC (bit 17) flags based on their respective CPUID support, |
| i.e. existing guest firmware will automatically set SGX and SGX LC accordingly, |
| assuming said firmware supports fw_cfg.msr_feature_control. |
| |
| Launching a guest |
| ----------------- |
| |
| To launch a SGX guest: |
| |
| .. parsed-literal:: |
| |
| |qemu_system_x86| \\ |
| -cpu host,+sgx-provisionkey \\ |
| -object memory-backend-epc,id=mem1,size=64M,prealloc=on \\ |
| -M sgx-epc.0.memdev=mem1,sgx-epc.0.node=0 |
| |
| Utilizing SGX in the guest requires a kernel/OS with SGX support. |
| The support can be determined in guest by:: |
| |
| $ grep sgx /proc/cpuinfo |
| |
| and SGX epc info by:: |
| |
| $ dmesg | grep sgx |
| [ 0.182807] sgx: EPC section 0x140000000-0x143ffffff |
| [ 0.183695] sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. |
| |
| To launch a SGX numa guest: |
| |
| .. parsed-literal:: |
| |
| |qemu_system_x86| \\ |
| -cpu host,+sgx-provisionkey \\ |
| -object memory-backend-ram,size=2G,host-nodes=0,policy=bind,id=node0 \\ |
| -object memory-backend-epc,id=mem0,size=64M,prealloc=on,host-nodes=0,policy=bind \\ |
| -numa node,nodeid=0,cpus=0-1,memdev=node0 \\ |
| -object memory-backend-ram,size=2G,host-nodes=1,policy=bind,id=node1 \\ |
| -object memory-backend-epc,id=mem1,size=28M,prealloc=on,host-nodes=1,policy=bind \\ |
| -numa node,nodeid=1,cpus=2-3,memdev=node1 \\ |
| -M sgx-epc.0.memdev=mem0,sgx-epc.0.node=0,sgx-epc.1.memdev=mem1,sgx-epc.1.node=1 |
| |
| and SGX epc numa info by:: |
| |
| $ dmesg | grep sgx |
| [ 0.369937] sgx: EPC section 0x180000000-0x183ffffff |
| [ 0.370259] sgx: EPC section 0x184000000-0x185bfffff |
| |
| $ dmesg | grep SRAT |
| [ 0.009981] ACPI: SRAT: Node 0 PXM 0 [mem 0x180000000-0x183ffffff] |
| [ 0.009982] ACPI: SRAT: Node 1 PXM 1 [mem 0x184000000-0x185bfffff] |
| |
| References |
| ---------- |
| |
| - `SGX Homepage <https://software.intel.com/sgx>`__ |
| |
| - `SGX SDK <https://github.com/intel/linux-sgx.git>`__ |
| |
| - SGX specification: Intel SDM Volume 3 |