| PCI SR/IOV EMULATION SUPPORT |
| ============================ |
| |
| Description |
| =========== |
| SR/IOV (Single Root I/O Virtualization) is an optional extended capability |
| of a PCI Express device. It allows a single physical function (PF) to appear as multiple |
| virtual functions (VFs) for the main purpose of eliminating software |
| overhead in I/O from virtual machines. |
| |
| Qemu now implements the basic common functionality to enable an emulated device |
| to support SR/IOV. Yet no fully implemented devices exists in Qemu, but a |
| proof-of-concept hack of the Intel igb can be found here: |
| |
| git://github.com/knuto/qemu.git sriov_patches_v5 |
| |
| Implementation |
| ============== |
| Implementing emulation of an SR/IOV capable device typically consists of |
| implementing support for two types of device classes; the "normal" physical device |
| (PF) and the virtual device (VF). From Qemu's perspective, the VFs are just |
| like other devices, except that some of their properties are derived from |
| the PF. |
| |
| A virtual function is different from a physical function in that the BAR |
| space for all VFs are defined by the BAR registers in the PFs SR/IOV |
| capability. All VFs have the same BARs and BAR sizes. |
| |
| Accesses to these virtual BARs then is computed as |
| |
| <VF BAR start> + <VF number> * <BAR sz> + <offset> |
| |
| From our emulation perspective this means that there is a separate call for |
| setting up a BAR for a VF. |
| |
| 1) To enable SR/IOV support in the PF, it must be a PCI Express device so |
| you would need to add a PCI Express capability in the normal PCI |
| capability list. You might also want to add an ARI (Alternative |
| Routing-ID Interpretation) capability to indicate that your device |
| supports functions beyond it's "own" function space (0-7), |
| which is necessary to support more than 7 functions, or |
| if functions extends beyond offset 7 because they are placed at an |
| offset > 1 or have stride > 1. |
| |
| ... |
| #include "hw/pci/pcie.h" |
| #include "hw/pci/pcie_sriov.h" |
| |
| pci_your_pf_dev_realize( ... ) |
| { |
| ... |
| int ret = pcie_endpoint_cap_init(d, 0x70); |
| ... |
| pcie_ari_init(d, 0x100, 1); |
| ... |
| |
| /* Add and initialize the SR/IOV capability */ |
| pcie_sriov_pf_init(d, 0x200, "your_virtual_dev", |
| vf_devid, initial_vfs, total_vfs, |
| fun_offset, stride); |
| |
| /* Set up individual VF BARs (parameters as for normal BARs) */ |
| pcie_sriov_pf_init_vf_bar( ... ) |
| ... |
| } |
| |
| For cleanup, you simply call: |
| |
| pcie_sriov_pf_exit(device); |
| |
| which will delete all the virtual functions and associated resources. |
| |
| 2) Similarly in the implementation of the virtual function, you need to |
| make it a PCI Express device and add a similar set of capabilities |
| except for the SR/IOV capability. Then you need to set up the VF BARs as |
| subregions of the PFs SR/IOV VF BARs by calling |
| pcie_sriov_vf_register_bar() instead of the normal pci_register_bar() call: |
| |
| pci_your_vf_dev_realize( ... ) |
| { |
| ... |
| int ret = pcie_endpoint_cap_init(d, 0x60); |
| ... |
| pcie_ari_init(d, 0x100, 1); |
| ... |
| memory_region_init(mr, ... ) |
| pcie_sriov_vf_register_bar(d, bar_nr, mr); |
| ... |
| } |
| |
| Testing on Linux guest |
| ====================== |
| The easiest is if your device driver supports sysfs based SR/IOV |
| enabling. Support for this was added in kernel v.3.8, so not all drivers |
| support it yet. |
| |
| To enable 4 VFs for a device at 01:00.0: |
| |
| modprobe yourdriver |
| echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs |
| |
| You should now see 4 VFs with lspci. |
| To turn SR/IOV off again - the standard requires you to turn it off before you can enable |
| another VF count, and the emulation enforces this: |
| |
| echo 0 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs |
| |
| Older drivers typically provide a max_vfs module parameter |
| to enable it at load time: |
| |
| modprobe yourdriver max_vfs=4 |
| |
| To disable the VFs again then, you simply have to unload the driver: |
| |
| rmmod yourdriver |