Liang Li | 263170e | 2015-03-23 16:32:16 +0800 | [diff] [blame] | 1 | Use multiple thread (de)compression in live migration |
| 2 | ===================================================== |
| 3 | Copyright (C) 2015 Intel Corporation |
| 4 | Author: Liang Li <liang.z.li@intel.com> |
| 5 | |
| 6 | This work is licensed under the terms of the GNU GPLv2 or later. See |
| 7 | the COPYING file in the top-level directory. |
| 8 | |
| 9 | Contents: |
| 10 | ========= |
| 11 | * Introduction |
| 12 | * When to use |
| 13 | * Performance |
| 14 | * Usage |
| 15 | * TODO |
| 16 | |
| 17 | Introduction |
| 18 | ============ |
| 19 | Instead of sending the guest memory directly, this solution will |
| 20 | compress the RAM page before sending; after receiving, the data will |
| 21 | be decompressed. Using compression in live migration can help |
| 22 | to reduce the data transferred about 60%, this is very useful when the |
| 23 | bandwidth is limited, and the total migration time can also be reduced |
| 24 | about 70% in a typical case. In addition to this, the VM downtime can be |
| 25 | reduced about 50%. The benefit depends on data's compressibility in VM. |
| 26 | |
| 27 | The process of compression will consume additional CPU cycles, and the |
| 28 | extra CPU cycles will increase the migration time. On the other hand, |
| 29 | the amount of data transferred will decrease; this factor can reduce |
| 30 | the total migration time. If the process of the compression is quick |
| 31 | enough, then the total migration time can be reduced, and multiple |
| 32 | thread compression can be used to accelerate the compression process. |
| 33 | |
| 34 | The decompression speed of Zlib is at least 4 times as quick as |
| 35 | compression, if the source and destination CPU have equal speed, |
| 36 | keeping the compression thread count 4 times the decompression |
| 37 | thread count can avoid resource waste. |
| 38 | |
| 39 | Compression level can be used to control the compression speed and the |
| 40 | compression ratio. High compression ratio will take more time, level 0 |
| 41 | stands for no compression, level 1 stands for the best compression |
| 42 | speed, and level 9 stands for the best compression ratio. Users can |
| 43 | select a level number between 0 and 9. |
| 44 | |
| 45 | |
| 46 | When to use the multiple thread compression in live migration |
| 47 | ============================================================= |
| 48 | Compression of data will consume extra CPU cycles; so in a system with |
| 49 | high overhead of CPU, avoid using this feature. When the network |
| 50 | bandwidth is very limited and the CPU resource is adequate, use of |
| 51 | multiple thread compression will be very helpful. If both the CPU and |
| 52 | the network bandwidth are adequate, use of multiple thread compression |
| 53 | can still help to reduce the migration time. |
| 54 | |
| 55 | Performance |
| 56 | =========== |
| 57 | Test environment: |
| 58 | |
| 59 | CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz |
| 60 | Socket Count: 2 |
| 61 | RAM: 128G |
| 62 | NIC: Intel I350 (10/100/1000Mbps) |
| 63 | Host OS: CentOS 7 64-bit |
| 64 | Guest OS: RHEL 6.5 64-bit |
| 65 | Parameter: qemu-system-x86_64 -enable-kvm -smp 4 -m 4096 |
| 66 | /share/ia32e_rhel6u5.qcow -monitor stdio |
| 67 | |
| 68 | There is no additional application is running on the guest when doing |
| 69 | the test. |
| 70 | |
| 71 | |
| 72 | Speed limit: 1000Gb/s |
| 73 | --------------------------------------------------------------- |
| 74 | | original | compress thread: 8 |
| 75 | | way | decompress thread: 2 |
| 76 | | | compression level: 1 |
| 77 | --------------------------------------------------------------- |
| 78 | total time(msec): | 3333 | 1833 |
| 79 | --------------------------------------------------------------- |
| 80 | downtime(msec): | 100 | 27 |
| 81 | --------------------------------------------------------------- |
| 82 | transferred ram(kB):| 363536 | 107819 |
| 83 | --------------------------------------------------------------- |
| 84 | throughput(mbps): | 893.73 | 482.22 |
| 85 | --------------------------------------------------------------- |
| 86 | total ram(kB): | 4211524 | 4211524 |
| 87 | --------------------------------------------------------------- |
| 88 | |
| 89 | There is an application running on the guest which write random numbers |
| 90 | to RAM block areas periodically. |
| 91 | |
| 92 | Speed limit: 1000Gb/s |
| 93 | --------------------------------------------------------------- |
| 94 | | original | compress thread: 8 |
| 95 | | way | decompress thread: 2 |
| 96 | | | compression level: 1 |
| 97 | --------------------------------------------------------------- |
| 98 | total time(msec): | 37369 | 15989 |
| 99 | --------------------------------------------------------------- |
| 100 | downtime(msec): | 337 | 173 |
| 101 | --------------------------------------------------------------- |
| 102 | transferred ram(kB):| 4274143 | 1699824 |
| 103 | --------------------------------------------------------------- |
| 104 | throughput(mbps): | 936.99 | 870.95 |
| 105 | --------------------------------------------------------------- |
| 106 | total ram(kB): | 4211524 | 4211524 |
| 107 | --------------------------------------------------------------- |
| 108 | |
| 109 | Usage |
| 110 | ===== |
| 111 | 1. Verify both the source and destination QEMU are able |
| 112 | to support the multiple thread compression migration: |
Wei Jiangang | aa5982e | 2016-05-23 17:43:57 +0800 | [diff] [blame] | 113 | {qemu} info migrate_capabilities |
Liang Li | 263170e | 2015-03-23 16:32:16 +0800 | [diff] [blame] | 114 | {qemu} ... compress: off ... |
| 115 | |
| 116 | 2. Activate compression on the source: |
| 117 | {qemu} migrate_set_capability compress on |
| 118 | |
| 119 | 3. Set the compression thread count on source: |
| 120 | {qemu} migrate_set_parameter compress_threads 12 |
| 121 | |
| 122 | 4. Set the compression level on the source: |
| 123 | {qemu} migrate_set_parameter compress_level 1 |
| 124 | |
| 125 | 5. Set the decompression thread count on destination: |
| 126 | {qemu} migrate_set_parameter decompress_threads 3 |
| 127 | |
| 128 | 6. Start outgoing migration: |
| 129 | {qemu} migrate -d tcp:destination.host:4444 |
| 130 | {qemu} info migrate |
| 131 | Capabilities: ... compress: on |
| 132 | ... |
| 133 | |
| 134 | The following are the default settings: |
| 135 | compress: off |
| 136 | compress_threads: 8 |
| 137 | decompress_threads: 2 |
| 138 | compress_level: 1 (which means best speed) |
| 139 | |
| 140 | So, only the first two steps are required to use the multiple |
| 141 | thread compression in migration. You can do more if the default |
| 142 | settings are not appropriate. |
| 143 | |
| 144 | TODO |
| 145 | ==== |
| 146 | Some faster (de)compression method such as LZ4 and Quicklz can help |
| 147 | to reduce the CPU consumption when doing (de)compression. If using |
| 148 | these faster (de)compression method, less (de)compression threads |
| 149 | are needed when doing the migration. |