Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 1 | COLO-proxy |
| 2 | ---------- |
| 3 | Copyright (c) 2016 Intel Corporation |
| 4 | Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. |
| 5 | Copyright (c) 2016 Fujitsu, Corp. |
| 6 | |
| 7 | This work is licensed under the terms of the GNU GPL, version 2 or later. |
| 8 | See the COPYING file in the top-level directory. |
| 9 | |
| 10 | This document gives an overview of COLO proxy's design. |
| 11 | |
| 12 | == Background == |
| 13 | COLO-proxy is a part of COLO project. It is used |
| 14 | to compare the network package to help COLO decide |
| 15 | whether to do checkpoint. With COLO-proxy's help, |
| 16 | COLO greatly improves the performance. |
| 17 | |
| 18 | The filter-redirector, filter-mirror, colo-compare |
| 19 | and filter-rewriter compose the COLO-proxy. |
| 20 | |
| 21 | == Architecture == |
| 22 | |
| 23 | COLO-Proxy is based on qemu netfilter and it's a plugin for qemu netfilter |
| 24 | (except colo-compare). It keep Secondary VM connect normally to |
| 25 | client and compare packets sent by PVM with sent by SVM. |
| 26 | If the packet difference, notify COLO-frame to do checkpoint and send |
| 27 | all primary packet has queued. Otherwise just send the queued primary |
| 28 | packet and drop the queued secondary packet. |
| 29 | |
| 30 | Below is a COLO proxy ascii figure: |
| 31 | |
| 32 | Primary qemu Secondary qemu |
| 33 | +--------------------------------------------------------------+ +----------------------------------------------------------------+ |
| 34 | | +----------------------------------------------------------+ | | +-----------------------------------------------------------+ | |
| 35 | | | | | | | | | |
| 36 | | | guest | | | | guest | | |
| 37 | | | | | | | | | |
| 38 | | +-------^--------------------------+-----------------------+ | | +---------------------+--------+----------------------------+ | |
| 39 | | | | | | ^ | | |
| 40 | | | | | | | | | |
| 41 | | | +------------------------------------------------------+ | | | | |
| 42 | |netfilter| | | | | | netfilter | | | |
| 43 | | +----------+ +----------------------------+ | | | +-----------------------------------------------------------+ | |
Like Xu | 806be37 | 2019-02-20 13:27:26 +0800 | [diff] [blame] | 44 | | | | | | | out | | | | | | filter execute order | | |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 45 | | | | | +-----------------------------+ | | | | | | +-------------------> | | |
| 46 | | | | | | | | | | | | | | | TCP | | |
| 47 | | | +-----+--+-+ +-----v----+ +-----v----+ |pri +----+----+sec| | | | +------------+ +---+----+---v+rewriter++ +------------+ | | |
| 48 | | | | | | | | | |in | |in | | | | | | | | | | | | | |
| 49 | | | | filter | | filter | | filter +------> colo <------+ +--------> filter +--> adjust | adjust +--> filter | | | |
| 50 | | | | mirror | |redirector| |redirector| | | compare | | | | | | redirector | | ack | seq | | redirector | | | |
| 51 | | | | | | | | | | | | | | | | | | | | | | | | | |
| 52 | | | +----^-----+ +----+-----+ +----------+ | +---------+ | | | | +------------+ +--------+--------------+ +---+--------+ | | |
| 53 | | | | tx | rx rx | | | | | tx all | rx | | |
| 54 | | | | | | | | | +-----------------------------------------------------------+ | |
| 55 | | | | +--------------+ | | | | | | |
Like Xu | 806be37 | 2019-02-20 13:27:26 +0800 | [diff] [blame] | 56 | | | | filter execute order | | | | | | | |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 57 | | | | +----------------> | | | +--------------------------------------------------------+ | |
| 58 | | +-----------------------------------------+ | | | |
| 59 | | | | | | | |
| 60 | +--------------------------------------------------------------+ +----------------------------------------------------------------+ |
| 61 | |guest receive | guest send |
| 62 | | | |
| 63 | +--------+----------------------------v------------------------+ |
| 64 | | | NOTE: filter direction is rx/tx/all |
| 65 | | tap | rx:receive packets sent to the netdev |
| 66 | | | tx:receive packets sent by the netdev |
| 67 | +--------------------------------------------------------------+ |
| 68 | |
| 69 | 1.Guest receive packet route: |
| 70 | |
| 71 | Primary: |
| 72 | |
| 73 | Tap --> Mirror Client Filter |
| 74 | Mirror client will send packet to guest,at the |
| 75 | same time, copy and forward packet to secondary |
| 76 | mirror server. |
| 77 | |
| 78 | Secondary: |
| 79 | |
| 80 | Mirror Server Filter --> TCP Rewriter |
| 81 | If receive packet is TCP packet,we will adjust ack |
| 82 | and update TCP checksum, then send to secondary |
| 83 | guest. Otherwise directly send to guest. |
| 84 | |
| 85 | 2.Guest send packet route: |
| 86 | |
| 87 | Primary: |
| 88 | |
| 89 | Guest --> Redirect Server Filter |
| 90 | Redirect server filter receive primary guest packet |
| 91 | but do nothing, just pass to next filter. |
| 92 | |
| 93 | Redirect Server Filter --> COLO-Compare |
| 94 | COLO-compare receive primary guest packet then |
Like Xu | 806be37 | 2019-02-20 13:27:26 +0800 | [diff] [blame] | 95 | waiting secondary redirect packet to compare it. |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 96 | If packet same,send queued primary packet and clear |
| 97 | queued secondary packet, Otherwise send primary packet |
| 98 | and do checkpoint. |
| 99 | |
| 100 | COLO-Compare --> Another Redirector Filter |
| 101 | The redirector get packet from colo-compare by use |
| 102 | chardev socket. |
| 103 | |
| 104 | Redirector Filter --> Tap |
| 105 | Send the packet. |
| 106 | |
| 107 | Secondary: |
| 108 | |
| 109 | Guest --> TCP Rewriter Filter |
| 110 | If the packet is TCP packet,we will adjust seq |
| 111 | and update TCP checksum. Then send it to |
| 112 | redirect client filter. Otherwise directly send to |
| 113 | redirect client filter. |
| 114 | |
| 115 | Redirect Client Filter --> Redirect Server Filter |
| 116 | Forward packet to primary. |
| 117 | |
| 118 | == Components introduction == |
| 119 | |
| 120 | Filter-mirror is a netfilter plugin. |
| 121 | It gives qemu the ability to mirror |
| 122 | packets to a chardev. |
| 123 | |
| 124 | Filter-redirector is a netfilter plugin. |
| 125 | It gives qemu the ability to redirect net packet. |
| 126 | Redirector can redirect filter's net packet to outdev, |
| 127 | and redirect indev's packet to filter. |
| 128 | |
| 129 | filter |
| 130 | + |
| 131 | redirector | |
| 132 | +--------------+ |
| 133 | | | | |
| 134 | | | | |
| 135 | | | | |
| 136 | indev +---------+ +----------> outdev |
| 137 | | | | |
| 138 | | | | |
| 139 | | | | |
| 140 | +--------------+ |
| 141 | | |
| 142 | v |
| 143 | filter |
| 144 | |
| 145 | COLO-compare, we do packet comparing job. |
| 146 | Packets coming from the primary char indev will be sent to outdev. |
| 147 | Packets coming from the secondary char dev will be dropped after comparing. |
Ville Skyttä | 9277d81 | 2018-06-12 09:51:50 +0300 | [diff] [blame] | 148 | COLO-compare needs two input chardevs and one output chardev: |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 149 | primary_in=chardev1-id (source: primary send packet) |
| 150 | secondary_in=chardev2-id (source: secondary send packet) |
| 151 | outdev=chardev3-id |
| 152 | |
| 153 | Filter-rewriter will rewrite some of secondary packet to make |
| 154 | secondary guest's tcp connection established successfully. |
| 155 | In this module we will rewrite tcp packet's ack to the secondary |
| 156 | from primary,and rewrite tcp packet's seq to the primary from |
| 157 | secondary. |
| 158 | |
| 159 | == Usage == |
| 160 | |
Zhang Chen | d4aa431 | 2016-12-01 14:55:57 +0800 | [diff] [blame] | 161 | Here is an example using demonstration IP and port addresses to more |
| 162 | clearly describe the usage. |
| 163 | |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 164 | Primary(ip:3.3.3.3): |
Michael Tokarev | 7aa94e5 | 2024-01-07 14:24:59 +0300 | [diff] [blame] | 165 | -netdev tap,id=hn0,vhost=off |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 166 | -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 |
Daniel P. Berrangé | c238741 | 2021-02-16 19:10:24 +0000 | [diff] [blame] | 167 | -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off |
| 168 | -chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off |
| 169 | -chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 170 | -chardev socket,id=compare0-0,host=3.3.3.3,port=9001 |
Daniel P. Berrangé | c238741 | 2021-02-16 19:10:24 +0000 | [diff] [blame] | 171 | -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 172 | -chardev socket,id=compare_out0,host=3.3.3.3,port=9005 |
Wang Yong | 861d51e | 2017-08-29 15:22:39 +0800 | [diff] [blame] | 173 | -object iothread,id=iothread1 |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 174 | -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 |
| 175 | -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out |
| 176 | -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 |
Wang Yong | 861d51e | 2017-08-29 15:22:39 +0800 | [diff] [blame] | 177 | -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1 |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 178 | |
| 179 | Secondary(ip:3.3.3.8): |
Michael Tokarev | 7aa94e5 | 2024-01-07 14:24:59 +0300 | [diff] [blame] | 180 | -netdev tap,id=hn0,vhost=off |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 181 | -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 |
| 182 | -chardev socket,id=red0,host=3.3.3.3,port=9003 |
| 183 | -chardev socket,id=red1,host=3.3.3.3,port=9004 |
| 184 | -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 |
| 185 | -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 |
Zhang Chen | 2484ff0 | 2017-07-04 14:53:56 +0800 | [diff] [blame] | 186 | -object filter-rewriter,id=f3,netdev=hn0,queue=all |
| 187 | |
| 188 | If you want to use virtio-net-pci or other driver with vnet_header: |
| 189 | |
| 190 | Primary(ip:3.3.3.3): |
| 191 | -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown |
| 192 | -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 |
Daniel P. Berrangé | c238741 | 2021-02-16 19:10:24 +0000 | [diff] [blame] | 193 | -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off |
| 194 | -chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off |
| 195 | -chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off |
Zhang Chen | 2484ff0 | 2017-07-04 14:53:56 +0800 | [diff] [blame] | 196 | -chardev socket,id=compare0-0,host=3.3.3.3,port=9001 |
Daniel P. Berrangé | c238741 | 2021-02-16 19:10:24 +0000 | [diff] [blame] | 197 | -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off |
Zhang Chen | 2484ff0 | 2017-07-04 14:53:56 +0800 | [diff] [blame] | 198 | -chardev socket,id=compare_out0,host=3.3.3.3,port=9005 |
| 199 | -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr_support |
| 200 | -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out,vnet_hdr_support |
| 201 | -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0,vnet_hdr_support |
| 202 | -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support |
| 203 | |
| 204 | Secondary(ip:3.3.3.8): |
Michael Tokarev | 7aa94e5 | 2024-01-07 14:24:59 +0300 | [diff] [blame] | 205 | -netdev tap,id=hn0,vhost=off |
Zhang Chen | 2484ff0 | 2017-07-04 14:53:56 +0800 | [diff] [blame] | 206 | -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 |
| 207 | -chardev socket,id=red0,host=3.3.3.3,port=9003 |
| 208 | -chardev socket,id=red1,host=3.3.3.3,port=9004 |
| 209 | -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0,vnet_hdr_support |
| 210 | -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1,vnet_hdr_support |
| 211 | -object filter-rewriter,id=f3,netdev=hn0,queue=all,vnet_hdr_support |
Zhang Chen | 46cca4e | 2016-09-27 10:22:36 +0800 | [diff] [blame] | 212 | |
| 213 | Note: |
| 214 | a.COLO-proxy must work with COLO-frame and Block-replication. |
| 215 | b.Primary COLO must be started firstly, because COLO-proxy needs |
| 216 | chardev socket server running before secondary started. |
| 217 | c.Filter-rewriter only rewrite tcp packet. |