| Block driver correctness testing with ``blkverify`` |
| =================================================== |
| |
| Introduction |
| ------------ |
| |
| This document describes how to use the ``blkverify`` protocol to test that a block |
| driver is operating correctly. |
| |
| It is difficult to test and debug block drivers against real guests. Often |
| processes inside the guest will crash because corrupt sectors were read as part |
| of the executable. Other times obscure errors are raised by a program inside |
| the guest. These issues are extremely hard to trace back to bugs in the block |
| driver. |
| |
| ``blkverify`` solves this problem by catching data corruption inside QEMU the first |
| time bad data is read and reporting the disk sector that is corrupted. |
| |
| How it works |
| ------------ |
| |
| The ``blkverify`` protocol has two child block devices, the "test" device and the |
| "raw" device. Read/write operations are mirrored to both devices so their |
| state should always be in sync. |
| |
| The "raw" device is a raw image, a flat file, that has identical starting |
| contents to the "test" image. The idea is that the "raw" device will handle |
| read/write operations correctly and not corrupt data. It can be used as a |
| reference for comparison against the "test" device. |
| |
| After a mirrored read operation completes, ``blkverify`` will compare the data and |
| raise an error if it is not identical. This makes it possible to catch the |
| first instance where corrupt data is read. |
| |
| Example |
| ------- |
| |
| Imagine raw.img has 0xcd repeated throughout its first sector:: |
| |
| $ ./qemu-io -c 'read -v 0 512' raw.img |
| 00000000: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ |
| 00000010: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ |
| [...] |
| 000001e0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ |
| 000001f0: cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd cd ................ |
| read 512/512 bytes at offset 0 |
| 512.000000 bytes, 1 ops; 0.0000 sec (97.656 MiB/sec and 200000.0000 ops/sec) |
| |
| And test.img is corrupt, its first sector is zeroed when it shouldn't be:: |
| |
| $ ./qemu-io -c 'read -v 0 512' test.img |
| 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ |
| 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ |
| [...] |
| 000001e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ |
| 000001f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ |
| read 512/512 bytes at offset 0 |
| 512.000000 bytes, 1 ops; 0.0000 sec (81.380 MiB/sec and 166666.6667 ops/sec) |
| |
| This error is caught by ``blkverify``:: |
| |
| $ ./qemu-io -c 'read 0 512' blkverify:a.img:b.img |
| blkverify: read sector_num=0 nb_sectors=4 contents mismatch in sector 0 |
| |
| A more realistic scenario is verifying the installation of a guest OS:: |
| |
| $ ./qemu-img create raw.img 16G |
| $ ./qemu-img create -f qcow2 test.qcow2 16G |
| $ ./qemu-system-x86_64 -cdrom debian.iso \ |
| -drive file=blkverify:raw.img:test.qcow2 |
| |
| If the installation is aborted when ``blkverify`` detects corruption, use ``qemu-io`` |
| to explore the contents of the disk image at the sector in question. |