|  | # Specification for the fuzz testing tool | 
|  | # | 
|  | # Copyright (C) 2014 Maria Kustova <maria.k@catit.be> | 
|  | # | 
|  | # This program is free software: you can redistribute it and/or modify | 
|  | # it under the terms of the GNU General Public License as published by | 
|  | # the Free Software Foundation, either version 2 of the License, or | 
|  | # (at your option) any later version. | 
|  | # | 
|  | # This program is distributed in the hope that it will be useful, | 
|  | # but WITHOUT ANY WARRANTY; without even the implied warranty of | 
|  | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the | 
|  | # GNU General Public License for more details. | 
|  | # | 
|  | # You should have received a copy of the GNU General Public License | 
|  | # along with this program.  If not, see <http://www.gnu.org/licenses/>. | 
|  |  | 
|  |  | 
|  | Image fuzzer | 
|  | ============ | 
|  |  | 
|  | Description | 
|  | ----------- | 
|  |  | 
|  | The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img | 
|  | by providing to them randomly corrupted images. | 
|  | Test images are generated from scratch and have valid inner structure with some | 
|  | elements, e.g. L1/L2 tables, having random invalid values. | 
|  |  | 
|  |  | 
|  | Test runner | 
|  | ----------- | 
|  |  | 
|  | The test runner generates test images, executes tests utilizing generated | 
|  | images, indicates their results and collects all test related artifacts (logs, | 
|  | core dumps, test images, backing files). | 
|  | The test means execution of all available commands under test with the same | 
|  | generated test image. | 
|  | By default, the test runner generates new tests and executes them until | 
|  | keyboard interruption. But if a test seed is specified via the '--seed' runner | 
|  | parameter, then only one test with this seed will be executed, after its finish | 
|  | the runner will exit. | 
|  |  | 
|  | The runner uses an external image fuzzer to generate test images. An image | 
|  | generator should be specified as a mandatory parameter of the test runner. | 
|  | Details about interactions between the runner and fuzzers see "Module | 
|  | interfaces". | 
|  |  | 
|  | The runner activates generation of core dumps during test executions, but it | 
|  | assumes that core dumps will be generated in the current working directory. | 
|  | For comprehensive test results, please, set up your test environment | 
|  | properly. | 
|  |  | 
|  | Paths to binaries under test (SUTs) ``qemu-img`` and ``qemu-io`` are retrieved | 
|  | from environment variables. If the environment check fails the runner will | 
|  | use SUTs installed in system paths. | 
|  | ``qemu-img`` is required for creation of backing files, so it's mandatory to set | 
|  | the related environment variable if it's not installed in the system path. | 
|  | For details about environment variables see qemu-iotests/check. | 
|  |  | 
|  | The runner accepts a JSON array of fields expected to be fuzzed via the | 
|  | '--config' argument, e.g. | 
|  |  | 
|  | '[["feature_name_table"], ["header", "l1_table_offset"]]' | 
|  |  | 
|  | Each sublist can have one or two strings defining image structure elements. | 
|  | In the latter case a parent element should be placed on the first position, | 
|  | and a field name on the second one. | 
|  |  | 
|  | The runner accepts a list of commands under test as a JSON array via | 
|  | the '--command' argument. Each command is a list containing a SUT and all its | 
|  | arguments, e.g. | 
|  |  | 
|  | runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]' | 
|  | /tmp/test ../qcow2 | 
|  |  | 
|  | For variable arguments next aliases can be used: | 
|  | - $test_img for a fuzzed img | 
|  | - $off for an offset in the fuzzed image | 
|  | - $len for a data size | 
|  |  | 
|  | Values for last two aliases will be generated based on a size of a virtual | 
|  | disk of the generated image. | 
|  | In case when no commands are specified the runner will execute commands from | 
|  | the default list: | 
|  | - qemu-img check | 
|  | - qemu-img info | 
|  | - qemu-img convert | 
|  | - qemu-io -c read | 
|  | - qemu-io -c write | 
|  | - qemu-io -c aio_read | 
|  | - qemu-io -c aio_write | 
|  | - qemu-io -c flush | 
|  | - qemu-io -c discard | 
|  | - qemu-io -c truncate | 
|  |  | 
|  |  | 
|  | Qcow2 image generator | 
|  | --------------------- | 
|  |  | 
|  | The 'qcow2' generator is a Python package providing 'create_image' method as | 
|  | a single public API. See details in 'Test runner/image fuzzer' chapter of | 
|  | 'Module interfaces'. | 
|  |  | 
|  | Qcow2 contains two submodules: fuzz.py and layout.py. | 
|  |  | 
|  | 'fuzz.py' contains all fuzzing functions, one per image field. It's assumed | 
|  | that after code analysis every field will have own constraints for its value. | 
|  | For now only universal potentially dangerous values are used, e.g. type limits | 
|  | for integers or unsafe symbols as '%s' for strings. For bitmasks random amount | 
|  | of bits are set to ones. All fuzzed values are checked on non-equality to the | 
|  | current valid value of the field. In case of equality the value will be | 
|  | regenerated. | 
|  |  | 
|  | 'layout.py' creates a random valid image, fuzzes a random subset of the image | 
|  | fields by 'fuzz.py' module and writes a fuzzed image to the file specified. | 
|  | If a fuzzer configuration is specified, then it has the next interpretation: | 
|  |  | 
|  | 1. If a list contains a parent image element only, then some random portion | 
|  | of fields of this element will be fuzzed every test. | 
|  | The same behavior is applied for the entire image if no configuration is | 
|  | used. This case is useful for the test specialization. | 
|  |  | 
|  | 2. If a list contains a parent element and a field name, then a field | 
|  | will be always fuzzed for every test. This case is useful for regression | 
|  | testing. | 
|  |  | 
|  | The generator can create header fields, header extensions, L1/L2 tables and | 
|  | refcount table and blocks. | 
|  |  | 
|  | Module interfaces | 
|  | ----------------- | 
|  |  | 
|  | * Test runner/image fuzzer | 
|  |  | 
|  | The runner calls an image generator specifying the path to a test image file, | 
|  | path to a backing file and its format and a fuzzer configuration. | 
|  | An image generator is expected to provide a | 
|  |  | 
|  | 'create_image(test_img_path, backing_file_path=None, | 
|  | backing_file_format=None, fuzz_config=None)' | 
|  |  | 
|  | method that creates a test image, writes it to the specified file and returns | 
|  | the size of the virtual disk. | 
|  | The file should be created if it doesn't exist or overwritten otherwise. | 
|  | fuzz_config has a form of a list of lists. Every sublist can have one | 
|  | or two elements: first element is a name of a parent image element, second one | 
|  | if exists is a name of a field in this element. | 
|  | Example, | 
|  | [['header', 'l1_table_offset'], | 
|  | ['header', 'nb_snapshots'], | 
|  | ['feature_name_table']] | 
|  |  | 
|  | Random seed is set by the runner at every test execution for the regression | 
|  | purpose, so an image generator is not recommended to modify it internally. | 
|  |  | 
|  |  | 
|  | Overall fuzzer requirements | 
|  | =========================== | 
|  |  | 
|  | Input data: | 
|  | ---------- | 
|  |  | 
|  | - image template (generator) | 
|  | - work directory | 
|  | - action vector (optional) | 
|  | - seed (optional) | 
|  | - SUT and its arguments (optional) | 
|  |  | 
|  |  | 
|  | Fuzzer requirements: | 
|  | ------------------- | 
|  |  | 
|  | 1.  Should be able to inject random data | 
|  | 2.  Should be able to select a random value from the manually pregenerated | 
|  | vector (boundary values, e.g. max/min cluster size) | 
|  | 3.  Image template should describe a general structure invariant for all | 
|  | test images (image format description) | 
|  | 4.  Image template should be autonomous and other fuzzer parts should not | 
|  | rely on it | 
|  | 5.  Image template should contain reference rules (not only block+size | 
|  | description) | 
|  | 6.  Should generate the test image with the correct structure based on an image | 
|  | template | 
|  | 7.  Should accept a seed as an argument (for regression purpose) | 
|  | 8.  Should generate a seed if it is not specified as an input parameter. | 
|  | 9.  The same seed should generate the same image for the same action vector, | 
|  | specified or generated. | 
|  | 10. Should accept a vector of actions as an argument (for test reproducing and | 
|  | for test case specification, e.g. group of tests for header structure, | 
|  | group of test for snapshots, etc) | 
|  | 11. Action vector should be randomly generated from the pool of available | 
|  | actions, if it is not specified as an input parameter | 
|  | 12. Pool of actions should be defined automatically based on an image template | 
|  | 13. Should accept a SUT and its call parameters as an argument or select them | 
|  | randomly otherwise. As far as it's expected to be rarely changed, the list | 
|  | of all possible test commands can be available in the test runner | 
|  | internally. | 
|  | 14. Should support an external cancellation of a test run | 
|  | 15. Seed should be logged (for regression purpose) | 
|  | 16. All files related to a test result should be collected: a test image, | 
|  | SUT logs, fuzzer logs and crash dumps | 
|  | 17. Should be compatible with python version 2.4-2.7 | 
|  | 18. Usage of external libraries should be limited as much as possible. | 
|  |  | 
|  |  | 
|  | Image formats: | 
|  | ------------- | 
|  |  | 
|  | Main target image format is qcow2, but support of image templates should | 
|  | provide an ability to add any other image format. | 
|  |  | 
|  |  | 
|  | Effectiveness: | 
|  | ------------- | 
|  |  | 
|  | The fuzzer can be controlled via template, seed and action vector; | 
|  | it makes the fuzzer itself invariant to an image format and test logic. | 
|  | It should be able to perform rather complex and precise tests, that can be | 
|  | specified via an action vector. Otherwise, knowledge about an image structure | 
|  | allows the fuzzer to generate the pool of all available areas can be fuzzed | 
|  | and randomly select some of them and so compose its own action vector. | 
|  | Also complexity of a template defines complexity of the fuzzer, so its | 
|  | functionality can be varied from simple model-independent fuzzing to smart | 
|  | model-based one. | 
|  |  | 
|  |  | 
|  | Glossary: | 
|  | -------- | 
|  |  | 
|  | Action vector is a sequence of structure elements retrieved from an image | 
|  | format, each of them will be fuzzed for the test image. It's a subset of | 
|  | elements of the action pool. Example: header, refcount table, etc. | 
|  | Action pool is all available elements of an image structure that generated | 
|  | automatically from an image template. | 
|  | Image template is a formal description of an image structure and relations | 
|  | between image blocks. | 
|  | Test image is an output image of the fuzzer defined by the current seed and | 
|  | action vector. |