Maria Kustova | d6dc10a | 2014-08-11 14:33:58 +0400 | [diff] [blame] | 1 | # Specification for the fuzz testing tool |
| 2 | # |
| 3 | # Copyright (C) 2014 Maria Kustova <maria.k@catit.be> |
| 4 | # |
| 5 | # This program is free software: you can redistribute it and/or modify |
| 6 | # it under the terms of the GNU General Public License as published by |
| 7 | # the Free Software Foundation, either version 2 of the License, or |
| 8 | # (at your option) any later version. |
| 9 | # |
| 10 | # This program is distributed in the hope that it will be useful, |
| 11 | # but WITHOUT ANY WARRANTY; without even the implied warranty of |
| 12 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
| 13 | # GNU General Public License for more details. |
| 14 | # |
| 15 | # You should have received a copy of the GNU General Public License |
| 16 | # along with this program. If not, see <http://www.gnu.org/licenses/>. |
| 17 | |
| 18 | |
| 19 | Image fuzzer |
| 20 | ============ |
| 21 | |
| 22 | Description |
| 23 | ----------- |
| 24 | |
| 25 | The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img |
| 26 | by providing to them randomly corrupted images. |
| 27 | Test images are generated from scratch and have valid inner structure with some |
| 28 | elements, e.g. L1/L2 tables, having random invalid values. |
| 29 | |
| 30 | |
| 31 | Test runner |
| 32 | ----------- |
| 33 | |
| 34 | The test runner generates test images, executes tests utilizing generated |
| 35 | images, indicates their results and collects all test related artifacts (logs, |
| 36 | core dumps, test images, backing files). |
| 37 | The test means execution of all available commands under test with the same |
| 38 | generated test image. |
| 39 | By default, the test runner generates new tests and executes them until |
| 40 | keyboard interruption. But if a test seed is specified via the '--seed' runner |
| 41 | parameter, then only one test with this seed will be executed, after its finish |
| 42 | the runner will exit. |
| 43 | |
| 44 | The runner uses an external image fuzzer to generate test images. An image |
| 45 | generator should be specified as a mandatory parameter of the test runner. |
| 46 | Details about interactions between the runner and fuzzers see "Module |
| 47 | interfaces". |
| 48 | |
| 49 | The runner activates generation of core dumps during test executions, but it |
| 50 | assumes that core dumps will be generated in the current working directory. |
| 51 | For comprehensive test results, please, set up your test environment |
| 52 | properly. |
| 53 | |
| 54 | Paths to binaries under test (SUTs) qemu-img and qemu-io are retrieved from |
| 55 | environment variables. If the environment check fails the runner will |
| 56 | use SUTs installed in system paths. |
| 57 | qemu-img is required for creation of backing files, so it's mandatory to set |
| 58 | the related environment variable if it's not installed in the system path. |
| 59 | For details about environment variables see qemu-iotests/check. |
| 60 | |
| 61 | The runner accepts a JSON array of fields expected to be fuzzed via the |
| 62 | '--config' argument, e.g. |
| 63 | |
| 64 | '[["feature_name_table"], ["header", "l1_table_offset"]]' |
| 65 | |
| 66 | Each sublist can have one or two strings defining image structure elements. |
| 67 | In the latter case a parent element should be placed on the first position, |
| 68 | and a field name on the second one. |
| 69 | |
| 70 | The runner accepts a list of commands under test as a JSON array via |
| 71 | the '--command' argument. Each command is a list containing a SUT and all its |
| 72 | arguments, e.g. |
| 73 | |
| 74 | runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]' |
| 75 | /tmp/test ../qcow2 |
| 76 | |
| 77 | For variable arguments next aliases can be used: |
| 78 | - $test_img for a fuzzed img |
| 79 | - $off for an offset in the fuzzed image |
| 80 | - $len for a data size |
| 81 | |
| 82 | Values for last two aliases will be generated based on a size of a virtual |
| 83 | disk of the generated image. |
| 84 | In case when no commands are specified the runner will execute commands from |
| 85 | the default list: |
| 86 | - qemu-img check |
| 87 | - qemu-img info |
| 88 | - qemu-img convert |
| 89 | - qemu-io -c read |
| 90 | - qemu-io -c write |
| 91 | - qemu-io -c aio_read |
| 92 | - qemu-io -c aio_write |
| 93 | - qemu-io -c flush |
| 94 | - qemu-io -c discard |
| 95 | - qemu-io -c truncate |
| 96 | |
| 97 | |
| 98 | Qcow2 image generator |
| 99 | --------------------- |
| 100 | |
| 101 | The 'qcow2' generator is a Python package providing 'create_image' method as |
| 102 | a single public API. See details in 'Test runner/image fuzzer' chapter of |
| 103 | 'Module interfaces'. |
| 104 | |
| 105 | Qcow2 contains two submodules: fuzz.py and layout.py. |
| 106 | |
| 107 | 'fuzz.py' contains all fuzzing functions, one per image field. It's assumed |
| 108 | that after code analysis every field will have own constraints for its value. |
| 109 | For now only universal potentially dangerous values are used, e.g. type limits |
| 110 | for integers or unsafe symbols as '%s' for strings. For bitmasks random amount |
| 111 | of bits are set to ones. All fuzzed values are checked on non-equality to the |
| 112 | current valid value of the field. In case of equality the value will be |
| 113 | regenerated. |
| 114 | |
| 115 | 'layout.py' creates a random valid image, fuzzes a random subset of the image |
| 116 | fields by 'fuzz.py' module and writes a fuzzed image to the file specified. |
| 117 | If a fuzzer configuration is specified, then it has the next interpretation: |
| 118 | |
| 119 | 1. If a list contains a parent image element only, then some random portion |
| 120 | of fields of this element will be fuzzed every test. |
| 121 | The same behavior is applied for the entire image if no configuration is |
| 122 | used. This case is useful for the test specialization. |
| 123 | |
| 124 | 2. If a list contains a parent element and a field name, then a field |
| 125 | will be always fuzzed for every test. This case is useful for regression |
| 126 | testing. |
| 127 | |
Maria Kustova | 56271ef | 2014-08-19 16:25:11 +0400 | [diff] [blame] | 128 | The generator can create header fields, header extensions, L1/L2 tables and |
| 129 | refcount table and blocks. |
Maria Kustova | d6dc10a | 2014-08-11 14:33:58 +0400 | [diff] [blame] | 130 | |
| 131 | Module interfaces |
| 132 | ----------------- |
| 133 | |
| 134 | * Test runner/image fuzzer |
| 135 | |
| 136 | The runner calls an image generator specifying the path to a test image file, |
| 137 | path to a backing file and its format and a fuzzer configuration. |
| 138 | An image generator is expected to provide a |
| 139 | |
| 140 | 'create_image(test_img_path, backing_file_path=None, |
| 141 | backing_file_format=None, fuzz_config=None)' |
| 142 | |
| 143 | method that creates a test image, writes it to the specified file and returns |
| 144 | the size of the virtual disk. |
| 145 | The file should be created if it doesn't exist or overwritten otherwise. |
| 146 | fuzz_config has a form of a list of lists. Every sublist can have one |
| 147 | or two elements: first element is a name of a parent image element, second one |
| 148 | if exists is a name of a field in this element. |
| 149 | Example, |
| 150 | [['header', 'l1_table_offset'], |
| 151 | ['header', 'nb_snapshots'], |
| 152 | ['feature_name_table']] |
| 153 | |
| 154 | Random seed is set by the runner at every test execution for the regression |
| 155 | purpose, so an image generator is not recommended to modify it internally. |
| 156 | |
| 157 | |
| 158 | Overall fuzzer requirements |
| 159 | =========================== |
| 160 | |
| 161 | Input data: |
| 162 | ---------- |
| 163 | |
| 164 | - image template (generator) |
| 165 | - work directory |
| 166 | - action vector (optional) |
| 167 | - seed (optional) |
| 168 | - SUT and its arguments (optional) |
| 169 | |
| 170 | |
| 171 | Fuzzer requirements: |
| 172 | ------------------- |
| 173 | |
| 174 | 1. Should be able to inject random data |
| 175 | 2. Should be able to select a random value from the manually pregenerated |
| 176 | vector (boundary values, e.g. max/min cluster size) |
| 177 | 3. Image template should describe a general structure invariant for all |
| 178 | test images (image format description) |
| 179 | 4. Image template should be autonomous and other fuzzer parts should not |
| 180 | rely on it |
| 181 | 5. Image template should contain reference rules (not only block+size |
| 182 | description) |
| 183 | 6. Should generate the test image with the correct structure based on an image |
| 184 | template |
| 185 | 7. Should accept a seed as an argument (for regression purpose) |
| 186 | 8. Should generate a seed if it is not specified as an input parameter. |
| 187 | 9. The same seed should generate the same image for the same action vector, |
| 188 | specified or generated. |
| 189 | 10. Should accept a vector of actions as an argument (for test reproducing and |
| 190 | for test case specification, e.g. group of tests for header structure, |
| 191 | group of test for snapshots, etc) |
| 192 | 11. Action vector should be randomly generated from the pool of available |
| 193 | actions, if it is not specified as an input parameter |
| 194 | 12. Pool of actions should be defined automatically based on an image template |
| 195 | 13. Should accept a SUT and its call parameters as an argument or select them |
| 196 | randomly otherwise. As far as it's expected to be rarely changed, the list |
| 197 | of all possible test commands can be available in the test runner |
| 198 | internally. |
| 199 | 14. Should support an external cancellation of a test run |
| 200 | 15. Seed should be logged (for regression purpose) |
| 201 | 16. All files related to a test result should be collected: a test image, |
| 202 | SUT logs, fuzzer logs and crash dumps |
| 203 | 17. Should be compatible with python version 2.4-2.7 |
| 204 | 18. Usage of external libraries should be limited as much as possible. |
| 205 | |
| 206 | |
| 207 | Image formats: |
| 208 | ------------- |
| 209 | |
| 210 | Main target image format is qcow2, but support of image templates should |
| 211 | provide an ability to add any other image format. |
| 212 | |
| 213 | |
| 214 | Effectiveness: |
| 215 | ------------- |
| 216 | |
| 217 | The fuzzer can be controlled via template, seed and action vector; |
| 218 | it makes the fuzzer itself invariant to an image format and test logic. |
| 219 | It should be able to perform rather complex and precise tests, that can be |
| 220 | specified via an action vector. Otherwise, knowledge about an image structure |
| 221 | allows the fuzzer to generate the pool of all available areas can be fuzzed |
| 222 | and randomly select some of them and so compose its own action vector. |
| 223 | Also complexity of a template defines complexity of the fuzzer, so its |
| 224 | functionality can be varied from simple model-independent fuzzing to smart |
| 225 | model-based one. |
| 226 | |
| 227 | |
| 228 | Glossary: |
| 229 | -------- |
| 230 | |
| 231 | Action vector is a sequence of structure elements retrieved from an image |
| 232 | format, each of them will be fuzzed for the test image. It's a subset of |
| 233 | elements of the action pool. Example: header, refcount table, etc. |
| 234 | Action pool is all available elements of an image structure that generated |
| 235 | automatically from an image template. |
| 236 | Image template is a formal description of an image structure and relations |
| 237 | between image blocks. |
| 238 | Test image is an output image of the fuzzer defined by the current seed and |
| 239 | action vector. |