Blame - docs/image-fuzzer.txt - qemu

blob: 3e23ebec331d88c4f43f5de402131d6d29408615 [file] [log] [blame]

Maria Kustova	d6dc10a	2014-08-11 14:33:58 +0400	[diff] [blame]	1	# Specification for the fuzz testing tool
				2	#
				3	# Copyright (C) 2014 Maria Kustova <maria.k@catit.be>
				4	#
				5	# This program is free software: you can redistribute it and/or modify
				6	# it under the terms of the GNU General Public License as published by
				7	# the Free Software Foundation, either version 2 of the License, or
				8	# (at your option) any later version.
				9	#
				10	# This program is distributed in the hope that it will be useful,
				11	# but WITHOUT ANY WARRANTY; without even the implied warranty of
				12	# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
				13	# GNU General Public License for more details.
				14	#
				15	# You should have received a copy of the GNU General Public License
				16	# along with this program. If not, see <http://www.gnu.org/licenses/>.
				17
				18
				19	Image fuzzer
				20	============
				21
				22	Description
				23	-----------
				24
				25	The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img
				26	by providing to them randomly corrupted images.
				27	Test images are generated from scratch and have valid inner structure with some
				28	elements, e.g. L1/L2 tables, having random invalid values.
				29
				30
				31	Test runner
				32	-----------
				33
				34	The test runner generates test images, executes tests utilizing generated
				35	images, indicates their results and collects all test related artifacts (logs,
				36	core dumps, test images, backing files).
				37	The test means execution of all available commands under test with the same
				38	generated test image.
				39	By default, the test runner generates new tests and executes them until
				40	keyboard interruption. But if a test seed is specified via the '--seed' runner
				41	parameter, then only one test with this seed will be executed, after its finish
				42	the runner will exit.
				43
				44	The runner uses an external image fuzzer to generate test images. An image
				45	generator should be specified as a mandatory parameter of the test runner.
				46	Details about interactions between the runner and fuzzers see "Module
				47	interfaces".
				48
				49	The runner activates generation of core dumps during test executions, but it
				50	assumes that core dumps will be generated in the current working directory.
				51	For comprehensive test results, please, set up your test environment
				52	properly.
				53
				54	Paths to binaries under test (SUTs) qemu-img and qemu-io are retrieved from
				55	environment variables. If the environment check fails the runner will
				56	use SUTs installed in system paths.
				57	qemu-img is required for creation of backing files, so it's mandatory to set
				58	the related environment variable if it's not installed in the system path.
				59	For details about environment variables see qemu-iotests/check.
				60
				61	The runner accepts a JSON array of fields expected to be fuzzed via the
				62	'--config' argument, e.g.
				63
				64	'[["feature_name_table"], ["header", "l1_table_offset"]]'
				65
				66	Each sublist can have one or two strings defining image structure elements.
				67	In the latter case a parent element should be placed on the first position,
				68	and a field name on the second one.
				69
				70	The runner accepts a list of commands under test as a JSON array via
				71	the '--command' argument. Each command is a list containing a SUT and all its
				72	arguments, e.g.
				73
				74	runner.py -c '[["qemu-io", "$test_img", "-c", "write $off $len"]]'
				75	/tmp/test ../qcow2
				76
				77	For variable arguments next aliases can be used:
				78	- $test_img for a fuzzed img
				79	- $off for an offset in the fuzzed image
				80	- $len for a data size
				81
				82	Values for last two aliases will be generated based on a size of a virtual
				83	disk of the generated image.
				84	In case when no commands are specified the runner will execute commands from
				85	the default list:
				86	- qemu-img check
				87	- qemu-img info
				88	- qemu-img convert
				89	- qemu-io -c read
				90	- qemu-io -c write
				91	- qemu-io -c aio_read
				92	- qemu-io -c aio_write
				93	- qemu-io -c flush
				94	- qemu-io -c discard
				95	- qemu-io -c truncate
				96
				97
				98	Qcow2 image generator
				99	---------------------
				100
				101	The 'qcow2' generator is a Python package providing 'create_image' method as
				102	a single public API. See details in 'Test runner/image fuzzer' chapter of
				103	'Module interfaces'.
				104
				105	Qcow2 contains two submodules: fuzz.py and layout.py.
				106
				107	'fuzz.py' contains all fuzzing functions, one per image field. It's assumed
				108	that after code analysis every field will have own constraints for its value.
				109	For now only universal potentially dangerous values are used, e.g. type limits
				110	for integers or unsafe symbols as '%s' for strings. For bitmasks random amount
				111	of bits are set to ones. All fuzzed values are checked on non-equality to the
				112	current valid value of the field. In case of equality the value will be
				113	regenerated.
				114
				115	'layout.py' creates a random valid image, fuzzes a random subset of the image
				116	fields by 'fuzz.py' module and writes a fuzzed image to the file specified.
				117	If a fuzzer configuration is specified, then it has the next interpretation:
				118
				119	1. If a list contains a parent image element only, then some random portion
				120	of fields of this element will be fuzzed every test.
				121	The same behavior is applied for the entire image if no configuration is
				122	used. This case is useful for the test specialization.
				123
				124	2. If a list contains a parent element and a field name, then a field
				125	will be always fuzzed for every test. This case is useful for regression
				126	testing.
				127
Maria Kustova	56271ef	2014-08-19 16:25:11 +0400	[diff] [blame]	128	The generator can create header fields, header extensions, L1/L2 tables and
				129	refcount table and blocks.
Maria Kustova	d6dc10a	2014-08-11 14:33:58 +0400	[diff] [blame]	130
				131	Module interfaces
				132	-----------------
				133
				134	* Test runner/image fuzzer
				135
				136	The runner calls an image generator specifying the path to a test image file,
				137	path to a backing file and its format and a fuzzer configuration.
				138	An image generator is expected to provide a
				139
				140	'create_image(test_img_path, backing_file_path=None,
				141	backing_file_format=None, fuzz_config=None)'
				142
				143	method that creates a test image, writes it to the specified file and returns
				144	the size of the virtual disk.
				145	The file should be created if it doesn't exist or overwritten otherwise.
				146	fuzz_config has a form of a list of lists. Every sublist can have one
				147	or two elements: first element is a name of a parent image element, second one
				148	if exists is a name of a field in this element.
				149	Example,
				150	[['header', 'l1_table_offset'],
				151	['header', 'nb_snapshots'],
				152	['feature_name_table']]
				153
				154	Random seed is set by the runner at every test execution for the regression
				155	purpose, so an image generator is not recommended to modify it internally.
				156
				157
				158	Overall fuzzer requirements
				159	===========================
				160
				161	Input data:
				162	----------
				163
				164	- image template (generator)
				165	- work directory
				166	- action vector (optional)
				167	- seed (optional)
				168	- SUT and its arguments (optional)
				169
				170
				171	Fuzzer requirements:
				172	-------------------
				173
				174	1. Should be able to inject random data
				175	2. Should be able to select a random value from the manually pregenerated
				176	vector (boundary values, e.g. max/min cluster size)
				177	3. Image template should describe a general structure invariant for all
				178	test images (image format description)
				179	4. Image template should be autonomous and other fuzzer parts should not
				180	rely on it
				181	5. Image template should contain reference rules (not only block+size
				182	description)
				183	6. Should generate the test image with the correct structure based on an image
				184	template
				185	7. Should accept a seed as an argument (for regression purpose)
				186	8. Should generate a seed if it is not specified as an input parameter.
				187	9. The same seed should generate the same image for the same action vector,
				188	specified or generated.
				189	10. Should accept a vector of actions as an argument (for test reproducing and
				190	for test case specification, e.g. group of tests for header structure,
				191	group of test for snapshots, etc)
				192	11. Action vector should be randomly generated from the pool of available
				193	actions, if it is not specified as an input parameter
				194	12. Pool of actions should be defined automatically based on an image template
				195	13. Should accept a SUT and its call parameters as an argument or select them
				196	randomly otherwise. As far as it's expected to be rarely changed, the list
				197	of all possible test commands can be available in the test runner
				198	internally.
				199	14. Should support an external cancellation of a test run
				200	15. Seed should be logged (for regression purpose)
				201	16. All files related to a test result should be collected: a test image,
				202	SUT logs, fuzzer logs and crash dumps
				203	17. Should be compatible with python version 2.4-2.7
				204	18. Usage of external libraries should be limited as much as possible.
				205
				206
				207	Image formats:
				208	-------------
				209
				210	Main target image format is qcow2, but support of image templates should
				211	provide an ability to add any other image format.
				212
				213
				214	Effectiveness:
				215	-------------
				216
				217	The fuzzer can be controlled via template, seed and action vector;
				218	it makes the fuzzer itself invariant to an image format and test logic.
				219	It should be able to perform rather complex and precise tests, that can be
				220	specified via an action vector. Otherwise, knowledge about an image structure
				221	allows the fuzzer to generate the pool of all available areas can be fuzzed
				222	and randomly select some of them and so compose its own action vector.
				223	Also complexity of a template defines complexity of the fuzzer, so its
				224	functionality can be varied from simple model-independent fuzzing to smart
				225	model-based one.
				226
				227
				228	Glossary:
				229	--------
				230
				231	Action vector is a sequence of structure elements retrieved from an image
				232	format, each of them will be fuzzed for the test image. It's a subset of
				233	elements of the action pool. Example: header, refcount table, etc.
				234	Action pool is all available elements of an image structure that generated
				235	automatically from an image template.
				236	Image template is a formal description of an image structure and relations
				237	between image blocks.
				238	Test image is an output image of the fuzzer defined by the current seed and
				239	action vector.