Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 1 | = Tracing = |
| 2 | |
| 3 | == Introduction == |
| 4 | |
| 5 | This document describes the tracing infrastructure in QEMU and how to use it |
| 6 | for debugging, profiling, and observing execution. |
| 7 | |
| 8 | == Quickstart == |
| 9 | |
| 10 | 1. Build with the 'simple' trace backend: |
| 11 | |
Lluís Vilanova | 5b80827 | 2014-05-27 15:02:14 +0200 | [diff] [blame] | 12 | ./configure --enable-trace-backends=simple |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 13 | make |
| 14 | |
Lluís | 03727e6 | 2011-08-31 20:31:45 +0200 | [diff] [blame] | 15 | 2. Create a file with the events you want to trace: |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 16 | |
Lluís | 03727e6 | 2011-08-31 20:31:45 +0200 | [diff] [blame] | 17 | echo bdrv_aio_readv > /tmp/events |
| 18 | echo bdrv_aio_writev >> /tmp/events |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 19 | |
Lluís | 03727e6 | 2011-08-31 20:31:45 +0200 | [diff] [blame] | 20 | 3. Run the virtual machine to produce a trace file: |
| 21 | |
| 22 | qemu -trace events=/tmp/events ... # your normal QEMU invocation |
| 23 | |
| 24 | 4. Pretty-print the binary trace file: |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 25 | |
Chen Fan | 60e17d2 | 2014-07-11 11:24:37 +0800 | [diff] [blame] | 26 | ./scripts/simpletrace.py trace-events trace-* # Override * with QEMU <pid> |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 27 | |
| 28 | == Trace events == |
| 29 | |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 30 | There is a set of static trace events declared in the "trace-events" source |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 31 | file. Each trace event declaration names the event, its arguments, and the |
| 32 | format string which can be used for pretty-printing: |
| 33 | |
Lluís Vilanova | 4b710a3 | 2011-09-20 21:03:48 +0200 | [diff] [blame] | 34 | qemu_vmalloc(size_t size, void *ptr) "size %zu ptr %p" |
| 35 | qemu_vfree(void *ptr) "ptr %p" |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 36 | |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 37 | The "trace-events" file is processed by the "tracetool" script during build to |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 38 | generate code for the trace events. Trace events are invoked directly from |
| 39 | source code like this: |
| 40 | |
| 41 | #include "trace.h" /* needed for trace event prototype */ |
Lluís | 4992604 | 2011-08-31 20:31:10 +0200 | [diff] [blame] | 42 | |
Lluís Vilanova | 4b710a3 | 2011-09-20 21:03:48 +0200 | [diff] [blame] | 43 | void *qemu_vmalloc(size_t size) |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 44 | { |
| 45 | void *ptr; |
Lluís Vilanova | 4b710a3 | 2011-09-20 21:03:48 +0200 | [diff] [blame] | 46 | size_t align = QEMU_VMALLOC_ALIGN; |
| 47 | |
| 48 | if (size < align) { |
| 49 | align = getpagesize(); |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 50 | } |
Lluís Vilanova | 4b710a3 | 2011-09-20 21:03:48 +0200 | [diff] [blame] | 51 | ptr = qemu_memalign(align, size); |
| 52 | trace_qemu_vmalloc(size, ptr); |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 53 | return ptr; |
| 54 | } |
| 55 | |
| 56 | === Declaring trace events === |
| 57 | |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 58 | The "tracetool" script produces the trace.h header file which is included by |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 59 | every source file that uses trace events. Since many source files include |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 60 | trace.h, it uses a minimum of types and other header files included to keep the |
| 61 | namespace clean and compile times and dependencies down. |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 62 | |
| 63 | Trace events should use types as follows: |
| 64 | |
| 65 | * Use stdint.h types for fixed-size types. Most offsets and guest memory |
| 66 | addresses are best represented with uint32_t or uint64_t. Use fixed-size |
| 67 | types over primitive types whose size may change depending on the host |
| 68 | (32-bit versus 64-bit) so trace events don't truncate values or break |
| 69 | the build. |
| 70 | |
| 71 | * Use void * for pointers to structs or for arrays. The trace.h header |
| 72 | cannot include all user-defined struct declarations and it is therefore |
| 73 | necessary to use void * for pointers to structs. |
| 74 | |
| 75 | * For everything else, use primitive scalar types (char, int, long) with the |
| 76 | appropriate signedness. |
| 77 | |
Stefan Hajnoczi | 9a85d39 | 2010-10-05 14:28:50 +0100 | [diff] [blame] | 78 | Format strings should reflect the types defined in the trace event. Take |
| 79 | special care to use PRId64 and PRIu64 for int64_t and uint64_t types, |
Stefan Hajnoczi | 913540a | 2011-09-13 13:34:35 +0100 | [diff] [blame] | 80 | respectively. This ensures portability between 32- and 64-bit platforms. |
Stefan Hajnoczi | 9a85d39 | 2010-10-05 14:28:50 +0100 | [diff] [blame] | 81 | |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 82 | === Hints for adding new trace events === |
| 83 | |
| 84 | 1. Trace state changes in the code. Interesting points in the code usually |
| 85 | involve a state change like starting, stopping, allocating, freeing. State |
| 86 | changes are good trace events because they can be used to understand the |
| 87 | execution of the system. |
| 88 | |
| 89 | 2. Trace guest operations. Guest I/O accesses like reading device registers |
| 90 | are good trace events because they can be used to understand guest |
| 91 | interactions. |
| 92 | |
| 93 | 3. Use correlator fields so the context of an individual line of trace output |
| 94 | can be understood. For example, trace the pointer returned by malloc and |
| 95 | used as an argument to free. This way mallocs and frees can be matched up. |
| 96 | Trace events with no context are not very useful. |
| 97 | |
| 98 | 4. Name trace events after their function. If there are multiple trace events |
| 99 | in one function, append a unique distinguisher at the end of the name. |
| 100 | |
Lluís | 31965ae | 2011-08-31 20:31:24 +0200 | [diff] [blame] | 101 | == Generic interface and monitor commands == |
| 102 | |
Lluís Vilanova | b1bae81 | 2013-03-05 14:47:38 +0100 | [diff] [blame] | 103 | You can programmatically query and control the state of trace events through a |
| 104 | backend-agnostic interface provided by the header "trace/control.h". |
Lluís | 31965ae | 2011-08-31 20:31:24 +0200 | [diff] [blame] | 105 | |
Lluís Vilanova | b1bae81 | 2013-03-05 14:47:38 +0100 | [diff] [blame] | 106 | Note that some of the backends do not provide an implementation for some parts |
| 107 | of this interface, in which case QEMU will just print a warning (please refer to |
| 108 | header "trace/control.h" to see which routines are backend-dependent). |
Lluís | 31965ae | 2011-08-31 20:31:24 +0200 | [diff] [blame] | 109 | |
Lluís Vilanova | b1bae81 | 2013-03-05 14:47:38 +0100 | [diff] [blame] | 110 | The state of events can also be queried and modified through monitor commands: |
Lluís | 31965ae | 2011-08-31 20:31:24 +0200 | [diff] [blame] | 111 | |
| 112 | * info trace-events |
| 113 | View available trace events and their state. State 1 means enabled, state 0 |
| 114 | means disabled. |
| 115 | |
| 116 | * trace-event NAME on|off |
Lluís Vilanova | b1bae81 | 2013-03-05 14:47:38 +0100 | [diff] [blame] | 117 | Enable/disable a given trace event or a group of events (using wildcards). |
Lluís | 31965ae | 2011-08-31 20:31:24 +0200 | [diff] [blame] | 118 | |
Lluís | 23d15e8 | 2011-08-31 20:31:31 +0200 | [diff] [blame] | 119 | The "-trace events=<file>" command line argument can be used to enable the |
| 120 | events listed in <file> from the very beginning of the program. This file must |
| 121 | contain one event name per line. |
| 122 | |
Stefan Hajnoczi | 8f5a0fb | 2012-11-05 08:48:29 +0100 | [diff] [blame] | 123 | If a line in the "-trace events=<file>" file begins with a '-', the trace event |
| 124 | will be disabled instead of enabled. This is useful when a wildcard was used |
| 125 | to enable an entire family of events but one noisy event needs to be disabled. |
| 126 | |
Lluís Vilanova | b1bae81 | 2013-03-05 14:47:38 +0100 | [diff] [blame] | 127 | Wildcard matching is supported in both the monitor command "trace-event" and the |
| 128 | events list file. That means you can enable/disable the events having a common |
| 129 | prefix in a batch. For example, virtio-blk trace events could be enabled using |
| 130 | the following monitor command: |
| 131 | |
| 132 | trace-event virtio_blk_* on |
| 133 | |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 134 | == Trace backends == |
| 135 | |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 136 | The "tracetool" script automates tedious trace event code generation and also |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 137 | keeps the trace event declarations independent of the trace backend. The trace |
| 138 | events are not tightly coupled to a specific trace backend, such as LTTng or |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 139 | SystemTap. Support for trace backends can be added by extending the "tracetool" |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 140 | script. |
| 141 | |
| 142 | The trace backend is chosen at configure time and only one trace backend can |
| 143 | be built into the binary: |
| 144 | |
Lluís Vilanova | 5b80827 | 2014-05-27 15:02:14 +0200 | [diff] [blame] | 145 | ./configure --trace-backends=simple |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 146 | |
| 147 | For a list of supported trace backends, try ./configure --help or see below. |
| 148 | |
| 149 | The following subsections describe the supported trace backends. |
| 150 | |
| 151 | === Nop === |
| 152 | |
| 153 | The "nop" backend generates empty trace event functions so that the compiler |
| 154 | can optimize out trace events completely. This is the default and imposes no |
| 155 | performance penalty. |
| 156 | |
Lluís | dd215f6 | 2011-08-31 20:31:38 +0200 | [diff] [blame] | 157 | Note that regardless of the selected trace backend, events with the "disable" |
| 158 | property will be generated with the "nop" backend. |
| 159 | |
Stefan Hajnoczi | b48c20f | 2011-02-23 14:00:21 +0000 | [diff] [blame] | 160 | === Stderr === |
| 161 | |
| 162 | The "stderr" backend sends trace events directly to standard error. This |
| 163 | effectively turns trace events into debug printfs. |
| 164 | |
| 165 | This is the simplest backend and can be used together with existing code that |
| 166 | uses DPRINTF(). |
| 167 | |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 168 | === Simpletrace === |
| 169 | |
| 170 | The "simple" backend supports common use cases and comes as part of the QEMU |
| 171 | source tree. It may not be as powerful as platform-specific or third-party |
| 172 | trace backends but it is portable. This is the recommended trace backend |
| 173 | unless you have specific needs for more advanced backends. |
| 174 | |
Stefan Hajnoczi | 8f64211 | 2011-09-05 16:31:45 +0100 | [diff] [blame] | 175 | The "simple" backend currently does not capture string arguments, it simply |
| 176 | records the char* pointer value instead of the string that is pointed to. |
| 177 | |
Eiichi Tsukata | e64dd5e | 2013-04-11 20:25:16 +0900 | [diff] [blame] | 178 | === Ftrace === |
| 179 | |
| 180 | The "ftrace" backend writes trace data to ftrace marker. This effectively |
| 181 | sends trace events to ftrace ring buffer, and you can compare qemu trace |
| 182 | data and kernel(especially kvm.ko when using KVM) trace data. |
| 183 | |
| 184 | if you use KVM, enable kvm events in ftrace: |
| 185 | |
| 186 | # echo 1 > /sys/kernel/debug/tracing/events/kvm/enable |
| 187 | |
| 188 | After running qemu by root user, you can get the trace: |
| 189 | |
| 190 | # cat /sys/kernel/debug/tracing/trace |
| 191 | |
| 192 | Restriction: "ftrace" backend is restricted to Linux only. |
| 193 | |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 194 | ==== Monitor commands ==== |
| 195 | |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 196 | * trace-file on|off|flush|set <path> |
| 197 | Enable/disable/flush the trace file or set the trace file name. |
| 198 | |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 199 | ==== Analyzing trace files ==== |
| 200 | |
| 201 | The "simple" backend produces binary trace files that can be formatted with the |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 202 | simpletrace.py script. The script takes the "trace-events" file and the binary |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 203 | trace: |
| 204 | |
Markus Armbruster | 8f44015 | 2013-01-25 16:43:40 +0100 | [diff] [blame] | 205 | ./scripts/simpletrace.py trace-events trace-12345 |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 206 | |
Lluís | 7b92e5b | 2011-04-06 20:33:56 +0200 | [diff] [blame] | 207 | You must ensure that the same "trace-events" file was used to build QEMU, |
Stefan Hajnoczi | 81a97d9 | 2010-06-22 15:07:09 +0100 | [diff] [blame] | 208 | otherwise trace event declarations may have changed and output will not be |
| 209 | consistent. |
| 210 | |
| 211 | === LTTng Userspace Tracer === |
| 212 | |
| 213 | The "ust" backend uses the LTTng Userspace Tracer library. There are no |
| 214 | monitor commands built into QEMU, instead UST utilities should be used to list, |
| 215 | enable/disable, and dump traces. |
Stefan Hajnoczi | b48c20f | 2011-02-23 14:00:21 +0000 | [diff] [blame] | 216 | |
Mohamad Gebai | ef3ef4a | 2014-01-29 22:47:57 -0500 | [diff] [blame] | 217 | Package lttng-tools is required for userspace tracing. You must ensure that the |
| 218 | current user belongs to the "tracing" group, or manually launch the |
| 219 | lttng-sessiond daemon for the current user prior to running any instance of |
| 220 | QEMU. |
| 221 | |
| 222 | While running an instrumented QEMU, LTTng should be able to list all available |
| 223 | events: |
| 224 | |
| 225 | lttng list -u |
| 226 | |
| 227 | Create tracing session: |
| 228 | |
| 229 | lttng create mysession |
| 230 | |
| 231 | Enable events: |
| 232 | |
| 233 | lttng enable-event qemu:g_malloc -u |
| 234 | |
| 235 | Where the events can either be a comma-separated list of events, or "-a" to |
| 236 | enable all tracepoint events. Start and stop tracing as needed: |
| 237 | |
| 238 | lttng start |
| 239 | lttng stop |
| 240 | |
| 241 | View the trace: |
| 242 | |
| 243 | lttng view |
| 244 | |
| 245 | Destroy tracing session: |
| 246 | |
| 247 | lttng destroy |
| 248 | |
| 249 | Babeltrace can be used at any later time to view the trace: |
| 250 | |
| 251 | babeltrace $HOME/lttng-traces/mysession-<date>-<time> |
| 252 | |
Stefan Hajnoczi | b48c20f | 2011-02-23 14:00:21 +0000 | [diff] [blame] | 253 | === SystemTap === |
| 254 | |
| 255 | The "dtrace" backend uses DTrace sdt probes but has only been tested with |
| 256 | SystemTap. When SystemTap support is detected a .stp file with wrapper probes |
| 257 | is generated to make use in scripts more convenient. This step can also be |
| 258 | performed manually after a build in order to change the binary name in the .stp |
| 259 | probes: |
| 260 | |
| 261 | scripts/tracetool --dtrace --stap \ |
| 262 | --binary path/to/qemu-binary \ |
| 263 | --target-type system \ |
Paolo Bonzini | b9a7b74 | 2013-06-04 14:45:26 +0200 | [diff] [blame] | 264 | --target-name x86_64 \ |
Stefan Hajnoczi | b48c20f | 2011-02-23 14:00:21 +0000 | [diff] [blame] | 265 | <trace-events >qemu.stp |
Lluís Vilanova | b7d66a7 | 2011-12-06 17:38:15 +0100 | [diff] [blame] | 266 | |
| 267 | == Trace event properties == |
| 268 | |
| 269 | Each event in the "trace-events" file can be prefixed with a space-separated |
| 270 | list of zero or more of the following event properties. |
| 271 | |
| 272 | === "disable" === |
| 273 | |
| 274 | If a specific trace event is going to be invoked a huge number of times, this |
| 275 | might have a noticeable performance impact even when the event is |
| 276 | programmatically disabled. |
| 277 | |
| 278 | In this case you should declare such event with the "disable" property. This |
| 279 | will effectively disable the event at compile time (by using the "nop" backend), |
| 280 | thus having no performance impact at all on regular builds (i.e., unless you |
| 281 | edit the "trace-events" file). |
| 282 | |
| 283 | In addition, there might be cases where relatively complex computations must be |
| 284 | performed to generate values that are only used as arguments for a trace |
| 285 | function. In these cases you can use the macro 'TRACE_${EVENT_NAME}_ENABLED' to |
| 286 | guard such computations and avoid its compilation when the event is disabled: |
| 287 | |
| 288 | #include "trace.h" /* needed for trace event prototype */ |
| 289 | |
| 290 | void *qemu_vmalloc(size_t size) |
| 291 | { |
| 292 | void *ptr; |
| 293 | size_t align = QEMU_VMALLOC_ALIGN; |
| 294 | |
| 295 | if (size < align) { |
| 296 | align = getpagesize(); |
| 297 | } |
| 298 | ptr = qemu_memalign(align, size); |
| 299 | if (TRACE_QEMU_VMALLOC_ENABLED) { /* preprocessor macro */ |
| 300 | void *complex; |
| 301 | /* some complex computations to produce the 'complex' value */ |
| 302 | trace_qemu_vmalloc(size, ptr, complex); |
| 303 | } |
| 304 | return ptr; |
| 305 | } |
Lluís Vilanova | b1bae81 | 2013-03-05 14:47:38 +0100 | [diff] [blame] | 306 | |
| 307 | You can check both if the event has been disabled and is dynamically enabled at |
| 308 | the same time using the 'trace_event_get_state' routine (see header |
| 309 | "trace/control.h" for more information). |
Lluís Vilanova | 0bb403b | 2014-05-30 14:11:26 +0200 | [diff] [blame] | 310 | |
| 311 | === "tcg" === |
| 312 | |
| 313 | Guest code generated by TCG can be traced by defining an event with the "tcg" |
| 314 | event property. Internally, this property generates two events: |
| 315 | "<eventname>_trans" to trace the event at translation time, and |
| 316 | "<eventname>_exec" to trace the event at execution time. |
| 317 | |
| 318 | Instead of using these two events, you should instead use the function |
| 319 | "trace_<eventname>_tcg" during translation (TCG code generation). This function |
| 320 | will automatically call "trace_<eventname>_trans", and will generate the |
| 321 | necessary TCG code to call "trace_<eventname>_exec" during guest code execution. |
| 322 | |
| 323 | Events with the "tcg" property can be declared in the "trace-events" file with a |
| 324 | mix of native and TCG types, and "trace_<eventname>_tcg" will gracefully forward |
| 325 | them to the "<eventname>_trans" and "<eventname>_exec" events. Since TCG values |
| 326 | are not known at translation time, these are ignored by the "<eventname>_trans" |
| 327 | event. Because of this, the entry in the "trace-events" file needs two printing |
| 328 | formats (separated by a comma): |
| 329 | |
| 330 | tcg foo(uint8_t a1, TCGv_i32 a2) "a1=%d", "a1=%d a2=%d" |
| 331 | |
| 332 | For example: |
| 333 | |
| 334 | #include "trace-tcg.h" |
| 335 | |
| 336 | void some_disassembly_func (...) |
| 337 | { |
| 338 | uint8_t a1 = ...; |
| 339 | TCGv_i32 a2 = ...; |
| 340 | trace_foo_tcg(a1, a2); |
| 341 | } |
| 342 | |
| 343 | This will immediately call: |
| 344 | |
| 345 | void trace_foo_trans(uint8_t a1); |
| 346 | |
| 347 | and will generate the TCG code to call: |
| 348 | |
| 349 | void trace_foo(uint8_t a1, uint32_t a2); |