User Statically-Defined Tracing (USDT) probes¶
Sometimes it’s desired to troubleshoot one of OVS’s components in the field. One of the techniques used is to add dynamic tracepoints, for example using perf. However, the desired dynamic tracepoint and/or the desired variable, might not be available due to compiler optimizations.
In this case, a well-thought-off, static tracepoint could be permanently added, so it’s always available. For OVS we use the DTrace probe macro’s, which have little to no overhead when disabled. Various tools exist to enable them. See some examples below.
Compiling with USDT probes enabled¶
Since USDT probes are compiled out by default, a compile-time option is available to include them. To add the probes to the generated code, use the following configure option
$ ./configure --enable-usdt-probes
The following line should be seen in the configure output when the above option is used
checking whether USDT probes are enabled... yes
As USDT probes internally use the DTRACE_PROBExx
macros, which are part of
the SystemTap framework, you need to install the appropriate package for your
Linux distribution. For example, on Fedora, you need to install the
systemtap-sdt-devel
package.
Listing available probes¶
There are various ways to display USDT probes available in a userspace application. Here we show three examples. All assuming ovs-vswitchd is in the search path with USDT probes enabled:
You can use the perf tool as follows
$ perf buildid-cache --add $(which ovs-vswitchd)
$ perf list | grep sdt_
sdt_main:poll_block [SDT event]
sdt_main:run_start [SDT event]
You can use the bpftrace tool
# bpftrace -l "usdt:$(which ovs-vswitchd):*"
usdt:/usr/sbin/ovs-vswitchd:main:poll_block
usdt:/usr/sbin/ovs-vswitchd:main:run_start
Note
If you execute this on a running process,
bpftrace -lp $(pidof ovs-vswitchd) "usdt:*"
, it will list all USDT
events, i.e., also the ones available in the used shared libraries.
Finally, you can use the tplist tool which is part of the bcc framework
$ /usr/share/bcc/tools/tplist -vv -l $(which ovs-vswitchd)
b'main':b'poll_block' [sema 0x0]
location #1 b'/usr/sbin/ovs-vswitchd' 0x407fdc
b'main':b'run_start' [sema 0x0]
location #1 b'/usr/sbin/ovs-vswitchd' 0x407ff6
Using probes¶
We will use the OVS sandbox environment in combination with the probes shown
above to try out some of the available trace tools. To start up the virtual
environment use the make sandbox
command. In addition we have to create
a bridge to kick of the main bridge loop
$ ovs-vsctl add-br test_bridge
$ ovs-vsctl show
055acdca-2f0c-4f6e-b542-f4b6d2c44e08
Bridge test_bridge
Port test_bridge
Interface test_bridge
type: internal
perf¶
Perf is using Linux uprobe based event tracing to for capturing the events. To enable the main:* probes as displayed above and take an actual trace, you need to execute the following sequence of perf commands
# perf buildid-cache --add $(which ovs-vswitchd)
# perf list | grep sdt_
sdt_main:poll_block [SDT event]
sdt_main:run_start [SDT event]
# perf probe --add=sdt_main:poll_block --add=sdt_main:run_start
Added new events:
sdt_main:poll_block (on %poll_block in /usr/sbin/ovs-vswitchd)
sdt_main:run_start (on %run_start in /usr/sbin/ovs-vswitchd)
You can now use it in all perf tools, such as:
perf record -e sdt_main:run_start -aR sleep 1
# perf record -e sdt_main:run_start -e sdt_main:poll_block \
-p $(pidof ovs-vswitchd) sleep 30
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (132 samples) ]
# perf script
ovs-vswitchd 8576 [011] 21031.340433: sdt_main:run_start: (407ff6)
ovs-vswitchd 8576 [011] 21031.340516: sdt_main:poll_block: (407fdc)
ovs-vswitchd 8576 [011] 21031.841726: sdt_main:run_start: (407ff6)
ovs-vswitchd 8576 [011] 21031.842088: sdt_main:poll_block: (407fdc)
...
Note that the above examples works with the sandbox environment, so make sure you execute the above command while in the sandbox shell!
There are a lot more options available with perf, for example, the
--call-graph dwarf
option, which would give you a call graph in the
perf script
output. See the perf documentation for more information.
One other interesting feature is that the perf data can be converted for use
by the trace visualizer Trace Compass. This can be done using the
--all --to-ctf
option to the perf data convert
tool.
bpftrace¶
bpftrace is a high-level tracing language based on eBPF, which can be used to script USDT probes. Here we will show a simple one-liner to display the USDT probes being hit. However, the script section below reference some more advanced bpftrace scripts.
This is a simple bpftrace one-liner to show all main:*
USDT probes
# bpftrace -p $(pidof ovs-vswitchd) -e \
'usdt::main:* { printf("%s %u [%u] %u %s\n",
comm, pid, cpu, elapsed, probe); }'
Attaching 2 probes...
ovs-vswitchd 8576 [11] 203833199 usdt:main:run_start
ovs-vswitchd 8576 [11] 204086854 usdt:main:poll_block
ovs-vswitchd 8576 [11] 221611985 usdt:main:run_start
ovs-vswitchd 8576 [11] 221892019 usdt:main:poll_block
bcc¶
The BPF Compiler Collection (BCC) is a set of tools and scripts that also use
eBPF for tracing. The example below uses the trace
tool to show the events
while they are being generated
# /usr/share/bcc/tools/trace -T -p $(pidof ovs-vswitchd) \
'u::main:run_start' 'u::main:poll_block'
TIME PID TID COMM FUNC
15:49:06 8576 8576 ovs-vswitchd main:run_start
15:49:06 8576 8576 ovs-vswitchd main:poll_block
15:49:06 8576 8576 ovs-vswitchd main:run_start
15:49:06 8576 8576 ovs-vswitchd main:poll_block
^C
Scripts¶
To not have to re-invent the wheel when trying to debug complex OVS issues, a
set of scripts are provided in the source repository. They are located in the
utilities/usdt-scripts/
directory, and each script contains detailed
information on how they should be used, and what information they provide.
Available probes¶
The next sections describes all the available probes, their use case, and if used in any script, which one. Any new probes being added to OVS should get their own section. See the below “Adding your own probes” section for the used naming convention.
Available probes in ovs_vswitchd
:
dpif_netlink_operate__:op_flow_del
dpif_netlink_operate__:op_flow_execute
dpif_netlink_operate__:op_flow_get
dpif_netlink_operate__:op_flow_put
dpif_recv:recv_upcall
main:poll_block
main:run_start
revalidate:flow_result
revalidate_ukey__:entry
revalidate_ukey__:exit
revalidator_sweep__:flow_result
udpif_revalidator:start_dump
udpif_revalidator:sweep_done
dpif_netlink_operate__:op_flow_del¶
Description:
This probe gets triggered when the Netlink datapath is about to execute the
DPIF_OP_FLOW_DEL operation as part of the dpif operate()
callback.
Arguments:
arg0:
(struct dpif_netlink *) dpif
arg1:
(struct dpif_flow_del *) del
arg2:
(struct dpif_netlink_flow *) flow
arg3:
(struct ofpbuf *) aux->request
Script references:
utilities/usdt-scripts/dpif_op_nl_monitor.py
dpif_netlink_operate__:op_flow_execute¶
Description:
This probe gets triggered when the Netlink datapath is about to execute the
DPIF_OP_FLOW_EXECUTE operation as part of the dpif operate()
callback.
Arguments:
arg0:
(struct dpif_netlink *) dpif
arg1:
(struct dpif_execute *) op->execute
arg2:
dp_packet_data(op->execute.packet)
arg3:
dp_packet_size(op->execute.packet)
arg4:
(struct ofpbuf *) aux->request
Script references:
utilities/usdt-scripts/dpif_op_nl_monitor.py
utilities/usdt-scripts/upcall_cost.py
dpif_netlink_operate__:op_flow_get¶
Description:
This probe gets triggered when the Netlink datapath is about to execute the
DPIF_OP_FLOW_GET operation as part of the dpif operate()
callback.
Arguments:
arg0:
(struct dpif_netlink *) dpif
arg1:
(struct dpif_flow_get *) get
arg2:
(struct dpif_netlink_flow *) flow
arg3:
(struct ofpbuf *) aux->request
Script references:
utilities/usdt-scripts/dpif_op_nl_monitor.py
dpif_netlink_operate__:op_flow_put¶
Description:
This probe gets triggered when the Netlink datapath is about to execute the
DPIF_OP_FLOW_PUT operation as part of the dpif operate()
callback.
Arguments:
arg0:
(struct dpif_netlink *) dpif
arg1:
(struct dpif_flow_put *) put
arg2:
(struct dpif_netlink_flow *) flow
arg3:
(struct ofpbuf *) aux->request
Script references:
utilities/usdt-scripts/dpif_op_nl_monitor.py
utilities/usdt-scripts/upcall_cost.py
probe dpif_recv:recv_upcall¶
Description:
This probe gets triggered when the datapath independent layer gets notified
that a packet needs to be processed by userspace. This allows the probe to
intercept all packets sent by the kernel to ovs-vswitchd
. The
upcall_monitor.py
script uses this probe to display and capture all packets
sent to ovs-vswitchd
.
Arguments:
arg0:
(struct dpif *)->full_name
arg1:
(struct dpif_upcall *)->type
arg2:
dp_packet_data((struct dpif_upcall *)->packet)
arg3:
dp_packet_size((struct dpif_upcall *)->packet)
arg4:
(struct dpif_upcall *)->key
arg5:
(struct dpif_upcall *)->key_len
Script references:
utilities/usdt-scripts/upcall_cost.py
utilities/usdt-scripts/upcall_monitor.py
probe main:run_start¶
Description:
The ovs-vswitchd’s main process contains a loop that runs every time some work
needs to be done. This probe gets triggered every time the loop starts from the
beginning. See also the main:poll_block
probe below.
Arguments:
None
Script references:
utilities/usdt-scripts/bridge_loop.bt
probe main:poll_block¶
Description:
The ovs-vswitchd’s main process contains a loop that runs every time some work
needs to be done. This probe gets triggered every time the loop is done, and
it’s about to wait for being re-started by a poll_block() call returning.
See also the main:run_start
probe above.
Arguments:
None
Script references:
utilities/usdt-scripts/bridge_loop.bt
revalidate_ukey__:entry¶
Description:
This probe gets triggered on entry of the revalidate_ukey__() function.
Arguments:
arg0:
(struct udpif *) udpif
arg1:
(struct udpif_key *) ukey
arg2:
(uint16_t) tcp_flags
arg3:
(struct ofpbuf *) odp_actions
arg4:
(struct recirc_refs *) recircs
arg5:
(struct xlate_cache *) xcache
Script references:
utilities/usdt-scripts/reval_monitor.py
revalidate_ukey__:exit¶
Description:
This probe gets triggered right before the revalidate_ukey__() function exits.
Arguments:
arg0:
(struct udpif *) udpif
arg1:
(struct udpif_key *) ukey
arg2:
(enum reval_result) result
Script references:
None
udpif_revalidator:start_dump¶
Description:
The ovs-vswitchd’s revalidator process contains a loop that runs every time revalidation work is needed. This probe gets triggered every time the dump phase has started.
Arguments:
arg0:
(struct udpif *) udpif
arg1:
(size_t) n_flows
Script references:
utilities/usdt-scripts/reval_monitor.py
udpif_revalidator:sweep_done¶
Description:
The ovs-vswitchd’s revalidator process contains a loop that runs every time revalidation work is needed. This probe gets triggered every time the sweep phase was completed.
Arguments:
arg0:
(struct udpif *) udpif
arg1:
(size_t) n_flows
arg2:
(unsigned) MIN(ofproto_max_idle, ofproto_max_revalidator)
Script references:
utilities/usdt-scripts/reval_monitor.py
probe revalidate:flow_result¶
Description: This probe is triggered when the revalidator has executed on a particular flow key to make a determination whether to evict a flow, and the cause for eviction. The revalidator runs periodically, and this probe will only be triggered when a flow is flagged for revalidation.
Arguments:
arg0:
(struct udpif *) udpif
arg1:
(struct udpif_key *) ukey
arg2:
(enum reval_result) result
arg3:
(enum flow_del_reason) del_reason
Script references:
utilities/usdt-scripts/flow_reval_monitor.py
probe revalidator_sweep__:flow_result¶
Description: This probe is placed in the path of the revalidator sweep, and is executed under the condition that a flow entry is in an unexpected state, or the flows were asked to be purged due to a user action.
Arguments:
arg0:
(struct udpif *) udpif
arg1:
(struct udpif_key *) ukey
arg2:
(enum reval_result) result
arg3:
(enum flow_del_reason) del_reason
Script references:
utilities/usdt-scripts/flow_reval_monitor.py
Adding your own probes¶
Adding your own probes is as simple as adding the OVS_USDT_PROBE()
macro
to the code. It’s similar to the DTRACE_PROBExx
macro’s with the difference
that it does automatically determine the number of optional arguments.
The macro requires at least two arguments. The first one being the provider,
and the second one being the name. To keep some consistency with the probe
naming, please use the following convention. The provider should be the
function name, and the name should be the name of the tracepoint. If you do
function entry and exit like probes, please use entry
and exit
.
If, for some reason, you do not like to use the function name as a provider,
please prefix it with __
, so we know it’s not a function name.
The remaining parameters, up to 10, can be variables, pointers, etc., that might be of interest to capture at this point in the code. Note that the provided variables can cause the compiler to be less effective in optimizing.