- Renamed struct_.py -> structs.py again.
- Removed lfs.csv, instead prefering script specific csv files.
- Added *-diff make rules for quick comparison against a previous
result, results are now implicitly written on each run.
For example, `make code` creates lfs.code.csv and prints the summary, which
can be followed by `make code-diff` to compare changes against the saved
lfs.code.csv without overwriting.
- Added nargs=? support for -s and -S, now uses a per-result _sort
attribute to decide sort if fields are unspecified.
- Fixed added/removed count in scripts when an entry has no field in
the expected results
- Fixed a python-sort-type issue when by-field is missing in a result
- Changed --(tool)-tool to --(tool)-path in scripts, this seems to be
a more common name for this sort of flag.
- Changed BUILDDIR to not have implicit slash, makes Makefile internals
a bit more readable.
- Fixed some outdated names hidden in less-often used ifdefs.
Based loosely on Linux's perf tool, perfbd.py uses trace output with
backtraces to aggregate and show the block device usage of all functions
in a program, propagating block devices operation cost up the backtrace
for each operation.
This combined with --trace-period and --trace-freq for
sampling/filtering trace events allow the bench-runner to very
efficiently record the general cost of block device operations with very
little overhead.
Adopted this as the default side-effect of make bench, replacing
cycle-based performance measurements which are less important for
littlefs.
This adds -P/--propagate and -Z/--depth to perf.py for showing recursive
results, making it easy to narrow down on where spikes in performance
come from.
This ended up being a bit different from stack.py's recursive results,
as we end up with different (diminishing) numbers as we descend.
This provides 2 things:
1. perf integration with the bench/test runners - This is a bit tricky
with perf as it doesn't have its own way to combine perf measurements
across multiple processes. perf.py works around this by writing
everything to a zip file, using flock to synchronize. As a plus, free
compression!
2. Parsing and presentation of perf results in a format consistent with
the other CSV-based tools. This actually ran into a surprising number of
issues:
- We need to process raw events to get the information we want, this
ends up being a lot of data (~16MiB at 100Hz uncompressed), so we
paralellize the parsing of each decompressed perf file.
- perf reports raw addresses post-ASLR. It does provide sym+off which
is very useful, but to find the source of static functions we need to
reverse the ASLR by finding the delta the produces the best
symbol<->addr matches.
- This isn't related to perf, but decoding dwarf line-numbers is
really complicated. You basically need to write a tiny VM.
This also turns on perf measurement by default for the bench-runner, but at a
low frequency (100 Hz). This can be decreased or removed in the future
if it causes any slowdown.
The main change is requiring field names for -b/-f/-s/-S, this
is a bit more powerful, and supports hidden extra fields, but
can require a bit more typing in some cases.
- Added the littlefs license note to the scripts.
- Adopted parse_intermixed_args everywhere for more consistent arg
handling.
- Removed argparse's implicit help text formatting as it does not
work with perse_intermixed_args and breaks sometimes.
- Used string concatenation for argparse everywhere, uses backslashed
line continuations only works with argparse because it strips
redundant whitespace.
- Consistent argparse formatting.
- Consistent openio mode handling.
- Consistent color argument handling.
- Adopted functools.lru_cache in tracebd.py.
- Moved unicode printing behind --subscripts in traceby.py, making all
scripts ascii by default.
- Renamed pretty_asserts.py -> prettyasserts.py.
- Renamed struct.py -> struct_.py, the original name conflicts with
Python's built in struct module in horrible ways.
With more scripts generating CSV files this moves most CSV manipulation
into summary.py, which can now handle more or less any arbitrary CSV
file with arbitrary names and fields.
This also includes a bunch of additional, probably unnecessary, tweaks:
- summary.py/coverage.py use a custom fractional type for encoding
fractions, this will also be used for test counts.
- Added a smaller diff output for size scripts with the --percent flag.
- Added line and hit info to coverage.py's CSV files.
- Added --tree flag to stack.py to show only the call tree without
other noise.
- Renamed structs.py to struct.py.
- Changed a few flags around for consistency between size/summary scripts.
- Added `make sizes` alias.
- Added `make lfs.code.csv` rules
These scripts can't easily share the common logic, but separating
field details from the print/merge/csv logic should make the common
part of these scripts much easier to create/modify going forward.
This also tweaked the behavior of summary.py slightly.
A small mistake in test.py's control flow meant the failing test job
would succesfully kill all other test jobs, but then humorously start
up a new process to continue testing.
Using errors=replace in python utf-8 decoding makes these scripts more
resilient to underlying errors, rather than just throwing an unhelpfully
generic decode error.
A full summary of static measurements (code size, stack usage, etc) can now
be found with:
make summary
This is done through the combination of a new ./scripts/summary.py
script and the ability of existing scripts to merge into existing csv
files, allowing multiple results to be merged either in a pipeline, or
in parallel with a single ./script/summary.py call.
The ./scripts/summary.py script can also be used to quickly compare
different builds or configurations. This is a proper implementation
of a similar but hacky shell script that has already been very useful
for making optimization decisions:
$ ./scripts/structs.py new.csv -d old.csv --summary
name (2 added, 0 removed) code stack structs
TOTAL 28648 (-2.7%) 2448 1012
Also some other small tweaks to scripts:
- Removed state saving diff rules. This isn't the most useful way to
handle comparing changes.
- Added short flags for --summary (-Y) and --files (-F), since these
are quite often used.
- Added -L/--depth argument to show dependencies for scripts/stack.py,
this replaces calls.py
- Additional internal restructuring to avoid repeated code
- Removed incorrect diff percentage when there is no actual size
- Consistent percentage rendering in test.py
This required a patch to the --diff flag for the scripts to ignore
a missing file. This enables the useful one liner for making comparisons
with potentially missing previous versions:
./scripts/code.py lfs.o -d lfs.o.code.csv -o lfs.o.code.csv
function (0 added, 0 removed) old new diff
TOTAL 25476 25476 +0
One downside, these previous files are easy to delete as a part of make
clean, which limits their usefulness for comparing configuration
changes...
Note this detects loops (recursion), and renders this as infinity.
Currently littlefs does have a single recursive function and you can see
how this infects the full call graph. Eventually this should be removed.
This is to avoid unexpected script behavior even though data.py should
always return 0 bytes for littlefs. Maybe a check for this should be
added to CI?