On Tue, 11 Feb 2020, David Fifield wrote:
I did a build of tor-browser-build build and it took even longer than I remember it taking -- from Friday until Tuesday, roughly. This is on a VPS with six 2 GHz CPUs and 16 GB RAM. Some projects I know take a long time to build, like rust and firefox, but it seemed like most of the time was being spent in the "rbm" process, not in building the projects.
I don't know much about profiling Perl, but I tried a tool called NYTProf that produces HTML output: https://people.torproject.org/~dcf/graphs/nytprof-rbm-20200212-94fce31e/
Thanks, I think this profile can help find a few things to improve.
The above NYTProf output is not from a full build. It's from modifying projects/snowflake/config only and then building snowflake for only one target; still it took almost an hour. perl -d:NYTProf rbm/rbm build snowflake --target testbuild --target torbrowser-linux-x86_64 nytprofhtml
Was it a new build of snowflake where some of the dependencies had to be built, or a build where only snowflake had to be built?
In the first case, I think part of the reason why it takes time is that snowflake has many small dependencies, and each one is built in a separate container, so some time is spent extracting/removing chroots.
If only snowflake and no other dependencies was built, then I don't know why it took almost one hour. In my case, adding a blank line to projects/snowflake/build and rebuilding it takes around 8 minutes.
To my untrained eye it looks like there is some kind of exponential recursion happening around the RBM::config_p function. See here, where in building one project the function was called 89 million times: https://people.torproject.org/~dcf/graphs/nytprof-rbm-20200212-94fce31e/#sub... Clicking through to the per-line view, we see that 45 million of those calls were from RBM::get_target, and most of the rest were from RBM::config: https://people.torproject.org/~dcf/graphs/nytprof-rbm-20200212-94fce31e/root...
You know how, if you write a Fibonacci number function and don't memoize the recursive calls, it ends up re-computing the same subtrees over and over and doing exponential work? def fib(n): if n == 0 or n == 1: return 1 return fib(n - 1) + fib(n - 2) It feels like something like that is happening here. There's a pattern where RBM::process_template calls a chain of functions that reaches RBM::config_p, which eventually calls RBM::process_template again. It's not an infinite loop, but it seems to be doing the same work many times. I looked at it for a bit but my Perl isn't strong enough to understand specifically what's happening.
Notice how there are three loops of RBM::process_template before doing an RBM::config_p that calls back into further RBM::process_template.
Yes, there is a lot of recursive calls.
For example the filename in projects/tor-browser/config is defined like this:
filename: 'tor-browser-[% c("version") %]-[% c("var/osname") %]-[% c("var/build_id") %]'
computing this filename will require computing var/build_id, which is defined like this:
build_id: '[% sha256(c("var/build_id_txt", { buildconf => { num_procs => 4 } })).substr(0, 6) %]' build_id_txt: | [% c("version") %] [% IF c("git_hash") || c("hg_hash"); GET c("abbrev"); END; %] [% IF c("var/container/use_container") && ! c("var/container/global_disable") -%] [% c("var/container/suite") %] [% c("var/container/arch") %] [% END -%] input_files: [% c("input_files_id") %] build: [% c("build", { filename => 'f', output_dir => '/out' }) %]
so this require recursively computing filenames of all other projects, which requires computing sha256sum of all input files.
Maybe there is some caching we can add to avoid computing some things multiple times. However in some cases we cannot avoid computing almost the same thing twice because we cannot know if an option can change the result, for example with the following two lines:
filename: 'name-[% c("version", { option_1 => 1 }) %]' filename: 'name-[% c("version", { option_1 => 2 }) %]'
Some more evidence is that while I was waiting for rbm to do something, I was watching the modification times of the git_clones: ls -ltrhd git_clones/*/.git | tail -n 15 It was constantly touching all different projects, but re-touching the same ones over and over.
I opened a ticket for this: https://trac.torproject.org/projects/tor/ticket/33283
I think this patch should avoid doing multiple `git checkout` and `git log` on the same projects: https://gitweb.torproject.org/user/boklm/rbm.git/commit/?h=bug_33283&id=...
When running this:
rbm/rbm showconf tor-browser filename --target torbrowser-linux-x86_64 --target alpha
in my case the patch improves the time to run this from 3m25 to 2m49.
An other possible improvement is in the function input_file_need_dl where we can avoid computing multiple times the sha256sum of the same files.
Nicolas