Steve Klabnik recently wrote about whether out parameters are idiomatic in Rust. The post
ends by showing a snippet of code: a generic function, with a non-generic function inside of
it which contains the actual implementation. Steve says this pattern may warrant its own post,
so here is that post, where I’ll explain why this inner function is useful, discuss the
trade-offs of doing it, and describe why this pattern will hopefully not be necessary in
the future.
The snippet Steve showed is this one:
Notice that the outer function is generic, taking anything which is convertible via
the AsRef trait into a &Path, while the inner function is not generic, only taking in
a &Path. The outer function does nothing but perform the as_ref() conversion and then
pass the result to the inner function.
So why have two functions at all?
Well, Rust generics are monomorphized at compile time. That means the compiler identifies
all the concrete types which any generic functions or types are called with throughout the
codebase, and generates copies of the generic code, specialized to the concrete types. So, for
example, if read_to_string() were called with a PathBuf, the compiler would generate a copy
of read_to_string() which takes a PathBuf specifically. This is how we can have code that is
both generic and fast to call (not involving any runtime checking to identify the implementation
to use).
We can imagine this monomorphization looks like this:
While the generated code doesn’t look too different from the original code, consider that some generic functions
may be called with a several different concrete types throughout a codebase, with each concrete type causing the
generation of a complete copy of the original generic code. This creates a tension and a trade-off between
compilation speed and the size of the resulting binary on one side and generic, reusable code on the other side.
To explain: Rust compilation is complicated and slow compilation times can have any number of causes, but code
generation is consistently identified as one of the slowest compilation phases. Monomorphization is one form of code
generation in Rust, so using generic code more heavily, and thus causing the generation of more concrete copies of a
generic function or type, contributes to a crate compiling slowly. Those copies are also done in their entirety,
meaning you may end up with a lot of repeated code in a binary, one for each version of the original, bloating
the resulting executable file.
However, the ability to be generic and reusable is valuable, and taking in types like AsRef<Path> is often in
principle preferable to taking in a &Path (this is why we have the standard conversion traits like AsRef or Into
in the first place).
So, what do you do?
Well, you can do what Steve does in his example. Most of the function body isn’t generic, the generic is simply to
be more flexible in the input type, which the function then converts into a single concrete type. So, you can create
a separate function which takes the converted-to type, and then have the generic function only perform the conversion
and then call the non-generic function. This way, the time spent generating code and the size of the generic code in the
resulting binary are both reduced.
You could do this pattern with a function outside of the original like so:
However, this does pollute the namespace with a function which will only ever have one caller. The one question
before putting this inner function inside of read_to_string is: if the inner function is inside the generic
function, doesn’t it still contribute to code bloat?
The answer is no, and to prove it, let’s turn to the Rust Playground and its ability to generate MIR (Rust’s
Mid-level Internal Representation, which is converted into LLVM IR and then into machine code). The MIR for
read_to_string in the above full code example looks like this:
This is a generated textual representation of MIR’s internal structure, so it may be a bit hard to read, but
this shows the function doing the conversion (inside the bb0 section) and calling the inner function (inside
the bb2 section). All the code which actually performs the file operations is contained inside the inner
function, which is split out and does not contribute to the size of the outer function in the generated code.
So, placing inner inside of the generic function limits its scope to only the callsite that needs it,
without harming our goal of reducing compile time and the size of the resulting binary.
The final question to ask in this situation is why the Rust compiler doesn’t just perform this optimization itself?
In an ideal world, the Rust compiler would see that most of the function isn’t generic, or that the parameters
are generic over one of the standard conversion trait, and would create an inner non-generic function and update
the original function to perform the conversion and call the inner function.
That’s not the reality right now. The compiler isn’t smart enough to do this, although it could in the future. If
it did, everyone could get the benefit of this pattern without having to remember to it themselves, which would be lovely.
So, to summarize:
The non-generic inner function pattern exists to reduce compile time and code bloat due to monomorphization
of certain sorts of generic functions.
The inner function ought to be placed inside the generic function, which reduces its scope to only the relevant location
without causing any problems.
Eventually, it would be nice if the Rust compiler could do this all automatically.
Andrew Lilley Brinker
Andrew works on software supply chain security by day and writes educational Rust posts by night. He started Possible Rust in 2020 and can usually be found on Twitter.
Possible Rust succeeds when more people can join in learning about Rust! Please take a moment to share, especially if you have questions or if you disagree!