Inbound & Outbound FFI

Foreign Function Interfaces (FFI) are a core mechanism for enabling integration of new languages into existing codebases or building on existing libraries. That said, the term “FFI” is often overloaded in ways that may be unclear or ambiguous, and the area can seem overwhelming to approach. In this post, I explain the two “directions” of FFI, some patterns for how FFI in each direction is handled in Rust and further break down some FFI design approaches.

When we talk about FFI, we talk about it to possible directions: “outbound,” where we’re exposing functionality in our primary language for use in other languages, and “inbound,” where we’re wrapping functionality from other languages to be used in our primary language. Unfortunately, both are often just called “FFI,” which can be confusing.

Inbound FFI in Rust

For new languages, inbound FFI is more common. This is unsurprising, as there already exists a lot of code in other languages that you may want to have access to use. The focus in inbound FFI is wrapping the API provided by the other language (often that language is C, otherwise it’s exposing a C-compatible interface) into something your current language understands, called bindings.

Bindings

Bindings are what make sure the two languages both understand the types and functions in play.

In Rust, that means:

Data types that are marked #[repr(C)], meaning they have the same hardware representation as the equivalent C data type would have.
Functions inside an extern "C" block, with only a function signature provided, telling the compiler that this function exists and will be provided by some other library at link time.

This work can be tedious! Thankfully there’s an official Rust tool called “bindgen” that can usually generate these bindings for you. Bindgen takes a C header file as input and outputs a Rust file defining a matching interface, enabling the two languages to talk to each other.

Idiomatic Wrappers

While this step is often the first when writing inbound FFI in Rust, the second step is crucial. Rust’s safety mechanisms are one of its key selling points, and FFI inherently gives up certain guarantees, as the Rust compiler can’t reason about code in another language. Additionally, API’s in other languages are often not built in a manner which is naturally amenable to the kind of architecture Rust’s rules encourage.

Aside 1: An example of problematic cross-language patterns ↺

The container_of macro from the Linux kernel is a good example of something that is natural in C and would be rejected by Rust. The macro takes in a pointer to a field of a struct, the name of the type of the struct, and the name of the field in the struct, and returns a pointer to the overall container.

In Rust, if you have a reference to a field of a struct, the ability to derive from it a reference to the overall struct would break Rust’s reasoning about struct field borrows being distinct, and so this pattern is not allowed.

If you’re working with an inbound FFI API, and it expects use of a pattern like this, you may find it difficult to use from Rust, and the FFI interface may require more work to make it feel natural in a Rust codebase.

The next step is therefore usually to write a wrapper around the bindings which provides a more natural and idiomatic API for use in Rust. Commonly this pattern manifests with two crates, one called *-sys (replacing * with the name of the original API you’re binding to), and the other without the *-sys suffix.

This second step, of writing the idiomatic bindings, can be hard! If you’ve used Rust much, you’ve probably noticed that Rust’s rules that all data has one owner, and that references may be aliasing or mutable but never both at the same time impact the designs that work comfortably. While you may be able to loosen them with use of interior mutability or reference counting, that may involve trade-offs in code clarity or performance.

Info 1 “Choosing Your Guarantees”

“Choosing Your Guarantees” by Manish Goregaokar is a classic post in the Rust community explaining the different relaxations offered by Rust’s various wrapper types. It’s a little out of date now (written in 2015), but is still a very useful resource.

Outbound FFI in Rust

Outbound FFI in Rust is becoming more common as Rust gains wider use and its key strengths and trade-offs make it the right choice for certain programming contexts. With outbound FFI, you’re exposing Rust code for other languages to use.

In this context, you have two choices for how to do it: invasively or non-invasively.

Invasive Outbound FFI

In invasive outbound FFI, you construct the internal types and functions of your Rust crate to be amenable to direct exposure to other languages. Not all Rust types can be exposed over FFI, so choosing outbound FFI may mean choosing type representations which aren’t natural on the Rust side, but which make the work of exposing the crate to other languages easier.

The tool cbindgen is a popular choice for generating C bindings for Rust code (the outbound FFI version of bindgen), and notes what kinds of types it can directly generated C-equivalents for in the “Supported Types” section of its documentation.

Invasive outbound FFI is an approach that tries to ensure internal Rust types fit the rules of what cbindgen can generate bindings for.

Non-Invasive Outbound FFI

So, non-invasive outbound FFI is therefore an approach that doesn’t do this. In non-invasive outbound FFI, you generally instead choose not to expose Rust types across the FFI boundary, instead exposing opaque pointers, meaning pointers to types where the external language knows the name of the type, but nothing about its contents.

With cbindgen, you can get this type representation as output by having a type which is public, but not #[repr(C)], meaning it’s not guaranteed to have a layout C can understand.

Since the outside language knows nothing about your type, including its fields, a non-invasive FFI approach needs to expose functions which replace the getting, setting, iterating, and other operations the outside language might want to do.

To avoid dangling pointers, this approach also means you’ll need to put your opaque Rust types on the heap and then leak the pointer (transferring ownership to the outside language) via Box::into_raw. This heap allocation may be undesirable in certain contexts, so as always, choosing between invasive and non-invasive designs involves trade-offs.

Reality

In reality, any outbound FFI in Rust is likely to involve a mix of invasive and non-invasive approaches, as appropriate.

Regardless of whether inbound or outbound FFI is used, great care must be taken to ensure safety invariants are maintained. Anyone approaching FFI in Rust ought to read at least the first eight chapters of The Rustomicon, which cover how to reason about unsafety.

Conclusion

FFI is an important area in any language’s design, and Rust is no exception. Hopefully this post has clarified some of what is meant when people talk about FFI, and has usefully outlined approaches one can take in structuring inbound and outbound FFI designs.

Possible Rust

Possible Rust

Categories

Tags