Trying and failing to leak memory with Rust and Foundation
Created On:
One of the oldest frameworks on macOS is Foundation. It’s impossible to interact with an Apple provided API that is not built upon Foundation in some way. This API is only exposed via Objective-C which makes calling it from Rust a complicated endeavour. Most of this complexity stems from managing memory and ensuring that memory does not leak or using previously freed memory.
Objective-C has a complicated history of memory management. Initially Objective-C only had manual reference counting. It was expected that programmers use a combination of retain and release to adjust the reference count, and when the last release on the object was run, the Objective-C runtime would free the object. The runtime also contains an Autorelease Pool where objects could be added to the pool with autorelease. Eventually drain could be called on the pool and it would call release on everything added to the pool. This provides a tool that allows some APIs to allocate an object and then immediately call autorelease on it. The caller can choose to call retain it or if the use is temporary, do nothing and expect the object to be freed when the pool is drained as some point in the future.
Complicating matters is a brief use of Garbage Collection with Objective-C on macOS, first available on macOS 10.5 and then later deprecated on 10.8. The goal here was to remove manual the use of releaseretain and autorelease in application code. However it was deprecated and replaced with Automatic Reference Counting where the compiler would infer ownership of objects1 and insertretain and release statements as needed. ARC also introduced special syntax in Objective-C for managing Autorelease Pools, which would ensure that the pool was drained at the end of a scope instead of requiring the programmer to call drain.
Now that ARC is the preferred way of interacting with Foundation APIs in Objective-C, it makes calling Foundation APIs from Rust difficult. The Rust side has to carefully manage calls to release and retain to emulate ARC.
Further Apple’s documentation says the Rust side has to manage an autorelease pool and call drain when using Foundation outside of AppKit or Objective-C and it might not be safe to send objects added to a pool across threads.
Threads
If you are making Cocoa calls outside of the Application Kit’s main thread—for example if you create a Foundation-only application or if you detach a thread—you need to create your own autorelease pool.
If your application or thread is long-lived and potentially generates a lot of autoreleased objects, you should periodically drain and create autorelease pools (like the Application Kit does on the main thread); otherwise, autoreleased objects accumulate and your memory footprint grows. If, however, your detached thread does not make Cocoa calls, you do not need to create an autorelease pool.
With all of this in mind, I would expect a very naive Rust program which converts a Rust String to NSString to leak and observe the leaks using the built in leaks tool from XCode. However, a naive program doesn’t leak as Apple’s documentation implies.
A program that should leak but doesn’t
Lets create a Rust program called a-leaky-bucket which tries to leak memory.
From my previous post on bindgen it’s pretty easy to create some bindings to NSString. For Objective-C related code, it’s important to note that bindgen expects the application to depend on the objc crate which provides macros that the generated code will use.
Generating bindings is pretty straight forward with a wrapper.h that references NSString.h.
Feeding the wrapper.h to bindgen with support for Objective-C gives us the bindings. Due to some bugs in the bindgen heuristics, it’s important to pass --no-derive-copy and --no-derive-debug otherwise the generated bindings will not compile.
To use these bindings the objc crate needs to be added to the Cargo.toml.
Then main.rs can use these bindings to convert a Rust String to NSString.
Running the code results in “a leaky string” printed as expected.
However the leaks tool from XCode does not show any leaks in the code despite the code not making a single release call after the NSString was allocated.
Looking under the hood
Trying to understand this behaviour requires taking a look into Foundation. With rust-lldb it’s pretty straight forward to set some breakpoints and try to understand what is calling release eventually on the created NSString. rust-lldb is distributed along with rustc and cargo so it should be available. We can set a breakpoint at main and once it is hit, break at any Objective-C runtime function with the word release in it.
With the breakpoints set some curious pieces of code were hit. First there is proof that the stringWithUTF8String method calls autorelease on the returned NSString.
The backtrace shows that the autorelease method was called from inside stringWithUTF8String method. I would expect this to fail or leak memory since there is no Autorelease Pool created. However continuing a few times results in a backtrace deep in the Objective-C runtime.
The AutoreleasePoolPage seems to be a C++ class which is the implementation of the autorelease functionality in Objective-C. Since the code does not allocate an Autorelease Pool, this code should not be running.
Fortunately Apple publishes the source of the Objective-C runtime. Searching for AutoreleasePoolPage in the source code reveals the implementation in NSObject.mm. Within this implementation there are two interesting snippets.
It seems that when autorelease is called for the first time and no “page” has been allocated before, a page will be allocated so the object can be added to the page. Since other parts of the code make references to thread local storage, and Apple’s documentation implies Autorelease Pools are thread specific, I assume the pool will be drained when the main thread dies, which explains why leaks does not report any leaks.
Triggering the leak as expected
In the snippet above, there is a hint indicating we can prevent allocating a pool on demand.
The DebugMissingPools is an Objective-C runtime configuration flag that is controlled by the OBJC_DEBUG_MISSING_POOLS environment variable.
Running a-leaky-bucket with this variable set shows the warning message printed out along side with our expected output.
Running our program with leaks and this environment variable set shows the leak as expected.
Conclusion
Despite Apple’s documentation, it is not required to allocate an Autorelease Pool when calling Foundation APIs to prevent memory leaks. However, failure to do so creates a thread local Autorelease Pool, preventing release of the objects in the pool until the thread exits. When calling Foundation APIs from Rust, careful care has to be taken to ensure a pool is allocated and drained appropriately, otherwise memory will not be freed until the thread exits.
Also the OBJC_DEBUG_MISSING_POOLS environment variable will prevent automatically creating pools, allowing for detection of missing Autorelease Pools via leaks.
Apple uses the term “strong” and “weak” instead of owned and borrowed.↩︎