Sergey Kornienko (@b1thvn_) of PixiePoint Security

The Basics

Disclosure or Patch Date: April 12, 2022

Product: Microsoft Windows

Advisory: https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-24521

Affected Versions: Before security updates of April 12, 2022, for Windows 7, 8.1, 10, 11 and Windows Server 2008, 2012, 2016, 2019, 2022

First Patched Version: Security updates of April 12, 2022, for CVE-2022-24521

Issue/Bug Report: N/A

Patch CL: N/A

Bug-Introducing CL: N/A

Reporter(s): National Security Agency, Adam Podlosky and Amir Bazine of Crowdstrike

The Code

Proof-of-concept: N/A

Exploit sample: N/A

Did you have access to the exploit sample when doing the analysis? No

The Vulnerability

Bug class: Logical error (lack of indirect-call validation)

Vulnerability details:

As per the CLFS format, the array of signatures intersects with the container or client context.

When the log block is encoded, sector's bytes from SIG_* are transferred to an array, pointed by SignaturesOffset. While decoding, these bytes are written back to their initial location. If we'll construct the base log record in a way that the container context and the signature array will be close to each other and then copy context's bytes to SIG_0 ... SIG_X, encode and decode operation will not corrupt the container context. Moreover, all the data modified between encoding and decoding will be restored.

Now let's assume that container context is modified in memory (PCLFS_CONTAINER_CONTEXT->pContainer is zeroed). We searched for a while where it is actually used and this led us to CClfsBaseFilePersisted::RemoveContainer which can be called directly from LoadContainerQ:

__int64 __fastcall CClfsBaseFilePersisted::RemoveContainer(CClfsBaseFilePersisted *this, unsigned int a2)
{
...
		v11 = CClfsBaseFilePersisted::FlushImage((PERESOURCE *)this);
		v9 = v11;
		v16 = v11;
		if ( v11 >= 0 )
		{
		pContainer = *((_QWORD *)containerContext + 3);
		if ( pContainer )
		{
			*((_QWORD *)containerContext + 3) = 0i64;
			ExReleaseResourceForThreadLite(*((PERESOURCE *)this + 4), (ERESOURCE_THREAD)KeGetCurrentThread());
			v4 = 0;
			(*(void (__fastcall **)(__int64))(*(_QWORD *)pContainer + 0x18i64))(pContainer); // remove method
			(*(void (__fastcall **)(__int64))(*(_QWORD *)pContainer + 8i64))(pContainer); // release method
			v9 = v16;
			goto LABEL_20;
		}
		goto LABEL_19;
		}
...
}

To ensure that the user cannot pass any FAKE_pContainer pointer to the kernel, before any indirect call this field is set to zero:

v44 = *((_DWORD *)containerContext + 5); // to trigger RemoveContainer one should set this field to -1
if ( v44 == -1 )
{
	*((_QWORD *)containerContext + 3) = 0i64; // pContainer is set to NULL
	v20 = CClfsBaseFilePersisted::RemoveContainer(this, v34);
	v72 = v20;
	if ( v20 < 0 )
		goto LABEL_134;
	v23 = v78;
	v34 = (unsigned int)(v34 + 1);
	v79 = v34;
}

Everything goes as planned until there is no logical issue described above. To understand it better lets look inside the call chain CClfsBaseFilePersisted::FlushImage -> CClfsBaseFilePersisted::WriteMetadataBlock which is in RemoveContainer. The information associated with the deleted container should be also removed from the linked structures and this is done with the following code:

...
// Obtain all container contexts represented in blf
// save pContainer class pointer for each valid container context
for ( i = 0; i < 0x400; ++i )
{
v20 = CClfsBaseFile::AcquireContainerContext(this, i, &v22);
v15 = (char *)this + 8 * i;
if ( v20 >= 0 )
{
	v16 = v22;
	*((_QWORD *)v15 + 56) = *((_QWORD *)v22 + 3); // for each valid container save pContainer
	*((_QWORD *)v16 + 3) = 0i64; // and set the initial pContainer to zero
	CClfsBaseFile::ReleaseContainerContext(this, &v22);
}
else
{
	*((_QWORD *)v15 + 56) = 0i64;
}
}
// Stage [1] enode block, prepare it for writing
ClfsEncodeBlock(
(struct _CLFS_LOG_BLOCK_HEADER *)v9,
*(unsigned __int16 *)(v9 + 4) << 9,
*(_BYTE *)(v9 + 2),
0x10u,
1u);
// write modified data
v10 = CClfsContainer::WriteSector(
		*((CClfsContainer **)this + 19),
		*((struct _KEVENT **)this + 20),
		0i64,
		*(void **)(*((_QWORD *)this + 6) + 24 * v8),
		*(unsigned __int16 *)(v9 + 4),
		&v23);
...
if ( v7 )
{
// Stage [2] Decode file again for futher processing in clfs.sys
ClfsDecodeBlock((struct _CLFS_LOG_BLOCK_HEADER *)v9, *(unsigned __int16 *)(v9 + 4), *(_BYTE *)(v9 + 2), 0x10u, &v21);
// optain new pContainer class pointer
v17 = (_QWORD *)((char *)this + 448);
do
{
	// Stage [3] for each valid container
	// update pContainer field
	if ( *v17 && (int)CClfsBaseFile::AcquireContainerContext(this, v6, &v22) >= 0 )
	{
	*((_QWORD *)v22 + 3) = *v17;
	CClfsBaseFile::ReleaseContainerContext(this, &v22);
	}
	++v6;
	++v17;
}
while ( v6 < 0x400 );
}
...

When the operation begins, pContainer is set to zero. During Stage [1] the information is encoded -> bytes from each sector are written to their location -> we restore the zeroed field with the information we provide from the user mode. The only issue is to make CClfsBaseFile::AcquireContainerContext fail at Stage [3] (rather easy to do). If everything is done, we'll be able to pass any address to an indirect call chain inside CClfsBaseFilePersisted::RemoveContainer which leads to the direct RIP control.

Patch analysis:

The patch diffing of CLFS.sys reveals eight changed and two new functions. Of these, new logical block has been added to the LoadContainerQ function:

...
containerArray = (_DWORD *)((char *)BaseLogRecord + 0x328); // *CLFS_CONTAINER_CONTEXT->rgContainers
...
v22 = CClfsBaseFile::ContainerCount(this);
...
while ( containerIndex < 0x400 )
{
	v17 = (CClfsContainer *)containerIndex;
	if ( containerArray[containerIndex] )
	++v24;
	v89 = ++containerIndex;
}
...
if ( v24 == v22 )
{
	if ( (unsigned int)Feature_Servicing_38197806__private_IsEnabled() )
	{
	v25 = (_OWORD *)((char *)v19 + 0x138);
	v26 = (unsigned int *)operator new(0x11F0ui64, PagedPool);
	rgObject = v26;
	if ( !v26 )
	{
		goto LABEL_135;
	}
	memmove(v26, containerArray, 0x1000ui64);
	v28 = rgObject + 0x400;
	v29 = 3i64;
	...
	v20 = CClfsBaseFile::ValidateRgOffsets(this, rgObject);
	v72 = v20;
	operator delete(rgObject);
}

In fact, this block is a wrapper for CClfsBaseFile::ValidateRgOffsets:

__int64 __fastcall CClfsBaseFile::ValidateRgOffsets(CClfsBaseFile *this, unsigned int *rgObject)
{
...
LogBlockPtr = *(_QWORD *)(*((_QWORD *)this + 6) + 48i64); // * _CLFS_LOG_BLOCK_HEADER
...
signatureOffset = LogBlockPtr + *(unsigned int *)(LogBlockPtr + 0x68); // PCLFS_LOG_BLOCK_HEADER->SignaturesOffset
...
qsort(rgObject, 0x47Cui64, 4ui64, CompareOffsets); // sort rgObject array
while ( 1 )
{
	currObjOffset = *rgObject2; // obtain offset from rgObject
	if ( *rgObject2 - 1 <= 0xFFFFFFFD )
	{
	pObjContext = CClfsBaseFile::OffsetToAddr(this, currObjOffset); // Obtain in-memory representation
																	// of the object's context structure
...
	unkn = currObjOffset - 0x30;
	v13 = rgIndex * 4 + v5 + 0x30;
	if ( v13 < v5 || v5 && v13 > unkn )
		break;
	v5 = unkn;
	if ( *pObjContext == 0xC1FDF008 ) // CLFS_NODE_TYPE_CLIENT_CONTEXT
	{
		rgIndex = 0xC;
	}
	else
	{
		if ( *pObjContext != 0xC1FDF007 ) // CLFS_NODE_TYPE_CONTAINER_CONTEXT
		return 0xC01A000D;
		rgIndex = 0x22;
	}
	criticalRange = &pObjContext[rgIndex]; // get the address of context + 0x30
	if ( criticalRange < pObjContext || (unsigned __int64)criticalRange > signatureOffset ) // comapre with sig offset
		break;
	}
	++i;
	++rgObject2;
	if ( i >= 0x47C )
	return ret;
}
return 0xC01A000D;
}

As we can see, this function simply checks that the signature offset does not intersect with any of the context objects. In addition, it also validates several context fields like CLFS_NODE_ID.

Thoughts on how this vuln might have been found (fuzzing, code auditing, variant analysis, etc.):

We think that this vulnerability might have been found from code auditing/reverse engineering because (1) the base log record has to be crafted for the container context to remain uncorrupted from the encode/decode operations (2) the CClfsBaseFile::AcquireContainerContext function has to purposely fail. In all fairness, because (2) is easy to achieve, this might in fact have been found from fuzzing or other means.

(Historical/present/future) context of bug:

https://msrc.microsoft.com/update-guide/vulnerability/CVE-2022-24521

The Exploit

(The terms exploit primitive, exploit strategy, exploit technique, and exploit flow are defined here.)

Exploit strategy (or strategies):

As we do not have a sample to analyse, we have no idea how the ITW exploit works. However we did managed to exploit this vulnerability with a similar procedure to overwrite process token with pipe objects as outlined in the SSTIC2020: Scoop the Windows 10 pool! paper.

Exploit flow:

  1. Create pipe objects and add pipe attributes. The attributes are a key-value pair and stored in a linked list, and the PipeAttribute object is allocated in the Paged Pool.
  2. Use NtQuerySystemInformation to leak kernel virtual address of pipe objects in big pool.
  3. Allocate fake_pipe_attribute object. It will be used later to inject its address to an original doubly linked list.
  4. Obtain selected gadget-module base address using NtQuerySystemInformation.
  5. Trigger CLFS bug which allows us to call a module-gadget performing arbitrary data modification to achieve an arbitrary read primitive which can be used to obtain EPROCESS address.
  6. Trigger CLFS bug to overwrite usermode process token to elevate to system privileges.

Known cases of the same exploit flow:

N/A

Part of an exploit chain?

N/A

The Next Steps

Variant analysis

Areas/approach for variant analysis (and why): N/A

Found variants: N/A

Structural improvements

What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?

Ideas to kill the bug class: N/A

Ideas to mitigate the exploit flow: N/A

Other potential improvements: N/A

0-day detection methods

What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected as a 0-day?

Other References