Drivers sometimes need to use atomic bit test and set operations in code. Drivers may be calling other functions that rely on bit testing and setting1 but more often than not calls are made to one of those InterlockedCompareExchangeXXX functions or the shorter InterlockedXXX functions if comparing current value is inessential.
So it did not surprise me when I could not find anything in WDK documentation when looking for an InterlockedXXX function that emits essentially (rex) lock bts in x86/x64.
But if one looks around, there are actually two macros that satisfy the need
- InterlockedBitTestAndSet(64) in ntddk.h that uses InterlockedOr(64)
- InterlockedBitTestAndSet(64) in wdm.h that uses compiler intrinsic _interlockedbittestandset(64)
Macro #1 translates to lock or instruction so it is not what our original intention was2. Macro #2 leads to an intrinsic and does translate to a (rex) lock bts instruction. So make sure to play C header games if one needs to and use macro #2 or use the compiler intrinsic _interlockedbittestandset(64) directly – case closed right ?
Well not quite. If one takes a second look at what macro #1 expands to, more differences between the two macros become evident. My ntddk.h3 shows it like this
What is going on here ? This macro supports setting bits beyond the first LONG pointed to by Base by using the higher bits of the Bit parameter. This feature is not present in the _interlockedbittestandset intrinsic – the intrinsic blindly passes whatever is passed into a lock bts instruction and the cpu will faithfully ignore any higher bits set.
So in case you have to set bits beyond the first LONG (or LONG64), you could use macro #1.
1such as when using regular spin locks
2which makes this macro a bit confusing because it claims to do a test and set – but in reality is a set (and test?) if you think about it.
3which are not the absolute latest and greatest mind you