Jared Atkinson, Author at Security Boulevard

On Detection: Tactical to Functional

Jared Atkinson — Fri, 20 Oct 2023 18:43:37 +0000

Part 9: Perception vs. Conception

The concepts discussed in this post are related to those discussed in the 9th session of the DCP Live podcast. If you find this information interesting, I highly recommend checking the session out!

https://medium.com/media/89a600d7731c06c483f9d3c89ddc5ff7/href

At this point in the series, we understand that attack techniques are abstract concepts that must be instantiated in the form of a tool or software application. We also understand that there are many abstraction layers that exist between the (sub-)technique, OS Credential Dumping: LSASS Memory, and the Mimikatz tool. Over the past few posts in this series, we’ve explored these layers–specifically the functional and operational layer. For instance, below, we see the “operation chain” or “procedure” that corresponds with mimikatz’ sekurlsa::logonPasswords command. Here we see that the operations that form the chain are Process Enumerate -> Process Access -> Process Read.

Mimikatz sekurlsa::logonPasswords Operation Chain

In this article, I hope to demonstrate how the operational layer IS the appropriate layer of analysis for those interested in creating behavioral detection analytics.

Note: In this context, a behavioral detection analytic is one that focuses on what the malware does instead of what the malware is. It decouples the action from the actor. This is not to say that detecting known bad malware based on what it is, is a bad idea; we simply are attempting to take the next logical step.

Snake Detection Theory

In her 2009 book, “The Fruit, the Tree, and the Serpent”, Lynne Isbell explores her idea that human visual perception developed as a result of the relationship between our ancestors and snakes. The general proposal is that until relatively recently, humans lived with snakes as their primary predator which meant that “snake detection” served as a selective pressure on our sensorial development. If a human was unable to identify a snake, then they were likely to succumb to predation and therefore less likely to reproduce. This meant that over time, our visual perception developed to be more attuned to detecting snakes which in general is not an easy task. You see, snakes slither along the ground, often in tall brush, and are camouflaged to their environment. Isbell proposes that to adapt to this fact, human visual perception eventually became more sensitive to objects in motion instead of static objects. The proposal is that selection worked to ensure that modern humans see that which they need to see to survive. There is simply too much signal out there in the world for us to attend to it all, so we must prioritize the signal that is most likely to be important. This is why we use the phrase “to pay attention.” The idea is that attention is a finite resource that we must spend judiciously; ultimately, the process of attending is done subconsciously.

While this presumably works well in the biological world, it is not obvious that this same principle applies in the cyber realm. Our perception in the cyber world is artificial or man-made, so to assume that it is properly adapted via selection may be a large leap. A question worth asking is whether you believe that feedback is produced in an efficient enough manner to hone our perceptual capability in cyberspace. If the endpoint detection and response (EDR) sensor serves as the functional equivalent to our eyes in this space, how do we know that it is prioritizing those events that are most relevant to our ability to detect?

In this article, I hope to link our conceptual model, the operation chain, to our ability to perceive EDR events. We will explore how the operation chain provides a formula for predicting the types of events one should expect to encounter as a result of executing a specific sample or operation chain. We will also compare and contrast two EDR solutions to understand the difference in their ability to perceive certain operations. Most importantly, we will demonstrate that we perceive operations.

The events that we collect are synonymous with operations. This means that until we are able to prove otherwise, the optimal level of analysis for malware samples, within the abstraction, is the operational level. I’ve been using the phrase “align your conception to your perception” to help our team understand how to think about attacker behaviors. In this phrase, “conception” signifies the model with which you think about the behavior and “perception” signifies the events that you use to understand the activity in your environment. If these two concepts are not aligned, then you are starting with a disadvantage.

Let’s dig into it!

We perceive operations

A good detection engineer should understand how they perceive their cyber environment. Human beings exist in a biological environment and have hardware (sensory organs like eyes, ears, skin, etc.) that allows them to receive signals from the environment in order to perceive it. Even though humans have five senses, meaning they have the ability to perceive five different types of signals, we know that there are signals that exist in the world that we cannot naturally perceive. For example, we know that sharks have the ability to perceive electrical currents through a sense called electroreception. Unfortunately, our biological senses do not provide any natural perceptual value in the artificial cyber environment. As a result, we have built sensors that capture signals and convert them in a way that we can ingest and process. Over time, these sensors have evolved, so to speak, and become more complex. So what do we use as our eyes and ears in the cyber environment? The answer is simple, we use hardware and software sensors like EDR agents and network monitoring sensors to capture signals from the environment and present them to us in a manner that we can consume. Since EDR is a significant source of the telemetry used in detection engineering, I want to focus on discussing how and what exactly EDR sensors see.

To answer this question, I think it is valuable to dig into some events from multiple EDRs just to demonstrate why I am making the claim that what we see, in the form of events, are in fact operations being carried out. Below is a sequence of operations (i.e., operation chain) that describes classic remote thread injection. We see that the operation chain is composed of four operations, each of which will be discussed in more detail below:

The classic shellcode injection operation chain is composed of the Process Open, Memory Allocate, Process Write, and Thread Create operations.

Process Open

The first operation that we see is Process Open. Processes are securable objects on the Windows OS and their structure lives in the kernel. This means that user-mode applications do not have free reign to interact with them directly. Instead, the requesting application (the malware, in this case) must open a handle which can then be used to interact in many different ways with the process. The opening of the handle is the point at which the access checks occur based on the requesting process’s token and the target process’s discretionary access control list (DACL). If the access check fails, the requestor will receive an ACCESS_DENIED error in response.

Memory Allocate

Once the process handle has been obtained, it is necessary to allocate a buffer of memory in the remote process which will be used to hold the injected shellcode. This operation depends on the handle that the Process Open operation produced. Specifically, the handle must be opened with the PROCESS_VM_OPERATION access right. This step is necessary because the source (injecting) process must make the code available in a location that is accessible to the target (injected) process.

Process Write

After a buffer has been allocated in the target process, the malicious shellcode can be written to that buffer. Again, the process handle the Process Open operation produced is used; however,in this case, the PROCESS_VM_WRITE access right is required. The Process Write operation is the component of the operation chain that causes this to be considered “injection.” Once the code has been written to the target process’s memory space, the final step will be to find a way to coerce the target process to execute the code.

Thread Create

The final operation is focused on executing the injected code. The traditional way of executing the code is to create a new thread that is set to execute the code located in the buffer that was allocated during the Memory Allocate operation. Over the years, there have been many alternative execution mechanisms discovered, but for the sake of simplicity we will stick with the Thread Create operation for this article.

Associating EDR Events with the Operation Chain

Once the operation chain has been established, Detection Engineers have the information they need to begin considering the practical components of the detection. Specifically, the operation chain provides the details they require to identify the events or logs that are relevant to the behavior in question. In this section, we will explore the relevant events that two common EDR platforms generate and demonstrate that the events that we collect are focused on the operational level. This means that if you can identify the operations that a malware sample implements, then you can likely correlate them to the telemetry that your EDR platform captures. This facilitates deliberate detection engineering.

Note: It is important to note that there are many unique operation chains that implement the Process Injection technique. In this article, we are focusing on one operation chain which is considered the prototype for this technique. I’ve found that, similar to the way we teach complicated topics such as mathematics, it is ideal to first teach detection engineers to build detections for single operation chains. Once they are comfortable engineering detections for single operation chains, then they can begin to consider the ramifications of multi-chain detection engineering. If, however, they lack the single-chain fundamentals, they will inevitably struggle to find valid solutions for multiple chains.

Microsoft Defender for Endpoint (MDE)

Now imagine that you work for an organization that uses Microsoft Defender for Endpoint (MDE) as your EDR. If I asked you which MDE events are available for this particular operation chain, you might have a number of different options that immediately come to mind. In fact, I chose MDE specifically because it produces an event for each operation in this chain which I think precisely demonstrates the point of this article. The general rule of thumb that I hope to demonstrate is that EDR products perceive operations. Let’s take a look at each operation and consider which MDE events report on them.

The Classic Shellcode Injection operation chain composed of the Process Open, Memory Allocate, Process Write, and Thread Create operations

The first step in correlating MDE events to operations is to become familiar with the types of events that are available to you. To begin, I recommend referencing Microsoft’s data table documentation. While there are many supplementary tables available in MDE, I find that the tables shown in the screenshot below are the primary tables that I use as the foundation of my detection engineering efforts.

After analyzing this list, you might be thinking, “I don’t see anything related to Process Open, Memory Allocate, Process Write, or Thread Create,” and you would be entirely correct. The trick here is that the DeviceEvents table hides a treasure trove of events, so the next step is to dig into it. When we browse to the documentation page for the DeviceEvents table, we find that the ActionType column is the key to discovering all of the different event types that exist within the table. Notice the “Tip” callout where it recommends the use of the built-in schema reference in MDE in order to discover the supported ActionType values. If your organization uses or has access to MDE, I highly recommend you spend some time doing just that.

However, if you do not have access to MDE, then we will do this exploration artificially. In his blog post where he compares the telemetry generation capabilities of Sysmon to those of MDE, Olaf Hartong provides this great spreadsheet that contains all of the action types that were available at the time of his post (October 2021). The list has likely grown over the subsequent years, but is sufficient for the purposes of this article. We will use this list to correlate ActionType values to the relevant operations from our chain.

Process Open -> OpenProcessApiCall

As described earlier in this post, the Process Open operation is the first operation in the chain. When we reference Olaf’s list of ActionType values available within the DeviceEvents table, we see one ActionType that stands out. The OpenProcessApiCall ActionType, named after the OpenProcess API function, reports Process Open events. It is important to understand that despite the ActionType’s name being related to a specific API function, the event itself is generated in the kernel and thus is resilient to functional changes. This means that regardless of whether the kernel32!OpenProcess, kernelbase!OpenProcess, ntdll!NtOpenProcess, or syscall!NtOpenProcess functions are called, the event will generate. For more information regarding how this event is generated in the kernel, I recommend checking out Jonathan Johnson’s excellent blog post on the topic.

Below is an example Kusto query to find these events:

DeviceEvents |
where ActionType == "OpenProcessApiCall"

Note: MDE in particular implements some amount of filtering on this event. While the actual Event Tracing for Windows (ETW) provider captures all Process Opens, MDE will only forward the events where lsass.exe is the target process. While this decision does not impact some techniques, such as OS Credential Dumping: LSASS Memory, it would severely impact its utility with regard to Process Injection as the target process is not guaranteed to be lsass.exe.

Memory Allocate -> NtAllocateVirtualMemoryRemoteApiCall

Following a similar process, we find that there are two relevant ActionTypes for the Memory Allocate operation: NtAllocateVirtualMemoryApiCall and NtAllocateVirtualMemoryRemoteApiCall. Again, despite the name being focused on a specific function name, these events are generated in the kernel and thus capture all user-mode functions in the Memory Allocate function call stack. Jonathan Johnson again helps us out with his fantastic TelemetrySource project, where he explicitly maps how these events are generated via the Microsoft-Windows-Thread-Intelligence ETW provider.

Our next question is whether both events are relevant to our specific use case. What we find is that the first event, NtAllocateVirtualMemoryApiCall, is generated when memory is allocated locally (within the calling process); while the second event, NtAllocateVirtualMemoryRemoteApiCall, is generated when the memory allocation is done in a remote process. Since we are focused specifically on Process Injection, meaning the injection is occurring in a remote process, we can safely say that we are only interested in the NtAllocateVirtualMemoryRemoteApiCall version of the event.

DeviceEvents |
where ActionType == "NtAllocateVirtualMemoryRemoteApiCall"

Process Write -> WriteToLsassProcessMemory

The third operation is Process Write. At this point in the chain, a handle has been opened and a buffer has been allocated in the remote process. Now it is time to “inject” or write the malicious shellcode to that buffer. Again, this operation requires the handle to the target process that was obtained in step one, and this time the handle must have the Process_VM_READ access right.

Upon consulting the list of ActionTypes in MDE’s DeviceEvents table, we find the WriteToLsassProcessMemory ActionType. This event is technically aligned to the Process Write operation, but unfortunately the scope is a bit narrower than we might hope for.

Note: This event used to be called WriteProcessMemoryApiCall; however, in February 2021, Microsoft renamed it based on consumer feedback so the event name more closely represents the filtering conditions that are present.

While MDE’s WriteToLsassProcessMemory ActionType is limited to Process Write operations where the target process is lsass.exe, the underlying mechanism used to produce these events provides greater coverage. According to Jonny’s Telemetry Source project, these events are generated by the Microsoft-Windows-Threat-Intelligence ETW provider. Any EDR can subscribe to this provider, so long as they meet some criteria Microsoft put forth, and those vendors can make their own filtering decisions. In general, the ETW provider is situated to identify local Process Writes (Windows Event ID 12) and remote Process Writes (Windows Event ID 14). In the context of Process Injection, we are more interested in the latter.

DeviceEvents |
where ActionType == "WriteToLsassProcessMemory"

Thread Create -> CreateRemoteThreadApiCall

The final operation is the Thread Create operation. This step requires two essential inputs, first the handle to the target process. This handle must include the PROCESS_CREATE_THREAD, PROCESS_QUERY_INFORMATION, PROCESS_VM_OPERATION, PROCESS_VM_WRITE, and PROCESS_VM_READ access rights. Additionally, the address of the memory buffer containing the shellcode must be provided. Once this operation is performed, a new thread is created which will execute the shellcode and complete the “injection” routine.

Let’s check Olaf’s list of ActionTypes to see if we can find one related to thread creation. If you found the CreateRemoteThreadApiCall ActionType, you would be on the right track.

DeviceEvents |
where ActionType == "CreateRemoteThreadApiCall"

MDE Event/Operation Relationship

The image shown below associates MDE events with their associated operations:

Classic Shellcode Injection Operation Chain Overlaid With Relevant MDE Events

Sysmon

It is important to understand that each EDR is going to provide a slightly different perspective. To demonstrate this, we can turn to Sysmon: a free sensor that is available as part of the SysInternals suite of tools. While Sysmon might not qualify fully as an EDR product, telemetry production is not the feature holding it back. At the time of this writing, Sysmon provides 28 different events that can be used for detecting malicious or unauthorized activity. We can use the list of events, linked above, to begin identifying events that correspond with the operations in our operation chain. In this case, we find that Sysmon, unlike MDE, does not supply a corresponding event for EACH operation in the chain. Instead, we find that Sysmon Event ID 8, CreateRemoteThread, corresponds with the Thread Create operation and Sysmon Event ID 10, ProcessAccess, corresponds with the Process Open operation.

Sysmon Event/Operation Relationship

When it comes to Sysmon, we find that the claim that events align to operations remains true; however, we also find that not all sensors support telemetry generation for all operations. Unlike MDE, Sysmon currently only supports the Process Open and Thread Create operations. An interesting feature difference between Sysmon and MDE is that while Sysmon does not have events for the Memory Allocate and Process Write operations, it does allow for customized configuration regarding filtering for the events it does collect. This means that the user can select the conditions by which Sysmon will capture and log the relevant operations discussed above. As a result, there is a lower likelihood that a relevant event will be missed as a result of transparent filtering.

Classic Shellcode Injection Operation Chain Overlaid With Relevant Sysmon Events

Conclusion

When it comes to detection engineering, I follow two mantras. First, “we cannot detect that which we do not understand;” this means that our ability to detect an action is dependent upon our understanding of the action itself. If we have a relatively superficial understanding of Process Injection, then we are likely to be limited to detecting the WHAT instead of the HOW. So how do we develop this understanding, and at what point can we be confident that we understand the phenomenon ENOUGH to produce a sufficient solution?

The second mantra is, “we must align our conception with our perception.” The goal of this article is to demonstrate that, especially from an endpoint telemetry perspective, we see operations, so it only makes sense that we should strive to understand the behavior in the context of operations. This facilitates proper decision making when it comes to engineering detection rules.

If you find yourself building a detection analytic and you are relying on an event that is not in the corresponding operation chain, I encourage you to ask yourself whether you are building a detection for the “what” instead of the “how.” Put another way, are you detecting the tool and not the behavior? Maybe detecting the tool is your goal, that is fine, but that should be a deliberate choice.

References

On Detection: Tactical to Functional Series

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

On Detection: From Tactical to Functional

Jared Atkinson — Thu, 01 Jun 2023 17:55:01 +0000

In his 1931 paper “A Non-Aristotelian System and Its Necessity for Rigour in Mathematics and Physics,” Mathematician Alfred Korzybski introduced an idea that many today find helpful when dealing with complex systems. The idea is commonly referred to as “The map is not the territory,” and Korzybski lays it out according to the following points:

A.) A map may have a structure similar or dissimilar to the structure of the territory.
B.) Two similar structures have similar ‘logical’ characteristics. Thus, if in a correct map, Dresden is given as between Paris and Warsaw, a similar relation is found in the actual territory.
C.) A map is not the actual territory.
D.) An ideal map would contain the map of the map, the map of the map of the map, etc., endlessly…We may call this characteristic self-reflexiveness.

When we consider attacker tradecraft, we tend to think about it abstractly. What does it mean to “dump credentials from LSASS” or “Create a Golden Ticket”? Our understanding of these abstract concepts depends on the MAP we use to undergird them. If, as Korzybski’s second point states, our map does not share a similar structure to reality, then any conclusions we draw from the map will, by definition, be incorrect. How, then, should we build accurate maps of the “tradecraft territory?”

I believe the answer to that question can be derived from our old friend Plato, who, in his dialogue, Sophist, explores the definition of the form through the analysis of particulars. In our case, the form represents the tradecraft (technique or behavior) in question, for example, Process Injection; the particular is the tool or sample that implements the technique. Plato proposed that we can understand the form by understanding what the particulars have in common, “sameness,” and what differentiates them, “difference.” We can do this by analyzing tools that ostensibly perform Process Injection and “mapping” them to allow for comparison. This post describes our current approach to mapping tool implementations and how this mapping facilitates the comparison of the particulars.

Introduction

Over the past few months, I’ve been working with some colleagues at SpecterOps to develop our Purple Team service offering. One of the tasks that we’ve been focusing on is creating a model that allows us to represent the possible variations of an attack technique. The plan is to use this model for use case selection during our Purple Team assessment. To do this, we’ve applied many of the concepts described earlier in this series, and along the way, we’ve learned a lot about how to teach others to implement this approach. One person who helped me understand how to break down the analytical problem into bite-sized portions is Nathan D. When Nathan first worked with me, he did not have much experience with the analytical approach described in this series. His fresh perspective helped to identify areas where I was making logical leaps or did not represent the output well. As a result, he helped me create a new model or graph layout that I think is helpful and a bit more tangible for those just getting started.

Below is an image of the Tool Graph (yes, that is the really creative name I came up with). The idea is to analyze a single tool’s, or “particular” sample’s, implementation of a behavior to understand how it works and use that as a starting point to enumerate or discover additional variations. Over time, as more tools are analyzed, these graphs, or at least some of their components, can be combined to create a robust model of the different variations of each attack technique. It is a way to formalize our understanding of these variations instead of simply maintaining a disintegrated list of variations in our heads.

Tool Graph

In this post, I want to walk you through the components of the Tool Graph, what the colors represent, and how you can make your own. So let’s dig in!

Our Tool of Interest

For this blog post, I will reference a sample shared by DGRonpa in their Process_Injection repository. This repository is an excellent resource for learning about many of the most common Process Injection variations. Today we will analyze their Shellcode_Injection.cpp sample, which implements the classic remote thread shellcode injection variation that most of us are familiar with.

Choosing a sample or tool to analyze is a significant source of consternation. Through this initiative, we have identified several important factors to consider, especially for analysts just getting started with this type of analysis. I plan to discuss these factors in a follow-up entry in the series.

Process_Injection/Shellcode_Injection.cpp at main · DGRonpa/Process_Injection

Function Chain

The first step in generating a Tool Graph is to map the sequence of the sample’s API function calls. This sequence is also known as a “function chain.” For this post, I selected a relatively simple sample so that identifying the Function Chain would be as easy as possible. If you want to see how this process works with a more complicated example, I recommend you check out Part 1 of this series, which looks at the function chain for the mimikatz sekurlsa::logonPasswords module.

For now, I’ve included a screenshot of the critical snippet of our sample below:

Source Code for Shellcode_Injection.cpp

Notice that the InjectShellcode function calls four Windows API functions, OpenProcess, VirtualAllocEx, WriteProcessMemory, and CreateRemoteThread. Now that we’ve observed the specific sequence of functions the sample calls, we can graph them as a Function Chain:

Function Chain

Note: All graphs are generated using APC Jones’ Arrow Tool at https://arrows.app

In the Tool Graph, the circles and arrows of the Function Chain are red.

Function Call Stack(s)

I remember hearing about this Function Chain or pattern when I started in the industry. I was told, “if you see OpenProcess, VirtualAllocEx, WriteProcessMemory, and CreateRemoteThread, it signifies Injection.” Over time, as I became more familiar with API programming, I realized that this perspective was somewhat low resolution. I found that while this specific pattern of functions may indicate injection, there are (many) alternative Function Chains that also indicate injection.

It reminds me of the saying that “all squares are rectangles, but not all rectangles are squares,” but this time, it is something like “all instances of the [function chain] are injection, but not all instances of injection are the [function chain].” One way we can expand our map of Function Chain variations is to explore the Function Call Stacks for each function in our particular Function Chain, namely kernel32!OpenProcess, kernel32!VirtualAllocEx, kernel32!WriteProcessMemory, and kernel32!CreateRemoteThread. If you are unfamiliar with Function Call Stacks, check out my Understanding the Function Call Stack post, which offers an in-depth exploration of this phenomenon.

Note: Before we continue, there is a significant change between how I graph the Function Call Stack in the original post and how I integrate it into the Tool Graph. To differentiate between the Function Chain (the literal sequence of functions that the tool calls) and the Function Call Stack (the functions that are called implicitly/behind the scenes as a subsequent action of a higher level explicit function call), we represent the “Function Chain” horizontally and the “Function Call Stack” vertically.

Below, you will see the first Function Call Stack for kernel32!OpenProcess, integrated into the Tool Graph:

Including the Function Call Stack for kernel32!OpenProcess

We can then integrate the Function Call Stack for each additional function in the Function Chain, as shown below:

Function Chain and associated Function Call Stacks

Note: Some functions are much more complicated than others because they may call multiple subsequent functions. In some cases, we include all subsequent functions in our analysis, but in other cases, we only have the most relevant sub-stack. To signify incomplete function analysis, we color the function’s circle orange, as shown with kernelbase!CreateRemoteThreadEx in this example.

Analysis of Variations

Once we have enumerated all the Function Call Stacks, we can analyze all the possible “functional variations” represented by our current graph. Generally, a developer can select any function in each stack. For example, you can choose any function from call stack 1 and combine it with any function in call stacks 2, 3, and 4 to create a functioning application. We see that Call Stack 1 has five (5) functions, Call Stack 2 has six (6) functions, Call Stack 3 has five (5) functions, and Call Stack 4 has six (6) functions. Therefore, we can calculate the total permutations by multiplying the number of functions in each call stack. In this case, we find 900 (5x6x5x6) possible “Function Chains” or “functional variations” represented by these call stacks.

Again, a “Function Chain” is defined as a unique sequence of functions that implement the behavior. For that reason, our original functional variation of kernel32!OpenProcess -> kernel32!VirtualAllocEx -> kernel32!WriteProcessMemory -> kernel32!CreateRemoteThread is one example, while a hypothetical Function Chain might be kernel32!OpenProcess -> kernelbase!VirtualAllocEx -> ntdll!NtWriteVirtualMemory -> ntdll!NtCreateThreadEx would be a second variation. From the attacker’s perspective, these two variations are interchangeable because they produce the same result. From the detection engineer’s perspective, these small functional changes might cause the sample to evade detection. Below I’ve included the four Function Call Stacks with no function highlighting (no red circles) to show examples of how this works.

Let’s look at some of the different Function Chains these Call Stacks represent.

Standard Win32 API Functions

The most likely Function Chain is the one that uses the documented Win32 API functions. These are the functions that Microsoft intends for developers to use when writing software, so they are often consistent across versions of the OS, well documented, and easy to find tutorials on. For that reason, it is prevalent for malware also to use these functions unless there is a specific evasion consideration that causes the developer to use the less known functions that are lower in the stack.

Function Chain Variation — Win32 API Functions

Syscalls Only

A second alternative that has become more popular over the past few years is skipping past the higher-level API functions to make direct system calls (syscalls). Malware developers use syscalls because, in some cases, EDR sensors use logging techniques that allow them only to see high-level user-mode function calls. Malware developers can evade this subset of sensors or events by making calls directly to syscalls.

Function Chain Variation — Syscalls only

Arbitrary Example Variations

While the first two examples represent some of the Function Chains most likely to be selected, we can derive MANY other Function Chains from these call stacks without first observing them in a “real” sample. Below are two examples of arbitrary function chains that developers could use to create an alternative tool that would be functionally equivalent to the original but may be different enough to evade some detection rules.

Function Chain Variations — Arbitrary Selection of Functions

Operation Chain (Procedure)

We have identified 900 possible Function Chains that malware developers can use to implement this “Classic Remote Thread Shellcode Injection” variation. While changing the Function Chain may be enough to evade some telemetry generation or detection rules, most of the Function Chains described by the existing graph are so similar that their differences are irrelevant from a detection engineering perspective. Put another way, “these functional variations are fungible from the attacker’s perspective, and often fungible from the defender’s perspective.” This is because most modern EDR sensors generate telemetry from the kernel, and therefore, the capture mechanism is at the bottom of the Function Call Stack.

If a developer can use any function in the Call Stack to replace any other function in the Call Stack, then it makes sense that the detection engineer would want to view the collection of functions within a Call Stack as if they are the same. This is the basis of abstraction, the idea that although kernel32!OpenProcess and ntdll!NtOpenProcess are literally different functions; they are similar enough to treat them as the same. This allows the detection engineer to ignore irrelevant differences between Function Chains and focuses their energy on dealing with the differences that matter. Using this logic, we can derive an abstract category that represents or can replace any interchangeable function in a Call Stack. For example, instead of making those fine-grain distinctions between the functions mentioned earlier, we can refer to them using the Process Open operation. For a deeper look into this “Operation” idea, see Part 2 of the On Detection series.

I have added an Operation Chain to the graphic below. We have converted the specific Function Chain of kernel32!OpenProcess -> kernel32!VirtualAllocEx -> kernel32!WriteProcessMemory -> kernel32!CreateRemoteThread to an Operation Chain of Process Open -> Memory Allocate -> Process Write -> Thread Create. This analysis has generalized the specific Function Chain used by the malware sample to an Operation Chain representing 900 Function Chain variations. Abstraction saves us from the necessity of making small/onerous distinctions.

Deriving the Operation Chain from the Function Chain and Call Stacks

Note: The circles and arrows that comprise the Operation Chain are green in the Tool Graph.

One of the arguments that I will make in a future post is that, for the most part, we do not see function calls (despite what certain EDR naming conventions might lead us to believe). Instead, we perceive something that more closely approximates the aggregation of the Call Stack, something I have previously referred to as an Operation. This operational focus is becoming more common as EDR sensor technology becomes more sophisticated. In general, EDR sensors now generate much of their telemetry in the kernel below the syscall, the bottom level of the (user-mode) Function Call Stack. That means that no matter which function in the stack a tool calls, the operation should be perceived by the EDR sensor. As a result, we can use the Operation Chain as an abstract summary of all 900 Function Chains in the Tool Graph. In other words, the Operation Chain allows us to ignore unnecessary complexity and align our conception with our perception.

Conclusion

The Tool Graph allows explicitly mapping a sample from the Function Chain to the Operation Chain via Function Call Stacks. This ultimately allows for comparing multiple samples at different levels of abstraction. First, samples are compared functionally via their Function Chains. If their Function Chains are the same, then it is safe to assume that their Operation Chains will also be the same. However, in cases where the Function Chains are not the same, the Tool Graph allows the samples to be compared operationally via their Operation Chains. This is the premise proposed in Part 7 of this series. The idea is that if two samples are “operationally equivalent,” then they will almost certainly be mutually detectable. However, if the two samples are dissimilar at the operational level, this analysis may have revealed an opportunity for evasion.

The cool aspect of the Operation Chain is that it provides a new level of abstraction. Operations do not literally exist. Instead, they are ways for us to categorize Function Chains so similar that tool developers can switch them out transparently.

Final Operation Chain

It is important to remember that this is but one Operation Chain that the attacker can use to implement this behavior (Process Injection). This analysis aims to provide a concrete foundation for the layers of abstraction we are building. While my initial plan for this series included some prediction in discovering previously unknown variations, the first step should explicitly model KNOWN variations or tools. As a result, I recommend starting with a specific tool of interest. A tool provides a concrete foundation and helps to facilitate an onramp to this type of analysis. Your tool of interest may be an open-source tool (like I showed here), a tool in a threat report, or a component or command used by a C2 platform. The cool thing is that we will start to understand which differences and variations are more profound by analyzing many different tools. For instance, we might analyze two tools that claim to be significantly different. However, they are simply implementing two different Function Chains that can be summarized by the same Operation Chain (they are synonyms). Alternatively, we may see two tools that purport to be interchangeable from the operator’s perspective, but they implement two different Operation Chains. Over time, our model will grow, and we will have a more robust understanding of the changes attackers can make to fool our detection rules.

If you want to increase your analytical abilities, I recommend creating Tool Graphs for each sample in DGRonpa’s Process_Injection repository. This practice will help not only reinforce the skills that are necessary for building Tool Graphs but it will demonstrate the snowball effect of analysis. As you analyze multiple samples, especially samples that implement the same or similar techniques, you will continually encounter the same functions over and over. If you organize the results of your analysis well, you can save yourself a lot of time as you increase your experience with new and different samples.

On Detection: Tactical to Functional Series

On Detection: From Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: From Tactical to Functional appeared first on Security Boulevard.

On Detection: Tactical to Functional

Jared Atkinson — Thu, 29 Sep 2022 14:04:00 +0000

Part 7: Synonyms

“Experience is forever in motion, ramifying and unpredictable. In order for us to know anything at all, that thing must have enduring properties. If all things flow, and one can never step into the same river twice — Heraclitus’s phrase is, I believe, a brilliant evocation of the core reality of the right hemisphere’s world — one will always be taken unawares by experience, since nothing being ever repeated, nothing can ever be known. We have to find a way of fixing it as it flies, stepping back from the immediacy of experience, stepping outside the flow. Hence the brain has to attend to the world in two completely different ways, and in so doing to bring two different worlds into being. In the one, we experience — the live, complex, embodied, world of individual, always unique beings, forever in flux, a net of interdependencies, forming and reforming wholes, a world with which we are deeply connected. In the other we ‘experiences’ our experience in a special way: a ‘re-presented’ version of it, containing new static, separable, bounded, but essentially fragmented entities, grouped into classes, on which predictions can be based. This kind of attention isolates, fixes and makes each thing explicit by bringing it under the spotlight of attention. In doing so it renders things, mechanical, lifeless. But it also enables us for the first time to know, and consequently learn and to make things. This gives us power.”
- Iain McGilchrist¹

Introduction

Welcome back to the On Detection: Tactical to Functional series. In this article, I hope to continue my investigation into whether I believe that detection engineering efforts should be focused on the procedural level. Last time, we defined procedures as “a sequence of operations that, when combined, implement a technique or sub-technique.”² With this definition in mind, we’ve established a hierarchy that includes functions, operations, procedures, sub-techniques, techniques, and tactics. This hierarchy allows us to talk about attacks at numerous levels of resolution or abstraction, but what does that ability provide for us? I opened this article with a quote from Iain McGilchrist in his seminal work on neuroscience titled, “The Master and His Emissary.” In the book, McGilchrist explores the two hemispheres of the brain and how they “experience” the world differently. As he says in the quote, the right hemisphere experiences the world in motion, where everything is unique and constantly changing. One could argue that even inanimate objects like mountains are constantly changing, but at a rate that is difficult, if not impossible, to perceive.

On the other hand, the left hemisphere experiences objects categorically or as forms. It represents a combination of bricks, wood, furniture, etc., as a house instead of the component parts. McGilchrist posits that there is almost certainly a reason why our brains are split this way and that this phenomenon is not unique to humans but to many other animal species. In the quote, he mentions that the power of the left brain is that by seeing literally unique instances of objects through the categorical or gestalt lens, we can see their similarity rather than differences. This feature allows us to experience a world that we understand. Without it, everything would be new, unique, and anomalous. Understanding these categories enables us to predict how they might act in different scenarios. Prediction is the key benefit of the ontological structure we’ve been working to uncover.

This post will introduce a new fundamental layer to our ontology. A layer that represents the instance or literal manifestation of a given tool. Just like our brain, we must realize that every instance of every tool is unique in its own way, but we must simultaneously understand how we can see similarities between unique instances and the implications of said similarities. I will then discuss a cognitive tool that helps me to categorize instances and make decisions for different use cases like detection rule evaluation or purple teaming. I hope you enjoy the post, and as always, I would love to hear your feedback! What resonates with you, what do you agree with, what do you disagree with, etc.

The Literal Level

Thus far in the series, we’ve focused on categories that allow us to appreciate the different variations of a given attack technique or sub-technique, but I feel that we’ve neglected to discuss the instances themselves and create terminology to facilitate this discussion. I want to take a moment to introduce what I call the LITERAL level. As McGilchrist pointed out, from one perspective, everything we encounter in the world is constantly changing and, therefore, always new or different. Traditionally, we might think of two tools (files) as different if they have different cryptographic hash values. This is certainly one measure of difference, but that might not be the only way in which two tools can be differentiated. What if we considered a time element like when the file was created. How about a metadata element, such as who owns the file or what is the file’s name. Alternatively, we might consider a location element such as what system the file exists on. From this perspective, there is an infinite set of possible variations. If we only could view the world this way, we wouldn’t have a concept like ATT&CK, everything would be unique, and we wouldn’t have the ability to categorize what we encounter.

There are literally an almost infinite number of ways in which any attack technique can be manifested. Bytes can be added or subtracted, they can be substituted, and they can be changed to present themselves differently in the world. This is the problem we face as defenders. How do we allocate our finite attention to a relatively infinite problem? The key is abstraction, the power of the left hemisphere. The key is to not confuse the instances with the forms. These literal manifestations of an attack are simply reflections of the Platonic form in our world. However, abstraction allows us to shed the specificity of the literal and understand instances based on their similarities instead of their differences. This world of forms is what we’ve been exploring thus far in this series. With this understanding of the distinction between the instances (the literal) and the forms (the functional, operational, procedural, sub-technical, technical, and tactical), we can reduce the infinite variability of the problem we face to something that is hopefully more manageable. Let’s look at how we can classify the literal instantiation of the LSASS Memory sub-technique into the different levels of forms.

Synonyms

While the ideas posed by McGilchrist, about two different ways to experience the are interesting, there’s a question of practicality. How exactly do we measure the difference? Again, one way to view the world is that everything we encounter is different and constantly changing. While this perspective might technically be more closely related to how the world actually is, it makes detection engineering quite tricky. For that reason, it is helpful to consider the tools as static entities that can be grouped based on similarities, but how do we measure similarity?

One tool in our tool kit is cryptographic hashing algorithms like SHA256. These algorithms are excellent for determining whether two files or, more specifically, two sets of data are the same or different. You can submit data to the algorithm, and a resulting “hash” will be produced. The algorithm will generate the same value every time the same input data is provided. The cool thing is that even small changes to the input data produce significant changes to the output value. The SHA25 algorithm can determine if two tools are the same, at least in the sense of their file’s content. The problem is that a hash algorithm only determines if the files are identical or different. It is a binary result. We cannot determine how similar the files are from the files’ hash values.

Maybe we should ask whether a similarity metric even exists in the first place. Imagine we have two sets of two files. The first set consists of two files; the only difference is one byte that was changed. The second set consists of two files that have no bytes in common. These sets will have different SHA256 hash values and are, therefore, literally different, but we can intuitively say that the files in the first set are more similar than those in the second set. This hypothetical situation is meant to present the limit case because it makes the question easier to answer, but what if there were two sets of two files with one hundred bytes changed? Would they ALWAYS have the same amount of similarity? My answer to this question is, of course, “it depends” (what can I say? I’m a consultant). I think it depends on what the changed bytes represent and how it was changed. Maybe the bytes represent a bunch of unused code and don’t actually change the tool’s functionality, but perhaps they are related to its core functionality and make it act entirely differently when executed. So if files can be more or less similar, is there a way to objectively determine similarity?

To answer this question, I’m introducing a concept called “synonyms.” In language, synonyms are two words that, while literally different, have the same meaning. For example, the words “close” and “shut.” The definition for close is “move or cause to move so as to cover an opening,” while the definition of shut is “move something into position so as to block an opening.” As you can see, one can say “please shut the door” or “please close the door” without losing meaning. I was inspired to think about synonyms in a new way while reading Michael V. Wedin’s “Aristotle’s Theory of Substance: The Categories and Metaphysics Zeta.” ³ Wedin tells us that Aristotle opens The Categories by describing the three Onymies, homonym, synonym, and paronyms. Aristotle tells us that two words are synonyms, “when not only they bear the same name, but the name means the same thing in each case — has the same definition corresponding. Thus a man and an ox are called ‘animals.’ The name is the same in both cases; so also the statement of essence. For if you are asked what is meant by their both of them being called ‘animals,’ you give that particular name in both cases the same definition.”⁴

Now I must admit that initially, this both inspired and confused me. I had previously come up with the idea that Out-Minidump⁵ and Sharpdump⁶ were more similar than Out-Minidump and Mimikatz sekurlsa::logonpasswords⁷, but I didn’t have a way to express this idea propositionally. Synonyms seemed to be the key to unlocking a potential solution, but the condition put forth by Aristotle is that both the NAME and the STATEMENT OF ESSENCE be the same. Out-Minidump and Sharpdump don’t seem to fulfill the first condition, the name being the same for both instances, so does this idea fit the example? Let’s look at Aristotle’s example to see if we can better understand the concept. He describes a situation where both “man” and “ox” can equally be described as “animals.” This refers to the nested reality where all things simultaneously exist in many categories. In this case, man and ox are one level of resolution, but “animal” is a higher, or superordinate, category that describes both man and ox equally. So it seems that if two ideas can be described similarly at a higher level of abstraction, we can refer to them as synonyms.

What if we apply this idea to our current understanding of the ontological hierarchy we’ve been building, where there is a literal, functional, procedural, sub-technical, technical, and tactical level of analysis for all instances? Could it be said, for example, that two tools, or instances, are literally different but functionally synonymous? Or maybe tool tools are both literally and functionally distinct, but they are procedurally synonymous. I believe this idea can help us understand the degree to which two tools diverge from each other, which might have significant implications for our ability to build and evaluate detection rules. For instance, it seems reasonable to assume that a detection rule that detects two tools that are functionally different but are procedural synonyms is a more robust detection rule than one which can only detect two tools that are functional synonyms. Let’s look at a few examples to see exactly how this concept plays out within the OS Credential Dumping: LSASS Memory⁸ sub-technique and potentially beyond.

Functional Synonyms

The first level of categorization within our hierarchy is the functional level. Here we are concerned with the API functions a tool calls to accomplish its tactical objectives. While there are an infinite number of variations at the literal level, categorization allows us to conceptualize these literal variations as a finite set. In part 5, we calculated 39,333 unique functional variations for the LSASS Memory sub-technique.⁹ While 39,333 represents a significant reduction relative to the presumptively infinite number of variations at the literal level, it is still a daunting value. That being said, the functional level is also the first at which we encounter the synonym phenomenon. We find that two tools can be literally different but functionally synonymous. This means that when the code is analyzed to determine which API functions are called, both tools would be seen to call the same functions in the same order. These tools would therefore be “functional synonyms.”

Example

To help make this idea concrete, we can analyze two tools which are literally different, but functionally the same.

Out-Minidump is a PowerShell script written by Matt Graeber that leverages a technology called “reflection” to allow direct, in-memory, Win32 function calls from PowerShell. Reflection is a popular way for attackers to extend the default functionality of PowerShell. Out-Minidump takes a process identifier and opens a read handle to the target process using kernel32!OpenProcess and uses that handle to create a minidump via dbghelp!MiniDumpWriteDump (Notice that because the process identifier is provided as an argument to Out-Minidump, we don’t know what function was used to enumerate processes, but for simplicity’s sake, we will assume it was ntdll!NtQuerySystemInformation).

Sharpdump is a C# binary that is purpose-built for generating a minidump for the LSASS process. To accomplish this task, Sharpdump calls ntdll!NtQuerySystemInformation to enumerate running processes and find the process identifier for the LSASS process, kernel32!OpenProcess to open a read handle to LSASS, and dbghelp!MiniDumpWriteDump to create the dump file.

The key takeaway here is that although Out-Minidump and Sharpdump are literally different tools, they have different hash values, one is a PowerShell script, and the other is a C# binary, they both rely on the exact same function calls to achieve their outcome. Therefore, we can say that these tools are “functional synonyms,” meaning that from the functional point of view, they are the same. Figure 1 shows an analysis of both tools at the functional level and we can see that they do indeed call identical functions.

Figure 1. Functional Synonyms (Out-Minidump and Sharpdump)

Procedural Synonyms

In part 3 of the series, I introduced the function call graph, which among other things, demonstrates that developers are spoiled for choice when it comes to selecting API functions to accomplish a task.¹⁰ It turns out that for any task, many functions can often be used to achieve the desired outcome. We saw this, for example, with kernel32!ReadProcessMemory and ntdll!NtReadVirtualMemory. Most tasks have ~6 function options, but we’ve found some tasks with up to 30 related functions. We found it useful to introduce an abstraction layer called operations to allow us to speak about a set of functions that perform a task instead of being stuck with individual functions. This is helpful because saying 30 different function names every time is a mouth full, and sometimes we are more concerned with the outcome than the specific function. We then discovered that operations rarely occur by themselves. Developers typically combine operations to accomplish a technique. These sequences of operations are called procedures, and this is the next level of analysis where we encounter synonyms.

Procedural synonyms are two tools that are literally and functionally different. They neither have the same cryptographic hash nor call the same functions. However, they do sequence the same operations in the same order as each other. Thinking about the detection problem in terms of procedures may be helpful because while there are 39,333 known functional variations, we have only discovered 4 procedural variations. Notice that the number of variations is exponentially reduced with each layer of abstraction! We can therefore say that procedural synonyms are more different from each other than functional synonyms. We can start to see why these distinctions are useful. If one wants to maximize the difference between two test cases, it would be advantageous to use procedural synonyms instead of functional synonyms. Let’s take a look at an example.

Example

In this example, we will compare Out-Minidump to a new tool called Dumpert.¹¹ Dumpert, a tool introduced earlier in the series, was written by the team at Outflank in the Netherlands. The key feature employed by Dumpert is that instead of calling the Win32 functions like Out-Minidump, it does (almost) everything via syscalls. This means that instead of calling ntdll!NtQuerySystemInformation, kernel32!OpenProcess, dbghelp!MiniDumpWriteDump like Out-Minidump, Dumpert calls syscall!NtQuerySystemInformation, syscall!NtOpenProcess, dbghelp!MiniDumpWriteDump (I told you ALMOST everything).

So unlike the comparison between Out-Minidump and Sharpdump, where the tools were literally different but functionally synonymous, Out-Minidump and Dumpert are both literally and functionally different but procedurally synonymous. That’s because ntdll!NtQuerySystemInformation and syscall!NtQuerySystemInformation both implement the Process Enumerate operation, kernel32!OpenProcess and syscall!NtOpenProcess both implement the Process Access operation, and dbghelp!MiniDumpWriteDump, which both tools use, executes the Process Read operation.

In Figure 2, I’ve added an additional layer to represent the operations corresponding to each function. Notice that the functions are different, but the operations are the same, and importantly, their sequence, from left to right, is the same.

Figure 2. Procedural Synonyms (Out-Minidump and Dumpert)

Sub-Technical Synonyms

This type of comparative analysis can continue at the next level. We previously defined procedures as “a sequence of operations that, when combined, implement a technique or sub-technique.” This means that for any sub-technique, there will be at least one but likely many procedures that can be used to implement it. This is the last step before transitioning from a micro, intra-technique level to a macro, inter-technique level of analysis. Tools that differ at the procedural level, meaning they choose and sequence operations differently, but still implement the same sub-technique, can be referred to as sub-technical synonyms. This seems to be the highest level at which an instance (tool) would be relevant to a detection rule because most detection rules don’t transcend the technique, but there are likely exceptions to that rule. Let’s look at an example of two tools that are literally, functionally, and procedurally different but still implement the same sub-technique and are thus sub-technical synonyms.

Example

In this example, we compare Out-Minidump to the approach that James Forshaw introduced in his “Bypassing SACL Auditing on LSASS” post.¹² This procedure can be implemented using James’ NtObjectManager PowerShell module,¹³ but there are a relatively infinite number of possible implementations. The exciting thing is that James went further than Outflank did with Dumpert. Instead of simply changing the function calls, James changed the operations he used to achieve the outcome. While Out-Minidump approaches the problem by sequencing Process Enumerate, Process Access, and Process Read together to implement LSASS Memory Dumping, James adds Handle Copy to the sequence, which in turn becomes Process Enumerate, Process Access, Handle Copy, and Process Read. In the process of adding an operation to the orthodox procedure, James created a new procedure. This small change evades Microsoft’s SACL on LSASS. Therefore, Out-Minidump and NtObjectManager differ at the procedural level but converge at the sub-technical level. This makes them sub-technical synonyms.

Figure 3 compares the tools at the functional, procedural, and sub-technical levels. Notice that despite taking different routes, both tools eventually implement the LSASS Memory sub-technique.

Figure 3. Sub-Technical Synonyms (Out-Minidump and Forshaw Duplicate Token)

Technical and Tactical Synonyms

From a detection perspective, I believe that after the sub-technical level, we begin to transition from a micro to a macro focus, and as a result, the utility of comparisons at the higher levels of the ontological structure is diminished. However, I thought it might be interesting to conceptually explore how technical and tactical synonyms would manifest. Technical synonyms are tools that perform the same technique (OS Credential Dumping) but differ at all lower levels of the hierarchy. Similarly, tactical synonyms belong within the same tactic grouping (Credential Access) but differ at the technical level and below. This section provides two examples, one demonstrating technical synonyms and one showing tactical synonyms. I found that the graphics begin to lose their impact at this level, so the examples below only include descriptions, which I think adequately demonstrates the point.

Technical Example

A fourth example compares Out-Minidump to Mimikatz’s lsadump::dcsync command.¹⁴ While Out-Minidump implements the LSASS Memory sub-technique, Mimikatz’s lsadump::dcsync goes a different route, establishing its own sub-technique, called DCSync¹⁵. While the LSASS Memory sub-technique focuses on dumping credentials stored in the LSASS process’s memory, DCSync imitates a Domain Controller to fool a legitimate Domain Controller into sending credential information to the compromised system. These tools not only differ literally, functionally, and procedurally, but they differ sub-technically as well. However, while they differ at all those analysis levels, they implement the same technique, OS Credential Dumping. This means that we can refer to these tools as technical synonyms.

Tactical Example

Lastly, we can compare Out-Minidump to Invoke-Kerberoast.¹⁶ Interestingly, both tools are written in PowerShell, but that’s where their similarities end. As we’ve discussed previously, Out-Minidump is a tool specifically used to perform the LSASS Memory sub-technique of the OS Credential Dumping technique. Invoke-Kerberoast, on the other hand, does not implement either the LSASS Memory sub-technique or the OS Credential Dumping technique. Instead, it executes the Kerberoasting sub-technique¹⁷ of the Steal or Forge Kerberos Tickets technique. However, these tools converge at the tactical level as they both implement the Credential Access tactic. When an attacker is interested in obtaining a particular set of credentials, both Out-Minidump and Invoke-Kerberoast are valid choices depending on the details of the user account of interest and other tradecraft considerations. We can now say that Out-Minidump and Invoke-Kerberoast are tactically synonymous because they both implement the Credential Access tactic.

Conclusion

Let us consider why or how this concept of synonyms might be practical. Imagine that you created a detection rule to detect the OS Credential Dumping: LSASS Memory sub-technique. A common strategy for evaluating the efficacy of the resulting detection rule is to test it using many different tools. The idea is that if the detection rule can detect a diverse set of tools that implement the sub-technique, it must be robust. However, we have to be careful because the traditional view of “different” is generally limited to the LITERAL level of analysis. We might look at threat reports and find different tools made by different people that accomplish the same sub-technique and run them, but we never consider just HOW different these tools are from one another.

It is actually quite common in my experience for this process to result in a test where four tools are used to evaluate a detection rule, but the four tools are functional synonyms. This means the tools only differ at the literal level, the most superficial level of analysis. Remember that at the functional level of analysis, we’ve discovered 39,333 variations, and there are probably more that remain to be found. This means that our test using four different tools only evaluates 1 out of 39,333 possible functional variations. So maybe the first step should be to ensure that when we select four tools to test, the four tools should be procedural synonyms. This means that the tools would implement different functional variations. However, that is still only 4 of 39,333 functional variations. It stands to reason that of all the possible sets of four functional variations, some sets are better representatives of the range of possibilities than others. We can think of this as population sampling in a medical trail. So how might we choose the optimal set?

My proposal is that we look higher in the ontological hierarchy. While we’ve uncovered 39,333 functional variations thus far in our analysis of OS Credential Dumping: LSASS Memory, we’ve only uncovered 4 procedural variations. I’d propose that if we want to maximize the representation of the possible variations for this sub-technique, it would be ideal to select instances (tools) that are procedurally distinct, meaning they would be sub-technical synonyms. This would maximize the diversity of the tools while still maintaining a sub-technique focus. I think of this as basically triangulating the detection because we can’t achieve a perfect picture, but we can estimate how well our detection rule performs. Of course, there is probably still room for nuance during test selection. For instance, within a given procedure, there is likely an ideal functional variation to maximize the value of the test. I haven’t figured out how to determine this yet, but it seems intuitively likely. Let me know what you think.

References

[1]: McGilchrist, Iain. (2009). The master and his emissary: The divided brain and the making of the western world. Yale University Press.

[2]: Atkinson, Jared C. (2022, September 8). Part 6: What is a Procedure?. Medium.

[3]: Wedin, Michael V. (2000). Aristotle’s theory of substance. Oxford University Press.

[4]: Aristotle. (1938). Categories (H.P Cooke, Hugh Tredennick, Trans.). Loeb Classical Library 325. Harvard University Press.

[5]: Graeber, Matthew. Out-Minidump. (2013). GitHub repository.

[6]: Schroeder, William. Sharpdump. (2018). GitHub repository.

[13]: Delpy, Benjamin. Mimikatz sekurlsa::logonPasswords. (2014). GitHub repository.

[8] Williams E., Millington E. (2020, February 11). LSASS Memory. MITRE ATT&CK.

[9]: Atkinson, Jared C. (2022, August 18). Part 5: Expanding the operation graph. Medium.

[10]: Atkinson, Jared C. (2022, August 9). Part 3: Expanding the function call graph. Medium.

[11]: Outflank. Dumpert. (2019). GitHub repository.

[12]: Forshaw, James. (2017, October 8). Bypassing SACL Auditing on LSASS.

[13]: Forshaw, James. NtObjectManager. (2016). GitHub repository.

[14]: Delpy, Benjamin; Le Toux, Vincent. Mimikatz lsadump::dcsync. (2017). GitHub repository.

[15]: Le Toux, Vincent; ExtraHop. DCSync. (2020, February 11). MITRE ATT&CK.

[16]: Schroeder, William. Invoke-Kerberoast. (2016). GitHub repository.

[17]: Praetorian. Kerberoasting. (2020, February 11). MITRE ATT&CK

On Detection: Tactical to Functional Series

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

On Detection: Tactical to Function

Jared Atkinson — Thu, 08 Sep 2022 13:30:48 +0000

Part 6: What is a Procedure?

Physical reality has structures at all levels of metric size from atoms to galaxies. Within the intermediate band of terrestrial sizes, the environment of animals and men is itself structured at various levels of size. At the level of kilometers, the earth is shaped by mountains and hills. At the level of meters, it is formed by boulders and cliffs and canyons, and also by trees. It is still more finely structured at the level of millimeters by pebbles and crystals and particles of soil, and also by leaves and grass blades and plant cells. All these things are structural units of the terrestrial environment, what we loosely call the forms or shapes of our familiar world.

Now, with respect to these units, an essential point of theory must be emphasized. The smaller units are embedded in the larger units by what I call nesting. For example, canyons are nested within mountains; trees are nested within canyons; leaves are nested within trees; and cells are nested within leaves. These are forms within forms both up and down the scale of size. Units are nested within larger units. Things are components of other things. They would constitute a hierarchy except that this hierarchy is not categorical but full of transitions and overlaps. Hence, for the terrestrial environment, there is no special proper unit in terms of which it can be analyzed all at once and for all. There are no atomic units of the world considered as an environment. Instead, there are subordinate and superordinate units. The unit you choose for describing the environment depends on the level of the environment you choose to describe.

- James A. Gibson¹

Introduction

Welcome back to the On Detection: Tactical to Functional blog series. I want to take a slightly different approach to things in this post. Most of the previous articles in the series have been purely technical in focus. I hope that this post maintains this technical feel, but I feel it is time to do semantical house cleaning. The goal is to leverage the technical foundation we’ve established throughout the series to help answer this question. Over the past few months, I’ve observed an increased focus on “procedures,” as in the “P” of the acronym TTP, as being the optimal level of analysis in Detection Engineering. As a result, I’ve felt the need to consider whether I agree with the proposition that we should focus our detection efforts on procedures. However, I ran into an issue each time I tried to provide my perspective. I didn’t know what a procedure was! When I look at the MITRE ATT&CK matrix, it is easy to understand that tactics are the column headers, things like Persistence, Credential Access, Lateral Movement, and techniques are each individual cell, things like Kerberoasting, OS Credential Dumping, etc., but then we kind of hand wave away procedures as being “everything below the technique.” I’ve seen procedures defined as tools, specific command lines, steps in an attack, and different things. It appears that there are actually two questions hidden in this proposition. First the ontological question, “what do you think a procedure is?” Second, the practical question, “do you think that procedures are the optimal target of detection engineering efforts?” Initially, I intended for this post to answer both questions, but the initial draft was 5000 words, and frankly, I’m not sure anyone would stick with me for that long in one sitting. Instead, I decided to answer each question individually in subsequent articles. In this article, I plan to use the technical work we’ve put in during the previous 6 posts to explain precisely how I define the term “procedure.” Then we will build on that topic in the next post to understand whether “procedures” are, in fact, the optimal target for detection engineers.

Why is the term “procedure” so ill-defined?

In 1931, the Polish-American scientist Alfred Korzybski published a paper that included the statement, “the map is not the territory.”² This phrase is a metaphor that examines the relationship between the model we use to see the world (the map) and the world itself (the territory). This is a fascinating relationship because the world is far too complex for us to perceive completely, so our map must necessarily be lower resolution than the world. However, we would hope that our map roughly approximates reality such that it has utility for problem-solving. If it tells us that Fresno is between Los Angeles and Sacramento, we should expect that Fresno is actually between the other two cities. If it is not, then our decision-making is compromised. This means that our map should be low resolution enough to comprehend it while simultaneously high resolution enough (detailed) to enable us to make accurate predictions when faced with uncertainty. I believe that one of the significant issues in the sub-discipline of Detection and Response is that our map is too low resolution to use to make sound and accurate predictions. Our perception of the world is shaped by the frame of reference through which we interact with it.

Imagine that we have been trained our entire career to view the cyber world through a three-layered lens (Tactics, Techniques, and Procedures). As a result, we apprehend the cyber world as something composed of three layers and tend to be limited to thinking and talking (but I repeat myself) about it within this three-layer construct. Through this blog series, I have attempted to shed this preconceived structure to explore the ontological structure of this world, or at least estimate it as well as I can. The result has been interesting for me, and I hope for you as well because we have discovered the existence of at least six layers (functions, operations, procedures, sub-techniques, techniques, and tactics). Consider what must occur when one attempts to view six layers through a three-layered map? The resolution with which one views the world must necessarily be reduced. The nuance and complexity of those six layers are lost or compressed into the three-layer frame of reference.

In some cases, this loss of detail might be spread throughout all three layers of the model (TTP), but in this case, it seems that the first two layers, tactics and techniques, are well aligned to the first two layers in the six-layer model. This means that all of the loss is realized at the procedure level specifically. This is because the three-layer model’s version of procedures is responsible for representing the six-layer model’s concept of sequences of operations, operations, and functions. While we all have our own unique maps or perspective with which we view the problem, I hope that this journey to increase the resolution and detail of my map has the side effect of informing your map, so we are better off as an industry. It is time to unshackle “procedure” from the burden of representing multiple layers and allow it to take its rightful place within the emergent hierarchical structure we’ve uncovered.

What is a Procedure?

What does the Military Say?

To answer the question “what is a procedure?” I think it is essential to understand where the term originated. Where did the information security field pick up the Tactics, Techniques, and Procedures (TTP) concept? As is common in our specialty, it stems from the Military, so it may be worth consulting them (at least the US DoD) to better understand the intended purpose. My colleague, Robby Winchester (@robwinchester3 on Twitter), explored the idea of procedures in his What is a TTPblog post,³ referring to Department of Defense Joint Publication 1–02, Department of Defense Dictionary of Military and Associated Terms, for the official definitions of each term. JP 1–02 defines Procedures as the “specific detailed instructions and/or directions for accomplishing a task.”⁴ This definition is helpful but doesn’t clarify precisely where procedures fit into the overall structure of things we’ve begun to uncover.

Robby goes further in his post to provide a metaphor about car ownership that we can use to better understand the idea. He labels high-level considerations like “providing fuel,” “cleaning the interior,” and “preventative maintenance” as examples at the Tactic level. He then changes focus to the preventative maintenance Tactic and explores the Techniques that achieve this Tactical objective. Examples provided include “changing the oil,” “rotating tire,” and “replacing the breaks.” The last level of analysis is Procedures, where one would expect “detailed instructions and/or directions” to implement a Technique like changing the oil.⁵ To correctly interpret this definition, we must first understand what is meant by “detailed instructions and/or directions.”

My Thought Process

One option for interpretation is that it could mean something as specific as “run this tool with this command line,” However, I believe there is a technical issue that prevents this interpretation. So far, all the different concepts we’ve discussed are categories. You can think of them as classes in object-oriented programming, or you can think of them as Platonic forms (I prefer this interpretation). These forms are not concrete or tangible. Instead, they are patterns that must be instantiated in order to exist in the world. Tools are the instantiation of the class or the form. For an attacker to execute a technique, they MUST instantiate the form through a tool. Tools are simply idiosyncratic instantiations of the technique, not the technique itself.

To me, procedures belong to the forms, not the instances. The procedures are the pattern of steps to execute, not the execution of the steps. Therefore, it seems that the “detailed instructions” aspect of the procedure definition might look something like the list below:

Enumerate processes to obtain the process identifier for lsass.exe.
Open a handle to lsass.exe with the PROCESS_VM_READ access right.
Read the memory of lsass.exe to obtain the credentials stored within.

The vital point here is not the steps; it is what the steps represent. Notice that each step corresponds with an operation that we’ve discussed previously. Step 1 corresponds with the Process Enumerate operation, step 2 corresponds with the Process Access operation, and step 3 corresponds with the Process Read operation. The steps define a sequence of operations that implements a sub-technique, specifically, the sequence of operations that is instantiated by Mimikatz’s sekurlsa::logonPasswords command.⁶

As I mentioned in the introduction, I’ve, until now, found it difficult to pin down exactly what a procedure is or at least should be. I feel that the structure that we’ve uncovered throughout this series finally provides the foundation to allow me to state precisely what I believe a procedure is. My answer is that a procedure is not a tool or a command line argument.

A procedure is “a sequence of operations that, when combined, implement a technique or sub-technique.”

Comparing the Definition to the Operation Graph

We can now apply this definition to our analysis of the operation graph for LSASS Memory dumping that we discussed in the previous post,⁷ and that is shown below as Figure 1:

Figure 1. LSASS Memory Operation Graph

This graph shows four unique sequences of operations that can be used to accomplish the sub-technique. Based on our new definition for procedures, these four sequences of operations ARE procedures.

The Ontological Hierarchy

This series has explored the ontological hierarchy of things in the cyber world. We talked about the function call stack,⁸ identified the functions called by Mimikatz,⁹ abstracted functions into operations,¹⁰ combined them to make function call graphs,¹¹ observed how operations can be connected as procedures, and how numerous procedures exist that can fulfill the requirements of a given sub-technique.¹² While each of these concepts can hopefully stand alone, I have found it helpful to generate a graphic that visualizes the hierarchical relationship between these concepts and demonstrates that they fit together to form a coherent structure. The image is shared as Figure 2.

Figure 2. Conceptual Taxonomic Structure from the Tactical to the Functional

Note: For brevity’s sake, and because I have yet to do the work, this hierarchical image only expands one node at each level. Imagine that each node at each level has a relatively similar number of options that it abstracts. Hopefully, this allows you to imagine the sheer extent of the possibilities available to attackers.

The image shows the different levels of abstraction that are simultaneously occurring. So when someone chooses to run Mimikatz sekurlsa::logonPasswords to dump credentials, it is simultaneously true that they are executing kernel32!ReadProcessMemory¹³ (function), Process Read (operation), Direct Memory Access (procedure), LSASS Memory¹⁴ (sub-technique), OS Credential Dumping¹⁵ (technique), and Credential Access¹⁶ (tactic).

The nested structure of things

The quote that opened this post was written by James J. Gibson in his book The Ecological Approach to Visual Perception. What I love about it, especially in this context, is that it helps us to see that this nested structure is not unique to this particular use case. The nested structure is present in everything we interact with daily. Just as Gibson observed, the cell is nested within the leaf, the leaf within the branch, the branch within the tree, the tree within the canyon, and the canyon within the mountain.¹⁷ We can observe that the function implements the operation, operations can be combined to implement the procedure, the procedure implements the sub-technique, the sub-technique implements the technique, the technique implements the tactic, and tactics are integrated to implement the attack chain.

Put another, more explicit way, functions represent the base level of the structure, at least thus far. Functions are interesting because they are about as tangible as things get within the context of software. They represent the doors through which an application can interact with the operating system and the associated hardware. We then discovered that “functions” implement “operations.” Operations are the first level of abstraction in our structure. They are beneficial because numerous functions can be used to achieve the same operational result. For instance, both kernel32!ReadProcessMemory and ntdll!NtReadVirtualMemory implements the Process Read operation. Operations are interesting because, in some sense, they form the basic building blocks of applications. Programs basically combine individual operations to accomplish tasks. Based on the conclusion of this article, when operations are combined to achieve a technical objective, they become procedures. For instance, we saw that Mimikatz combines the Process Enumerate, Process Access, and Process Read operations to dump credentials from LSASS. This leads us to understand that procedures implement a “technique” or a “sub-technique.” The aforementioned procedure implements the LSASS Memory sub-technique, but it is essential to remember that there is often more than one procedure for any given technique or sub-technique. Sub-techniques, when applicable, are specific manifestations of techniques, like OS Credential Dumping. Techniques ultimately are implementations of tactics. For instance, OS Credential Dumping is one technique that implements the Credential Access tactic. Lastly, attackers can combine and order tactics to achieve their overall objectives. Our prior work now provides a coherent structure allowing us to connect the abstract tactical layer to the concrete functional layer. Specifically, we have language that enables precise speech when discussing these ideas.

Conclusion

I hope my explorations of the hierarchy, as I see it, allow for expanding the language used to discuss these details and increasing the definitional resolution of specific terms currently in our lexicon (like procedures). Whether you agree with my new/refined use of “procedures” or not, I hope you can at least appreciate my attempt to define a term that, based on my observations, we universally agree is overly broad by giving it specific technical meaning. Either way, I will use “procedures” to mean “a sequence of operations that, when combined, implement a technique or sub-technique” in my future posts, presentations, and Twitter debates.

I’d like to leave you with a parting thought. I’m going to ask you a question about Figure 3 below, and for this experiment to work, you mustn’t read the next paragraph until AFTER you have answered the question (this should not be a hard requirement to meet). When you look at this image, what object do you see?

Figure 3.

There is a concept in social psychology called “experiential hierarchies.” This idea focuses on the categorical nesting of things. Different subjects can apprehend an individual object at different levels within the hierarchy. It is possible to interpret the object in the image at different levels of abstraction. For instance, it is simultaneously an object, a vehicle, a car, a race car, a Formula 1 race car, a Red Bull Racing Formula 1 race car, and Max Verstappen’s Red Bull Racing Formula 1 race car. Each level increases the resolution with which one considers the object.

Two questions that I want you to consider between now and my next post are:

Why did you choose the level you did?
Is there an optimal hierarchical level at which you should view things?

References

[1]: James A. Gibson. (1986). The Ecological Approach to Visual Perception. Psychology Press.

[2]: Korzybski, Alfred (1933). Science and Sanity: An Introduction to Non-Aristotelian Systems and General Semantics. International Non-Aristotelian Library Publishing Company.

[3]: Robby Winchester. (2017, September 27). What’s in a name? TTPs in Info Sec. Medium.

[4]: Joint Chiefs of Staff. (2010). Department of Defense dictionary of military and associated terms (JP 1–02).

[5]: Winchester, R. (2017, September 27). What’s in a name? TTPs in Info Sec. Medium.

[6]: Delpy, B. (2014). Mimikatz. GitHub repository.

[7]: Atkinson, Jared C. (2022, August 18). Part 5: Expanding the operation graph. Medium.

[8]: Atkinson, Jared C. (2022, June 27). Understanding the function call stack. Medium.

[9]: Atkinson, Jared C. (2022, July 19). Part 1: Discovering API function usage through source code review. Medium.

[10]: Atkinson, Jared C. (2022, August 4). Part 2: Operations. Medium.

[11]: Atkinson, Jared C. (2022, August 9). Part 3: Expanding the function call graph. Medium.

[12]: Atkinson, Jared C. (2022, August 18). Part 5: Expanding the operation graph. Medium.

[13]: Microsoft. (2022, May 13). ReadProcessMemory function. Microsoft Docs.

[14] Williams E., Millington E. (2020 February 11). LSASS Memory. MITRE ATT&CK.

[15] Williams E., Le Toux V. (2017 May 31). OS Credential Dumping. MITRE ATT&CK.

[16] MITRE ATT&CK. (2018 October 17). Credential Access. MITRE ATT&CK.

[17]: James A. Gibson. (1986). The Ecological Approach to Visual Perception. Psychology Press.

On Detection: Tactical to Functional Series

On Detection: Tactical to Function was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Function appeared first on Security Boulevard.

On Detection: Tactical to Functional

Jared Atkinson — Thu, 18 Aug 2022 16:43:39 +0000

Part 5: Expanding the Operation Graph

Welcome back to the On Detection: Tactical to Functional blog series. Previously we discussed operations and sequences of operations that I call operation paths. This article will explore the idea that there must be one operation path for any given technique or sub-technique, but there can be many. When there are many operation paths, these paths can be combined to form an operation graph representing the different sequences of operations that attackers can use to perform a technique or sub-technique. This post will explore this concept and look at real-world examples of attackers implementing alternative operation paths to evade certain detective or preventative controls. If we understand the operational options and the reasons why an attacker might prefer one path over another, we can make better predictions about the types of variations we can expect to see. I hope you enjoy this article, and as always, I would love feedback and/or any discussion about the ideas I have presented in this series.

Introduction

The second article in this series introduced the concept of Operations. Operations work as a container for teleologically equivalent functions. In other words, if two functions provide the same output (say, generating a handle to a process), they accomplish the same operation (Process Access in this case).

The article compared two tools, Mimikatz and Dumpert, to demonstrate how operations are useful. Mimikatz, specifically the sekurlsa::logonPasswords command, makes three function calls to accomplish the OS Credential Dumping technique. Those function calls are ntdll!NtQuerySystemInformation, kernel32!OpenProcess, and kernel32!ReadProcessMemory. This sequence of function calls is shown in Figure 1 below:

Figure 1. Mimikatz sekurlsa::logonPasswords Function Call Sequence

On the other hand, Dumpert accomplishes the same outcome, reading credentials from LSASS memory, while using three different function calls than Mimikatz. The function calls used by Dumpert are syscall!NtQuerySystemInformation, syscall!NtOpenProcess, and dbghelp!MiniDumpWriteDump. The sequence of function calls made by Dumpert is shown in Figure 2 below:

Figure 2. Dumpert Function Call Sequence

The article continued to demonstrate how to use abstract operations in place of specific functions to describe the functional combinations possible to accomplish OS Credential Dumping: LSASS Memory. The sequence of operations shown in Figure 3 applies to Mimikatz and Dumpert’s implementations.

Figure 3. Mimikatz sekurlsa::loonPasswords and Dumpert Operation Path

Each operation acts as an abstraction layer for a function call graph. Therefore, a function call graph exists for the Process Enumerate operation, a function call graph exists for the Process Access operation, a function call graph exists for the Process Read operation, and many more function call graphs exist for operations we have not yet considered.

In the original post that introduced operations, we used our current understanding of the relevant function call graphs (Process Enumerate, Process Access, and Process Read) to calculate the number of functional combinations possible for this sequence of operations. We calculated that this single sequence of operations contained 192 possible variations of functions (4 Process Enumerate functions, 6 Process Access functions, and 8 Process Read functions) that could be combined to accomplish this task.

However, in the third article, we discussed integrating new information into existing function call graphs to expand our understanding of the territory of each operation. This integration increased the number of functions represented in the Process Enumerate function call graph from 4 to 21 functions. Using the function call graphs shown below in Figure 4 (Process Enumerate), Figure 5 (Process Access), and Figure 6 (Process Read), we can recalculate the number of functional variations contained within this single operational sequence as 1,008 (21 Process Enumerate, 6 Process Access, and 8 Process Read).

Figure 4. Process Enumerate Function Call Graph

Figure 5. Process Access Function Call Graph

Figure 6. Process Read Function Call Graph

Through the power of abstraction, we can represent 1,008 unique variations (at the functional level) as one variation at a higher level of abstraction (the operational level).

As I work through different examples in this post, I will explain my interpretation of why the authors made certain “tradecraft” decisions — starting with Dumpert as an example.

It is worth reiterating that one of the primary reasons why the team at Outflank decided to use different function calls, specifically replacing kernel32!OpenProcess with syscall!NtOpenProcess because they noticed that many EDR vendors focused on detecting the Process Access operation of this attack. However, they also saw that many sensors only monitored the more superficial layers, OpenProcess the Win32 API Function instead of NtOpenProcess the syscall, even though these functions are functionally equivalent.

In the most recent post in this series, we introduced the concept of compound functions. Unlike simple functions, like kernel32!OpenProcess or kernel32!ReadProcessMemory, which executes one and only one operation; compound functions are individual functions that perform multiple operations. In the case of kernel32!Toolhelp32ReadProcessMemory, which we explored, performs Process Access and Process Read. Since it executes two of the three operations necessary to perform OS Credential Dumping: LSASS Memory, we can imagine a NEW pseudo operational path consisting of Process Enumerate -> Toolhelp32ReadProcessMemory as shown below in Figure 7.

Figure 7. Pseudo-operational Path using the kernel32!Toolhelp32ReadProcessMemory Compound Function

Twenty-one additional variations are possible as a result of the Process Enumerate (21) -> [Process Access + Process Read] (1) pseudo-operational path. The addition of this pseudo-operational path brings the count to 1,029 possible functional variations.

It is essential to realize that because Toolhelp32ReadProcessMemory contains two operations, it is only a valid option for any operational sequence that includes this subsequence (Process Access -> Process Read). So although the function call graphs for both the Process Access AND Process Read operations include the Toolhelp32ReadProcessMemory function, it is not a valid choice for all operational paths that use one of the other. For instance, there may be future operational paths that we discover that have a subsequence of Process Access -> Process Write (something like Process Injection, maybe). Therefore, Toolhelp32ReadProcessMemory would not be a valid option for this operational path.

What if a technique or sub-technique (like OS Credential Dumping: LSASS Memory) contained not one but potentially many valid operational paths? We have seen with Dumpert that attackers can change their choices at the functional level of resolution to evade detection, but what if they could change their tradecraft at the operational level as well? This blog post explores some examples of operational variations. It explains how we can integrate that into our map and what that means for us as we strive to increase our understanding of this particular technique and tradecraft.

Alternative Operational Paths

In Part 2, we discussed how attackers, like the team from Outflank, could make functional changes to evade detection, but I had never considered that changes were possible at the operational layer. Considering that the concept of the operational layer did not exist at the time, who could blame me? I did, however, know of a blog post that presented a slightly modified way to dump credentials that I knew I had to integrate into the model. Still, I was having a tough time with it until it dawned on me that maybe this approach was presenting an alternate operational path. This next section will introduce that approach from James Forshaw and a couple of other alternatives I have discovered in the aftermath of figuring this out.

James Forshaw’s Handle Duplication Method

As mentioned in the previous paragraph, it felt terrific to identify the 1,029 functional variations of OS Credential Dumping: LSASS Memory. Still, I knew that James Forshaw had written a blog post (shared below) about an approach that allowed him to bypass the SACL that Microsoft added to LSASS in Windows 10.

Bypassing SACL Auditing on LSASS

The SACL specifically targeted the Process Access operation. Still, James found that it only reported instances where the PROCESS_VM_READ access right was included in the request (he has much more detail in his post, so please check it out). As a result, James found that he could open a handle to LSASS (Process Access). However, he could open it with only the PROCESS_DUP_HANDLE access right, which would not trigger the SACL. James could then use a trick with NtDuplicateObject where an application can derive a full access handle to the process, which, in this case, is LSASS. Then he could read from this new “duplicate” handle without triggering the SACL. The series of functions James used in his POC is below in Figure 8:

Note: James’ example in the blog post did not specify which function he used to enumerate the process identifier of LSASS or to read from the resulting handle, so I took a few liberties in creating Figure 8. They should be representative of functions he could or likely would have used.

Figure 8. James Forshaw’s Handle Duplication Function Call Sequence

This sequence of function calls presented a dilemma. How could there be concordance between a sequence of four functions and three operations? At first, I chose to ignore it, but it kept bothering me. Eventually, I realized that this was an alternative operation path. James’ approach followed the Process Enumerate -> Process Access -> Handle Copy -> Process Read operation path. Notice that this is VERY similar to the Process Enumerate -> Process Access -> Process Read operation path, but with a Handle Copy operation inserted to bypass the SACL. This small change could be the difference between visibility and invisibility. This approach evades any detection strategy that relies on the LSASS SACL. Figure 9 shows the operation path that James’ method introduced.

Figure 9. James Forshaw’s Handle Duplication Method Operation Path

Bill Demirkapi’s Process Forking

After realizing that multiple operation paths are possible, we should attempt to find as many alternative operation paths as possible. One interesting article by Bill Demirkapi describes a different evasion approach. In this case, Bill was worried about a subset of anti-virus products that would filter access rights from process handle requests, especially to sensitive processes like LSASS. He mentioned that 9 of the 13 possible access rights are often filtered and therefore unreliable when facing these products. This constraint set him off on a journey to discover whether any of the remaining four access rights could be helpful towards achieving some end, such as dumping credentials from the LSASS process. A link to his blog post describing his approach is below:

Abusing Windows' Implementation of Fork() for Stealthy Memory Operations

Bill found that most products would leave the PROCESS_CREATE_PROCESS access right unfiltered. This access right allows the caller to create a process using the resulting handle. This access right might sound familiar to any developers that have used parent process spoofing, as Bill details in his post. Bill ultimately found that this access right allows for the creation of a child process called a fork. Specifically, when a process is forked, the fork inherits handles to private memory regions of the forked (parent) process, LSASS.

Forking the LSASS process allows Bill to, among other exciting techniques, dump credentials from LSASS. The cool thing is that even if the target has anti-virus or EDR products that filter handle requests to sensitive processes if they allow the handle to have the PROCESS_CREATE_PROCESS access right, the attacker can create a fork of LSASS and ultimately read the credentials via the fork process without ever getting a “read handle” to LSASS.

Figure 10. ForkPlayground’s Function Call Sequence

This approach presents a third operation path. This path includes the following operation sequence of Process Enumerate -> Process Access -> Process Create -> Process Read, which is shown below in Figure 11:

Figure 11. ForkPlayground’s Operation Path

Matteo Malvica’s LSASS Snapshotting

The third and final alternate operation path to explore in this post was discovered and written about by Matteo Malvica and
b4rtik. In his blog post, shared below, Matteo mentions that he and a colleague were having issues dumping credentials from LSASS without triggering a built-in Windows Defender Advanced Threat Protection alert. Even tools like Dumpert did not seem to bypass the alert, so they were stumped for a solution. Interestingly, it appeared that this alert was targeting the Process Read operation instead of the Process Access operation that many other products target (there is some good discussion that we will save for a future article about which operation we should use as a foundation for detective or preventative controls). So what did they do?

Evading WinDefender ATP credential-theft: a hit after a hit-and-miss start

They eventually discovered a function called PssCaptureSnapShot, which allowed them to create a snapshot of the process and then read from the snapshot instead of the process itself. This way, detective controls focused on the Process Read of LSASS miss the behavior because this approach performs the Process Read operation on the snapshot of LSASS instead of the LSASS process. To do this, the sample code project ATPMiniDump calls ntdll!ZwQuerySystemInformation, ntdll!ZwOpenProcess, kernel32!PssCaptureSnapshot, and dbghelp!MiniDumpWriteDump. This sequence of function calls is shown in Figure 12 below:

Figure 12. ATPMiniDump’s Function Call Sequence

We can then abstract these functions into operations where the new operation path is Process Enumerate -> Process Access -> Snapshot Create -> Process Read. ATPMiniDump demonstrates the fourth and final operation path we will introduce in this blog post, shown in Figure 13.

Figure 13. ATPMiniDump’s Operation Path

The Operation Graph

We now have four known operation paths developers can use to perform the OS Credential Dumping: LSASS Memory sub-technique. It is crucial that we do not assume that these four paths represent ALL possible operation paths. Instead, it represents known operation paths. That said, we can now combine the four individual operation paths to form an operation graph, shown in Figure 14:

Figure 14. OS Credential Dumping: LSASS Memory’s Operational Graph

If one operation path, Process Enumerate -> Process Access -> Process Read, had 1,029 functional variations, imagine how many functional variations are represented by the four operation paths in this graph. Remember that each operation is an abstraction of a function call graph. To perform this calculation, we will first need to create the function call graphs for the new operations (Handle Copy, Process Create, and Snapshot Create), shown below in Figure 15, Figure 16, and Figure 17, respectively.

Figure 15. Handle Copy Function Call Graph (6 Functions)

Figure 16. Process Create Function Call Graph (28 Functions)

Figure 17. Partial Snapshot Create Function Call Graph (4 Functions)

Note: Notice the addition of a “partial” label to the Snapshot Create function call graph. That is because PssCreateSnapshot is a compound function that will require further analysis; however, this analysis could distract from the point of this article (the existence of multiple operation paths which can be combined to form an operation graph), so the article treats it as a simple function.

Using these new function call graphs, we can now calculate the number of functional variations for each operation path in our operation graph. To demonstrate this, we will use shorthand abbreviations for operation names. Below is a list of operations included in our graph and their associated abbreviation:

Handle Copy (Figure 15): HC has six (6) functional variations
Process Access (Figure 5): PA has six (6) functional variations
Process Create (Figure 16) : PC has twenty-eight (28) functional variations
Process Enumerate (Figure 4): PE has twenty-one (21) function variations
Process Read (Figure 6): PR has eight (8) functional variations
Snapshot Create (Figure 17): SsC has four (4) functional variations

We can then calculate the number of functional variations represented by each operational path.

Direct Memory Access PE (21) x PA (6) x PR (8) = 1,008
Toolhelp32ReadProcessMemory PE (21) x [PA -> PR] (1) = 21
Duplicate Token PE (21) x PA (6) x HC (6) x PR (8) = 6,048
Process Forking PE (21) x PA (6) x PC (28) x PR (8) = 28,224
Process Snapshotting PE (21) x PA (6) x SsC (4) x PR (8) = 4,032

By adding the total number of variations for each operation path, we can find 39,333 total functional variations across the four operational variations for the OS Credential Dumping: LSASS Memory sub-technique.

Note: This is, of course, based on our current knowledge shown in the operational graph and function call graphs shared in this article. We should assume that this is merely a subset of the total, but at least we have something tangible to start.

Conclusion

This article expanded on the “operations” idea presented in Part 2. The starting hypothesis was that each technique or sub-technique has one and only one sequence of operations. In Part 2, we saw that the sequence used by Mimikatz sekurlsa::logonPasswords was PE -> PA -> PR. Still, in this post, we explored three alternative approaches/tools that did not fit into our preconceived notion of there being only one valid operational path. Specifically, we found that James Forshaw was able to insert an additional operation into the sequence to evade a specific detection approach. We repeated the process for two additional examples.

The key takeaway is that at the Operational level of analysis, a single technique or sub-technique can have many valid operation paths, which we can graph to form an operation graph. We then know that each operation is an abstraction of an underlying function call graph which serves as our map of the different functional choices to accomplish the operation. We postulate that we can calculate the number of functional variations for a given operation path by multiplying the number of entry points in each function call graph. For example, Mimikatz and Dumpert follow the Direct Memory Access operation path, which uses the Process Enumerate, Process Access, and Process Read operations. If Process Enumerate has 21 function entry points, Process Access has six, and Process Read has 8, then the total number of functional variations for this operation path is 21x6x8 or 1,008. We can sum the number of functional variations from each operation path to calculate the total number of functional variations for the sub-technique. The exciting thing is that based on this calculation, there are 39,333 total variations of the OS Credential Dumping: LSASS Memory sub-technique at the functional level. However, these variations can be represented abstractly as only four unique variations at the operational level.

We will close with this observation. When we talk about an instance of an attack, maybe we are analyzing a breach report; we can think about it at the tactic level, “ we observed that the attacker used the Credential Access tactic.” We can think about it at the technique level, “we observed that the attacker used the OS Credential Dumping technique.” We can think about it at the sub-technique level, “we observed that the attacker used the LSASS Memory sub-technique.” We can think about it at the operational level, “we observed that the attacker used Direct Memory Access (PE -> PA -> PR).” Alternatively, we can think about it at the functional level, “we observed that the attacker called syscall!NtQuerySystemInformation, syscall!NtOpenProcess, and dbghelp!MiniDumpWriteDump”. The question is, “which level of analysis is the MOST appropriate level for the task at hand?” Let me know what you think in the comments or on Twitter.

On Detection: Tactical to Functional Series

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

On Detection: Tactical to Functional

Jared Atkinson — Tue, 16 Aug 2022 14:18:03 +0000

Part 4: Compound Functions

Introduction

Welcome back to the On Detection: Tactical to Functional series (links to all posts are at the bottom of the post). Thus far, we’ve explored the OS Credential Dumping: LSASS Memory sub-technique, specifically mimikatz, as an example to understand how this sub-technique works. The first post focused on identifying the API functions that the mimikatz’ sekurlsa::logonPasswords command uses to achieve its desired outcome. Functions are essential because they are the building blocks of functionality within the operating system. The second post introduced the concept of Operations, which act as abstract categories used to group similar functions based on their teleological outcome. Suppose I can replace one function with another, like how a developer can replace OpenProcess with NtOpenProcess. In that case, those functions perform the same Operation (Process Access in this case). The third post built on a concept I wrote about before starting this series in a post called Understanding the Function Call Stack. The idea was that documented API functions are, in fact, wrappers around the actual functionality. When we call a function like ReadProcessMemory, many functions are called before the operation is complete. This sequence of functions builds the “function call path.” In the third post, we explored how we can generate a new function call path and integrate it with other existing function call paths for the same operation, which results in the generation of a “function call graph.” The graph allows us to evaluate all of the different functional options, at least those known to us, that a developer can use to achieve an Operation.

This post is a slight detour, but it is essential nonetheless. This blog series is working to connect the Tactical to the Functional through a coherent taxonomy. Still, sometimes we make observations and build assumptions based on them only to find that our perspective was too low resolution to apply coherently to the range of implementations we should consider. This post discusses one such example.

As I alluded to earlier, the second post introduced the idea of functions. One of the fundamental axiomatic presuppositions I made was that every function represents one and only one Operation. As I continued my research of OS Credential Dumping: LSASS Memory and the building of the relevant function call graphs, I ran into some functions that contradicted this axiom. I want to use this post to show the example I ran into, explain how it works, provide language we can use to discuss it, and describe how I’ve adjusted the taxonomy to account for this phenomenon.

Revisiting ReadProcessMemory

The first article in this series analyzed the Mimikatz source code to find that it relies on a call to kernel32!ReadProcessMemory. It is then possible to use the methodology discussed in the Understanding the Function Call Stack post to generate the function call path for kernel32!ReadProcessMemory. If an application makes a call to kernel32!ReadProcessMemory, it will subsequently call api-ms-win-core-memory-l1–1–0!ReadProcessMemory, then kernelbase!ReadProcessMemory, then ntdll!NtReadVirtualMemory, and finally transition execution to the kernel via the NtReadVirtualMemory associated syscall. I thought this function call path was representative of normal functional behavior. It is quite straightforward, and execution passes from each function in the path without any detours, as shown in the graph below.

ReadProcessMemory Function Call Path

Introducing Toolhelp32ReadProcessMemory

As the Understanding the Function Call Stack post mentioned, one of the first steps during analysis is to open the implementing DLL and search the exports table for a reference to the function of interest. This search ultimately leads to the function’s code implementation, which helps us to understand how the function works. Upon searching for ReadProcessMemory in kernel32.dll, I stumbled upon a second, similarly named function called Toolhelp32ReadProcessMemory, which piqued my interest.

kernel32.dll Exports Table

According to the Toolhelp32ReadProcessMemory function’s documentation, it acts similarly to ReadProcessMemory, but with one exception. ReadProcessMemory requires a handle to the process from which to read, while Toolhelp32ReadProcessMemory only requires the process identifier (th32ProcessID). It appears that Toolhelp32ReadProcessMemory is functionally equivalent to ReadProcessMemory, but may be easier to use or at least potentially allow for bypassing that pesky Process Access operation required for ReadProcessMemory. Skipping the Process Access operation would be useful for attackers because the vast majority of detection rules for this technique target this operation specifically.

If we open the function in IDA, we can see that Toolhelp32ReadProcessMemory actually calls ReadProcessMemory for us. It seems like it might just be another layer of wrapper code to add to our function call graph.

Toolhelp32ReadProcessMemory Disassembled Code Implementation

It is possible to find the exact version of ReadProcessMemory used by consulting kernel32.dll’s import table. It appears that Toolhelp32ReadProcessMemory calls api-ms-win-core-memory-l1–1–2!ReadProcessMemory.

kernel32.dll Imports Table

Recall that part 2 of this series introduced a new Operational abstraction layer that allows us to group functions teleologically (based on the functions’ ends, goals, purposes, or objectives). For example, ReadProcessMemory falls under the Process Read operation because it is responsible for allowing an application to read the volatile memory of a process. Meanwhile, in part 3, we demonstrated how we could combine multiple individual function call paths for a given operation to form a function call graph that is aligned to the operation and describes all known functional options to execute the particular operation.

My first thought was that we could add Toolhelp32ReadProcessMemory to the ReadProcessMemory function call path earlier to produce a function call graph for the Process Read operation.

Initial Integration of Toolhelp32ReadProcessMemory into the Process Read Function Call Graph

While this seemed like a simple enough solution, it bothered me because something seemed off. Toolhelp32ReadProcessMemory wasn’t as simple of a function ReadProcessMemory. While it is true that it calls the API Set version of ReadProcessMemory, that isn’t everything it does. Remember when we observed that Toolhelp32ReadProcessMemory only required a process identifier instead of a process handle, and we thought that maybe we could skip the Process Access operation altogether? If we look again closely, this time at the code produced by IDA’s decompiler, we see that Toolhelp32ReadProcessMemory doesn’t ONLY call ReadProcessMemory. It calls OpenProcess, ReadProcessMemory, and CloseHandle.

Toolhelp32ReadProcessMemory Decompiled Code Implementation

Toolhelp32ReadProcessMemory is a single function that performs multiple (3) operations. OpenProcess for the Process Access operation, ReadProcessMemory for the Process Read operation, and CloseHandle for the Handle Close operation. While kernel32!ReadProcessMemory follows a straightforward function call path, which isn’t ALWAYS the case. Some functions, like Toolhelp32ReadProcessMemory, actually act as miniature self-contained applications. I’ve started referring to these multi-operational functions, like Toolhelp32ReadProcessMemory, as “compound functions” while referring to single-operational functions, like ReadProcessMemory, as “simple functions”.

Toolhelp32ReadProcessMemory presents a conundrum for our graphing efforts. While it is true that it performs the Process Read operation and therefore should be included in the Process Read function call graph, it also belongs in the Process Read and Handle Close function call graphs.

The problem is that this function is no longer atomic, meaning it cannot be mixed and matched with other functional implementations of a given operation. Suppose an application chooses to use NtReadVirtualMemory for the Process Read operation. In that case, the application can generally select to pair NtReadVirtualMemory with any simple function in the Process Access graph. This pairing ability is not the case with Toolhelp32ReadProcessMemory. Applications that use this compound function are, in essence, locked into using OpenProcess and ReadProcessMemory.

Visualizing Compound Functions

Understanding how compound functions work within function call graphs and operations, I’ve created two ways to visualize these functions. The first is to view the function atomically in what I call the “compound function graph,” and the second is to view it within the context of the relevant operations’ function call graphs.

Compound Function Graph

The compound function graph is an interesting way to understand how a compound function works. The compound function is on the left side of the graph, and its node is colored purple. We then see arrows originating with the compound function and pointing to yellow nodes, representing the compound function’s operations. We see Process Access, Process Read, and Handle Close in this case. Then we see that each operation node points to the entry point into the relevant operation’s function call graph and shows the subsequent function calls made. For instance, Toolhelp32ReadProcessMemory calls api-ms-win-core-processesthreads-l1–1–2!OpenProcess to implement the Process Access operation. The compound function graph is useful for getting the full picture of how an individual compound function works.

(Compound Function) Toolhelp32ReadProcessMemory Compound Function Graph

Combined Graph

The second way to visualize compound functions is to integrate them into the relevant operations’ function call graph. For instance, the function call graph for the Process Read operation below includes the Toolhelp32ReadProcessMemory compound function. This time, however, the compound function’s node is purple to indicate that it is a compound function and therefore cannot be used atomically like the other functions with red-colored nodes.

(Integrated) Process Read Function Call Graph

I’ve also included the function call graph for the Process Access operation to demonstrate that we should add compound functions to all relevant operations’ function call graphs.

(Integrated) Process Access Function Call Graph

Remember that the Operational Graph we created for mimikatz sekurlsa::logonPasswords was Process Enumerate -> Process Access -> Process Read, but Toolhelp32ReadProcessMemory allows that to be collapsed into Process Enumerate -> Toolhelp32ReadProcessMemory as shown below:

Additional Example

I thought it’d be helpful to include a second example of a compound function. A technique that I’ve been interested in for a while is Access Token Manipulation. Robby Winchester and I initially presented about Access Token Manipulation at Black Hat Europe in 2017 and subsequently released a white paper on that topic. The paper identified three categories of token theft that are now classified as sub-techniques in MITRE ATT&CK for the Access Token Manipulation Technique (Token Impersonation/Theft, Create Process with Token, and Make and Impersonate Token). Access Token Manipulation is a technique that the industry seems to have a decent understanding of, and yet we keep refining that understanding over time. Some great examples are Justin Bui and Jonathan Johnson’s work (here and here).

SetThreadToken vs. ImpersonateLoggedOnUser

I recently looked back into this technique to build function call graphs, and I rediscovered an interesting divergent use case that seems germane to this article. Applications can choose between two functions to apply an impersonation token to the current thread. The first is SetThreadToken, and the second is ImpersonateLoggedOnUser. Justin Bui previously spent some time investigating the relevant API functions to perform SYSTEM token theft (think of meterpreter’s getsystem command), so I asked him about the difference. In our conversation, one of the significant differences between the two functions was that applications must first create a duplicate copy of the target token before calling SetThreadToken. At the same time, ImpersonateLoggedOnUser does not require this step. This difference seemed to make ImpersonateLoggedOnUser advantageous, but does that change when we look into their code implementation?

Below is the function call path followed by SetThreadToken. Like ReadProcessMemory or OpenProcess, SetThreadToken is a simple function that only performs a single operation, Thread Write (it writes the desired token to the thread using NtSetInformationThread).

(Simple Function) SetThreadToken Function Call Path

Upon investigating ImpersonateLoggedOnUser, we see a slightly different and more complicated picture. It turns out that ImpersonateLoggedOnUser is a compound function that again performs three operations, Token Read (getting information about the token itself), Token Copy (creating a duplicate copy of the target token), and Thread Write (applying the token to the target thread). We see that it isn’t entirely true that ImpersonateLoggedOnUser doesn’t require a duplicated token. Instead, it performs the duplication implicitly via NtDuplicateToken.

(Compound Function) ImpersonateLoggedOnUser Compound Function Graph

Above we saw the compound function graph for ImpersonateLoggedOnUser. Still, we can see how this compound function integrates into the function call graph for both the Token Copy and Thread Write operations. One crucial detail I want you to notice is that this time there are three purple nodes instead of the one we saw with Toolhelp32ReadProcessMemory. ImpersonateLoggedOnUser has a similar layering structure to many of the simple functions we’ve seen. It has a documented function, an API Set, and an undocumented function component. The key is that applications can call any of these three functions, but all three will result in the compound result. As a result, I’ve included all three nodes in our function call graphs. However, all three nodes are colored purple to indicate their compound nature.

(Integrated) Token Copy Function Call Graph

(Integrated) Thread Write Function Call Graph

Conclusion

My goal with this work and this blog series is to explore the emergent taxonomy that seems to exist from Tactics down to Functions. As I explore and build layers and categories, I occasionally stumble upon some examples that don’t quite fit into the schema I’ve created. This discordance is a fantastic problem because it lets me expand or refine the schema to represent reality better or demonstrates that my schema has a fundamental error. In this case, compound functions challenged one axiomatic presupposition of my schema: that all functions perform a single operation. This axiom is demonstrably false, and I had to update how I viewed the world (the cyber world) to deal with this fact. It seems that categorizing functions as simple functions, those that perform one and only one operation, and compound functions, which act as miniature self-contained applications and perform multiple operations, works perfectly fine and is coherent with the rest of the schema (for now). Hopefully, this also helps you to understand, whether you are on the red or blue side, that there’s more to things than meets the eye, and just because you don’t explicitly call a function doesn’t mean you aren’t calling it implicitly. Please let me know what you think on Twitter or in the comments, and stay tuned for the next edition of the On Detection: Tactical to Functional series.

On Detection: Tactical to Functional Series

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

On Detection: Tactical to Functional

Jared Atkinson — Tue, 09 Aug 2022 16:18:26 +0000

Part 3: Expanding the Function Call Graph

Introduction

In the previous post in this series, I introduced the concept of operations and demonstrated how each operation has a function call graph that undergirds it. In that post, I purposely presented incomplete, relative to my knowledge, function call graphs because I wanted only to show the extent that was obviously based on what we’ve observed via mimikatz (in the first post in this series). Another benefit of limiting the function call graphs to what we’ve actively discovered as part of this series is that we can show that a partial picture, when formally documented, is still useful even when we know it is incomplete. Below is the function call graph for the Process Enumerate operation, which will serve as the basis for this article:

This graph is relatively sparse. During our analysis of mimikatz, we saw that it called NtQuerySystemInformation to enumerate a list of processes and ultimately find the process identifier (pid) for the LSASS process. We then analyzed the function call stack to identify the syscall and the alternative Native API function names. Generally speaking, we’ve observed that Native API functions are not the layer that most application developers are expected to interact with, so a reasonable question would be, “are there any higher level functions that might ultimately call NtQuerySystemInformation or similar functions?” One researcher did just this. @modexpblog from MDSec wrote an excellent blog post exploring 14 alternative options for identifying the process identifier of the LSASS process. This is EXACTLY what we are concerned about! The first option is to call NtQuerySystemInformation, which we have already covered, but the second method offers a different approach that is worth investigating.

— @MDSecLabs

The second method, described in the blog post, focuses on the Windows Terminal Service (WTS). It describes a Windows API function called WTSEnumerateProcesses which can be used to list processes and which then can result in finding the LSASS pid.

A cool feature of this blog post is that they include sample source code for each method, so we can also see how this function is used to get the LSASS pid.

Before we can start our analysis, we must determine which library implements WTSEnumerateProcessesA. To do this, we can browse the Microsoft documentation for the WTSEnumerateProcessesA function.

In the Requirements section, we find that the name of the implementing DLL is wtsapi32.dll.

wtsapi32.dll

Now that we know the wtsapi32.dll library implements the WTSEnumerateProcessesA function, we can open it in our disassembler. When investigating API Functions, my first step is checking the exports table. Especially in cases where the function in question ends in A or W, I like to search the export table more generically because there are probably alternative versions of the function. To do this, I searched for the word “process” and found that there are four WTSEnumerateProcesses* functions (WTSEnumerateProcessesA, WTSEnumerateProcessesExA, WTSEnumerateProcessesExW, and WTSEnumerateProcessesW).

Since this article is part of my On Detection series, we will create a graph to represent everything we learn due to this analysis visually. We know there are four independent functions at this point, but we don’t know much else. The image below reflects this information:

WTSEnumerateProcessesA

While designing this article, I struggled with the sequencing for analyzing the four functions. There were generally two options, analyze all four functions simultaneously or evaluate each function in sequence. I decided it is easier to follow if I follow the execution of a single function and then reanalyze the rest of them. As we encounter new ideas, I will explain them, and then, if we meet them later, I will refer you back to the section that covered that information while providing the output of the analysis.

The first function that we encounter is WTSEnumerateProcessesA. We can double-click on the function name and view its implementation. This function is pretty simple in that it calls one of the other functions we are interested in, which is WTSEnumerateProcessesW.

The WTSEnumerateProcessesA function calling the WTSEnumerateProcessesW function

Our updated function call graph now shows WTSEnumerateProcessesA calls WTSEnumerateProcessesW.

WTSEnumerateProcessesW

Continuing our analysis, we can dive into the implementation of WTSEnumerateProcessesW. Analyzing this function reveals two possible calls. The first to an imported function called WinStationGetAllProcesses and the second to an imported function called WinStationEnumerateProcesses. A glance at the flow of function calls indicates that the WTSEnumerateProcessesW only calls WinStatonEnumerateProcesses if the call to WinStationGetAllProcesses fails somehow. This flow is shown by the call to GetLastError, specifically checking the error code 0x6D1.

WTSEnumerateProcessesW calls WinStationGetAllProcesses or WinStationEnumerateProcesses

In the previous image, we see that the calls to WinStationGetAllProcesses and WinStationEnumerateProcesses are prefixed with __imp_, which indicates that these are imported functions or said differently. These functions are imported from an external library because they do not exist in winsta32.dll. To determine which binary implements these functions, we can search for them in the Import table. The image below shows the Import table entries for these two functions and shows that they are both implemented in winsta.dll.

We can update our function call graph to indicate that WTSEnumerateProcessesW can call either WinStationEnumerateProcesses or WinStationGetAllProcesses.

winsta.dll

Now we can load winsta.dll into our disassembler, and we can again view the Export table to find the reference to our functions. Again, we can use a genericized search term to make sure that if alternatives to these functions exist, we’d see them. In this case, there don’t seem to be any valid alternatives.

winsta!WinStationGetAllProcesses

We can now jump into the code implementation of WinStationGetAllProcesses, and we immediately see a call to an internal function called GetSystemProcessInformation@CProcessUtils.

We can follow that call, and we see that GetSystemProcessInformation@CProcessUtils ultimately calls the NtQuerySystemInformation Native API function.

We’ve reached an inflection point in our analysis because we’ve discovered that the WTSEnumerateProcess* graph converges with our existing NtQuerySystemInformation graph. I’ve updated the graph to show these new functions and connect the starting and newly created graphs.

winsta!Legacy_WinStationGetAllProcesses

If we continue our analysis of WinStationGetAllProcesses, it appears that in certain situations, the code can call a function called Legacy_WinStationGetAllProcesses which we can also analyze.

In some cases, WinStationGetAllProcesses can call Legacy_WinStationGetAllProcesses

Our analysis of the Legacy_WinStationGetAllProcesse function reveals a call to the NdrClientCall3 function, which handles RPC Procedure calls. As a result, we must analyze the parameters being passed into NdrClientCall3 to understand precisely which RPC Procedure is being called.

The first step is to identify the RPC Interface, which is passed in as a field within the first parameter (labeled as pProxyInfo) shown below:

The values labeled as Data1, Data2, Data3, and Data4 in the picture above can be parsed using PowerShell into the string form a Globally Unique Identifier (GUID), which represents the RPC Interface that NdrClientCall3 is calling.

We can search for the GUID string, 5ca4a760-ebb1–11cf-8611–00a0245420ed, using Google. This search will result in the discovery of the Microsoft documentation for the Terminal Services Terminal Server Runtime Interface Protocol, otherwise referred to as [MS-TSTS].

Section 1.9 Standards Assignment finds that the GUID is associated with the Legacy RPC Interface. This aligns with the name of the calling function, Legacy_WinStationGetAllProcesses.

The next bit of important information is the RPC Procedure Number (Opnum or Procnum) which is indicated by the second parameter passed to NdrClientCall3 labeled as pProcNum. We can see that the value being passed is 70.

Referring to the RPC Interface documentation, we see that Opnum 70 refers to a procedure called RpcWinStationGetAllProcesses_NT6 which seems to align well with what we know about the calling function.

We can now update the function call graph to include the alternative call to Legacy_WinStationGetAllProcesses which then makes an RPC call to an RPC Procedure called RpcWinStationGetAllProcesses_NT6.

termsrv!RpcWinStationGetAllProesses_NT6

As you might have guessed, the RPC Procedure call is not the end of the line of execution. In fact, to this point, nothing has happened in the context of actions taking place to change or enumerate the system. To continue following the execution path, we must investigate the code associated with the RpcWinStationGetAllProcesses_NT6 RPC Procedure, but because it isn’t an imported function like we saw previously, we have to use a different approach. RPC is a client/server interface where a client, in this case, Legacy_WinStationGetAllProcesses, makes a call to the server, which executes the code associated with the procedure. This means that there is likely a binary on the system that implements the RPC Interface and functions as the server, so we need to find it.

To find the server, we can use a function from James Forshaw’s NtObjectManager called Get-RpcServer, which parses binary files passed to it to determine if that binary has an RPC server implemented in its code. A brute force strategy is to list all DLL files in system32 and pass them all via the PowerShell pipeline into Get-RpcServer. We can filter the results by looking for the Interface GUID we identified from the NdrClientCall3 parameters. This process works, and as a result, we see that termsrv.dll implements the Legacy Interface of the [MS-TSTS] protocol.

Now that we know that termsrv.dll implements the RPC server, we can load it into our disassembler and find the relevant code. RpcWinStationGetAllProcesses_NT6 is an RPC Procedure, NOT an exported function, so instead of looking for it in the Exports table, we must find it in the general functions menu.

After navigating to the function’s implementation, the first relevant call is to an internal function called GetSessionProcessInformation@CProcessUtils.

The GetSessionProcessInformation@CProcessUtils function ends up calling the NtQuerySystemInformation Native API function. NtQuerySystemInformation already exists in our function call graph, so we’ve reached this code path’s end.

Below is the updated function call graph, which now enumerates the Legacy_WinStationGetAllProcesses function path and the resulting RpcWinStationGetAllProcesses_NT6 RPC Procedure call.

winsta!Legacy_WinStationGetAllProcesses

We can return to the Legacy_WinStationGetAllProcesses function in winsta.dll and continue following the code. We immediately see that a second RPC Procedure will be called if the call to RpcWinStationGetAllProcesses_NT6 fails with a 0x6D1 error code. We can see that the first argument to this second call points to the same structure (stru_18002E308) passed to the first NdrClientCall3. This means the same RPC Interface ([MS-TSTS] Legacy) is being called. However, we can see that the second parameter, which contains the RPC Procedure Number, is different. This second call refers to ProcNum 43.

By referring back to the RPC Protocol documentation, we find that Opnum 43 is called RpcWinStationGetAllProcesses.

termsrv!RpcWinStationGetAllProcesses

We already know that termsrv.dll implements the RPC Server, so we can search the termsrv.dll function table to find the function called RpcWinStationGetAllProcesses. When we find it, we can navigate to its code implementation. Once there, we find that RpcWinStationGetAllProcesses calls RpcWinStationGetAllProcesses_NT6, which we’ve already included in our graph.

We’ve updated our function call graph to include the RpcWinStationGetAllProcesses RPC Procedure.

winsta!Legacy_WinStationEnumerateProcesses

The WinStationGetAllProcesses call was analyzed in the previous few sections of this article. However, if the call to WinStationGetAllProcesses fails in a certain way (cmp eax, 6D1h) a second call to WinStationEnumerateProcesses will be made. Let’s take a look at this function’s implementation.

Jumping back into winsta.dll, we can open the WinStationEnumerateProcesses function. Ultimately WinStationEnumerateProcesses calls a function called Legacy_WinStationEnumerateProcesses. Let’s take a look at its implementation.

The Legacy_WinStationEnumerateProcesses function makes an RPC call. In this case, we can see that the first parameter (pProxyInfo) is pointing to the same structure we saw previously, so we know this part of the [MS-TSTS] LegacyApi. The only difference is that the nProcNum parameter is set to 36.

The documentation for [MS-TSTS] LegacyApi says that Opnum 36 is associated with the RpcWinStationEnumerateProcesses procedure, which seems to align well with the name of the calling function.

We can now update the graph to include the RPC call that is made by the WinStationEnumerateProcesses function, as shown below:

termsrv!RpcWinStationEnumerateProcesses

Remember that termsrv.dll is acting as the RPC Server for the LegacyApi Protocol, so there should be an internal function called RpcWinStationEnumerateProcesses to analyze. Upon first glance at the code, there aren’t any substantive call instructions. We see a call to RpcCallTrace and a second call to DbgPrintMessage, but based on the names of those functions, it appears they are simply helpers. Further analysis identified a Load Effective Address (lea) instruction directly after the RpcCallTrace call, and it loads a string into the RDX register. We can see that according to the command, this string says “!!!RpcWinStationEnumerateProcesses depr “… where “depr” stands for “deprecated.” This means that while this functional path appears to exist, it isn’t expected to be reached or do any process enumeration, at least on the version of the Operating System that we are using for this analysis.

While this function path is non-viable, on this version of the OS, I still think it is essential to document it as part of the function call graph. However, because these functions don’t seem to be a possible entry point into the function call graph to perform the Process Enumerate operation, I’ve changed the nodes on this path to black instead of red. This change indicates that these nodes are not valid entry points on modern operating systems.

OldRpcWinStationEnumerateProcesses

Even though we’ve seen three different RPC Procedures in our code that are relevant to this operation, we shouldn’t assume those are the only three RPC Procedures. One strategy I use is to figure out the naming convention used by the specific RPC Interface’s Procedures. For instance, in the case of the Legacy Interface, we see that all of the procedures follow the RpcWinStation* naming convention. We can use this convention to search for relevant functions we might have missed. In doing this, I discovered one additional function called OldRpcWinStationEnumerateProcesses. Let’s check it out.

We can verify that this is actually an RPC Procedure by looking for it in the docs; sure enough, we see that it is there.

We can then check the code implementation of OldRpcWinStationEnumerateProcesses, and we see that it simply calls RpcWinStationEnumerateProcesses, which we’ve already analyzed.

We can now add this RPC Procedure to our function call graph, but because it results in a deprecated call, we can give the node a black border since it isn’t a valid entry point into the graph on modern systems.

WTSEnumerateProcessesExA

At this point, we can go back and analyze WTSEnumerateProcessesExA. After navigating to the WTSEnumerateProcessesExA function’s code implementation, we see a call to WTSEnumerateProcessesExW, another API function on the list to explore.

The WTSEnumerateProcessesExA function calling the WTSEnumerateProcessesExW function

We’ve updated the function call graph to indicate that WTSEnumerateProcessesExA calls WTSEnumerateProcessesExW. Next, we will investigate the inner workings of WTSEnumerateProcessesExW.

WTSEnumerateProcessesExW

The last Windows API function to analyze is WTSEnumerateProcessesExW. Once we navigate to the function’s code implementation, we see that it calls WinStationGetAllProcesses, an undocumented function we previously investigated.

WTSEnumerateProcessesExW calls WinStationGetAllProcesses

We can now complete the function call graph relative to these four new WTSEnumerateProcesses functions.

API Set

Before we finish, we should talk about API Sets. You may have noticed at the very beginning, when we analyzed the Microsoft API documentation for WTSEnumerateProcessesA that in the Requirements section, there was a reference to an API Set called ext-ms-win-session-wtsapi32-l1–1–0. Using this information, we can derive four additional functions that can serve as entry points into our function call graph. This means that developers can refer to the API Set version of the WTSEnumerateProcesses* functions like ext-ms-win-session-wtsapi32-l1–1–0!WTSEnumerateProcessesA.

We can update the function graph to include the four API Set versions of the WTSEnumerateProcesses* functions as shown below:

Other Functions

While this article focuses on analyzing the WTSEnumerateProcesses* functions, attackers can use a few other functions to enumerate processes and, specifically, process identifiers. The graph shown below has added these additional functions for the sake of completeness. That being said, it is essential to emphasize that we should always work under the assumption that our function call graphs remain incomplete. They serve to document and represent our current understanding of the territory, but it would be foolish to assume that our knowledge is, in fact, complete.

Conclusion

We’re all probably familiar with the idea of known knowns, known unknowns, unknown knowns, and unknown unknowns. Donald Rumsfeld famously defined known knowns as things WE know that WE know. This is interesting because “WE” is a relative concept. The function call graph that I started this blog post with was relative to MY knowledge (WE being defined in this case as one person) of the different ways in which someone could enumerate processes. The cool thing is that the concept of “WE” can be expanded to include your friend group, your company, or even the entire industry. This blog post demonstrates how we can consume the knowledge of others to, at least conceptually, expand the scope of WE, allowing us to have a map that better represents the reality of the environment.

On Detection: Tactical to Functional Series

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

On Detection: Tactical to Functional

Jared Atkinson — Thu, 04 Aug 2022 14:06:59 +0000

Part 2: Operations

Introduction

Welcome back to my On Detection: Tactical to Functional series. In the first post in this series, we explored the source code for Mimikatz’s sekurlsa::logonPasswords command. We discovered that Mimikatz relies on three Windows APIs to read credentials from the memory of the LSASS process. First, it calls NtQuerySystemInformation to enumerate processes to find the Process Identifier (PID) of LSASS. Next, it calls OpenProcess to open a read handle to LSASS. It finishes by calling ReadProcessMemory to read the contents of the LSASS process’s memory which contains credential information. As a result, we determined a functional path for the sekurlsa::logonPasswords command of ntdll!NtQuerySystemInformation -> kernel32!OpenProcess -> kernel32!ReadProcessMemory.

Function Calls made by Mimikatz’s sekurlsa::logonPasswords command

It is quite common for malware analysts to become familiar with the chain of function calls used by malware to achieve some behavior. For instance, when I first started in Infosec, I was told that if you ever see OpenProcess -> VirtualAllocEx -> WriteProcessMemory -> CreateRemoteThread, you are dealing with Process Injection. At the time, I had no idea what these (API functions) were or how I’d see them, but that sequence has stuck with me.

In this blog post, we will explore how we can take this simple sequence of functions and expand it. You see, NtQuerySystemInformation -> OpenProcess -> ReadProcessMemory is not the ONLY way to dump credentials from LSASS. In doing so, I will introduce a new layer of abstraction that focuses on what operation(s) a function is performing. The operation is based on the outcome of the function call, for instance, WriteProcessMemory might be called a Process Write operation. This abstraction layer provides a more accurate way to describe the threat and prevents us from being too myopically focused on certain API functions. It also serves as the beginning of a coherent taxonomy that will reach all the way up to existing Tactics and Techniques codified in MITRE ATT&CK. Let’s dig in!

Function Call Stacks

In my Understanding the Function Call Stack post, I demonstrated how documented Windows API functions often serve as wrappers for more profound functionality. In that post, we explored how CreateFileW ultimately calls a Native API function called ntdll!NtCreateFile, which in turn makes a syscall to a function in the kernel also called NtCreateFile. I’ve reshared the Function Call Graph from the previous post below for reference:

Initial CreateFileW Function Call Graph

The point of that previous post was not to describe how CreateFileW works but to describe how all Windows operating system functions work. Therefore, we can use this mapping methodology to explore the function call stack for NtQuerySystemInformation, OpenProcess, and ReadProcessMemory, respectively, which we will do below.

NtQuerySystemInformation

The first function called by Mimikatz sekurlsa::logonPasswords is NtQuerySystemInformation. This is a Native API function that resides in ntdll.dll, so it is already relatively low in the function stack. Upon investigating its implementation, it’s found to simply call a syscall by the same name NtQuerySystemInformation.

It’s worth noting that when a syscall is made, it isn’t actually made to a named function. Instead, they are represented by a service index (a number) that is a then resolved into a function via the System Service Descriptor Table (SSDT). Unfortunately, these values change depending on the version of the Operating System or even based on patch state, so it is difficult to refer to them by their syscall number. For simplicity, I will refer to them by their function name. For more information on the details of syscall check out j00ru’s windows-syscalls project.

Additionally, it is worth noting that the same code segment that is pointed to by NtQuerySystemInformation is also pointed to by ZwQuerySystemInformation and RtlGetNativeSystemInformation as shown below (notice the exported entry comments at the top of the image):

NtQuerySystemInformation Implementation

Therefore, it is possible for an attacker, based on our current understanding, to achieve the same outcome using ntdll!NtQuerySystemInformation, ntdll!ZwQuerySystemInformation, ntdll!RtlGetNativeSystemInformation, and the syscall NtQuerySystemInformation. As a result, this graph can be produced to show the possible function call paths.

Initial Process Enumerate Function Graph

OpenProcess

The second function is OpenProcess which is a properly documented Windows API function. Using the Function Call Stack methodology, we can map out the calls made by kernel32!OpenProcess. In doing so, we see that kernel32!OpenProcess follows a relatively similar path to kernel32!CreateFileW, at least in principle. kernel32!OpenProcess calls a version of itself in the api-me-win-core-processthreads-l1–1–2 API Set, which redirects to kernelbase!OpenProcess. Next, kernelbase!OpenProcess calls ntdll!NtOpenProcess, which finishes by making a syscall to a function in the kernel called NtOpenProcess. Similar to what was observed with NtQuerySystemInformation, the code for ntdll!NtOpenProcess can also be reached through a call to ntdll!ZwOpenProcess.

Initial Process Access Function Graph

ReadProcessMemory

The last step for mimikatz is to read the contents of LSASS’s process memory. After all, this is where the credentials actually reside. It might even be said that ReadProcessMemory is the primary function call and that the preceding calls, NtQuerySystemInformation and OpenProcess, were simply used to obtain its prerequisites (a process handle to LSASS).

In this case, we see that kernel32!ReadProcessMemory points to an API Set called api-ms-win-core-memory-l1–1–0, redirecting to kernelbase!ReadProcessMemory. Then we see kernelbase!ReadProcessMemory calls ntdll!NtReadVirtualMemory then makes a syscall to the NtReadVirtualMemory function in the kernel. As is becoming typical, we also see that the code for ntdll!NtReadVirtualMemory is also pointed to by a different exported function called ntdll!ZwReadVirtualMemory.

Initial Process Read Function Graph

Dumpert by Outflank

You might be wondering why this type of analysis is important. John Lambert famously said, “Defenders think in lists. Attackers think in graphs. As long as this is true, attackers win.” One relevant example that proves this point is the tool Dumpert by the team at Outflank. It is prevalent for defenders, including security vendors, to view attack techniques through the functional lens. This means they might observe that Mimikatz follows the functional sequence of ntdll!NtQuerySystemInformation, kernel32!OpenProcess, and kernel32!ReadProcessMemory and build their visibility, detection rules, or preventative controls around this EXACT pattern. This is a problem because they aren’t approaching the problem from the perspective of the graph of possibilities. The team from Outflank did the exact opposite. They understood that kernel32!OpenProcess existed within a function call stack, resulting in a syscall to NtOpenProcess. They also understood that many EDR tools had visibility of kernel32!OpenProcess but not the associated syscall. As a result, they replicated Mimikatz functionality but replaced the kernel32!OpenProcess call with the NtOpenProcess syscall. Additionally, they realized that while kernel32!ReadProcessMemory might be the “orthodox” way to read the contents of a remote process’s memory, there may be less common ways to achieve this same outcome.

With this in mind, they replaced the kernel32!ReadProcessMemory call with a call to dbghelp!MiniDumpWriteDump. These two changes resulted, at least in the short term, in an implementation that achieved the same exact outcome as Mimikatz sekurlsa::logonPasswords while decreasing the likelihood of detection. The resulting function call sequence was syscall!NtQuerySystemInformation -> syscall!NtOpenProcess -> dbghelp!MiniDumpWriteDump as shown below:

Function Calls made by Dumpert

At this point, our function call graphs recognize syscall!NtQuerySystemInformation and syscall!NtOpenProcess, but dbghelp!MiniDumpWriteDump isn’t integrated into our map quite yet. We can use the function call stack analysis approach to understand the calls that dbghelp!MiniDumpWriteDump makes. In doing so, we see that dbghelp!MiniDumpWriteDump calls dbgcore!MiniDumpWriteDump, which calls a couple internal functions before calling kernelbase!ReadProcessMemory. This means that the function call path of dbgcore!MiniDumpWriteDump converges with the function call path of kernel32!ReadProcessMemory, so instead of being two independent paths, they can be combined into one coherent graph as shown below:

Updated Process Read Function Graph

Dumpert shows us that there are numerous ways to execute a given technique and that the variables involved can lead to unforeseen outcomes such as missed detection opportunities or bypasses of preventative controls.

Operations

If we take John Lambert’s advice and begin viewing the problem from a graph perspective, we can see that in almost all cases, numerous functions can be used to achieve any outcome. This realization allows us to create a model that will enable these outcomes to be discussed even though they are abstract concepts. Functions are the tangible layer of the model in the sense that we can interact with functions directly. In some sense, they are concrete ideas that exist, are documented, and can be touched in the context of code. However, it can be useful to view the problem more abstractly in the sense that any combination of functions in the ntdll!NtQuerySystemInformation, kernel32!OpenProcess, and kernel32!ReadProcessMemory function call graphs can be combined to achieve the OS Credential Dumping: LSASS Memory technique. This means that the set of functions that make up each function call graph can be rolled up into one abstract category, which I call the operational layer. With this in mind, we can convert the two function paths for mimikatz sekurlsa::logonPasswords and Dumpert into one operational path that covers both. The NtQuerySystemInformation functional graph is abstracted as Process Enumeration, the OpenProcess functional graph is abstracted as Process Access, and the ReadProcessMemory functional graph is abstracted as Process Read. Therefore, we can say that both tools follow the Process Enumerate -> Process Access -> Process Read Operational Path as shown below:

Initial Operation Graph for OS Credential Dumping: LSASS Memory

The key is that each operation has a corresponding set of functions that can be called to perform the operation. For instance, based on our current understanding, there are 4 Process Enumerate functions, 6 Process Access functions, and 8 Process Read functions. These numbers can be used to calculate the total number of possible functional permutations for the operational path. At the operational level, we know of one path, Process Enumerate -> Process Access -> Process Read, but by multiplying the number of functions within each operation in the sequence (4 x 6 x 8), we find that there are 192 total functional permutations. This is the power of abstraction. We can convert 192 individual sequences of function calls into 1 sequence of operations.

I’ve created an interactive notebook that should help to visualize how these function calls are related. The idea is that for each operation, there are several valid choices of functions to perform the operation.

https://medium.com/media/20c3d81a1bc128aefcda95ffc790ea88/href

Conclusion

We must understand that the sequence isn’t ntdll!NtQuerySystemInformation -> kernel32!OpenProcess -> kernel32!ReadProcessMemory it is Process Enumerate -> Process Access -> Process Read. Just like how the old sequence of kernel32!OpenProcess -> kernel32!VirtualAllocEx -> kernel32!WriteProcessMemory -> kernel32!CreateRemoteThread was never THE sequence for process injection. It was only ONE possible sequence for process injection.

The power of abstraction is that it allows us to summarize the world, but it is essential to not lose sight of the detail. We must maintain the ability to zoom in and out of the layers, increasing and decreasing resolution depending on our task. So far, we’ve only explored two layers of the abstraction, the functional and operational layers, but in time we will explore more. Stay tuned. In the next post, we will explore whether the Process Enumerate -> Process Access -> Process Read operational path is the ONLY operational path possible to achieve the OS Credential Dumping: LSASS Memory technique. If not, we will work to discover what other operational paths exist and demonstrate how we might find them.

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

On Detection: Tactical to Functional

Jared Atkinson — Tue, 19 Jul 2022 16:11:57 +0000

Part 1: Discovering API Function Usage through Source Code Review

Welcome to my new blog series, “On Detection: Tactical to Functional,” where I intend to explore and expand my understanding of that which we attempt to detect. We’ve all operated within the Tactics, Techniques, and Procedures paradigm for so long that I feel our ability to discuss complex technical topics is often stunted by our ability to express ideas. My recent observation is that a three-tiered taxonomy (such as TTP) is far too limiting to facilitate the necessary conversation to improve our thinking about detection. I believe there are more than three tiers that exist which means our three-tier taxonomy necessarily leads to grouping different things at the top or, in our case, at the bottom of the taxonomy. For this reason, it seems to me that the term “Procedures” is used too broadly to describe too many things and limits our ability to really get into the technical details. Tactics, Techniques, and Procedures are all abstract concepts that serve as categories to group concrete things together. I want to start at the concrete and work my way up to explore all of the levels in between.

For this reason, we will start with an analysis of API Functions which, in some sense, are the base component of functionality within an operating system. This first post will dig into a well-known attack tool, Mimikatz, and determine which API functions it is using to accomplish a particular task. I hope you enjoy it!

Introduction

In my Understanding the Function Call Stack post, I introduced the nesting nature of Windows API functions. There is almost always a superficial/documented version of the API that then goes through a series of calls to deeper, more fundamental functions that are less likely to be documented but still able to be called directly by applications. I then explained that malware developers could use the knowledge of this nesting to call the less expected/documented version of a function to evade some sensors making their actions “invisible.” In that post, we explored CreateFileW specifically and dug into it, but what we didn’t explain was the process for how we might identify which function we should be interested in.

This post will introduce one process, source code review, for determining which function(s) are used by a given malware sample. For this demonstration and the next couple of posts in this series, we will use everyone’s favorite tool Mimikatz and explore exactly which API functions it relies on to perform its most popular command, sekurlsa::logonPasswords. Remember that a tool is often just functioning as an even more superficial wrapper on top of a series of API functions doing the hard work. This is a fundamental concept that is described in my Capability Abstraction post. Here we will work through analyzing Mimikatz’s source code to understand which function(s) it calls, and then we can leverage the process demonstrated in my previous post to see how those functions nest underneath.

As a result, this post provides a detailed walk-through of how I analyzed the Mimikatz source code to tie the logonPasswords command to its respective function call(s).

Analyzing Mimikatz’ sekurlsa::logonPasswords

Mimikatz Github project

The first obstacle is determining where to start the analysis within the code base. The Mimikatz Github project provides numerous versions of the same First, we are planning on digging into the main component of Mimikatz, so we will skip the mimidrv, mimilib, and other folders and navigate to the mimikatz folder.

Within the Mimikatz folder, we see some files related to the visual studio code project and the main mimikatz.c file. Let’s think about how Mimikatz commands work. Every Mimikatz command follows the module::command syntax. This means that we are interested in digging into the modules folder.

Once in the modules folder, we can consider the exact command used. This is an excellent time to think about the command to tell Mimikatz to dump credentials from LSASS. If we are unfamiliar with the tool’s usage, then we might not know. In that case, I’d recommend revisiting the source that brought the tool to our attention in the first place to see EXACTLY how it was used there. Maybe this would be a threat report, or maybe it is a blog post, but either way, a good source document should include the specific commands that were used.

The most common or canonical Mimikatz command to dump credentials from LSASS process memory is sekurlsa::logonPasswords. That means the code we are interested in is in the sekurlsa folder. Let’s check it out.

In the sekurlsa, we find several files. The main file is kuhl_m_sekurlsa.c, so we can click on that and dig in.

Getting Started

Each module in Mimikatz has a central file (named after the module), and at the beginning of the main file, we find a function table. This table is used to correlate a command to the internal function that is executed when it is issued. Here we see that the logonPasswords command points to kuhl_m_sekurlsa_all.

Mimikatz sekurlsa function table

At this point, we begin to follow a series of calls to internal functions. The kuhl_m_sekurlsa_all function calls the kuhl_m_sekurlsa_getLogonData function with two parameters. The first parameter is called lsassPackages, and the second appears to measure the size of the lsassPackages array.

kuhl_m_sekurlsa_all calling kuhl_m_sekurlsa_getLogonData

Let’s take a second to see what the lsassPackages array contains. This constant contains an array of type PKUHL_M_SEKURLSA_PACKAGE instances, which represent the different Security Support Provider/Authentication Packages that come default on Windows. We can look at the PKUHL_M_SEKURLSA_PACKAGE structure definition to better understand what is contained in each.

lsassPackages variable being instantiated

The KUHL_M_SEKURLSA_PACKAGE structure is defined as having 5 fields in total. The first is the Name, presumably of the Authentication Package itself. The second field is called CredsForLUIDFunc, which appears to be some sort of callback function, possibly intended to retrieve credentials from the package. The third is simply a boolean value called isValid. However, there isn’t enough information here to understand EXACTLY what this field’s intended use is. The fourth field is the module’s name (DLL) that implements the Authentication Package. The fifth and last field is called Module and is of type KUHL_M_SEKURLSA_LIB.

Now we can look at the definition of the kuhl_m_sekurlsa_kerberos_package instance where we see that the name is set to kerberos, a function called kuhl_m_sekurlsa_enum_logon_callback_kerberos is set as the CredsForLUIDFunc value, the isValid field is set to TRUE, the ModuleName is set to kerberos.dll, and lastly, it appears that the final field is being initialized as a NULL instance of KUHL_M_SEKURLSA_LIB.

KUHL_M_SEKURLSA_PACKAGE kuhl_m_sekurlsa_kerberos_package = {L"kerberos", kuhl_m_sekurlsa_enum_logon_callback_kerberos, TRUE, L"kerberos.dll", {{{NULL, NULL}, 0, 0, NULL}, FALSE, FALSE}};

With that understood, let’s check out the implementation of the kuhl_m_sekurlsa_getLogonData function. It turns out that there isn’t much to get excited about here. We see the lsassPackages array being passed in along with the number of packages in the form of the nbPackages parameter. These parameters are added to an OptionalData variable and then passed as the second parameter to a new function called kuhl_m_sekurlsa_enum. Additionally, we see that the first parameter is kuhl_m_sekurlsa_enum_callback_logondata which seems to be some sort of callback function that we should investigate.

kuhl_m_sekurlsa_getLogonData calling kuhl_m_sekurlsa_enum

The following function is where things REALLY start to get interesting and a little convoluted. The kuhl_m_sekurlsa_enum function has a few decisions that must be made and a few functions that must be called. We will explore what we see and explain how to determine which path, for example, the code will choose. The first interesting thing we notice is the call to kuhl_m_sekurlsa_acquireLSA. Let’s take a look at its implementation.

kuhl_m_sekurlsa_enum (where the magic happens)

The kuhl_m_sekurlsa_acquireLSA function has a couple of decisions to make. The first decision is to check whether a handle, presumably to the LSASS process, already exists. If it doesn’t, it follows up with a check to see if the pMinidumpName variable has been set. We can check the code to see if either of these conditions is met.

kuhl_m_sekurlsa_acquireLSA checking if the handle to Lsass already exists

The first if clause checks to see if cLsass.hLsassMem is set as NULL. If we find where the cLsass variable was first declared, we find this bit of code cLsass = {NULL, {0,0,0}};. It appears as if cLsass is a structure with two fields. The first is set explicitly to NULL, while the second, a different type of structure, is set to three zeros. We should check the type definition for KUHL_M_SEKURLSA_CONTEXT to see the significance of either or each.

cLass variable being set to NULL

According to the KUHL_M_SEKYRLSA_CONTEXT structure’s definition, the first field is hLsassMem, the name of the value checked in kuhl_m_sekurlsa_acquireLSA. This means we can revisit the instantiation of the cLsass variable and see that the hLsassMem is NULL, which means that the conditional statement (!cLsass.hLsassMem) evaluates as true, which means we will execute the code contained within the if statement.

KUHL_M_SEKURLSA_CONTEXT

The next check is related to the pMinidumpName value. You may have noticed this value was set in an earlier screenshot when we saw the cLsass value being declared.

kuhl_m_sekurlsa_acquireLSA checking the pMinidumpName variable

I’ve reshared the image, and we can see that the pMinidumpName variable is set to NULL.

The pMinidumpName variable being set by default to NULL

This means that the conditional statement (pMinidumpName) evaluates to false, which results in execution being passed to the else block.

Because the pMinidumpName value is set to NULL the else block is selected

Process Enumeration

Let’s look at the first line in the else block where we see that a variable called Type is being set to the valueKULL_M_MEMORY_TYPE_PROCESS. Notice line 159, where the Type variable is being declared as the type KULL_M_MEMORY_TYPE.

The Type variable is set to KULL_M_MEMORY_TYPE_PROCESS

Next, we see the kull_m_process_getProcessIdForName subroutine being called. Notice that it takes two parameters, the first is the string lsass.exe, and the second is a pointer to a pid variable. The variable definition (line 161) shows that pid is a DWORD. Based on the function’s name, we might expect it to retrieve the process information, specifically the Process Identifier, for a process named lsass.exe. Let’s check it out.

A function named kull_m_process_getProcessIdForName is called with the string lsass.exe and a pointer to a DWORD name pid being passed as parameters.

We start our analysis of the kull_m_process_getProcessIdForName function by noticing a few variables being initialized. Importantly we observe the mySearch variable being instantiated as an instance of the KULL_M_PROCESS_PID_FOR_NAME structure. This structure is instantiated with three fields, the first is a pointer to the previously instantiated uName variable, the second is the second parameter to the function, which is called processId here, and the third is the boolean value FALSE. Remember that the calling function passed in a pointer to a variable called pid, which is now called processId in this function. Next, we see a call to an API function called RtlInitUnicodeString, which passes in a pointer to the uName variable we just talked about and the first parameter, which is now called name. We know that the calling function passed the string lsass.exe as the first parameter, so this appears to be applying that value to a particular data structure type expected later. Lastly, we see a pointer to the mySearch variable being passed as the second parameter to a new function called kull_m_process_getProcessInformation. Let’s check it out.

kull_m_process_getProcessIdForName calls kull_m_process_getProcessInformation

The kull_m_process_getProcessInfroamtion function is pretty simple and straightforward. We see a call to the kull_m_process_NtQuerySystemInformation function where the first argument is an enum value SystemProcessInformation (the source of this enumeration can be found here), the second argument is a pointer to a buffer, and the third argument is 0.

kull_m_process_getProcessInformation calls kull_m_process_NtQuerySystemInformation

When we investigate the kull_m_process_NtQuerySystemInformation function, we see that it is built simply to make a call to the NtQuerySystemInformation API function. NtQuerySystemInformation is a function meant to facilitate the enumeration of many different bits of information, one of which is related to running processes (this is indicated to the function by the SystemProcessInformation enumeration value seen earlier). The problem is that because NtQuerySystemInformation can return results of unpredictable types and sizes, the authors created a mechanism to enumerate the expected output size (buffer). This feature works because the function writes the actual length of the information requested to the parameter, represented here by returnedLen, and if that value is greater than the buffer size (represented by informationLength here), the function fails. The program can then use the value held in returnedLen to allocate an appropriately sized buffer and pass it in a second time.

However, if we look at line 19, it appears this code doesn’t leverage that built-in process for determining the size required for the output data. Instead, it seems the buffer is initialized to a length of 0x1000 bytes, and if the function fails, the sizeOfBuffer value is shifted to the left one bit (sizeOfBuffer <≤ 1), the buffer is reallocated, and the function is called again. This process repeats until the call is successful.

kull_m_process_NtQuerySystemInformation calls the NtQuerySystemInformation API function

With this understanding, we can then check the API documentation for NtQuerySystemInformation to better understand how it works. Generally speaking, this code section is meant to enumerate all processes and iterate through them until it finds lsass.exe. Once it finds the process it is looking for, it records the process identifier and returns it to be used in the next step.

NtQuerySystemInformation API function documentation

Process Access

After a short detour, we are back at kuhl_m_sekurlsa_acquireLSA, where the following line of interest calls the OpenProcess Windows API Function. Notice that there are three parameters being passed. The first is the processRights variable which is set on line 163 to either be PROCESS_VM_READ | PROCESS_QUERY_INFORMATION or PROCESS_VM_READ | PROCESS_QUERY_LIMITED_INFORMATION depending on the system’s Major Version.

It’s worth noting that PROCESS_VM_READ | PROCESS_QUERY(_LIMITED)_INFORMATION is the minimum necessary access rights required in order for the resulting handle to be used with the next function call (which we will encounter later in the post). Since the processRights value is a bit field, it is possible for alternative tools to add access rights even if they aren’t needed in order to evade static detections.

The second parameter is simply a boolean FALSE value, and the third is the pid variable derived from the call to kull_m_process_getProcessIdForName. In other words, it is the Process Identifier for the LSASS process.

kuhl_m_sekurlsa_acquireLSA calling the OpenProcess API function

If we weren’t familiar with what OpenProcess is used for and what parameters it takes, we could reference the documentation to better understand how to use it. This function is used to open a handle to a process. This is necessary because processes reside in the kernel, which means user-mode code cannot access them directly. Instead, the operating system provides an interface, OpenProcess, to request access to a process. It then grants access according to the Discretionary Access Control List (DACL). Check the documentation page for more information on how OpenProcess is used and what the parameter signifies.

OpenProcess API function documentation

Process Read

Now that we’ve finished the analysis of kuhl_m_sekurlsa_acquireLSA, we can return to kuhl_m_sekurlsa_enum and continue our investigation there. After opening the process handle to LSASS, we can continue skimming the code, and the following function call is kull_m_memory_copy. It appears it is taking three parameters, two pointers to KULL_M_MEMORY_ADDRESS structures (the data and securityStruct variables are instantiated on line 295) and one DWORD. Let’s take a look at the code for kull_m_memory_copy.

After the handle is acquired, kuhl_m_memory_copy is called

We see that the code implementation of kull_m_memory_copy is made up of a series of nested switch statements which means that it behaves differently depending on a few variables we must explore to understand the proper flow for the logonPasswords command.

The first switch statement investigates the Destination variable, which was passed in as the first parameter to kull_m_memory_copy.

This function first checks the type of the Destination memory address (1st parameter)

If we look at the call to kull_m_memory_copy (line 323), we can see that the first parameter is &data which is a pointer to a variable called data. We can scroll to the top of the kuhl_m_sekurlsa_enum function to find where the data variable is declared (line 295) and see that it is of the type KULL_M_MEMORY_ADDRESS, and the first field (nbListes) is set to a pointer to a ULONG which is set to 1 and the second field is a pointer to a constant value called KULL_M_MEMORY_GLOBAL_OWN_HANDLE.

Remember that the switch statement in kull_m_memory_copy looks into the Destination variable, but precisely the Destination->hMemory->type value. To determine exactly what that value is, we must check out the definition for the type assigned to the Destination variable, which is KULL_M_MEMORY_ADDRESS.

KULL_M_MEMORY_ADDRESS structure definition

From this definition, we can see that the first field represents an address of some sort. Remember that a pointer to a ULONG with the value of 1 was assigned to this field. The second field is a pointer to a different structure called PKULL_M_MEMORY_HANDLE. Notice the name of this second field is hMemory. This means we are on our way to understanding the value embedded in Destination->hMemory->type as we’ve identified that hMemory is set to a constant called KULL_M_MEMORY_GLOBAL_OWN_HANDLE. Now we need to figure out what that constant represents.

We can search the code base for the definition of the KULL_M_MEMORY_GLOBAL_OWN_HANDLE structure and find that it is of the type KULL_M_MEMORY_HANDLE. The structure’s first field is an instance of the KULL_M_MEMORY_TYPE_OWN type (this looks like what we are interested in), and the second field is set to NULL.

KULL_M_MEMORY_GLOBAL

Let’s check out the structure definition for the KULL_M_MEMORY_HANDLE type to see what we are dealing with. The first field is a KULL_M_MEMORY_TYPE type and is called type (this is the missing piece of the puzzle). We can refer back to the KULL_M_MEMORY_GLOBAL_OWN_HANDLE global variable definition and see that the type is set to KULL_M_MEMORY_TYPE_OWN. This gives us the information that we need to determine the choice that will be made in this first switch statement.

KULL_M_MEMORY_HANDLE

Upon returning to kull_m_memory_copy, we can see that because Destination->hMemory->type is set to KULL_M_MEMORY_TYPE_OWN that the switch statement will choose the first case.

The Destination memory address was of type KUHL_M_MEMORY_TYPE_OWN

Don’t celebrate too much because as soon as we start following the code inside our selected case condition, we find ANOTHER switch statement. This time it is investigating the Source parameter, specifically the Source->hMemory->type field. This is the same as the aforementioned switch statement but focused on Source instead of Destination this time. The Source parameter is the second parameter passed into kull_m_memory_copy, so let’s go back to the calling function and check it out.

This time we are checking the type of the Source memory address (2nd parameter)

We can see (line 323) that the second parameter to the kull_m_memory_copy function call is a pointer to a variable called securityStruct, which is instantiated at the beginning of kuhl_m_sekurlsa_enum (line 295) as a null value of type KULL_M_MEMORY_ADDRESS. We also see (line 321) that securityStruct.hMemory is being set to the value of the cLsass.hLsassMem variable.

Now would be a good time to revisit the KUHL_M_MEMORY_ADDRESS structure definition, where we can see that the hMemory field is of the type KULL_M_MEMORY_HANDLE.

KULL_M_MEMORY_ADDRESS

We can then review the KULL_M_MEMORY_HANDLE structure definition to see that it has two fields. The first field is called type, the value we are interested in to determine which case we will follow in the switch statement. The second field can be any of four possible handle types (PKULL_M_MEMORY_HANDLE_PROCESS, PKULL_M_MEMORY_HANDLE_FILE, PKULL_M_MEMORY_HANDLE_PROCESS_DMP, or PKULL_M_MEMORY_HANDLE_KERNEL).

KULL_M_MEMORY_HANDLE

So now we need to figure out where the cLsass variable is being set to understand the contents of the securityStruct variable. Recall that we discovered earlier that the cLsass variable is designated as a global variable and instantiated with NULL values.

This means it must get set somewhere in the code we’ve already investigated. Remember in kuhl_m_sekurlsa_acquireLSA that it first checks to see if cLsass.hLsassMem contains a valid handle. If not, it checks to see if pMinidumpName is set. If not, it executes the section of code highlighted in the image below.

Here we see that a variable called Type is being set to KULL_MEMORY_TYPE_PROCESS. We can see that Type is declared at the beginning of the kuhl_m_sekurlsa_acquireLSA function as an instance of the KULL_M_MEMORY_TYPE type. We are interested in this value, but we must first understand how this value is assigned to the cLsass variable.

kuhl_m_sekurlsa_acquireLSA set the type value to KULL_M_MEMORY_TYPE_PROCESS

If we keep following the code after the OpenProcess call, we find another if statement where it validates that hData is valid. Assuming that the call to OpenProcess succeeded, then it would be. This means the subsequent line would be executed where the kull_m_memory_open function is called. Notice the three parameters that are being passed to it. Type is the variable set (line 178) to KULL_M_MEMORY_TYPE_PROCESS, hData is the output of OpenProcess, which is a process handle to LSASS, and cLsass.hLsassMem is the value that is being used to determine which case we choose in the switch statement we are exploring.

We can now investigate the kull_m_memory_open function to see what it does with its parameters. In the function, we see that the third parameter passed in as cLsass is now called *hMemory. We first see (line 17) that (*hMemory)->type, otherwise known as cLsass->type in the calling function, is set to the Type parameter.

kull_m_memory_open function implementation

Next, we encounter a switch statement based on the first parameter, Type. We know from the calling function that Type is set as KULL_M_MEMORY_TYPE_PROCESS, so we can see that the code would select the second case.

kull_m_memory_open’s switch statement based on the Type parameter

Before we start analyzing the code, it is essential to understand that the name of the parameter, when passed in from the calling function, does not necessarily correspond with the name of the parameter in the called function. For instance, in the calling function, kuhl_m_sekurlsa_acquireLSA, the second parameter is called hData, but in the called function, kull_m_memory_open, the second parameter is called hAny. Similarly, the kuhl_m_sekurlsa_acquireLSA function’s third parameter was a pointer to cLsass, but in kull_m_memory_open, it is referred to as *hMemory.

In the code, we see (line 26) that hAny is assigned to (*hMemory)->pHandleProcess->hProcess. This means that cLsass->Type is KULL_M_MEMORY_TYPE_PROCESS and that cLsass->pHandleProcess-hProcess is the handle generated from the OpenProcess API call.

The KULL_M_MEMORY_TYPE_PROCESS case in kull_m_memory_open

Finally, we have determined that the value of Source->hMemory->type is KULL_M_MEMORY_TYPE_PROCESS which means we can see that the code will follow the second case as highlighted in the image below. From here, the code is relatively straightforward as we see a call to the Windows API function ReadProcessMemory, which is used to read the memory of the LSASS process from the source address (as you could probably figure out from the function’s name).

The Source type was KUHL_M_MEMORY_TYPE_PROCESS

As with the previous two function calls, we can check the Microsoft documentation for ReadProcessMemory to understand how it is used and what it does when called. This represents the end of our analysis today, where we discovered that Mimikatz’s sekurlsa::logonPasswords command uses the ReadProcessMemory Windows API function to access the contents of LSASS’s process memory and access the target credentials.

ReadProcessMemory API function documentation

Conclusion

After analyzing Mimikatz’ sekurlsa::logonPasswords command we found that it is generally calling three Windows API functions: NtQuerySystemInformation to get the process identifier (PID) for the LSASS process, OpenProcess to open a read handle to LSASS, and ReadProcessMemory to read the contents of LSASS memory where presumably the credentials that are being dumped are stored. To show the relationship between these calls I created a graph below:

We can now follow the process described in my Understanding the Function Call Stack post to dig into each of these function calls. An interesting side effect is that we can see that these function calls are intertwined in the sense that the output of the first call is required as an input to the second call, and the result from the second call is required as an input to the third call. This means that the sequence of functions is potentially as interesting as the individual functions themselves.

When I first started in information security (before I actually understood how API functions work), it was common to hear people describe a pattern indicative of “process injection.” The pattern was VirtualAllocEx, WriteProcessMemory, and CreateRemoteThread (interestingly, they don’t mention OpenProcess because that is needed as an input for all three), and we were often told that if you see this pattern, then you are witnessing Process Injection: Dynamic-link Library. We’ve established at least one pattern that might be commonly seen in OS Credential Dumping — Lsass Memory type behavior (NtQuerySystemInformation, OpenProcess, ReadProcessMemory). Still, we should not assume that every instance of this pattern IS credential dumping or that this is the ONLY valid pattern of credential dumping. Can you think of any other function combinations that might be indicative of credential dumping? We will explore the answer to this question in the next post!

On Detection: Tactical to Functional was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post On Detection: Tactical to Functional appeared first on Security Boulevard.

Understanding the Function Call Stack

Jared Atkinson — Mon, 27 Jun 2022 16:02:25 +0000

There’s more than meets the eye under the function call hood

This post is based on a September 2021 Twitter thread that I wrote to describe the same concept regarding function calls and their hidden hierarchy. That thread was inspired by a series of tweets by inversecos who shared how malware authors will often use Native APIs instead of Win32 APIs as a mechanism to evade naive detections that assume every application will use the Win32 API function. I wanted to explain WHY that approach worked, and now I think it is appropriate to solidify my thoughts in the blog form.

Introduction

A common feature of Operating Systems is the need to provide a mechanism for developers to interact with the operating system. This feature typically manifests through something called an Application Programming Interface (API) which is composed of a series of “functions” that can be used by developers to perform common tasks. There are a few value propositions inherent in an API. First, the API function serves as an abstraction layer that simplifies complex tasks. It isn’t necessary for all developers to understand the complicated components of the file system (such as the Master File Table), how to determine where free space exists on the hard drive, and how to write data to the physical hard drive. This is all managed on behalf of the developer by the relevant functions. Second, the API acts to constrain the variability in how developers interact with the Operating System. The OS is a relatively sensitive system and the wrong changes to certain system settings can cause serious instability. For instance, many functions like Windows service creation require interacting with and changes to the Windows Registry. It can be dangerous to make direct changes to the registry because the OS has very specific expectations for how data is structured there and certain changes can cause a fatal system error resulting in a Blue Screen of Death (BSOD). To avoid the need for all developers to interact with and understand the structure of the Windows Registry, API functions provide an abstraction layer that manages that complicated problem for us.

How Functions Nest

In the context of Microsoft, there are actually multiple layers of functions that tend to nest or call each other. The most superficial layer or the layer that is intended for third-party developers to interact with is called the Win32 API. The cool thing about this API is that it is well documented by Microsoft meaning that it is easy to look up instructions on what a function does and how it should be called. Another guarantee that Microsoft gives is that documented functions will continue to work as expected in perpetuity (it might not literally be for perpetuity, but you can expect documented functions to work similarly for a long time). So generally speaking, the recommendation and desire from Microsoft is that third-party developers use these public or documented functions that are generally referred to as the Win32 API. However, these documented functions are not the only functions that exist. In fact, many documented functions actually call other documented functions or even undocumented functions behind the scenes.

Case Study — CreateFileW

In order to better understand the relationship between function calls or the nesting of function calls we should look into a case study of one commonly encountered Win32 API function, specifically CreateFileW.

Finding the Implementation via a DLL

The first step to digging into CreateFileW is to determine what Dynamic Link Library (DLL) implements this function. DLLs are binary files (files that implement code) that create shared functionality. The idea is that it doesn’t make sense for every programmer to reinvent the wheel when the task they are performing is common and predictable. Instead, that functionality can and should be integrated into DLLs which can then be referenced by applications. The Win32 API is implemented in a set of default or built-in DLLs, and functions that are meant for application consumption are marked as “exported functions”. To determine which DLL implements or exports CreateFileW we can reference the function’s public documentation. In the requirements section, we see that CreateFileW is implemented by kernel32.dll.

Once we determine which DLL implements the function we can investigate how the function works under the hood. One way to do this is to load the DLL into a disassembler like IDA or Ghidra. Once kernel32.dll is loaded into IDA and public symbols are applied we can click on the exports tab which has a list of all of the functions that are exported by the DLL. This list includes the friendly name of the function, the relative address of the function’s implementation, and the function’s ordinal number. We can search for CreateFileW and once we find it we can double-click on it in the list which will take us to the function’s implementation.

CreateFileW’s Code Implementation

When we arrive at CreateFileW’s implementation it immediately appears fairly underwhelming. This is the beginning of the function nesting that we alluded to earlier. Strangely, it appears that CreateFileW is calling itself, but it’s not quite that simple. Notice the notation __imp_CreateFileW, the __imp_ part indicates that this version of CreateFileW does not exist in the current DLL (kernel32.dll). Instead, this version of the function is actually being “imported” from a foreign DLL. Just as we discussed that DLLs export functions that are meant for other applications to use, the way a binary can use an exported function is through “importing” it as we see here.

The Import Table

We can view kernel32.dll’s Import Table to find the version of CreateFileW that is being imported. Notice in this case that the library is not kernel32, but instead, it is api-ms-win-core-file-l1–1–0 which is a different library. There’s a method to refer to functions with the same name, but that are implemented by different DLLs which follows this convention DLLName!FunctionName. For example, the version of CreateFileW that is implemented by kernel32.dll is referred to as kernel32!CreateFileW while the version that is implemented by api-ms-win-core-file-l1–1–0 is api-ms-win-core-file-l1–1–0!CreateFileW. Okay, so to continue the journey toward understanding the function call hierarchy, we must understand what api-ms-win-core-file-l1–1–0 represents.

Application Programming Interface (API) Sets

In the previous section we found that kernel32!CreateFileW was importing api-ms-win-core-file-l1–1–0!CreateFileW. The api-ms-win-core-file-l1–1–0 component refers to a technology that is present in newer versions of windows called API Sets. API Sets are meant to be transparent to users and developers, but it is important for us to understand how to navigate them to continue digging into the CreateFileW implementation. Geoff Chappell does an excellent job documenting this technology on his blog. Notice that there are many file-related APIs that are all redirected to the same library name. This library serves as a redirector to a third implementation of CreateFileW, but it isn’t obvious where this version of the implementation resides. In order to figure out where CreateFileW’s implementation is located we must “resolve” the API Set to determine where it directs execution. This can be done using James Forshaw’s wonderful PowerShell Module NtObjectManager which has a super helpful cmdlet called Get-NtApiSet. This cmdlet resolves API Sets to their redirection DLL.

Undocumented Function

By using Get-NtApiSet we find that api-ms-win-core-file-l1–1–0 resolves to kernelbase.dll. This means that the next function in the chain is kernelbase!CreateFileW is an undocumented function. An undocumented function is a function that, while externally available to third-party applications, is not documented anywhere publicly. This means that it can sometimes be difficult to use because it might take slightly different parameters than the documented version, and Microsoft generally reserves the right to change how undocumented functions actually work. This means that legitimate developers should generally avoid using undocumented functions, but malware developers might use these functions to present an unexpected look to defenders. We can now load kernelbase.dll into our disassembler, load symbols, and navigate the export table to find the CreateFileW implementation.

We can double-click on CreateFileInternal, the internal function called by kernelbase!CreateFileW, to view its content which reveals a call to an imported function called NtCreateFile.

Let’s check the import table to see what DLL implements this function. In the screenshot below, we see that NtCreateFile resides within ntdll.dll which means we can refer to this specific function as ntdll!NtCreateFile. Functions implemented by Ntdll belong to a special class of functions called Native functions. Generally speaking, Microsoft discourages the direct use of Native functions for similar reasons that it discourages the use of undocumented functions. The Win32 API often helps the developer with complicated tasks that might not be as easy to implement at the Native API layer.

We can load ntdll.dll into our disassembler, load symbols, and navigate to the implementation of NtCreateFile. Native APIs generally have a simple but very special important role in the hierarchy of execution. Their responsibility is to make the appropriate system call (syscall). A syscall is a special type of function call which is responsible for transferring execution from user mode to kernel mode via the SYSCALL instruction on 64-bit systems. Notice that the basic block on the bottom right performs a “syscall” operation. Also, the highlighted value,55h, represents the value of NtCreateFile’s counterpart in the kernel which we can call ZwCreateFile. A complicating factor is that the number associated with a particular syscall may change from version to version of the operating system, so in order to operate at this level, developers must be extremely conscious of the OS version that they are operating on. Making direct syscalls has become a relatively common visibility bypass recently especially because it’s a great way to bypass any user mode signal generation. To see how this is being leveraged check out the windows-syscalls project by j00ru.

While we could continue our analysis into the kernel I think we’ve explored enough to understand the layered nature of function calls. The important point is to realize that any “exported” function is an entry point for an application. While the vast majority of applications will follow the “prescribed” path of calling the relevant Win32 API function (kernel32!CreateFileW in this case), however, it is sometimes advantageous for attackers to perform actions in unorthodox ways. If defenders ASSUME that all file creation will be performed via the approved or prescribed path and as a result only pay attention to the use of Win32 APIs, then attackers can take advantage of the naivety of that approach to avoid visibility.

I’ve generated a graph representation of the function calls that we observed during this blog post so we can see the relationship between them. I’ve also highlighted exported functions in red to indicate that those functions are valid entry points into the graph (even if most developers would only choose to use the documented Win32 API function).

The major takeaway is that in the context of cybersecurity and detection we shouldn’t assume that attackers will choose the path most traveled or the prescribed path to perform a particular behavior. Therefore it is worthwhile for us to understand the options and evaluate opportunities to observe behavior at each functional level. Also, it is worth identifying that this graph represents only one path of functions that can be used to create a file. In my experience, there are often MANY possible paths that can be used. A future blog post will demonstrate how we can evaluate numerous paths to certain behavior in order to produce a complex graph and how to use that complex graph to evaluate detection rule coverage and predict ways in which adversary procedures can evolve in the future. The same principles apply. If an attacker can accomplish the behavior via an unexpected path then they will have the upper hand relative to stealth. Simultaneously, by mapping all possible paths or at least all known paths defenders can predict hypothetical implementations, tools, or paths even if they haven’t explicitly observed a tool that leverages that approach.

Understanding the Function Call Stack was originally published in Posts By SpecterOps Team Members on Medium, where people are continuing the conversation by highlighting and responding to this story.

The post Understanding the Function Call Stack appeared first on Security Boulevard.