Ghidra Basics - Manual Shellcode Decryption

This post is a continuation of "Malware Unpacking With Hardware Breakpoints".

Here we will be utilising Ghidra to locate the shellcode, analyse the decryption logic and obtain the final decrypted content using Cyberchef.

Locating the Shellcode Decryption Function In Ghidra

At the point where the hardware breakpoint was first triggered, the primary executable was likely in the middle of the decryption function. We can use this information to locate the same decryption function within Ghidra.

From here, we can do some interesting things which are covered in the next 7 sections.

Locating the Shellcode Decryption Function In Ghidra
Identifying Decryption Routine Logic With ChatGPT
Identifying the Decryption Key Using Ghidra
Locating the Encrypted Shellcode Using Entropy
Performing Manual Decoding Using Cyberchef
Hunting For Additional Samples Using Decryption Bytes
Creating a Yara Rule Using Decryption Code

Locating the Shellcode Decryption Function In Ghidra

If we run the malware again, we can stop at the initial hardware breakpoint trigger and scroll up slightly in the disassembly window.

This will reveal the decryption logic used to obtain the shellcode.

In addition to the notes above, we can observe that the instruction pointer RIP is inside of a loop that contains an XOR instruction.

A looping XOR instruction can be a strong indicator of decryption/decoding logic.

If we copy the contents of the loop, we can use this to investigate the logic inside of Ghidra.

We can use Right-Click -> Binary -> Edit to copy out the bytecodes associated with the suspected decoding loop.

We can load the file within Ghidra and perform a memory search on these suspicious bytes. This can lead us to the decryption function.

Ghidra -> Search -> Memory - Make sure to use "hex" format and "All Blocks"

By clicking "Next" or Search All within the search menu, we are taken straight to the decryption function inside of Ghidra.

We can observe the call to VirtualAlloc, VirtualProtect and CreateThread inside of the Decompiler.

We can also view the decompiled decryption logic inside of the for loop. A primary giveaway here is the ^ xor operator.

If the above output is confusing, you can increase the readability by disabling type casts. Edit -> Tool Options -> Decompiler -> Disable Printing of Type Casts.

We can also see that the variable lpAddress (assigned to the result of VirtualAlloc) will receive the decoded content (it is assigned the result of the xor ^ operation) and is then modified by VirtualProtect and then executed via CreateThread

Now that we've identified the function and logic associated with decryption, we can go ahead and try to identify the type of encryption/obfuscation used.

Identifying Decryption Routine Logic With ChatGPT

Using ChatGPT, we can attempt to gather additional information about the decompiled code.

We can copy out the decompiled code, and ask ChatGPT something like In 3 sentences or less, can you summarise the purpose of this Ghidra Decompiled code

This identifies the general gist of the code, but doesn't provide a lot of information about the decryption routine.

To gather more information about the encoding itself, we can take out the contents of the for loop and summarise it with ChatGPT.

ChatGPT is able to recognize that a simple 4-byte key is used to decrypt some bytes and write them to the buffer we identified at lpAddress

Identifying the Decryption Key Using Ghidra

If we return to the decompiler output, we can observe the 4-byte key param_3 that was referenced by ChatGPT.

We can confirm that param_3 is part of the for loop used for decoding, and also that the value of param_3 is not visible within this function.

By right-clicking on the function name from the above screenshot, FUN_0040152e we can use "Show References To" to identify where the function is called.

We can use this to identify the value that is passed in param_3, which likely contains the 4-byte decryption key.

By Clicking on the value in the 3rd argument (param_3), we can jump to the location where the 4-byte key is stored.

This can be seen in the left window of the below screenshot. The 4 byte decryption key is 32 2f 0d 96

Locating the Encrypted Shellcode Using Entropy

The process of locating the encrypted shellcode is slightly more complex.

Cobalt Strike uses a system of named pipes to move around encrypted data. It is quite tedious to locate the shellcode from the point of the previous screenshot.

Instead, we will use entropy to locate the encrypted shellcode content.

We can begin this process by

Enabling the entropy view
Identifying a high-entropy section
Locating the beginning of the high entropy section using "recent labels"

We can begin by enabling the entropy view and clicking on the area with the highest entropy.

Typically high entropy areas are indicated by a red section within the entropy view. However, for some reason, Ghidra also highlights high entropy areas with a bright white colour.

(There is an entropy colour reference within Ghidra but it's blank when using dark mode)

We can move on by clicking anywhere within the white section.

We will now be somewhere within the encrypted section.

We want to go to the start of the encrypted region, which we can do by selecting the "L" (most recent label) button in Ghidra.

We should now be at the start of the encrypted content.

If we want to obtain the encrypted content for manual decoding, we can highlight it and select Copy Special -> Byte String

Performing Manual Decoding Using Cyberchef

From here we can paste the encrypted content into CyberChef and decrypt it using the 4-byte key identified from param_3 in the previous heading.

In the CyberChef output, we can observe the same strings previously identified within the decrypted shellcode.

Hunting For Additional Samples Using Decryption Bytes

Decryption and decoding routines are often unique enough to be used for malware hunting and Yara rules.

If we go back and take the bytes we obtained from x64dbg and search with Ghidra, we can go hunting for additional samples.

For example, we can search for additional samples using unpac.me. In this case, 58 results were obtained which all appeared to be Cobalt Strike samples.

The results from the search all returned 50+ results for Cobalt Strike on Virustotal.

Creating a Yara Rule Using Decryption Code

Working on the (generally safe) assumption that the decryption logic remains the same across similar samples.

We can use the identified decryption bytes to create a simple Yara rule. This should return the same results as the previous byte search with unpacme.

Locating the Shellcode Decryption Function In Ghidra

Locating the Shellcode Decryption Function In Ghidra

Identifying Decryption Routine Logic With ChatGPT

Identifying the Decryption Key Using Ghidra

Locating the Encrypted Shellcode Using Entropy

Performing Manual Decoding Using Cyberchef

Hunting For Additional Samples Using Decryption Bytes

Creating a Yara Rule Using Decryption Code

Sign up for Embee Research