Obfuscated VBA, Powershell, C#

malware analysis VBA Word Document
VBA Macros remain one of the most prevalent attack vectors to this date. Today we analyze a multi-tiered obfuscated VBA macro that includes VBA, WMI Objects, Powershell, Inlined C#, and an AMSI bypass.


Often we are asked, "Where do you get the malware you analyze?" In our About Us section of the website we state something along the lines of "online sandboxes, honeypots, and readers like you!". Well, today's sample comes to us from a reader who wanted to see how we approach analyzing this type of VBA enabled Word Document and we are happy to oblige.

Initial Analysis

We start off by inspecting the .DOC in a hex editor. We can see that the header begins with a PK indicating that this type of file is a ZIP. Modern Microsoft Word Documents are actually ZIP files and if you rename the document from a .DOC to a .ZIP you can unzip the contents. However, we're going to do something different to continue our analysis.

We give our document to the OfficeMalScanner tool and allow it to search the document for embedded items like Visual Basic (VBA) scripts. This document just so happens to have one and it extracts it as VBAPROJECT.BIN in our %temp% directory.

If we then scan the extracted VBAPROJECT.BIN with OfficeMalScanner it will output the actual VBA script from the binary file so we can inspect its contents.

Now that we have a human readable VBA script, we can see that the VBA code is poorly formatted (most likely on purpose); so, we will need to clean it up a little bit by replacing redundant newlines("\n"), tabs("\t"), and spaces("\s") until we get properly formatted VBA code. This is not necessary to run the code, however, if we are going to debug it in Microsoft Word's Developer Tools we need multi-lined code in order to place breakpoints.

And after cleaning up the code a bit we are left with something like this:

Opening Pandoras Box

Now that we see what VBA code we will be dealing with, lets open the document in Microsoft Word.

The reason we need to open the document in Microsoft Word and not just run the VBA script by itself is because this particular script utilizes hidden variables within the document to extract information from. The two variables can be seen in the "ActiveDocument.Variables(...).Value" statements of the code. 

You may be wondering how you can find these variables since they are not readily visible/apparent in the VBA code or the Document content, properties, or settings. We start by opening the Developer Tools in Microsoft Word and pasting our formatted script in the VBA window (replacing the poorly formatted one.) We then place a breakpoint in the "Sub Document_Open()" function. This function is the one that gets run when you open the Word Document. Then we run the script which will break on our breakpoint.

The two variables in question can be found by expanding the "[+] ME" in the locals window at the bottom of the VBA editor. You then need to scroll down to and expand"Variables". Now inspect the "Item 1" and "Item 2" drop-downs and you will find the two variables seen in the script.

These two variables will be read into the VBA script and decoded. This is where the functionality of this script is hidden. The first variable is decoded to a WMI Object "winmgmts:\\.\root\cimv2:Win32_Process" that is responsible for launching the content of the second variable. 

The second variable is decoded to...can you guess...yup...our old friend Powershell.

If we continue to run our script it will eventually launch a Powershell terminal with the decoded script from variable two. We can see the contents of this script when Powershell launches by looking at it in a tool called ProcessHacker.

Continuing Deobfuscation

Now that we have the Powershell extracted, we can once again clean up the code a bit and separate it into its proper lines by replacing ";" with ";\n" and fixing up line spacing. 

We do this so we can debug the code in Powershells IDE/Debugger that comes with Windows. You can find it by going to your START menu and typing "ISE". It should display as "Windows Powershell ISE". In order to run the code you will need to type the following command in the console window: "Set-ExecutionPolicy unrestricted".

The Powershell script is fairly straightforward. From the cleaned up code we can see one deobfuscation function at the top that will be responsible for decoding the blob of text in the middle variable. We also see an "Add-Type" line at the bottom of the script that is calling into the "c193b()" function. This is most often used for inlining a script in a different programming language from Powershell. In this case, we see that the script has decoded C# code.


So we've made it this far. We've run a VBA Enabled Word Document, let it decode hidden variables into obfuscated Powershell that it launched through a WMI Object, and we've deobfuscated Powershell to find inlined C#. We're almost done. We can once again clean up the deobfuscated code a bit so we can load it in a proper IDE and debug it. In this case we have C#, so we can start a new C# project in Visual Studio and paste the code after the "Hello World" function like below. We need to put the "using" statements at the top (these are essentially includes/imports like in other languages) and paste the "public class" code we deobfuscated below the main function.

From our Powershell script we saw that it was calling into the "yba2983" class at the function "c193b()". So we need to place this in the programs "Main" so it will mimic the handoff from Powershell. Be sure to place a breakpoint in the "c193b" function so you can step through it. We are now at the end right? We can finally see what this downloader was trying to put on our system. Well, almost. There is one final thing to point out before we reach the end.

AMSI Bypass

Anti-Malware Scanning Interface (AMSI) is a system protection that came with later versions of Microsoft Windows and it helps protect against these types of attacks. Microsoft describes AMSI as:

The Windows Antimalware Scan Interface (AMSI) is a versatile interface standard that allows your applications and services to integrate with any antimalware product that's present on a machine. AMSI provides enhanced malware protection for your end-users and their data, applications, and workloads.

In simple terms, "amsi.dll" runs on your system and forces all script output through the AmsiScanBuffer API. So all of the deobfuscated text we found would be inspected in AmsiScanBuffer interface for suspicious/malicious actions. If it finds any, AMSI will deny the script from executing. Where there are protections, there are bypasses and that's exactly what we find at the beginning of the C# code.

This particular bypass attempts to call "loadlibrary" with "amsi.dll" as it's target. If this succeeds, then it knows that AMSI exists on the system. Next the code will call "GetProcAddress" for the AmsiScanBuffer interface, unprotect it's memory region by calling "VirtualProtect", and then it will selectively patch the "amsi.dll" via a call to "RtlMoveMemory". By patching "amsi.dll" they are removing the protective checks that would prevent the script from running. 

The Finish Line

If we continue to debug the C# code we will eventually deobfuscate the final variable and reveal the URL of the intended download.

The code will then create a new WebClient connection, download the executable into a temporary space, and launch the process. The second stage of this malware could be anything from a Remote Access Trojan (RAT), banking malware, ransomware, etc; however, the purpose of this article was to demonstrate how to analyze the first stage and will not follow the second stage download.

But Why...?

Why did we go through all of this? We could have simply run this sample in a controlled environment and monitored the network traffic to see the final executable that is being downloaded. You are correct. That would have been a highly efficient way to get an Indicator of Compromise (IOC) within a few seconds. 

As a malware analyst, and more specifically a reverse engineer, we must be curious. We must manually tear apart infection techniques to truly understand what is happening underneath the hood. Otherwise, if we remain ignorant to the invisible code and chalk it up to "magical hacker stuff", we will not grow or excel in our profession.

What if we attempted to run the document in Microsoft Word and it didn't do anything? What would you do? Would you know to check the document for modern .docx properties or if it was a simple .rtf? What if you didn't know how to debug the VBA and it contained an anti-debugging/analysis technique that prevented the script from running? What if the Powershell unpacked an alternate script based on date and time? What if the C# detected an analysis environment and changed the final download string to something benign or non-existent? You wouldn't know because all you would have is the final IOC or a document that refused to run. 

We must understand what is happening. We must painstakingly sift through code which is intended to frustrate, confuse, and anger. If we don't, we may miss a key piece of code or be tricked into believing false information. We do these things to learn, to grow in our profession, and to keep abreast of the cyber threats of tomorrow. 

Above all else, the most valuable tool in our profession is curiosity. Stay curious.