Office Files Analysis

Analyzing malicious Office files is important because Office documents are common attack vectors due to their widespread use and support for macros and embedded objects. Understanding how these attacks work helps in developing effective defenses. Analysis can reveal specific techniques, tactics, and procedures (TTPs) used by threat actors, aiding in attribution and understanding the threat landscape.

Common Attacks Using Malicious Office Files

Malicious Office documents, such as Word or Excel files, are commonly used by attackers to deliver malware. These files may contain malicious macros, embedded objects, or exploit vulnerabilities to execute code, often as part of phishing campaigns. Analyzing them requires a structured approach to uncover the following techniques:

  • Macro-Based Attacks: Malicious macros run when enabled by the user, often downloading further payloads.

  • Embedded Objects: Objects like OLE or ActiveX controls may execute code without user awareness.

  • Exploiting Vulnerabilities: Crafted documents can exploit flaws in Office applications, such as buffer overflows.

  • Phishing & Social Engineering: Documents are used to deceive users into enabling macros or clicking malicious links.

Macros

Macros in Microsoft Office automate repetitive tasks using commands written in Visual Basic for Applications (VBA), a Microsoft-supported language across all Office products.

Office Open XML (OOXML) files such as .docx, .xlsx, and .pptx cannot store macros by default. Only specific file formats can contain VBA macros, such as:

  • Word: .docm, .dotm

  • Excel: .xlsm, .xltm

  • PowerPoint: .pptm, .potm

These file formats end with an 'm' to indicate the presence of macros, which may contain executable code. Users can rename the extension, but if macros are present, a security warning will state: "Macros have been disabled."

Office documents, like PDFs, have their own scripting language—in this case, VBA (Visual Basic for Applications). VBA macros are powerful and can directly call Windows APIs, enabling actions like malware download and code execution. Attackers commonly use macros to:

  • Modify Files: Change or delete system files.

  • Execute Code: Run malicious scripts or binaries.

  • Deliver Payloads: Fetch and launch malware from remote sources.

Analyzing macros is essential, as they are a common attack vector.

The easiest way to detect the presence of macros inside an Office file is by using the oleid.py Python utility followed by the document for analysis.

python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx

Office File Formats

Office documents can be saved in various formats, with the most common being:

The Python script oledir helps in showing the layout of an OLE file.

python C:\Tools\MalDoc\Office\Tools\oletools\oledir.py C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc

Despite the lack of macro support, RTF files can still be used in attacks through embedded objects (such as OLE1 objects), binary contents, or exploits targeting vulnerabilities in RTF parsers.

python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\xworm\sample.rtf

For detailed analysis, we will use the rtfdump.py Python utility, which can be downloaded from the official GitHub repository. This utility can be executed inside the target (VM) at the following path:

python C:\Tools\MalDoc\Office\Tools\DidierStevensSuite\rtfdump.py C:\Tools\Maldoc\Office\Demo\Samples\xworm\sample.rtf

Questions

Q1) Run olevba.py with -a option on the file "C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx". This will show a list of suspicious keywords. Figure out the keyword that downloads files from the Internet. Type the keyword as your answer. Answer Format is m********.*******

python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py -a C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx

Answer: microsoft.xmlhttp

Office Document - VBA Macro Analysis

Let's start with the MS Office document format first. To get started, let's review the different file types that we know.

File Type
Description

doc

Microsoft Word document before Word 2007

docm

Microsoft Word macro-enabled document

docx

Microsoft Word document (Open XML format, Latest)

dot/dotx/dotm

Word template files.

Initially, when we don't know about a file type, we can extract some basic information about the sample using trid.exe. This will provide us with the information related to what kind of sample we're dealing with.

trid C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc

The output indicates that it is a DOC file and also contain an OLE object. We can use olemeta.py, which is a script to parse OLE files such as MS Office documents (e.g., Word, Excel). This script extracts all standard properties present in the OLE file.

python c:\tools\maldoc\office\tools\oletools\olemeta.py  C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc

To get the timestamp information, we can use the oletimes.py Python script

python C:\tools\maldoc\office\tools\oletools\oletimes.py  C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc

Next, we can use oleid.py to get more information related to the sample.

python C:\tools\maldoc\office\tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc

We can see there are VBA macros present. Let us check this using the olevba utility. This script is used to open a MS Office file, detect if it contains VBA macros, and extract and analyze the VBA source code from your own Python applications.

Questions

Q1) Use olemeta.py to analyse the document properties. Find out who is the author of this document, and type the name of author as your answer.

python c:\tools\maldoc\office\tools\oletools\olemeta.py  C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc

Answer: Mohammed Alkuwari

Obfuscated VBA Macro Analysis

In this section, we'll analyze another sample, which is little more complicated and a heavily obfuscated malicious document that drops QuasarRAT malware on the system. We'll take a sample renamed as QuasarRAT.docx, which is tagged under the malware family (signature) of QuasarRAT, xRAT. The details related to this sample are as follows:

python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx

Next, we can run the olevba Python utility to extract more details related to the macro in the document.

python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx

We can see the use of the AutoExec function to trigger code execution when a user opens the document.

Despite obfuscation, the script's use of VBA functions like CreateObject, Open, Write, and SaveToFile reveals its role as a dropper. It downloads a QuasarRAT payload from an external source, writes it to disk, and executes it—demonstrating typical dropper behavior used to deploy additional malware.

Questions

Q1) When you extract VBA Macro code of this sample using olevba.py, there is a call to MsgBox. What is the content of this MsgBox function? Type it as your answer.

python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx

Answer: Open this Transaction Recipt Again!

Analysis of External Relationships

Adversaries have exploited remote code execution vulnerabilities in Office documents, such as CVE-2021-40444, which leveraged a malicious ActiveX control in MSHTML to deliver Cobalt Strike Beacon loaders linked to ransomware campaigns. This section explores such malicious Office document tactics.

Microsoft states that files from external sources are usually tagged with a Mark of the Web (MoTW), which triggers Protected View and requires user action to enable active content. However, this document bypasses that protection and executes its payload automatically upon opening, without MoTW or user interaction.

This vulnerability is triggered simply by opening a document—no user interaction, such as clicking 'Enable Content', is required. It can also impact other MSHTML-based applications like Skype, Outlook, and Visual Studio.

Let's begin our analysis by examining the App-description.docx document and scrutinizing the output from oleid.py.

python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx

As suggested in the above output in the screenshot, we can use oleobj to obtain the external relationship directly as shown below:

python C:\Tools\MalDoc\Office\Tools\oletools\oleobj.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx

It is really good to get the external relationship and details of the suspicious URL directly in no time by using oleobj.py. However, we should also be aware of the whole process, such as where the relationship is stored and how to extract it using some more useful tools and scripts.

python zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx

Zipdump has an option to dump all content of the file using the --dumpall parameter. This is really important as we can search through it.

python zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx --dumpall

The content reveals a wealth of information. To identify specific patterns, we'll use the re-search.py script, which applies regular expressions to search files. It supports both custom regex and predefined patterns from a built-in library, specified using the regex argument.

python zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx --dumpall | python re-search.py --name url

At the end of the output, there's a match for an external URL that is suspicious.

We can also perform a Yara search in the whole document. YARA, which stands for "Yet Another Recursive Acronym," is an open-source pattern-matching Swiss army knife that identifies patterns within files, making it a powerful tool for malware detection.

Zipdump supports the functionality to perform searches using YARA rules with files, directories, and direct strings as well. We'll use the YARA string search option to search for this domain using --yara "#s#pawevi.com". This should tell us which file contains this suspicious string.

python zipdump.py --yara "#s#pawevi.com" C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx

The output shows that this string is present in the relationships file with index 18. Let's open this index 18 relationship file using --select 18 along with the --dumpall or -d option to show the dump file content.

python zipdump.py --select 18 --dumpall C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx | python xmldump.py pretty

Questions

Q1) Locate the sample "C:\Tools\MalDoc\Office\Demo\Samples\SnakeKeylogger\PO026037.docx" and investigate relationships with external links. Type the external link as your answer. Answer format is an HTTP URL.

python C:\Tools\MalDoc\Office\Tools\oletools\oleobj.py C:\Tools\MalDoc\Office\Demo\Samples\SnakeKeylogger\PO026037.docx

Answer: http://gurl.pro/u8-drp

Last updated