Network Investigations

Sources:

Network traffic
Network devices
Host devices

Challenges:

Accessibility (data export)
Fidelity (Missing data)
Visibility (Encryption)

Network traffic is a valuable source of insight, as most threat actors communicate with target systems over a network. Evidence for network investigations can come from raw network traffic (live or stored packet captures), network devices (like firewalls and proxies), and host device logs (e.g., Windows event logs).

Network investigations can be productive, but several challenges can arise. Accessibility can be an issue, as some devices either make data hard to obtain or export it in an unusable format. Fidelity is another concern, as not all sources capture every detail of packets or interactions. Additionally, encryption can hinder the ability to analyze and interpret data.

tcpdump

Tcpdump is a widely used network packet capture tool that has been maintained for decades. It's available on all major platforms and many embedded devices. Although tcpdump is a low-level tool, it can perform basic protocol analysis for IP, TCP, UDP, ICMP, and similar protocols.

tcpdump -i interface                # Capture traffic for an interface. Can also use any.
tcpdump -i interface -w file        # Capture traffic for an interface and write to a file.
tcpdump -r file -n                  # Read packets from a file and don't resolve hosts and ports.
tcpdump -r file -n -A               # Read packets from a file, don't resolve, show as ASCII.

Berkeley Packet Filters (BPF)

tcpdump's filtering power comes from its use of Berkeley Packet Filters (BPF) to specify how packets are captured or excluded.

Specialized language to filter packets:

BPF expressions are composed of primitives and operators
Primitives are composed of one or more qualifiers and an ID

Three kinds of qualifiers:

type: what kind the ID is (host, net, port, or portrange)
dir: the direction (e.g., src, dst)
proto: match a protocol (e.g., ip, tcp, udp, icmp, etc.)

Can combine multiple primitives:

Using and (and, &&), or (or, ||), and not (not, !)

BPF Examples

tcpdump -r file 'host 8.8.8.8'                 # Traffic going to or from host 8.8.8.8
tcpdump -r file 'src host 8.8.8.8'             # Traffic coming from host 8.8.8.8
tcpdump -r file 'not src host 8.8.8.8'         # Traffic where the src is not 8.8.8.8
tcpdump -r file 'icmp and (src host 8.8.8.8)'  # Only ICMP from 8.8.8.8
tcpdump -r file -n -c 1 'host 1.0.0.1' -tttt   # Display the timestamp of the first packet
tcpdump -r file -n -t 'ip'                     # Show IP traffic

tcpdump -r file -n -t 'ip' | awk '{print $2}' | cut -d. -f 1-4  # Extract only the IP Addresses

Web Proxies

Many corporate environments use web proxies:

A local cache can reduce bandwidth usage
Can filter out sites inappropriate for business

With the widespread use of web apps, web traffic is becoming more valuable to investigations:

Build more thorough profile of user activity
Identify anomalous / suspicious requests
Potential to intercept SSL/TLS traffic

Squid is a popular open-source web proxy:

Others include Blue Coat, Forefront TMG, etc.

Web proxies are commonly used in corporate settings for various benefits, such as reducing bandwidth and congestion, filtering inappropriate sites, and providing valuable logs for incident analysis. Proxy logs help build user activity profiles and detect suspicious traffic, and some proxies can log encrypted sessions.

Popular proxy solutions include Microsoft Windows Web Application Proxy, Blue Coat, Forefront TMG, and Squid. We will focus on Squid, a widely-used proxy software known for caching, logging, and supporting multiple protocols.

Access Logs

Record individual requests:

User definable format, but default is quite verbose
May or may not include URL, depending on configuration

Access logs record requests through a Squid proxy in a user-defined format, with the default being very detailed. While URLs are typically displayed, HTTPS URLs might not always be visible depending on the proxy and client configuration. The example shown used settings that allowed interception of encrypted traffic.

The default format for Squid access logs are one entry per line, with each line divided into the following fields:

Timestamp: The time the request was logged, represented as the number of seconds since January 1, 1970, UTC, with millisecond resolution.
Duration: The number of milliseconds the proxy spent handling the request.
Client: The IP of the system making the request.
Result Codes: The Squid result code, followed by a slash (/), and the HTTP status code.
Size: The size, in bytes, of the data sent to the client.
HTTP Method: The HTTP request method in the client request.
URL: The URL the client requested (if available).
User: The identity of the requesting client. Identity is determined by looking at HTTP authentication information, configured external program, TLS authentication information, and IDENT lookups. If none provide an identity, a dash (-) is shown.
Hierarchy Code: A description of how the request was handled.
Content Type: The content type field from the HTTP reply.

Lab 1.2: Network Investigation

The Scenario

The victim is Falsimentis, a small (ctitious) corporation based in Los Angeles, California, that produces articial intelligence hardware and software. On Thursday the CEO decided to take the employees out to lunch. The CEO recalls locking their screen and leaving for lunch around 11:50 AM. Returning from lunch around 1:05 PM, the CEO noticed their computer had rebooted. After logging on, the CEO saw the following:

Once you've familiarized yourself with the scenario, analyze the following files:

access.log
falsimentis.pcap

As you analyze these files, try to answer the following questions:

What systems are likely compromised in the organization?
When did the threat actors begin their attack?
What host(s) are the threat actors using for command and control (C2)?

The systems on the internal Falsimentis network are listed below.

Hostname

Description

172.16.42.2

FM-SRV-DC01.falsimentis.com

Domain controller

172.16.42.3

FM-SRV-FS01.falsimentis.com

Corporate file server

172.16.42.10

FM-NET-FW01.falsimentis.com

Network firewall and Squid server

172.16.42.20

FM-WEBDEV.falsimentis.com

Internal web development server

172.16.42.103

FM-TETRIS.falsimentis.com

System administrator's workstation

172.16.42.105

FM-ELECTRONICA.falsimentis.com

Web developer's workstation

172.16.42.107

FM-CEO.falsimentis.com

CEO's workstation

172.16.42.108

FM-ALGORITHM.falsimentis.com

V.P. of Operations' workstation

172.16.42.109

FM-GOLF.falsimentis.com

An engineer's workstation

The publicly accessible Falsimentis systems are as follows:

Hostname

Description

52.219.120.171

www.falsimentis.com

Public website

52.219.120.171

email.falsimentis.com

Webmail client (on same server as www)

10.5.96.4

n/a

Private IP address of the server hosting the public website

144.202.115.64

fm-ext.falsimentis.com

Firewall and VPN server

10.5.96.3

n/a

The private IP address of the firewall and VPN server

Taking notes is vital to eective incident response, so let's record some key facts from the scenario:

The CEO locked their workstation and left for lunch at around 11:50 AM.
The CEO returned from lunch and logged on to their workstation at around 1:05 PM.
The ransom note popped up after the CEO logged on.
The ransom note is hosted at https://midnitemeerkats.com/note/
The note states the victim has 24 hours to pay, or their les will be deleted.
Compromised systems: 172.16.42.107 (FM-CEO)

The first three facts establish a timeline's starting point. The pop-up suggests prior threat actor activity, as it indicates installation of the cause. The fourth fact, the ransom note's location, serves as a key pivot in evidence search. The 24-hour deadline can influence business decisions. The compromised system fact helps assess the incident's scope and impact.

First, we need to confirm that an incident occurred. In the Falsimentis scenario, this is simple: the CEO saw a ransom note on their screen. Even if the threat is fake, the appearance of the message right after logging on qualifies as an incident.

Correlating Network Traffic

let's start by examining the Squid access.log file to find network traffic that matches the CEO's claim of the ransom note appearing around 1:05 PM. The logs may reveal clues, as threat actors often use these protocols for C2 communication.

Let's search through the access.log file for midnitemeerkats , the site where the ransom note was hosted.

grep midnitemeerkats access.log

The output displays Squid's log format with fields separated by spaces. Key fields to focus on are the request time (first field), client (third field), and URL (seventh field). let's use the awk command to extract these fields.

awk '/midnitemeerkats/ {print $1, $3, $7}' access.log

The awk command breaks down as follows:

/midnitemeerkats/ -> only process lines that contain the string midnitemeerkats
print -> print the following fields
$1 -> first field (timestamp)
$3 -> third field (requesting client)
$7 -> seventh field (requested URL)
access.log -> The le to process

The timestamp field is in POSIX time format (also known as Epoch time), the number of seconds since January 1st 1970 00:00:00 UTC. To display POSIX time in a human-friendly format, we can use the strftime function with awk.

TZ=Egypt/Cairo awk '/midnitemeerkats/ {print strftime("%T",$1), $3, $7}' access.log

strftime("%T", $1) - print the timestamp in HH:MM:SS format.

Let's check for network beacons, indicated by repeated requests to the same URL at regular intervals. We'll use findbeacons.py to identify these beacons.

To tell findbeacons.py what time interval to look for use the -i argument. To specify a minimum number of beacon requests (to reduce false positives) use the -c argument.

./findbeacons.py -i 5 -c 10 172.16.42.107 access.log

The findbeacons.py command breaks down as follows:

-i 5 : look for beacons that are at 5-second intervals
-c 10 : look for a minimum of 10 beacons
172.16.42.107: look for beacon tracffic from host 172.16.42.107

The findbeacons.py output reveals a suspicious URL, http://www1-google-analytics.com/collect, which has thousands of packets at 5-second intervals. It’s suspicious due to its high request frequency and similarity to the legitimate www.google-analytics.com.

Finding More Compromised Hosts

To find additional hosts in the network that are compromised, let's pivot on the domain www1-google-analytics.com.

awk '/www1-google-analytics.com/ {print $3}' access.log | sort -u

Three more systems (172.16.42.103, 172.16.42.105, and 172.16.42.109) are also communicating with www1-google-analytics.com, in addition to FM-CEO. These IPs should be added to the compromised systems list for investigation.

Finding Even More Compromised Hosts

We've been examining the access.log file for HTTP and HTTPS traffic from the proxy, but it doesn't show traffic on non-standard ports. Let's switch to analyzing the packet capture file for a broader view.

First, let's find the IP address of www1-google-analytics.com in the access.log file.

grep www1-google-analytics.com access.log | head -n 1

Here we can see the IP address of www1-google-analytics.com is 167.172.201.123 . Now we can search through the packet capture file falismentis.pcap for traffic destined to this IP.

tcpdump -nr falsimentis.pcap dst host 167.172.201.123 | cut -d ' ' -f 3 | 
cut -d '.' -f 1-4 | sort -u

Broken into pieces, this command line is as follows:

tcpdump -nr falsimentis.pcap dst host 167.172.201.123 -> print packets from the file falsimentis.pcap that are destined for host 167.172.201.123
cut -d ' ' -f 3 -> return the third space-delimited field from the tcpdump output (the IP address and port number).
cut -d '.' -f 1-4 -> cut the IP address and port number combination into pieces at the . character, and then select the first four fields (the IP address).

The output from tcpdump reveals three additional hosts sending traffic to www1-google- analytics.com , 172.16.42.2 (the domain controller), 172.16.42.3 (the file server), and 172.16.42.108 (the VP of Operation's workstation). As before, let's add these systems to the list of compromised systems.

Finding the First Packet

To estimate when the malicious activity began, let's check the first packet sent from each compromised host to www1-google-analytics.com (167.172.201.123).

for octet in 2 3 103 105 107 108 109; do TZ=PST7PDT 
tcpdump -tttt -n -r falsimentis.pcap -c 1 "src host 172.16.42.$octet and dst host 167.172.201.123 
and dst port 80" 2>/dev/null; done

There is a lot going on with this for loop; it breaks down as follows:

for octet in 2 3 103 105 107 108 109; -> Defines the loop variable, octet , and a list of numbers to iterate across. This list contains the last octet of each compromised system.
TZ=PST7PDT -> Set the timezone used to display timestamps. Unlike the awk command, tcpdump needs the daylight savings time information included in the timezone specication.
tcpdump -tttt -n -r falsimentis.pcap -c 1 -> Show timestamps in HH:MM:SS.(fractions of a second) format, don't resolve hosts or port numbers, read packets from the file falsimentis.pcap , and stop after finding the first packet that matches the filter.
"src host 172.16.42.$octet and dst host 167.172.201.123 and dst port 80" -> Search for traffic originating from one of the systems believed to be compromised, going to 167.172.201.123 port 80.
2>/dev/null -> Discard some of the irrelevant tcpdump output.

Daylight saving time can confuse time-related tasks. Tcpdump might misinterpret timezones, especially during transitions. If timestamps appear off by an hour, adjust the TZ= value to PDT7 or PST7 to correct for daylight or standard time.

Last updated 1 year ago