Network Investigations
Network Investigations
Sources:
Network traffic
Network devices
Host devices
Challenges:
Accessibility (data export)
Fidelity (Missing data)
Visibility (Encryption)
Network traffic is a valuable source of insight, as most threat actors communicate with target systems over a network. Evidence for network investigations can come from raw network traffic (live or stored packet captures), network devices (like firewalls and proxies), and host device logs (e.g., Windows event logs).
Network investigations can be productive, but several challenges can arise. Accessibility can be an issue, as some devices either make data hard to obtain or export it in an unusable format. Fidelity is another concern, as not all sources capture every detail of packets or interactions. Additionally, encryption can hinder the ability to analyze and interpret data.
tcpdump
Tcpdump is a widely used network packet capture tool that has been maintained for decades. It's available on all major platforms and many embedded devices. Although tcpdump is a low-level tool, it can perform basic protocol analysis for IP, TCP, UDP, ICMP, and similar protocols.
Berkeley Packet Filters (BPF)
tcpdump's filtering power comes from its use of Berkeley Packet Filters (BPF) to specify how packets are captured or excluded.
Specialized language to filter packets:
BPF expressions are composed of primitives and operators
Primitives are composed of one or more qualifiers and an ID
Three kinds of qualifiers:
type: what kind the ID is (host, net, port, or portrange)
dir: the direction (e.g., src, dst)
proto: match a protocol (e.g., ip, tcp, udp, icmp, etc.)
Can combine multiple primitives:
Using and (and, &&), or (or, ||), and not (not, !)
BPF Examples
Web Proxies
Many corporate environments use web proxies:
A local cache can reduce bandwidth usage
Can filter out sites inappropriate for business
With the widespread use of web apps, web traffic is becoming more valuable to investigations:
Build more thorough profile of user activity
Identify anomalous / suspicious requests
Potential to intercept SSL/TLS traffic
Squid is a popular open-source web proxy:
Others include Blue Coat, Forefront TMG, etc.
Web proxies are commonly used in corporate settings for various benefits, such as reducing bandwidth and congestion, filtering inappropriate sites, and providing valuable logs for incident analysis. Proxy logs help build user activity profiles and detect suspicious traffic, and some proxies can log encrypted sessions.
Popular proxy solutions include Microsoft Windows Web Application Proxy, Blue Coat, Forefront TMG, and Squid. We will focus on Squid, a widely-used proxy software known for caching, logging, and supporting multiple protocols.
Access Logs
Record individual requests:
User definable format, but default is quite verbose
May or may not include URL, depending on configuration
Access logs record requests through a Squid proxy in a user-defined format, with the default being very detailed. While URLs are typically displayed, HTTPS URLs might not always be visible depending on the proxy and client configuration. The example shown used settings that allowed interception of encrypted traffic.
The default format for Squid access logs are one entry per line, with each line divided into the following fields:
Timestamp: The time the request was logged, represented as the number of seconds since January 1, 1970, UTC, with millisecond resolution.
Duration: The number of milliseconds the proxy spent handling the request.
Client: The IP of the system making the request.
Result Codes: The Squid result code, followed by a slash (/), and the HTTP status code.
Size: The size, in bytes, of the data sent to the client.
HTTP Method: The HTTP request method in the client request.
URL: The URL the client requested (if available).
User: The identity of the requesting client. Identity is determined by looking at HTTP authentication information, configured external program, TLS authentication information, and IDENT lookups. If none provide an identity, a dash (-) is shown.
Hierarchy Code: A description of how the request was handled.
Content Type: The content type field from the HTTP reply.
Lab 1.2: Network Investigation
The Scenario
The victim is Falsimentis, a small (ctitious) corporation based in Los Angeles, California, that produces articial intelligence hardware and software. On Thursday the CEO decided to take the employees out to lunch. The CEO recalls locking their screen and leaving for lunch around 11:50 AM. Returning from lunch around 1:05 PM, the CEO noticed their computer had rebooted. After logging on, the CEO saw the following:
Once you've familiarized yourself with the scenario, analyze the following files:
access.log
falsimentis.pcap
As you analyze these files, try to answer the following questions:
What systems are likely compromised in the organization?
When did the threat actors begin their attack?
What host(s) are the threat actors using for command and control (C2)?
The systems on the internal Falsimentis network are listed below.
172.16.42.2
FM-SRV-DC01.falsimentis.com
Domain controller
172.16.42.3
FM-SRV-FS01.falsimentis.com
Corporate file server
172.16.42.10
FM-NET-FW01.falsimentis.com
Network firewall and Squid server
172.16.42.20
FM-WEBDEV.falsimentis.com
Internal web development server
172.16.42.103
FM-TETRIS.falsimentis.com
System administrator's workstation
172.16.42.105
FM-ELECTRONICA.falsimentis.com
Web developer's workstation
172.16.42.107
FM-CEO.falsimentis.com
CEO's workstation
172.16.42.108
FM-ALGORITHM.falsimentis.com
V.P. of Operations' workstation
172.16.42.109
FM-GOLF.falsimentis.com
An engineer's workstation
The publicly accessible Falsimentis systems are as follows:
52.219.120.171
www.falsimentis.com
Public website
52.219.120.171
email.falsimentis.com
Webmail client (on same server as www)
10.5.96.4
n/a
Private IP address of the server hosting the public website
144.202.115.64
fm-ext.falsimentis.com
Firewall and VPN server
10.5.96.3
n/a
The private IP address of the firewall and VPN server
Taking notes is vital to eective incident response, so let's record some key facts from the scenario:
The CEO locked their workstation and left for lunch at around 11:50 AM.
The CEO returned from lunch and logged on to their workstation at around 1:05 PM.
The ransom note popped up after the CEO logged on.
The ransom note is hosted at https://midnitemeerkats.com/note/
The note states the victim has 24 hours to pay, or their les will be deleted.
Compromised systems: 172.16.42.107 (FM-CEO)
The first three facts establish a timeline's starting point. The pop-up suggests prior threat actor activity, as it indicates installation of the cause. The fourth fact, the ransom note's location, serves as a key pivot in evidence search. The 24-hour deadline can influence business decisions. The compromised system fact helps assess the incident's scope and impact.
First, we need to confirm that an incident occurred. In the Falsimentis scenario, this is simple: the CEO saw a ransom note on their screen. Even if the threat is fake, the appearance of the message right after logging on qualifies as an incident.
Correlating Network Traffic
let's start by examining the Squid access.log file to find network traffic that matches the CEO's claim of the ransom note appearing around 1:05 PM. The logs may reveal clues, as threat actors often use these protocols for C2 communication.
Let's search through the access.log file for midnitemeerkats
, the site where the ransom note was hosted.
The output displays Squid's log format with fields separated by spaces. Key fields to focus on are the request time (first field), client (third field), and URL (seventh field). let's use the awk
command to extract these fields.
The awk command breaks down as follows:
/midnitemeerkats/ -> only process lines that contain the string midnitemeerkats
print -> print the following fields
$1 -> first field (timestamp)
$3 -> third field (requesting client)
$7 -> seventh field (requested URL)
access.log -> The le to process
The timestamp field is in POSIX time format (also known as Epoch time), the number of seconds since January 1st 1970 00:00:00 UTC. To display POSIX time in a human-friendly format, we can use the strftime
function with awk
.
strftime("%T", $1) - print the timestamp in HH:MM:SS format.
Let's check for network beacons, indicated by repeated requests to the same URL at regular intervals. We'll use findbeacons.py
to identify these beacons.
To tell findbeacons.py what time interval to look for use the -i
argument. To specify a minimum number of beacon requests (to reduce false positives) use the -c
argument.
The findbeacons.py command breaks down as follows:
-i 5 : look for beacons that are at 5-second intervals
-c 10 : look for a minimum of 10 beacons
172.16.42.107: look for beacon tracffic from host 172.16.42.107
The findbeacons.py output reveals a suspicious URL, http://www1-google-analytics.com/collect, which has thousands of packets at 5-second intervals. It’s suspicious due to its high request frequency and similarity to the legitimate www.google-analytics.com.
Finding More Compromised Hosts
To find additional hosts in the network that are compromised, let's pivot on the domain www1-google-analytics.com.
Three more systems (172.16.42.103, 172.16.42.105, and 172.16.42.109) are also communicating with www1-google-analytics.com, in addition to FM-CEO. These IPs should be added to the compromised systems list for investigation.
Finding Even More Compromised Hosts
We've been examining the access.log file for HTTP and HTTPS traffic from the proxy, but it doesn't show traffic on non-standard ports. Let's switch to analyzing the packet capture file for a broader view.
First, let's find the IP address of www1-google-analytics.com in the access.log file.
Here we can see the IP address of www1-google-analytics.com is 167.172.201.123
. Now we can search through the packet capture file falismentis.pcap for traffic destined to this IP.
Broken into pieces, this command line is as follows:
tcpdump -nr falsimentis.pcap dst host 167.172.201.123 -> print packets from the file falsimentis.pcap that are destined for host 167.172.201.123
cut -d ' ' -f 3 -> return the third space-delimited field from the tcpdump output (the IP address and port number).
cut -d '.' -f 1-4 -> cut the IP address and port number combination into pieces at the . character, and then select the first four fields (the IP address).
The output from tcpdump reveals three additional hosts sending traffic to www1-google- analytics.com , 172.16.42.2 (the domain controller), 172.16.42.3 (the file server), and 172.16.42.108 (the VP of Operation's workstation). As before, let's add these systems to the list of compromised systems.
Finding the First Packet
To estimate when the malicious activity began, let's check the first packet sent from each compromised host to www1-google-analytics.com (167.172.201.123).
There is a lot going on with this for loop; it breaks down as follows:
for octet in 2 3 103 105 107 108 109; -> Defines the loop variable, octet , and a list of numbers to iterate across. This list contains the last octet of each compromised system.
TZ=PST7PDT -> Set the timezone used to display timestamps. Unlike the awk command, tcpdump needs the daylight savings time information included in the timezone specication.
tcpdump -tttt -n -r falsimentis.pcap -c 1 -> Show timestamps in HH:MM:SS.(fractions of a second) format, don't resolve hosts or port numbers, read packets from the file falsimentis.pcap , and stop after finding the first packet that matches the filter.
"src host 172.16.42.$octet and dst host 167.172.201.123 and dst port 80" -> Search for traffic originating from one of the systems believed to be compromised, going to 167.172.201.123 port 80.
2>/dev/null -> Discard some of the irrelevant tcpdump output.
Daylight saving time can confuse time-related tasks. Tcpdump might misinterpret timezones, especially during transitions. If timestamps appear off by an hour, adjust the TZ= value to PDT7 or PST7 to correct for daylight or standard time.
Last updated