I wanted to see what a packet flow looks like to a website hosted on AWS so I uploaded a simple HelloWorld website to AWS S3, an object store that is great for hosting simple, static websites, and captured packets working from my home “office”.
In case you haven’t captured traffic before, it’s a pretty easy process that I outline below, otherwise you can skip to the analysis.
1) Paste in the website location into my browser, using Chrome, but don’t hit enter yet. Local network traffic is very noisy so I’ll hit enter just after I start Wireshark.
2) Load up Wireshark and setup the capture filter. I’m using wireless so the capture filter on my MacBook Air is en0. That’s it, nothing else.
3) Do these steps in order and as quick as possible.
- start the wireshark capture
- go back to Chrome and hit enter to submit the HTTP GET request
- stop the wireshark capture
4) Looking at the wireshark capture window, I captured 196 frames and most of them are specific to my home network traffic and not related to the AWS S3 website. So, we’ll have to dig into the capture to find the relevant IP Addresses. Save the pcap file so that we have a backup in case we screw up the traffic capture.
Relevant IP Addresses
Since I got 196 packets captured, it’s not that simple to identify the packets that we want to analyze versus the other noise. So, we need some clues and be able to use the Wireshark filter capabilities to find the relevant IP Addresses.
Run the command ifconfig to get the local IP address.
Let’s also get the IP Address of the AWS S3 hosted website.
So now we should expect to see captured packets that show communication between 192.168.1.206 and 184.108.40.206. However, 192.168.1.206 is a private RFC 1918 address so it’s not routable. This means that we go through a NAT device, and of course the NAT is occurring on my local WiFi Gateway. Let’s get the IP address of my local Gateway.
At this point, we have 3 IP Addresses that will be helpful to analyze the packet flow and we expect to find these addresses in the Wireshark capture. The relevant IP’s are:
- Local Address – 192.168.1.206
- Local Gateway Address – 192.168.1.1
- AWS S3 Website Address – 220.127.116.11
Note that the AWS S3 address, in this case, is not static and can change.
Issue: The AWS S3 Address 18.104.22.168 is not found in the capture file. This is the nature of the cloud where IP Addresses are not static and are loosely coupled to services. IP Addresses are allocated on demand and as demand changes so does the specific resource. So, even though we pinged the domain, it’s not necessarily the IP of the HTTP host.
Resolution: Since we are accessing a web page using HTTP, we can filter Wireshark on http so let’s do that.
This doesn’t make sense, there are no HTTP packets all.
Issue: There are no HTTP packets in the packet capture, even though the browser sent an HTTP GET REQEUST and we received the HTTP GET RESPONSE
Resolution: This can be easy to forget and miss. We issued the HTTP GET using HTTPS which means we are using TCP Port 443 and the HTTP message is encrypted in TLS/SSL. So, we can narrow down the AWS S3 IP to captured packets that use the TLS/SSL protocol.
In the packet capture window, I’m going to filter on TCP Port = 443 (used by TLS and SSL) and we’re going to get a lot of data to look at.
Although not shown above, I found other interesting TLS/SSL traffic, including to Apple IP Addresses 22.214.171.124 and 126.96.36.199 that are hosted on Akamai and Fastly, probably related to ITunes and/or the App Store.
In the TCP Port 443 filter, we see IP Address 188.8.131.52. That looks similar to the IP Address when we pinged AWS S3. We can do an nslookup to learn more about the IP Address.
Yes, we see that its part of s3 at amazonaws.com. Looking good and we’ll assume that 184.108.40.206 is the AWS S3 Website IP Address.
With the completion of this AWS S3 investigation, the updated relevant IP’s are:
- Local Address – 192.168.1.206
- Local Gateway Address – 192.168.1.1
- Website Address – 220.127.116.11
Let’s quickly recap what we know and what we assume.
We know the following:
- Local IP address is 192.168.1.206
- Local Gateway Address is 192.168.1.1
- We are using HTTPS, TLS and TCP port 443
We assume the following:
- Website IP Address is 18.104.22.168
It’s always good to keep in mind our facts and assumptions when doing the analysis. Based on evidence uncovered during the analysis, assumptions may be confirmed as fact!
At a high-level, when we access a website, we have 3 distinct stages.
Stage 1 – DNS
We put in the website URL in the Chrome browser and the first thing that happens is the DNS lookup.
By filtering on udp.port == 53, we see 4 DNS packets. In order there is an A record request, AAAA record request, an A record response and AAAA record response. A records resolve IP version 4 addresses to domains and AAAA records resolve IPv6 addresses to domains. Also note the DNS transaction is between the Macbook Air and the local Gateway at 192.168.1.1. The local Gateway will then recursively resolve the DNS lookup (IE do all the work) and reply to the Macbook Air.
Highlighted in red is the type A record with IP Address 22.214.171.124. This confirms our assumption that the website IP Address is 126.96.36.199. Good news! Clearly, in the future just filter on DNS to find out the website IP Address.
Stage 2 – TCP
TCP is the second stage of accessing a website and this setups the communication so we can transfer the HTTP requests. Remember, TCP requires a 3-way handshake to setup a TCP session.
Issue: How do we find the start of the TCP handshake.
Resolution: They key is to understand the TCP specific fields that identify the TCP 3-way handshake. I’ve created a table that can be used as a guide.
[By default, Wireshark converts all sequence and acknowledgement numbers into relative numbers. This means that all SEQ and ACK numbers always start at 0 for the first packet seen in each conversation.]
To find the start of the 3-way handshake in Wireshark, we need find the packet with source = 192.168.1.206, destination = 188.8.131.52, Syn=1 (or Set), Ack=0 (Not Set) and the Sequence number equal to 0.
TCP Handshake Packet 1– Syn
We can filter in wireshark for Syn=1 packets very easily. Notice also that the Sequence # is 0 and the Acknowledgement bit is not set. Using the filter, tcp.flags.syn==1 && tcp.flags.ack == 0, below is a snapshot.
Also notice there are two packets for the start of the 3-way handshake. Only 1 is needed, so this may be a case where the Chrome browser sets up multiple streams for parallel processing or other reasons known only to Google. We will ignore the second TCP 3-way handshake.
TCP Handshake Packet 2 – Syn and Ack
The AWS S3 website responds with both the Syn and Ack bits set and the Sequence=0 and the Acknowledgement=1. With the Wireshark filter tcp.flags.syn==1 && tcp.flags.ack == 1, we see the following picture.
The Syn and Ack flags are set, the sequence=0 and the ack=1. So this looks like a good second packet in the TCP 3-way handshake.
TCP Handshake Packet 3– Ack
The 3rd packet in the handshake finishes the stream setup and has the Syn bit not set, the Ack bit set, the packet Seq=1 and packet Ack=1. This snapshot show the complete of the TCP 3-way handshake – the stream is now setup. Please note that we are looking at packet 111, highlighted in the soft blue row.
Stage 3 – The HTTP Request
We’ve completed the DNS lookup and setting up the TCP session. Now the browser will issue the HTTP GET request to ask for the web page data. So we should be able to use an HTTP Wireshark filter. But, unfortunately, no HTTP packets were captured by Wireshark and that’s because we used HTTPS and TLS encryption. The HTTP packets are encrypted and encapsulated in the TLS packet. Since we can’t find HTTP packets, we’ll filter Wireshark on tcp port 443 and then sort based on TLS.
The filter (ip.addr eq 192.168.1.206 and ip.addr eq 184.108.40.206) and (tcp.port eq 57914 and tcp.port eq 443) provides the source/destination IP pair along with the TCP port pair. You can sort in the protocol field to get all the TLS packets to analyze.
In this snapshot, packets from 114 to 156 indicate the TLS connection setup. Packet 156 with info “Encrypted Handshake Message” is the last packet in the TLS connection setup and subsequent packets are related to the HTTP GET request and response.
The TLS packets number 166 and later are specific to HTTPS but we can’t see the contents because of the TLS encryption. If we look at packet 166, we can see that it has a TCP payload of 575 bytes, and is http-over-tls.
There are ways to decode TLS in Wireshark and a quick Google search will provide the specifics.
That’s it. This webpage decode walked through a Wireshark analysis connecting from a home office to an AWS hosted static web page, which is a pretty typical, real-world, internet transaction.
- AWS Hosted Website IP may be different from a ping to the website and a DNS lookup for the website.
- Know the TCP 3-Way Handshake Table
- Google Chrome may setup parallel TCP sessions due to performance, SSL security, other reasons that we don’t know.
A future version of this will include an HTTP session instead of HTTPS so we can dig a little bit into the HTTP requests.