Learning and mastering the fundamentals are key to becoming a successful network engineer. In this blog post, I will be showing how ARP works. ARP is one of the fundamental protocols when dealing with today’s networks. While ARP itself works automatically and without any configuration, knowing how it works will help you really understand how traffic flows through the network. Before we go into the technical details, we first need to know why ARP is even required.
Why Do We Need ARP?
Address Resolution Protocol is described in RFC 826. ARP is the protocol that allows higher level network addresses (most commonly IPv4 addresses) to be translated into the corresponding MAC addresses used by ethernet communication. As you move down the networking stack, each higher level PDU is encapsulated in a lower level PDU. Below is a visual representation of how network layer PDUs (packets) are encapsulated by data link layer PDUs (frames):
In this example, the network layer is using IP addresses as the source and destination fields for the packet. When the packet moves down the protocol stack into the data link layer, it is encapsulated by the source and destination MAC address because that is how the data link layer knows how to communicate with other hosts. ARP allows the sending machine to know what source and destination MAC to use at the data link layer based on the source and destination IPs at the network layer. Lets jump into the technical details and see how it figures that out.
The format of the ARP frame is pretty straight forward. I will go through each field, and will be giving the values that would be used for IPv4 to Ethernet MAC address resolution.
- 2 byte field for the hardware address type. Ethernet is the most common, which has a value of 1.
- 2 byte field for the protocol address type. We will be working with IPv4, so the type would be 0x0800. This value was specifically chosen to match the type field of an ethernet frame carrying an IPv4 datagram.
- 1 byte field for the hardware address size in bytes. Ethernet MAC addresses are 48 bits long, so this value would be 6 (48 bites / 8 bits per byte).
- 1 byte field for the protocol address size. IP address are 32 bits long, so the value would be 4 (32 bits / 8 bits per byte).
- 2 byte Op field. This field specifies the ARP operation. For an ARP request (What is the hardware address for this particular protocol address?) the value is 1, and for an ARP reply (This is the hardware address for the requested protocol address.) the value is 2.
- 6 byte field for the sender’s hardware address. The length of this field will match the size contained in the hardware size field.
- 4 byte field for the sender’s protocol address. The length of this field will match the size contained in the protocol size field.
- 6 byte field for the target’s hardware address. In an ARP request, because the target’s hardware address is unknown, this is set to all 0s.
- 4 byte field for the target’s protocol address. This is the IPv4 address we are trying to resolve a MAC address for.
The ARP frame format is pretty straight-forward. The first 8 bytes will never change, but the rest of the frame will have different sizes depending on the type of hardware and protocol addresses being used.
In our example of resolving an IPv4 address to a MAC address, the ARP frame is contained inside an ethernet frame. The source MAC address used will be the sender’s MAC address, but the destination MAC address is set to the special ff:ff:ff:ff:ff:ff address which is the broadcast address. I have also seen 00:00:00:00:00:00 used since the Ethernet frame itself will be addressed to the destination of ff:ff:ff:ff:ff:ff. The frame will be sent to all hosts on the same broadcast domain. This also means that ARP will only work when trying to resolve hosts on the same broadcast domain.
ARP In Action
I will be showing the two basic ARP operations today: ARP request and ARP reply. An ARP request is when a host does not know the destination MAC address and asks if anyone knows it, and an ARP reply is when the host that does have the original target IP address replies with its MAC address. When a host successfully uses ARP to find the MAC address for a specific IPv4 address, it will populate the binding in what is known as the ARP table or ARP cache. This allows the host to quickly lookup the IP to MAC binding in the future without having to broadcast another ARP request. The normal timeout of the ARP table/cache is 20 minutes, but most hosts and devices allow for the timeout value to be changed. I will be demonstrating ARP in two examples. First will be hosts on the same subnet connected by a switch, and the second will be two hosts on different subnets connected by a router.
ARP Request & Reply: Same Subnet
The topology for this example will consist of three hosts connected to the same switch. I am using three hosts so that I can demonstrate how the ARP request is sent it all hosts, but the ARP reply is only sent to the original requestor. I will be configuring all hosts for the 192.168.1.0/24 subnet using VPCS hosts in GNS3, as depicted below:
Make note of the MAC addresses for each hosts. To demonstrate ARP, I will be pinging from one host to another. You can use the “show arp” command on the VPCS hosts to ensure that there are no MAC to IP bindings already in the ARP table:
Before I try to ping, I started a packet capture on each host connection by right clicking on the connection and selecting “start capture.”
I will now ping 192.168.1.3 from 192.168.1.2. Lets first look at the Wireshark capture from 192.168.1.2’s connection to the switch.
The first frame captured is the ARP request. If we look at the bottom portion of the Wireshark screen we can inspect the fields of the ARP frame. The opcode is “1” because this is an ARP request. The Sender MAC and IP match what we previously saw on the “show ip” from the first host. The Target IP is 192.168.1.3 because that is the IP we are trying to ping. The Target MAC is the broadcast ff:ff:ff:ff:ff:ff address. We can see that the next frame is the ARP reply that is saying that 192.168.1.3 can be found at the MAC address 00:50:79:66:68:01, which we saw earlier is the MAC address of the second host that was configured for 192.168.1.3.
Now let’s look at the screen capture from host 2 and the details of the ARP reply:
Host 2 received the ARP request, and then sent an ARP reply. The opcode on the reply is 2 to signify it is a reply. In this frame, the Sender MAC and IP are set to host 2’s, and the Target MAC and IP are of Host 1 who originally sent the request.
One important thing to note here is that Host 2 was able to reply directly to the initial ARP request without having to do an ARP request of its own. Host 2 was able to use the information from the original ARP request and use it in the ARP reply. We can use “show arp” on the VPCS hosts to see the ARP table:
The last thing we will check is the packet capture on host 3. Because it is on the same subnet as host 1, but does not have the IP listed in the ARP request, we will only see the ARP request and not the reply.
Host 3 receives the ARP request, but ignores it after seeing that the Target IP is not its own. That is the basics of how ARP works on a subnet when two hosts are trying to communicate, and how the rest of the subnet sees the ARP process. Next we will see how ARP works when trying to contact a host on another subnet.
ARP Request & Reply: Different Subnets
ARP is also used when communicating to remote hosts. In this case, rather than resolving the MAC address of the far end device, the sending host will resolve the MAC address of its gateway. In our first example, host 1 knows that the IP that it was trying to communicate with (192.168.1.3) is on the same subnet so it sent out an ARP request directly for host 2. When a host calculates that the IP that it is trying to communicate with is on a different network, the host will use ARP to resolve the MAC address of its default gateway. Below is the topology that I will use to demonstrate this example:
We should see two separate ARP request/reply combos. The first is 192.168.1.2 requesting the MAC of 192.168.1.1 (its gateway), and another for 192.168.2.1 requesting the MAC for 192.168.2.2. Here is the configuration for the two hosts:
R1 is configured with the IP addresses 192.168.1.1 and 192.168.2.1. Again I will start a capture on each connection (here I will use the connection between the switch and router), and this time I will ping 192.168.2.2 from 192.168.1.2. Below are the two captures:
Host 1 (192.168.1.2) does not know the MAC address of its gateway (192.168.1.1) and sends an ARP request. The router replies with an ARP reply telling what the MAC address for 192.168.1.1 is.
R1 (192.168.2.1) does not know the MAC address of host 2 (192.168.2.2) and sends an ARP request. The host replies with an ARP reply telling what the MAC address for 192.168.2.2 is. If we look at the ARP table on the router, we now see entries for both 192.168.1.2 and 192.168.2.2.
This scenario shows the importance of the ARP table. If an ARP table was not maintained, then every time a host tried to communicate to another network, it would have to resolve the MAC address of its gateway for every frame sent! Populating a table allows for much quick processing of the IPv4 to Ethernet MAC address translation.
ARP is the protocol used when trying to translate network addresses to data link layer addresses. While ARP can support a number of network and data link address types, the most popular are IPv4 and Ethernet MAC Addresses. In this blog post I gave a basic overview of ARP, why its creation was needed, what the frame format is, and how ARP works translating IPv4 addresses to Ethernet MAC addresses. ARP may “just work,” but every network engineer should know the details to how it works in order to have a firm understanding of how Ethernet and IPv4 interact on a network.