Cisco IP Device Tracking

IPDT and DHCP collision detection

14 July 2017   11 min read

Ever thought about how ACS gets an end users IP or how when showing an interfaces authentication sessions it had the IP of the host attached? This all stems from IP Device Tracking. I only recently came across it when troubleshooting an issue we had with windows machines not getting a DHCP address due collision detection involving the 0.0.0.0 address. Although there are lots of posts about people having a similar issue and the workarounds, I couldn’t find much information on the exact reason why this happened. This post is designed to give more reasoning on why this happens.


Table Of Contents



IPDT Operation

IPDT uses ARP inspection to maintain a database of MAC/IP per VLAN off every switchport. This information is used by features that have dependencies on it such as 802.1x, MAB (ACS & ISE), Netflow, Trustsec and web-auth. For example with MAB, the port is first authenticated then in a subsequent RADIUS update packet the device tracking info (IP of device) is passed onto ACS.

As it uses ARP, even if you don’t use this feature for anything else it is a great way to check ARP entries without having to log into the device that is the gateway for the subnet.

On earlier code IPDT is apparently disabled by default (not always true), but from 15.2(1)E upwards it can`t be enabled or disabled manually. It is enabled dependent on the features that use it, so if a feature that relies on it is enabled, IPDT is also enabled. You can see on an interface which features use IPDT, so to disable it for that interface you need to disable those feature on that interface.

Switch# show ip device tracking interface gig 1/0/9
--------------------------------------------
Interface GigabitEthernet1/0/9 is: STAND ALONE
IP Device Tracking = Disabled
IP Device Tracking Probe Count = 3
IP Device Tracking Probe Interval = 180000
IPv6 Device Tracking Client Registered Handle: 75
IP Device Tracking Enabled Features:
HOST_TRACK_CLIENT_ATTACHMENT
HOST_TRACK_CLIENT_SM

This seems a bit shortcoming by Cisco as they recommend not to enable IPDT on trunk links, but if you use one of these features on a trunk link you can’t disable it. Cisco recommendations regards trunks are based on it potentially effecting switch performance:

  • The IPDT database for a trunk port will be very large as you are tracking the whole ARP table.
  • Probes are sent every 30 seconds so a large table will increase network traffic. Activities of hosts connected indirectly over a trunk may not come to this switch at all but will still be probed.
  • Probes sent by remote switches connected to the local switch via trunk ports maybe received that could increase the chance of the duplicate IP Address issue introduced by IPDT probes.
  • The IPDT table is synced between active and standby stack members, so trunk port flaps will cause entry insertion/ deletion resulting in a large number of notification flooding.

Facts about IPDT

If an interface goes down the state of entries in the table are changed to INACTIVE. This also applies if a PC attached to a Cisco phone is shutdown, the entry immediately becomes INACTIVE.

  • All entries (ACTIVE and INACTIVE) remain in the IP device tracking table until the maximum table limit is reached (switch dependent), after which the oldest entry will then be replaced.
  • The default probe interval (aging timer) is 30 seconds. The idle timeout is actually the probe interval multiplied by the number of probe entries (3), so 90 seconds (idle timer).
  • The switch will send a probe every 30 seconds (aging time) to every port that it has an ACTIVE entry.
  • If the switch receives an ARP packet (request, reply or GARP) on a port, it resets the aging time back to 30 seconds for that port.
  • If the aging time expires the switch sends an ARP probe to verify that the host is present. If the host is present, it sends a response to the switch and the switch updates its entry.

If an interface goes down the state of entries in the table are changed to INACTIVE. This also applies if a PC attached to a Cisco phone is shutdown, the entry immediately becomes INACTIVE.

  • All entries (ACTIVE and INACTIVE) remain in the IP device tracking table until the maximum table limit is reached (switch dependent), after which the oldest entry will then be replaced.
  • The default probe interval (aging timer) is 30 seconds. The idle timeout is actually the probe interval multiplied by the number of probe entries (3), so 90 seconds (idle timer).
  • The switch will send a probe every 30 seconds (aging time) to every port that it has an ACTIVE entry.
  • If the switch receives an ARP packet (request, reply or GARP) on a port, it resets the aging time back to 30 seconds for that port.
  • If the aging time expires the switch sends an ARP probe to verify that the host is present. If the host is present, it sends a response to the switch and the switch updates its entry.

IPDT probes are in the form of:

Sender IP: 0.0.0.0                   Sender MAC: switch_int_MAC
Target IP: attached_host_IP          Target MAC: attached_host_ip

Will see the below debug log message every 30 seconds or when an ARP packet is received from the host.

Jan  9 2017 12:21:19.063 GMT: sw_host_track-notify:host_track_activate_entry Notify other features: activate -(Gi1/0/10 169.254.11.153 (e8e0.b708.6c5e) VLAN:12 ID:133 ARP)

Example of an IP moving between ports:

Jan  9 2017 12:19:50.241 GMT: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/10, changed state to up
Jan  9 2017 12:19:51.143 GMT: sw_host_track-obj_destroy:Host removed from AVL 1 HASH 1 INTF 1 (Gi1/0/2 10.17.10.81 (e8e0.b708.6c5e) VLAN:12 ID:133 ARP)
Jan  9 2017 12:19:51.143 GMT: sw_host_track-notify:host_track_delete_entry - Notify other features: del - (Gi1/0/2 10.17.10.81 (e8e0.b708.6c5e) VLAN:12 ID:133 ARP)
Jan  9 2017 12:19:51.143 GMT: sw_host_track-obj_destroy:Host entry freed(Gi1/0/2 10.17.52.81 (e8e0.b708.6c5e) VLAN:12 ID:133 ARP)
Jan  9 2017 12:19:51.143 GMT: sw_host_track-obj_create:e8e0.b708.6c5e(169.254.11.153) New mac info AVL node created
Jan  9 2017 12:19:51.143 GMT: sw_host_track-obj_create:e8e0.b708.6c5e(169.254.11.153) Cache entry created
Jan  9 2017 12:19:51.143 GMT: sw_host_track-notify:host_track_add_entry - Notify other features: wired add - (Gi1/0/10 169.254.11.153 (e8e0.b708.6c5e)

Example of the same IP on same port:

Jan  9 2017 12:45:31.540 GMT: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/11, changed state to up
Jan  9 2017 12:45:32.540 GMT: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/11, changed state to up
Jan  9 2017 12:45:32.695 GMT: sw_host_track-notify:host_track_activate_entry Notify other features: activate -(Gi1/0/11 169.254.11.153 (e8e0.b708.6c5e)

Cisco’s recommendations to tackle the duplicate IP 0.0.0.0 issues are:

  • Enter the non-zero source IP in ARP requests. The downside is this makes it non-RFC compliant.
  • Delay the first IPDT probe (is triggered by a link-up) by 10 seconds.

To understand why this will resolve the duplicate IP address issue we need to first understand how collision detection works.

Collision Detection (RFC5227)

Collision detection is used by the DHCP client to ensure that the IP it has been assigned isn’t already in use. It does this by sending ARP request sourced from 0.0.0.0 with the target being the DHCP assigned IP address it wants to use.

Sender IP: 0.0.0.0                Sender MAC: PC_MAC
Target IP: PC_DHCP_IP             Target MAC: 00:00:00:00:00:00

From the tests I performed it seems that Windows does comply with the RFC in terms of how it performs collision detection. A summarization of  the process for collision detection as per the RFC:

  1. Host waits for a random time interval selected uniformly in the range zero to PROBE_WAIT (1) secs.
  2. It will send PROBE_NUM (3) probe packets in total
  3. Each of these probe packets is spaced randomly and uniformly, PROBE_MIN (1) to PROBE_MAX (2) secs apart (to stop sending initial probes simultaneously if lots of hosts powered on at same time).
  4. If during this period, from the beginning of the probing process until ANNOUNCE_WAIT (2) seconds after the last probe packet is sent one of two conditions (see below) is met the host MUST treat this address as being in use by some other host.
  5. If classed as free host MUST announce that it is commencing to use this address by broadcasting ANNOUNCE_NUM ARP announcements (GARP), spaced ANNOUNCE_INTERVAL (2) seconds apart. The host may begin legitimately using the IP immediately after sending the first of the two ARP announcements.

Therefore the client can wait upto 9 seconds when monitoring for a collision:

  • Minimum time = 61 (PROBE_WAIT) + [3 (PROBE_NUM) x 1 (PROBE_MIN)] + 2 (ANNOUNCE_WAIT)
  • Maximum time = 9   1 (PROBE_WAIT) + [3 (PROBE_NUM) x 2 (PROBE_MAX)] + 2 (ANNOUNCE_WAIT)

The 2 possible conditions (reasons) for it detecting a collision are:

  • (a) Looks as if another host already has and is using the IP address. If during detection period the host receives:

    • Any ARP packet (Request *or* Reply) on the interface where the probe is being performed
    • Where the packet’s ‘sender IP address’ is the address being probed for
    • Then the host MUST treat this address as being in use by some other host
  • (b) Another host has been assigned the IP at the same time or IPDT probe. If during detection period the host receives:

    • Any ARP Probe where the packet’s ‘target IP address’ is the address being probed for
    • And the packet’s ‘sender hardware address’ is not the MAC of any of the host’s interfaces
    • Then the host SHOULD similarly treat this as an address conflict and signal an error to the configuring agent as above

How does this relate to IPDT

In terms of the duplicate IP issue seen when using IPDT it is down to it matching condition b.

If the Cisco IPDT probe happens during this 9 second Collision detection window, as both probes have a sender IP of 0.0.0.0 and the same target IP the windows client marks the IP as used and will waits 30 seconds before trying again for another IP.  The DHCP server will mark that IP as a bad address meaning it can no longer be used. Therefore, if you go back to Cisco’s workarounds:

Enter the non-zero source IP in ARP requests (makes it non-RFC compliant) Solves the problem as the source IP addresses are now different meaning the client wont see it as another host doing collision detection for the same assigned DHCP IP (target address)

Delay the first IPDT probe (is triggered by a link-up) by 10 seconds As the max time is 9 seconds, this delay of 10 seconds gives the client time to go through its collision detection process and avoid the conflict

When a Windows client boots up it does collision for its APIPA address straight away which will cause the aging time to reset so give you 30 seconds (unless another is received ARP in between) before the IPDT will attempt to probe again. Adding the delay to this that gives 40 seconds for the PC to boot up and get an IP address.

If PCs do a PXE boot at startup the 10 second delay may not fix the issue as it is dependent on how long your system takes to start up. PXE will not do a collision detection but this will cause the IPDT entry to go ACTIVE, meaning a probe will be sent 30 seconds afterwards as is unlikely any other ARPs will be heard by the switch on that port (to reset the IPDT timer) whilst the PC is booting. Therefore if it takes 35 seconds for your PC to boot up the 10 second delay will not fix the problem.

Because of PXE boot and PC boot times ip device tracking probe delay 20 did not fix the issue for us. Instead we used the following and it resolved the problem. Setting the count to 2 ensure that it still had the default behavior of 90 seconds before entries become INACTIVE.

ip device tracking probe interval 45
ip device tracking probe count 2

Some useful cmd for seeing how it works, but be warned will generate a lot of logs if you have many IPDT entries.

debug ip device tracking error
debug ip device tracking notify
debug ip device tracking obj-create
debug ip device tracking obj-destroy

For reference below is my Windows test machines boot process and times.

1. 0 second - PXE asks for IP using DHCP REQUEST
2. Is PXE assigned an IP by DHCP
3. Requests TFTP data
4. 26 seconds - PC sends ARP request (broadcast)
   Sender IP: 0.0.0.0                               MAC: PC_interface_MAC
   Target IP: 169.254.x.x                           MAC: 00:00:00:00:00:00
5. 40 seconds - Sends DHCP DISCOVER/REQUEST
6. DHCP ACK
7. PC sends 3 requests, 1 every second (Windows collision detection)
   Sender IP: 0.0.0.0                              MAC: PC_interface_MAC
   Target IP: PC_DHCP_IP                           MAC: 00:00:00:00:00:00
8. 44 seconds - PC sends a GARP
   Sender IP: PC_DHCP_IP                           MAC: PC_interface_MAC
   Target IP: PC_DHCP_IP                           MAC: 00:00:00:00:00:00
9. Sends several APRs for default gateway. As the switch receives this it wont send any probes until no ARP received for more than 30 seconds
10. 68 seconds - Client sends a DHCP Inform to say “I am using this IP”, which will get ACK from DHCP servers