I ran into a very odd situation today. Now, I know that there are those out there in cyberland who will have seen this before, but I have not, and on the odd chance that it might help you, I post this.
Doing WNLB, using VMWare hosting the WNLB servers. Therefore, according to VMWare we should be using multicast. So we did. And swiftly noticed that other servers on the local VLAN could not find the WNLB address. In fact, we noticed that the switch itself could not ping the WNLB address. Devices on OTHER vlans could ping the WNLB. WTF?! Double-check the setup and redo the static ARP on the switch with this:
ARP 10.1.8.75 03bf.c0a8.0164 ARPA
Here is what it looked like AFTER we did the proper static ARP configuration on the switch…
Notice that a tracert is showing that even the simplest action is pushing the packets at the local gateway.
We are using the following switch hardware:
Cisco IOS Software, Catalyst 4500 L3 Switch Software (cat4500-ENTSERVICESK9-M), Version 12.2(54)SG, RELEASE SOFTWARE (fc3)
System image file is "bootflash:cat4500-entservicesk9-mz.122-54.SG.bin"
cisco WS-C4948-10GE (MPC8540) processor (revision 5) with 262144K bytes of memory.
Processor board ID FOX092101VW
After going back and forth, including rebuilding the WNLB configuration, we realized we were dealing with a multicast capable switch. Having nothing to lose, we did the following on the switch:
no ip multicast-routing
wala! Now we can resolve the WNLB, ping it, tracert to it, and actually access services on the member servers. Oddly, I had a colleague with a similar issue at the same time. Their situation was resolved by using the arp IP MAC arpa command not only on the switch the WNLB connected to, but all distribution switches and the core of the stack also.