As described in my blog post here I experienced an issue with certain Intel ethernet controllers. Here's how to see if your controllers are affected.
For this simplified test you'll need two machines (one to replay the packet and one to receive it) and you'll need to be on the same ethernet segment. No routers or VLAN aware switches should be in the mix (but dumb switches/hubs should be fine).
sudo tcpreplay -v -i [transmitting interface] [pcap name]
sudo tcpreplay -v -i eth1 pod-icmp-ping.pcap
If your controllers are affected the ethernet interface will lose link. In many circumstances the only way to get the controller to work again is to physically power off the machine and power it back on.
NOTE: These packets will be sent to the ethernet broadcast address (to simplify testing). If you are affected by this issue it will take down all of the ethernet interfaces on the connected network. If that is of concern you should use tcpreplay-edit to set a specific destination ethernet address:
sudo tcpreplay-edit --enet-dmac=00:11:22:33:44:55 -v -i eth1 pod-icmp-ping.pcap
Where "00:11:22:33:44:55" is the MAC address of the machine you'd like to test.
Finding other examples (findpod):
I've had various people report similar (if not identical) behavior with various other ethernet controller and traffic types. If you're experiencing sporadic failures of your ethernet controller and you think it may be related to network traffic you're receiving I've created a tool called "findpod" that can help you narrow your search. It's called "findpod.sh" and there is a download link below. If you're using a Debian based system you can install it like so:
sudo bash ./findpod.sh install
As news of this issue spreads further some controllers are affected and some aren't. That's more or less what I expected. Here's what I know about fixing this.
It has been my understanding that Intel provides at least two EEPROM versions for this chip: one with BMC enabled and one without. My controllers do not have BMC enabled, therefore my fix only applies to non-BMC enabled controllers. This is unfortunate because the BMC enabled controllers seem to be much more widely used. Even with that other than the very basics (MAC address and checksum) I don't know the meaning of these values. Another reason not to reprogram the EEPROM on your NIC based on what some guy on the internet told you.
With that being said here is a diff between an affected EEPROM and a good EEPROM:
-0x0010: ff ff ff ff 6b 02 00 00 86 80 d3 10 ff ff 5a c0
Where the "-" lines were the bad EEPROM and the "+" lines were the good EEPROM.
Under Linux you can view these values with ethtool:
# ethtool -e [interface]