Ethernet networks have taken over as the prominent bus architecture for industrial automation networks. And for good reason.
Ethernet simplifies and homogenizes connectivity between industrial devices. We don’t have to worry about male or female DB9 and DB25 serial connectors, no concern for DTE or DCE devices, and forget about straight-through vs. null-modem cables. In general, the world is a much happier place.
However, even with the added simplicity Ethernet offers, troubleshooting Ethernet networks can be a bit more involved than serial networks.
Let’s assume you’re trying to establish communication with an industrial Ethernet device that for whatever reason has dropped off the network. Here are a few tips I always use when starting to troubleshoot an industrial Ethernet network.
The easiest solution is often the right solution
I’m a Windows user by default. But as much as I love Windows (particularly Windows 10), it's not entirely without fault. Occasionally, there are times when the simplest solution to a networking problem in Windows—and yes, it pains me to say it—is to reboot your PC.
There have been times where I’ve spent close to an hour checking everything I can think of, like running ping, nslookup, traceroutes, etc., just to reboot the PC and have that completely fix whatever the phantom networking problem was.
So, as a first resort when you have trouble getting your Windows PC connected and communicating on your Industrial Ethernet network, if possible, reboot your PC and proceed from there.
DOS is your friend
While it’s no longer technically called the DOS prompt (short for Disk Operating System), the Windows command line is still incredibly helpful in troubleshooting industrial Ethernet networks.
Open a command line by going to start, run, and typing in cmd. If you’re on Windows 10 you can just click start and then type in the letters cmd and you’ll get the same result—a black window with a flashing cursor ready to do your industrial network bidding.
The first thing to check is your PC’s general network health. The easiest way to do that is to run the command ipconfig. Depending on what’s going on with your PC, you may get one of several possible outputs to the command line.
If everything is working as it should you’ll see your Ethernet adapter listed, followed by your PC’s current IP configuration. This is ideally what we want to see.
Another possibility is that your PC will report "Media disconnected."
This essentially means the connection between your PC and the Ethernet switch has malfunctioned. Here are some things to check.
Note: A lot of these tips can be applied to the end device you’re trying to connect to, like a PAC, a PLC, or that rack of intelligent Ethernet I/O. What!? You’re not using intelligent I/O!? That's so '90s. Check out The Case for Intelligent Remote I/O white paper.
Bad Ethernet cable
Swap your current Ethernet cable out for a known good cable and run ipconfig again. If you’re still seeing "Media disconnected," there’s a problem with either the Ethernet switch interface or your PC’s network interface card (NIC).
Also, look at the RJ-45 connector on your cable. If the locking tab is broken, save yourself some heartache and throw out the cable, or crimp a new RJ-45 on it. Trust me, you’ll save yourself frustration down the road. (And when it comes to RJ-45 crimping tools, don’t cheap out on yourself. The right tool will save you a ton of time down the road. Outfit your shop with the right tools using this blog post.)
Disabled switch port
Sometimes switch ports are configured to be disabled or turned off by the IT network administrators.
This is actually a good security practice. You don’t want someone to be able to just walk into your plant or factory, plug in their PC, and start snooping around on your network.
I remember a few years ago, a multimillion-dollar company I worked with went through a security audit. During the audit a penetration tester walked in a door that was usually locked, as someone exited the door.
In the security world we call that tailgating. You can read more about how to defend your company from tailgating in this www.securitymagazine.com article.
The penetration tester meandered over to an open cubicle, sat down, jacked into the network, and spent the next two days running exploits against the company's servers, where they thought their source code and other intellectual property was safe. It was over a day and a half before someone finally asked the penetration tester who she was and why she was there.
So check with the network administrator and make sure the Ethernet port on the switch is actually enabled. The link light on the switch or your network card can also help tell you this.
Wrong duplex and speed setting
Remember that Ethernet was not always 10/100/1000 MB and full duplex.
Lots of devices support different Ethernet speeds. It’s up to the two Ethernet devices on each end of the Ethernet cable to negotiate the speed and duplex settings of the connection. But this negotiation fails more often than people realize.
You can always try hardcoding the Ethernet switch’s interface to whatever the attached device supports. If you don’t know for sure what settings are supported by your device or PC, start small with 10 MB half duplex and work your way up from there.
If you want to learn more about what happens at the link layer on Ethernet networks, check out this blog post on Link Layer Troubleshooting.
Once you’ve confirmed the physical Ethernet connection between your PC and switch are good and working, it’s time to move up the network stack so to speak.
Verify your IP settings
In your ipconfig command output, check to see if the IP address starts with the numbers 169.254.X.X. An address setting where the first two numbers are 169.254.X.X indicates two things:
- Your PC is configured to automatically receive an IP address from the industrial network’s DHCP server. (A DHCP server is a server on the network that hands out IP addresses to industrial network devices and keeps track of them.)
- For whatever reason, the PC asked the DHCP server for an address but didn’t receive a response.
Uh-oh.
But be careful with a static IP. If you don’t know what IP addresses are in use on the network and configure your PC with an address already in use by another device, you can wreak all kinds of havoc on the network. Duplicate IP addresses on a network can be a real pain to track down.
Another possibility is that the DHCP server is only configured to hand out IP addresses to devices it knows, and for some reason it doesn't know your PC. This is another good security practice to make sure rogue nodes don’t show up on your industrial Ethernet network.
As a side note, the four numbers in an IP address are called octets, which sounds like some sort of dreadful night at the opera but is more closely related to the IP address being a 32-bit address broken into four 8-bit sections. So now you know about octets. And knowing is half the battle.
Application connectivity troubleshooting
So at this point you should have a known good working PC connected to a known good working Ethernet switch port. Now let’s assume you’re trying to communicate with another device on the network but running into problems. Maybe you’ve got a new PAC or PLC that dropped off the network overnight. How do you track down the problem?
Again, DOS is your friend. Open a command prompt and type ping X.X.X.X where the 4 Xs are the IP address of the device you’re trying to connect to.
The ping command in networking is one of those tools you’ll use more than just about any other. Ping is basically the same concept used by submarine sonar. We send a message out to a device and hope to get a response.
Ping is part of the incredibly important Internet Control Message Protocol (ICMP) and is used to facilitate general network communication. If you were to look a network trace of a ping network packet, what you’d actually see is an ICMP echo request packet. And if the device being pinged responded, you’d see an ICMP echo reply packet coming back.
The response you get to the command can help clue us into what’s going wrong. If you get a "request timed out" message, that typically means the device simply wasn’t found on the network. Go back up and repeat the Ethernet troubleshooting tips above.
Now if you do get a reply from the IP address, but for whatever reason you can't get your software application to connect to your PLC or PAC, that could happen for several reasons.
Duplicate IP address on the local area network (LAN)
- The first thing to do is clear your PC’s ARP cache. ARP stands for Address Resolution Protocol. ARP is basically how network devices associate an Ethernet MAC address to an IP address.
Remember that when two Industrial Ethernet network devices try to communicate with each other, they have to use both the Ethernet protocol and the TCP/IP protocol. Which means that at some point each device needs to associate the other device's MAC address to its IP address. This association is stored in the device's ARP table, also known as the ARP cache. And the current ARP table or cache is where a device initiating a connection to another device will first look to begin a new connection. If the information in the ARP table or cache is incorrect, the connection may fail.
- You can clear your PC’s ARP cache with the command arp -d. That shouldn’t hurt anything on your PC. Your PC will simply start rebuilding its ARP cache as needed for new connections.
- Now run your ping command with your device's IP address again, and wait for the reply. Once you get a reply, check your PC’s ARP cache with the command arp -a. Look through the list of IP addresses and find the device you’re trying to communicate with. Check the corresponding MAC address and then confirm that it matches the MAC address printed on the device you’re trying to communicate with.
If they don't match, you’ve very likely got a duplicate IP address that has knocked your PLC or PAC off the network. Tracking down the rogue device usually requires help from the network administrator; hopefully they have the toolset required to find out which switch port the rogue device is connected to.
Routers and firewalls
Routers and firewalls are designed to block traffic or send traffic to designated routes. But that means it's also possible for those routers and firewalls to block traffic that actually needs to go through, due to misconfiguration. Here are some things to check.
- If you’re trying to communicate across a router or through a firewall to your device, and you can ping it but can’t get your software to connect, make sure the port numbers your application uses are actually open on the router or firewall. A good security practice is to shut down all ports on routers and firewalls unless explicitly required. We call this allow or enable by exception.
- Another thing to check is to make sure the device you’re trying to communicate with across the router has the proper gateway and subnet mask configured. Remember that both devices on each LAN need to have the right subnet mask and gateway addresses configured to get to each other across a router. The packets each device sends need to know how to get there and then how to get back.
- Finally, check your PC’s firewall or the firewall of the device you’re trying to get to. Modern industrial Ethernet devices should have some form of firewall or IP filtering built into them for security purposes. Make sure the device is configured to allow connections from your PC’s IP address.
The SNAP PAC programmable automation controllers from Opto 22 support IP filtering as well as a host of other secure features, like a RESTful API and web server to keep your data and IoT applications safe.
You can learn more about SNAP PAC programmable automation controllers on the SNAP PAC System product page.
Sometimes troubleshooting Industrial Ethernet networks can get more complicated than the basic steps above. When that happens I always turn to the Wireshark network analyzer. It’s free and is pretty much the industry standard for network analysis at a packet and protocol level.
Sign up for the Opto 22 blog to receive notification when we publish our upcoming Wireshark for Industrial Networks blog post.