Network performance issues are an IT problem that will have your help desk line ringing off the hook. The network is the foundation to your applications and data, so issues at this layer will lead to a bad experience for anything else that runs on top of it.
There are many reasons for performance issues, but in this post, we'll focus specifically on packet loss. The following four causes are among the most common times you could encounter dropped packets:
1. Link Congestion
Your data must travel through multiple devices and links during its trip across your network. If one of these links is at full capacity when your data arrives, then it must wait its turn before being sent across the wire (this is known as queuing).
If a network device is falling very far behind, it won’t have room for the new data to wait (queue), so it does the only thing it can, which is to discard the information.
Hearing that data is “discarded” may sound harsh, but most applications are able to gracefully handle this, and the user probably won't ever notice it. The user’s application realizes that a packet was lost, slows down its transfer speed, and re-transmits the data. If this was a file download, an email, or another non real-time application, the effect will be minimal as long as the packet loss doesn’t continue to happen.
Some applications do not handle this very well at all, and the effect is very noticeable to the users. If you are on a phone call and the network loses some data, there is no time to resend the packets since it is a real-time conversation. The user will typically hear breaks in the audio during small packet loss, and potentially lose the phone call if the packet loss is severe. Another critical application that has a low threshold for packet loss is video conferencing. If data is lost between the two end points, the video will show artifacts and the audio will be distorted.
There are two main ways to help reduce the effect of packet loss due to network congestion:
- Increase the bandwidth of the congested link(s).
- Implement Quality of Service (QoS) to give priority to real-time traffic. This will not help the congestion of the link, but it can give priority to applications like voice or video which lowers the likelihood of a drop.
2. Device (Router/Switch/Firewall/etc.) Performance
If your bandwidth is adequate, you can still face an issue if your router/switch/firewall is not able to keep up with the traffic.
Let’s take a scenario where you recently upgraded a link from 1Gb to 10Gb because traffic reports show that you were at full capacity during peak hours of the day. After the upgrade, your charts show the bandwidth going up to 1.5Gb, but you are still experiencing network performance issues. The issue could be that the device is not able to keep up with the traffic, and you have hit the maximum throughput your hardware can provide.
The traffic is reaching the device, but the device’s CPU or memory is maxed out and not able to handle extra traffic.
This results in packet loss for all traffic that is beyond the capacity of the box.
You must replace the hardware with a new appliance that can handle your maximum throughput, or potentially cluster additional hardware to increase your throughput.
3. Software issues (bugs) on a network device
While we can hope that the software written for our network devices is perfect, I can assure you that it is not. These network devices are extremely complex, and it is a matter of time before you stumble upon a bug.
These bugs can cause new features to not work at all when you deploy them, or can go undetected for awhile before you may notice performance issues.
Once the performance issue is detected and troubleshooting has started, these types of issues are usually found using system logs and packet captures.
You must upgrade the software on the affected device(s).
4. Faulty Hardware or Cabling
Your traffic report shows that your links are not over-utilized, and the hardware utilization is within specification. The next common issue that can lead to drops would be a physical component that is malfunctioning.
If hardware is not working properly, it will usually lead to error messages being seen on the console of the device or within system logs.
If there is a link issue, it can usually be seen as errors on an interface. This can be seen on both copper cabling and fiber optic.
The faulty hardware must be replaced, or the fault link must be repaired.
These are the most common reasons for packet loss on a network, but there are many other reasons that can contribute to packet drops. The best way to determine the root cause is through a network assessment and detailed troubleshooting.
A partner who is specialized in uncovering various network pitfalls can help you plan a remediation strategy so that you don't have to live with a degraded network.