Hosting Partners  |  About Us  |  Blog  |  Legal  |  Portal Login

The Planet Blog

 
Posts Tagged ‘traceroute’

Anthony LedesmaI’ve spent many years in various tech support roles throughout my career, regularly reviewing issues customers face. At The Planet, our goal is to provide excellent service for our customers, which usually comes through our support tickets. Most of our customers use the ticket system, so I thought I’d offer some thoughts on how to ensure we help you resolve outstanding issues quickly and thoroughly.

Here is an example of a support ticket that provides just about every detail that’s necessary to remediate the issue:

—–SNIP—–

Department: Support

Subject: New York City AT&T DSL unable to connect to my server in DLLSTX6

Server: servername-1.2.3.4

Body: My customer, Jon Smith, is unable to ssh into 1.2.3.4 on tcp/22 from 255.255.255.255. We have insured that Jon is not being blocked via iptables and he can connect to Apache on tcp/80.

[root@server ~]# iptables –nLChain INPUT (policy ACCEPT)target prot opt source

destination Chain FORWARD (policy ACCEPT)target prot opt source

destination Chain OUTPUT (policy ACCEPT)target prot opt source destination[root@server ~]#

[root@server ~]# traceroute 255.255.255.255

myserversgateway (1.2.3.1) 0.004 ms 0.020 ms 0.011 ms

1 dsr01.dllstx2.theplanet.com (12.96.160.9) 0.901 ms 0.944 ms 1.174 ms

2 …3 ……10 …11 255.255.255.255 (255.255.255.255) 42.211 ms 50.421 ms 45.114 ms[root@server ~]# ping 255.255.255.255PING 255.255.255.255 (255.255.255.255) 56(84) bytes of data.

64 bytes from 255.255.255.255 (255.255.255.255): icmp_seq=0 ttl=43 time=49.0 ms

64 bytes from 255.255.255.255 (255.255.255.255): icmp_seq=1 ttl=43 time=45.4 ms

64 bytes from 255.255.255.255 (255.255.255.255): icmp_seq=2 ttl=43 time=45.5 ms

64 bytes from 255.255.255.255 (255.255.255.255): icmp_seq=3 ttl=43 time=54.0 ms

64 bytes from 255.255.255.255 (255.255.255.255): icmp_seq=4 ttl=43 time=46.1 ms

— 255.255.255.255 ping statistics —

5 packets transmitted, 5 received, 0% packet loss, time 4003ms

rtt min/avg/max/mdev = 45.453/48.060/54.077/3.277 ms, pipe 2

[root@server ~]#

[customer@remoteserver ~]$ telnet 1.2.3.4 80

Trying 1.2.3.4…

Connected to myserver.tld (1.2.3.4).

Escape character is ‘^]’.

HEAD / HTTP/1.0

HTTP/1.1 404 Not Found

Date: Fri, 06 Jul 2007 17:34:37 GMT

Server: Apache/2.0.52 (Red Hat)

Connection: closeContent-

Type: text/html; charset=iso-8859-1

Connection closed by foreign host.[customer@remoteserver ~]$

ssh 1.2.3.4ssh_exchange_identification:

Connection closed by remote host[customer@remoteserver ~]$ telnet 1.2.3.4 22

Trying 1.2.3.4…

Connected to myserver.tld (1.2.3.4).Escape character is ‘^]’.

Connection closed by foreign host.[customer@remoteserver ~]$

You can login to my server with the credentials listed in Orbit. You can reach me any time at (555) 555-5555.

Thank you,Bob

Customerbob@remote.tld

—END SNIP—

Without ever logging into the server, our trained technicians are able to immediately identify the problem. The issue in this case is that either through a script(bfd) or other means, the remote user is being denied by TCP WRAPPERS (man 5 hosts_access). We are able to see that iptables are not blocking any connections, and we are able to reach the server without issue on another TCP port that’s not typically controlled by the wrapper. The customer ensures that the ticket subject was descriptive so that our technicians are able to evaluate the ticket in the support system queue. Each technician typically has a different area of expertise, so we work to steer the ticket to the right person for the job. Here is a list of subjects that provide helpful information for the support techs responding to the ticket:

  1. cPanel: WHM is not responding. Server is accessible via SSH
  2. Apache: My .cgi is giving a 501 error
  3. cPanel: Apache is segfaulting after I upgraded PHP
  4. Exim: User so-and-so cannot send outgoing form mail after tweaking security in cPanel
  5. Server kernel panics daily. Need to have the RAM tested/replaced
  6. MySQL: forums refusing more than 100 concurrent users
  7. Reboot Request**
  8. Network: San Jose, CA – 350ms RTT via Sprintlink

** #7 should not be a Support Ticket but a Reboot Ticket.

At The Planet, tech support is always eager to help resolve issues quickly and efficiently, so when customers add a bit of detail in the subject line, we’re able to ensure the ticket is routed to the most appropriate individual.

Just a few tips to help bring a quick resolution to a number of support issues. Thanks for stopping by.

- Anthony

Chris TurbevilleA question that I often face – what IS a traceroute? Most individuals know that it represents the “hops” or routers along the path from a system’s IP to the destination IP entered. What’s most often misunderstood is the response time numbers printed for each hop. Some assume that it is the time any packet takes to make that leg of the journey. But, that just isn’t the case.

Those times actually measure the time elapsed from when the packet was sent to when a response was received. In today’s Internet, most routers have very strict limits on how many of these responses they can generate in any particular time period. So a lack of response isn’t indicative of the latency of that leg, also because the router may use a very different part of its “brain” to generate these control responses than it would to simply forward a normal packet the times may be misleading. This means that a router under modest load may respond with wildly different times, as its busy doing other “housework.”

So how do we know where the lag is coming from with a traceroute? Only end-to-end pings can really show latency or packet loss. But, certain patterns in a traceroute can help pinpoint a possible source.

One method of detection is a cliff-like increase in latency that builds from one hop forward. In other words, the traceroute suddenly has a steady jump in the return times of each hop from one spot to the next. Notice I didn’t say stars in the route. In today’s Internet landscape, stars don’t reflect the certain issue they once did. Certain providers have restrained the routers so much that they constantly throw stars.

If you see every hop after a line is throwing stars then that link may be losing packets. An end-to-end ping showing this loss is just about the only way to verify that for sure. As if all these rate limits weren’t enough to render our poor traceroutes meaningless, there’s another issue making it even more difficult.

The Internet often involves paths that get somewhere a different way than they get back. In other words the Internet is asymmetric. This asymmetry means that the packet you sent to the hop in the traceroute got to the router one way – through one set of providers, links, etc. – and the router’s response got back to you another way. This means a lack of response, the star, or latency could indicate an issue with either path. It also means that the cliff I spoke of earlier could mean that past that hop the return path has an issue not the actual route you see. Yes, traceroute can only show you the outbound path. This is a weakness in the technique it uses. Only the outbound path is visible to the tracing packets. This makes diagnosing or finding the offending hop difficult if it’s located in the return path.

So how do we find the issue? Anyone that’s opened a network-related ticket with The Planet knows we like to have traceroutes (to indicate the path), then pings of 100 or so, from the IP at The Planet seeing the problem to the IP on the Internet. And if possible, and we know this isn’t very easy, the same from the Internet side of the issue back to The Planet IP.

If we have this sort of information we can usually determine where the problem exists. Of course like taking your car to the shop many times these traces and pings don’t show the problem because it is intermittent. They are still useful and at least give us a baseline to work from when we’re looking into the issue. Intermittent issues can also be helped by reporting times of day the issue happens and/or if it is limited to certain IPs or servers.

So the next time someone tells you that a 380ms spike in hop 5 means that the router is overloaded, or that a star in line 10 shows that we’re losing packets, you might let them know that it’s never that simple in today’s Internet.

- Turbo

 
 

Dedicated Servers

Managed Hosting

Colocation

Business Solutions

Why The Planet?

Contact Us