BGP workaround for backbone issue

I had an interesting issue where I had to use BGP traffic engineering to work around an unusual quark with a major backbone provider. The site I was working with was dual-homed using a major national provider and a smaller local provider. In order to better load traffic between the circuits we attempted to use some BGP as-path prepending to favor the major connection less. When we added more than two prepends to the major provider, all traffic on the link stopped and we only saw about 4kbps to traffic on the link.

Checking different BGP routers with a full tables and route-servers I no longer saw the route via the major provider be advertised. We contacted the provider assuming they had some error where AS-Path filtering was eliminating the route. The provider check in their router and confirmed that there were seeing my prepended advertised routes in their router. Things got interesting when a traceroute from their NOC exited their network through a transit provider rather than going through their own network. Checking with their engineers we found the issue was that they were seeing the path through their transit provider to the local provider as being 2 hops and the local connection as being 3 hops therefore they were passing all traffic off to that transit provider. This does not normally happen because standard practice for providers is to favor customer learned routes, then peer routes, and lastly paid transit connections. This is because backbones would rather have you need to upgrade your connection, and thus earn more revenue rather then need to either upgrade their peering links or pay for more transit. This provider was not favoring the direct connection and both the transit provider and customer link had a default local preference of 100. After learning this I was able to send them a community string of XXXX:120 to raise the local preference of my connection to be 120 and thus favor my connection over the peer learned route. After the discovery the engineers put in a local ticket to follow up on things internally, to verify if the local-preference issue was intentional or unintentional, however my workaround will function the same way even if they later lower the preference of their transit links.

In summary when troubleshooting BGP remember that routes are learned evaluating them in the following order

  1. The longest (most specific) route
  2. The local weight
  3. The BGP local-preference
  4. The BGP AS-Path