Quantcast
Channel: SRX Services Gateway topics
Viewing all articles
Browse latest Browse all 3959

VPN Tunnel failover on redundant circuits.

$
0
0

Hello everyone,

 

I have an interesting problem. We recently had an issue with our Primary ISP connection and I had to manually switch over to our backup ISP while they fixed the primary connection (physical fiber damage). This cause some serious downtime and people had to be sent home while I worked on this. I had my VPN tunnels set for next hop and qualified next hop with different preferences and my thinking was then when the next hop was unavailabe the traffic would get switched to the qualified next hop route which runs over the backup link. This didn't happen as I think I didn't understand how qualified next hop works and its not for redundancy. So this brought me to RPM and IP monitoring. I can setup the probe and the IP monitoring piece but what I am struggling with is how will this work with VPN tunnels.

 

My network layout:

 

SRX 340 Cluster with redundant ISPs 

SRX 550 Cluster with Redundant ISPs

SRX 240 Cluster with a single ISP

 

Each cluster resides in a different physical location throughout the US, each node in the cluster has both ISP connections, except the single 240 cluster which has just the one ISP for each node. Each cluster has a VPN tunnel going to the other SRX cluster, on each ISP connection. So there is a total of six VPN tunnels on the 340 and 550 SRX cluster and four VPN tunnels on the 240 SRX cluster. The VPN tunnels are setup ( So I though) in a primary/backup with a next hop going over one tunnel and a qualified next hop over the other tunnel. The route preference is set to 2 for the next hop and 5 for the qualified next hop.  So when the pimary link failed on the SRX340 cluster all traffic stopped.  The failure occured further up-stream so all links stayed up but the gateway was not available. I had to manually switch all static routes to the other VPN tunnels in order to restore traffic. This is all on static routes. There is no dynamic routing of any kind. Also no BGP.  So my question is this. If I implement IP monitoring and have it ping 8.8.8.8 for example, on the primary link and have it set to immediately switch all routes to use the other VPN tunnel, how do make the other firewalls use the backup tunnels too? What I see with IP monitoring and RPM probes is that I can switch the routes over if the link fails but I dont' see how I can get the other SRXes to start sending traffic on the backup tunnel. The only thing would work would be the default route. None of the other routes will work since the other SRXes are not aware of the link failing and will still route traffic based on their routing tables. Should I use equal cost multipathing for the VPN tunnels? Woudn't this just alternate traffic between the primary VPN tunnel and the secondary VPN tunnel so some of the traffic would get there and other traffic won't. Do I set up and complex network of IP monitoring rules on each SRX to monitor all ISPs? (this would be crazy) 

 

The prefered way for me to set this up would be to ditch primary/backup labels and use all links to route traffic but I just dont see a way to mitigate failed links and especially VPN tunnels going over those failed links.  Any help would be appreciated.

 


Viewing all articles
Browse latest Browse all 3959


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>