Quantcast
Channel: SRX Services Gateway topics
Viewing all articles
Browse latest Browse all 3959

Problems and more problems in a SRX340 cluster.... the neverending story

$
0
0

Hi guys, 

This story is coming from here https://forums.juniper.net/t5/SRX-Services-Gateway/Junos-upgrade-fails-on-SRX340-cluster-from-15-1X49-D170-4-to-17/td-p/467752

 

I was strugling to upgrade a SX340 cluster to a newer Junos version, and finally with the help of some gurus, I made it upgrade to version 18.3R2.7 on both nodes. Now however, sometimes the HA shows fine, but some times it shows amber HA led, and the output of the regular commands shows as below:

 

root@SPCFW-BRAVO> show chassis firmware  
node0:
--------------------------------------------------------------------------
Part                     Type       Version
FPC 0                    O/S        Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
FWDD                     O/S        Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC

node1:
--------------------------------------------------------------------------
Part                     Type       Version
FPC 0                    O/S        Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
FWDD                     O/S        Version 18.3R2.7 by builder on 2019-05-03 09:17:52 UTC
root@SPCFW-BRAVO> show chassis cluster information 
node0:
--------------------------------------------------------------------------
Redundancy Group Information:

    Redundancy Group 0 , Current State: primary, Weight: 255

        Time            From                 To                   Reason
        Sep 11 20:57:13 hold                 secondary            Hold timer expired
        Sep 11 20:57:22 secondary            primary              Better priority (200/100)

    Redundancy Group 1 , Current State: primary, Weight: 0

        Time            From                 To                   Reason
        Sep 11 20:57:13 hold                 secondary            Hold timer expired
        Sep 11 20:57:24 secondary            primary              Remote yield (0/0)

Chassis cluster LED information:
    Current LED color: Amber
    Last LED change reason: Monitored objects are down
Control port tagging:                   
    Disabled

Failure Information:

    Coldsync Monitoring Failure Information:
        Statistics:
            Coldsync Total SPUs: 1
            Coldsync completed SPUs: 0
            Coldsync not complete SPUs: 1

    Fabric-link Failure Information:
        Fabric Interface: fab0
          Child interface   Physical / Monitored Status     
          ge-0/0/2              Up   / Down 

node1:
--------------------------------------------------------------------------
Redundancy Group Information:

    Redundancy Group 0 , Current State: secondary, Weight: 0

        Time            From                 To                   Reason
        Sep 11 20:57:21 hold                 secondary            Hold timer expired

    Redundancy Group 1 , Current State: secondary, Weight: -255

        Time            From                 To                   Reason
        Sep 11 20:57:22 hold                 secondary            Hold timer expired

Chassis cluster LED information:
    Current LED color: Amber
    Last LED change reason: Monitored objects are down
Control port tagging:
    Disabled

Failure Information:

    Coldsync Monitoring Failure Information:
        Statistics:
            Coldsync Total SPUs: 1
            Coldsync completed SPUs: 0
            Coldsync not complete SPUs: 1

    Fabric-link Failure Information:    
        Fabric Interface: fab1
          Child interface   Physical / Monitored Status     
          ge-5/0/2              Up   / Down 

{secondary:node1}
root@SPCFW-BRAVO> show chassis cluster status        
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring      RE  Relinquish monitoring
Cluster ID: 1
Node   Priority Status               Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 0
node0  200      primary              no      no       None           
node1  0        secondary            no      no       FL             

Redundancy group: 1 , Failover count: 0
node0  0        primary              yes     no       CS             
node1  0        secondary            yes     no       CS FL          
root@SPCFW-BRAVO> show chassis cluster interfaces 
Control link status: Up

Control interfaces: 
    Index   Interface   Monitored-Status   Internal-SA   Security
    0       fxp1        Up                 Disabled      Disabled  

Fabric link status: Down

Fabric interfaces: 
    Name    Child-interface    Status                    Security
                               (Physical/Monitored)
    fab0    ge-0/0/2           Up   / Down               Disabled   
    fab0   
    fab1    ge-5/0/2           Up   / Down               Disabled   
    fab1   

Redundant-ethernet Information:     
    Name         Status      Redundancy-group
    reth0        Down        Not configured   
    reth1        Up          1                
    reth2        Down        Not configured   
    reth3        Down        Not configured   
    reth4        Down        Not configured   
                                        
Redundant-pseudo-interface Information:
    Name         Status      Redundancy-group
    lo0          Up          0                

It seems that for some reason I can´t understand, fab0 ge-0/0/2 comes up sometimes, and comes down other times. 

 

Any help would be much appreciated

Thanks!


Viewing all articles
Browse latest Browse all 3959

Trending Articles