OVHCloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#4185 — rbx-46
Incident Report for Network & Infrastructure
Resolved
One of the two cards does not work properly. We've already switched
the # 2 to # 1 and the router switches back. But on the # 2 the clients rise problems of bandwidth.

May 11 02:32:00 rbx-46-c1.routers.ovh.net 2010 May 11 00:31:45 %SYS-5-SUP_MODSBY:Module 2 is in standby mode
May 11 02:32:27 rbx-46-c1.routers.ovh.net 2010 May 11 00:32:12 %SYS-5-PORT_SSUPOK:Ports on standby supervisor (module 2) are up
May 11 02:37:32 rbx-46-c1.routers.ovh.net 2010 May 11 00:37:17 %SYS-5-SUP_MODSBY:Module 1 is in standby mode
May 11 02:38:55 rbx-46-c1.routers.ovh.net 2010 May 11 00:38:40 %SYS-5-PORT_SSUPOK:Ports on standby supervisor (module 1) are up


Update(s):

Date: 2010-05-17 08:08:53 UTC
..1 chassis, 5 routing cards, 1 6148A card later...

The router is up again.

Yeah! Sometimes, it is better to think a bit ;)


Date: 2010-05-17 08:06:02 UTC
We will not wait till the night to make this intervention.
The goal is to make the router operational again.

We will intervene in 10 minutes.

Date: 2010-05-16 14:58:38 UTC
Well.

2 solutions:
- we prepare a new router, totally new and we replace all
- we will move the bay router, because it is a cursed place.

We will start with n°1.

Date: 2010-05-14 09:20:40 UTC
We have replaced the chassis. It still crashes.
We replace another one. Same thing.
Without any card. Same thing.

The router is not functional with both routing cards
despite of the replacement of all the cards.

yeah...

The router is stable for the moment. It functions with one single routing
card.

Well...

We need to think.



Date: 2010-05-14 09:12:34 UTC
We replace the chassis.
Back to normal in 30 minutes.

Date: 2010-05-14 09:11:09 UTC
We change the card #1.
The card #2 has crashed twice. It restarts.

Date: 2010-05-14 09:07:13 UTC
2010 May 12 19:45:01 %SYS-3-MOD_PORTINTFOUTOFSYNC:Port Interface not
sbifSyncOnSendTwoSeqZeroPkts failed
PANIC: Stack in process \"SysLogTask\" whose ID is 50 is overflown
System reset on software watchdog is disabled
InterruptStatus = 0x00000001 last_timeout_func = 0x80972dc0
Check for nested intrrupt
sp is 0x81801ea0

Breakpoint Exception occurred on May 12 2010 19:45:01
Software version = 8.4(4)
Process ID #32, Name = SysLogTask
*** Cache Error Exception ***
Cache Err Reg = 0xa0001ce1
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x80000400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa0001b91
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa0001a41
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa00018f1
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa00017a1
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa0001651
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa0001501
data reference, primary cache, data field error , error not on SysAD Bus
PC = 0xbfc09b50, Cause = 0x400, Status Reg = 0x87d28f8e

*** Cache Error Exception ***
Cache Err Reg = 0xa00013b1
data reference
*** Watch Dog Timeout ***
PC = 0xbfc084f8, SP = 0x81801020 frame = 0xa0005ea8
Cygnus_ResetSystem
InterruptStatus = 0x00000001
Total download memory used = 3035796
crash info filename is bootflash:crashinfo_100512-194509
Opening crash info file bootflash:crashinfo_100512-194509

Date: 2010-05-14 09:06:45 UTC
The configurations were erased. By dint of adding and removing
the cards which did not restart, the router crashed. we
had to make an electric cut.

The configurations were put back. Everything is up.

But I do not feel it. It will crash again. Either both of the 2 cards were
dead or the chassis is dead. I will do a failover at
night.

Date: 2010-05-12 18:53:21 UTC
OVH Comment - Wednesday, 12th of May, 2010, 19:13

We change the #2.


OVH Comment - Wednesday, 12th of May, 2010, 19:49

Crashed router.


OVH Comment - Wednesday, 12th of May, 2010, 20:26

pff...
Posted May 12, 2010 - 18:49 UTC