OVHCloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#4421 — ams-1-6k
Incident Report for Network & Infrastructure
Resolved
Thereafter establishing \"wrr-queue\" on all 10G network interfaces, on the router ams-1-6k, the card 2 is established on random.

Jul 29 08:52:14 GMT: %PM_SCP-SP-2-LCP_FW_ERR_INFORM: Module 2 is experiencing the following error: RO[2] (166004 noncritical int in the last 10s, they are now disabled). ROINTMSK[2]:
2E9=0xC,00F=0x728,024=0x1FFF,0E8=0x4,052=0x0,04C=0x1E,049=0x0,09D=0x2FFF,009=0x0,00C=0x0,

The traffic passing by this card has been impacted. We have disconnected the port and the traffic has returned. We are in touch with the card restarting.


Update(s):

Date: 2010-07-29 22:43:05 UTC
Done.

Even though we have disabled the MPLS. With the MPLS the router has not enough RAM and would crash. that runs to 10Mo ... though we deactivated the MPLS on ldn-1 also.

The router is stable.

What a day ...

Date: 2010-07-29 16:22:32 UTC
We will reload the router and return it into production.


Date: 2010-07-29 16:15:18 UTC
We will change the 10G card and then restart the router
and returned the traffic. We'll see if the router will replant. If
yes, which is more likely, we will change the sup
card.




Date: 2010-07-29 12:54:32 UTC
A hardware problem is certainly at the origin of these problems.
We will intervene on the site to change the hardware. Whether the card
10G or the sup, or the 2. It would take 3 hours of roads
from Roubaix. The traffic will goes through London and Frankfurt.





Date: 2010-07-29 12:51:16 UTC
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 920: Jul 29 10:40:10 GMT: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x41044EEC, alignment 8
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 921: Pool: Processor Free: 1395584 Cause: Memory fragmentation
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 922: Alternate Pool: None Free: 0 Cause: No Alternate pool
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 923: -Process= \"IP RIB Update\", ipl= 0, pid= 164
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 924: -Traceback= 4102C83C 4103246C 41044EF4 413C2334 413C2578 4228B548 40641B40 42307BD0 409D3998 4098445C 4098457C
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 925: Jul 29 10:40:13 GMT: %FIB-3-NORPXDRQELEMS: Exhausted XDR queuing elements while preparing message for slot/cpu 6/0
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 926: -Process= \"IP RIB Update\", ipl= 0, pid= 164
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 927: -Traceback= 413C273C 4228B548 40641B40 42307BD0 409D3998 4098445C 4098457C
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 928: Jul 29 10:40:13 GMT: %FIB-3-UPDATEFAIL: Update of prefix 124.138.241.0/-256 failed, resulting in it being deleted.
Jul 29 11:40:48 40G.ams-1-6k.routers.ovh.net 929: Jul 29 10:40:17 GMT: %FIB-3-NOMEM: Malloc Failure, disabling DCEF

Date: 2010-07-29 12:50:57 UTC
ams-1-6k#sh mem stat
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 44B219D0 927819312 715646936 212172376 0 1836520
I/O 8000000 67108864 12821888 54286976 54219792 54104760
ams-1-6k#reload

System configuration has been modified. Save? [yes/no]:
% Please answer 'yes' or 'no'.

System configuration has been modified. Save? [yes/no]:
% Please answer 'yes' or 'no'.

System configuration has been modified. Save? [yes/no]:
% Please answer 'yes' or 'no'.

System configuration has been modified. Save? [yes/no]: no
Proceed with reload? [confirm]
Connection closed by foreign host.

The router has replanted. We managed to reload it




Date: 2010-07-29 12:49:29 UTC
The router is back. We'll put it again in the backbone.



Date: 2010-07-29 12:48:42 UTC
We take this opportunity to update the IOS to a newer version 17a.



Date: 2010-07-29 08:25:42 UTC
Router isolation has induced a service disconnection.

We are dealing again with the router. We are saving the configuration and we are restarting it.



Date: 2010-07-29 08:22:52 UTC
The router is crashed

Jul 29 10:07:09 40G.ams-1-6k.routers.ovh.net 6774: Jul 29 09:06:51 GMT: %C6KFIB-4-DISABLED: Hardware FIB forwarding disabled, reverting to only software forwarding.
Jul 29 10:07:13 40G.ams-1-6k.routers.ovh.net 6775: Jul 29 09:06:53 GMT: %FIB-2-FIBDOWN: CEF has been disabled due to a low memory condition.
Jul 29 10:07:13 40G.ams-1-6k.routers.ovh.net 6776: It can be re-enabled by configuring \"ip cef [distributed]\"

We are isolating it of the network

Date: 2010-07-29 08:22:11 UTC
Jul 29 10:02:54 40G.ams-1-6k.routers.ovh.net 6703: Jul 29 09:02:34 GMT: %QM-2-TCAM_BAD_LOU: Bad TCAM LOU operation in ACL

Date: 2010-07-29 08:22:04 UTC
ams-1-6k#sh mem stat
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 44B1D6B0 927836496 891342240 36494256 0 4132240
I/O 8000000 67108864 11948344 55160520 53479168 55056824

Date: 2010-07-29 08:21:56 UTC
Jul 29 10:00:31 40G.ams-1-6k.routers.ovh.net 6687: Jul 29 09:00:10 GMT: %SYS-3-CPUHOG: Task is running for (2000)msecs, more than (2000)msecs (33/3),process = CEF Reloader.
Jul 29 10:00:31 40G.ams-1-6k.routers.ovh.net 6688: -Traceback= 41D7B360 41042F5C 413C3E60 413C487C 413C4F48 41044C40 41044C2C
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6689: Jul 29 09:00:12 GMT: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x410433D8, alignment 8
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6690: Pool: Processor Free: 7057952 Cause: Memory fragmentation
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6691: Alternate Pool: None Free: 0 Cause: No Alternate pool
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6692: -Process= \"CEF Reloader\", ipl= 0, pid= 146
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6693: -Traceback= 4102AD28 41030958 410433E0 413C26A0 413C3E04 413C487C 413C4F48 41044C40 41044C2C
Posted Jul 29, 2010 - 08:21 UTC