OVHCloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#4279 — p19-53/p19-54
Incident Report for Network & Infrastructure
Resolved
Since 7 am this morning, we had a strange problem between Nexus 5000 and Catalyst 6500.




Update(s):

Date: 2010-06-13 10:41:28 UTC
p19-57-6k#sh inter t6/4 | i 30 sec
30 second input rate 1276919000 bits/sec, 316372 packets/sec
30 second output rate 1149697000 bits/sec, 204839 packets/sec
p19-57-6k#sh inter t7/1 | i 30 sec
30 second input rate 364833000 bits/sec, 68728 packets/sec
30 second output rate 1142250000 bits/sec, 205848 packets/sec
p19-57-6k#sh inter t7/3 | i 30 sec
30 second input rate 1404992000 bits/sec, 344054 packets/sec
30 second output rate 1042913000 bits/sec, 199855 packets/sec
p19-57-6k#sh inter t7/4 | i 30 sec
30 second input rate 342808000 bits/sec, 65034 packets/sec
30 second output rate 1081846000 bits/sec, 194390 packets/sec

p19-57-6k#sh inter cou err mod 6
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Te6/4 0 0 0 271 0 0
Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Giants
Te6/4 0 0 0 0 0 0 0
Port SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err Symbol-Err
Te6/4 0 0 0 0 0
p19-57-6k#sh inter cou err mod 7
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Te7/1 0 0 0 0 0 0
Te7/3 0 0 0 142 0 0
Te7/4 0 0 0 0 0 0
Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Giants
Te7/1 0 0 0 0 0 0 0
Te7/3 0 0 0 0 0 0 0
Te7/4 0 0 0 0 0 0 0
Port SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err Symbol-Err
Te7/1 0 0 0 0 0
Te7/3 0 0 0 0 0
Te7/4 0 0 0 0 0


Date: 2010-06-13 10:41:03 UTC
With 4x10G /6500, it goes well as soon as a 10G port
takes on more than 300'000 packets/seconds from a N5,
nevertheless it has errors on the input.

We will ascend the errors at Cisco.

Date: 2010-06-12 07:48:47 UTC
The errors still increase, yet there is no problem of MAC detection
between the cards of load repartition and the servers.

p19-57-6k#sh inter counters errors module 6

Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Te6/1 0 0 0 0 0 0
Te6/2 0 0 0 0 0 0
Te6/3 0 0 0 0 0 0
Te6/4 0 0 0 146 0 0

Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Giants
Te6/1 0 0 0 0 0 0 0
Te6/2 0 0 0 0 0 0 0
Te6/3 0 0 0 0 0 0 15
Te6/4 0 0 0 0 0 0 0

Port SQETest-Err Deferred-Tx IntMacTx-Err IntMacRx-Err Symbol-Err
Te6/1 0 0 0 0 0
Te6/2 0 0 0 0 0
Te6/3 0 0 0 0 0
Te6/4 0 0 0 0 0

We will add a 2nd 10G port in the channels port
between the 6k and the N5. If we distribute the traffic
on 2 10G ports instead of 1, it must functions well.


Date: 2010-06-12 07:46:13 UTC
On n5 we can put flowcontrol on the port channel, but not on the physical ports, and on 6K we can put flowcontrol on the physical port but not on port channel.
I love it.



Date: 2010-06-12 07:34:35 UTC
We will try to check if it would be better with the flowcontrol.


Date: 2010-06-12 07:32:51 UTC
We have an input packet problem on the 6000. None on the Nexus 5000.

We have restarted the p19-53-n5 then p19-54-n5.
They are in cluster so each took the other's relay. The problem persists.

The problem is on all 6K ports that are connected towards N5.


p19-57-6k#sh inter counters errors module 6

Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Te6/1 0 0 0 0 0 0
Te6/2 0 0 0 0 0 0
Te6/3 0 0 0 0 0 0
Te6/4 0 0 0 74 0 0

On the N5 we use a virtual port channel on 2 equipments.


187 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

It is like if the Nexus are balancing such traffic that 6k did not take ...

hmm ... we will increase the input's queue size on the 6k

wrr-queue bandwidth 255 255 255 255 255 255 255
wrr-queue queue-limit 100 100 100 100 100 100 100
wrr-queue threshold 1 100 100 100 100 100 100 100 100
wrr-queue threshold 2 100 100 100 100 100 100 100 100
wrr-queue random-detect min-threshold 3 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 1 100 100 100 100 100 100 100 100
wrr-queue random-detect max-threshold 2 100 100 100 100 100 100 100 100
wrr-queue cos-map 1 4 0 1
wrr-queue cos-map 3 1 6
wrr-queue cos-map 7 8 7
rcv-queue bandwidth 255 255 255 255 255 255 255 255
rcv-queue queue-limit 100 100 100 100 100 100 100 100
rcv-queue threshold 1 100 100 100 100 100 100 100 100
rcv-queue random-detect min-threshold 1 100 100 100 100 100 100 100 100
rcv-queue random-detect max-threshold 1 100 100 100 100 100 100 100 100
rcv-queue cos-map 1 1 2 3
rcv-queue cos-map 1 8 0 1
rcv-queue cos-map 2 8 4 6
rcv-queue cos-map 7 8 7
rcv-queue cos-map 8 8 5
Posted Jun 12, 2010 - 07:08 UTC