OVHcloud Network Status

Current status
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#15238 — BHS
Incident Report for Network & Infrastructure
The links between Newark and our datacenter of Beauharnois are down:
We will update this taso as soon as we have more information.


Date: 2015-11-11 16:45:59 UTC

On Monday November 2, 2015, the BHS datacenter suffered a new network problem impacting all BHS Datacenter customers. This incident was similar to 5 months ago, when 3 pairs of our optical fibers were cut between Beauharnois and Montreal on the North route.

Two years ago, we began adding redundancy on the 103 KMs of fiber on the South loop. Since then, there have been long delays concerning this project, bringing the time of completion to 24 months instead of the planned 12. Work was finished this week and only in response to the urgency of the situation. We forced the installation of the new South fiber optic line, thereby reducing the duration of downtime of the public network from 10 hours and 51 minutes to 4 hours and 55 minutes.

Now, the datacenter is connected in total redundancy by the North loop and the South loop. We have 5 directions: 2 towards New York, 1 towards Chicago and 2 towards Montreal.

Three years after starting the project, we are continuing the work on improving the 2nd route towards New York where all that remains is to solder the optic fiber at the Canada/USA border. Negotiations are underway to connect Toronto and Detroit, MI.

Here is what our dark fiber and DWDM network connecting BHS looks like:

We own:
- 4x10 Gbps between BHS and Montreal via the North loop
- 4x10 Gbps between BHS and Montreal via the South loop
- 6x100 Gbps towards New York/Newark via the South loop
We rent:
- 10x10 Gbps towards Chicago via South loop
- 20x10 Gbps towards Newark via the North loop
- 4x10 Gbps towards London (through Canada) via South loop
- 4x10 Gbps towards Paris (through USA) via the North loop

Within 3 to 4 weeks, we will finish the work on our own 2nd route towards New York/Newark and we are going to activate an optical failover of 6x100 Gbps through the following 2 routes: BHS/NYC-NWK. This will allow us to terminate the rental of the 20x10 Gbps through Newark.

We are going to increase the capacity of Montreal/Chicago to 2x100 Gbps then 4x100 Gbps while waiting for our own route to be finished between BHS and Chicago by Toronto/Detroit. We cannot yet provide an ETA, knowing that we lack 800 KMs between Toronto and Detroit, MI. We already have in place BHS towards Toronto and Detroit, MI towards Chicago, IL.

At the same time, we are working on the network in the USA between New York and Newark (NY) then Chicago (IL) and Ashburn (VA). This new USA network is planned to be deployed in Q2 2016 and will not only provide redundancy from BHS towards Chicago through New York but also allow for the connection of a datacenter in the USA.

We sincerely apologize for the latest BHS downtime. We underestimated the time and the complexity of the fiber optic network construction projects in Canada, thus creating an unreasonable delay in securing each kilometer connecting the BHS Datacenter. Even though today it's done and we have 100% redundancy, we are really sad to announce this under such circumstance.

Within the next 10 days, we will offer compensation which conforms to our SLA for the month of November.

The exact times concerning the incident:
- down at 11:53 AM GMT-5 (5:53 PM GMT+1)
- at 4:48 PM GMT-5 (10:48 PM GMT+1), traffic to the BHS datacenter from the outside returned to normal thanks to the optical fiber by the South
- in contrast, the links between BHS and the other OVH datacenters and the VAC (Anti-DDoS) were affected until the complete repair of the North fiber. Traffic was able to resume at 10:44PM (04h44 GMT+1).

Best Regards,

Date: 2015-11-03 14:29:50 UTC
===== 2015-11-02 @ 08:09PM CET (02:09PM EST)
The production of 2 new fibers should allow reconnexion of 6x100G links with BHS DC and Newark POP.

===== 2015-11-02 @ 09:37PM CET (03:37PM EST)
The works continue in 2 ways :
1) The commissioning of the new fiber pair BHS <> NWK is still ongoing. The missing of cable passage (cross connect) is underway in Montreal. All our optical equipment are pre-wired and ready to push up the link.
2) The repair work in the tunnel where the cut takes place are also preparing. Several operators are impacted.
Here's the latest update from the maintainer of the cable:
\"Our first team is on the scene and investigate for the repair of breakdowns. The cable is already on the road to Melocheville. We mobilize a second assembly team to ensure the passage of the cable ducts. Two fusions teams will join the first shortly. Flags will also be there.\"

===== 2015-11-02 @ 09:55PM CET (03:55PM EST)
The installation of the cable in Montreal should start within a few minutes. By 30-40min we think we will have more visibility about the potential rise in Beauharnois links <> Newark via this new redundant path.

===== 2015-11-02 @ 11:06PM CET (05:06PM EST)
We got link up Beauharnois <> Newark on the new fiber from east. On 3 pairs on link were ongoing recipe, two were already operational which allowed us to produce urgently.
The links BHS <> Chicago and other links BHS <> Montreal are still down which impacts to the vac traffic, vrack traffic to Europe and some OVH internal links.

===== 2015-11-02 @ 11:51PM CET (05:51PM EST)
We cut peerings (Bell / Videotron / Tata) in MTL to avoid saturating the 20G NWK <> MTL and MTL 10G <> BHS. Traffic will flow through other peerings, we look to see if you have to modify some roads.

===== 2015-11-03 @ 00:28AM CET (06:28PM EST)
We did the same work for TOR by shutting TORIX to avoid saturation on link TOR <> NWK.

===== 2015-11-03 @ 00:52AM CET (06:52PM EST)
The optical characteristics of the new link is very different from the existing link because it is much shorter. We have too much power received on Montreal making unstable 4 of 6 100G links. We need to add an attenuator on the line to stabilize the link. This will impact again traffic between 30s and several minutes.

===== 2015-11-03 @ 01:31AM CET (07:31PM EST)
The establishment of the attenuator has stabilized the link. We recovered 600G capacity between BHS and Newark. Links are stable.
ETA for the repair of the fiber from WEST side (BHS links <> MTL, BHS <> Chicago and inter-dc / vrack traffic): 3h

===== 2015-11-03 @ 06:10AM CET (00:10AM EST)
The repair of the cable by the western route is complete. All cutted links are now up (bhs <> chicago, bhs <> montreal, internal network to Europe and vrack). Traffic rerouted from peering at the Bell / Videotron was standardized to back to normal performance.
The new fiber to the new Montreal East road had to be put into production the same week. Bad timing. Although not planned and carried out entirely in the urgency of the situation, its production has allowed us to avoid several hours of extra failures and to permanently enhance the capacity to Newark. In the coming days we will migrate some other links to this link to secure the capacity to Chicago, Montreal and inter-data center network to Europe. We are waiting soon a 3rd other road from BHS to Newark wich will allow us to more securise the connectivity of BHS DC. We present again our apologies for this incident.

Date: 2015-11-02 20:30:15 UTC
We still don't have an ETA regarding the repairs on the existing fiber.

However, it's been confirmed that 2 out the 3 new pairs of the Eastern route (which were being setup) are now usable. We are working with our providers in order to get a full connection as quickly as possible: the only cross-connect linking OVH to our provider (in Montreal) is currently missing.

Date: 2015-11-02 18:48:54 UTC
Our teams are dispatched and are evaluating the damage. We'll have a concrete ETA within the next hour.

Date: 2015-11-02 18:38:30 UTC
The cut would fall within the tunnel near the power plant.

Our supplier will send a team on site to repair.

Date: 2015-11-02 17:58:23 UTC
The cut is located 108 km from Montreal (fiber that passes through the north of the lake), our technicians follow the path from BHS to find the cause of the fault (probably a fiber cut at approx. 9km from the datacentre).

Date: 2015-11-02 17:57:42 UTC
The delivery of additional network was planned for the coming days, we are doing everything we can to speed the process and mount these links quickly.

In parallel, we are in contact with our suppliers to find the cause of the failure.

Date: 2015-11-02 17:16:46 UTC
Montreal <> Beauharnois down (6x10G)
Posted Nov 02, 2015 - 17:16 UTC