Get webhook notifications whenever Network & Infrastructure creates an incident, updates an incident, resolves an incident or changes a component status.
We will reinsert a FWSM card into one of the VoIP routers in order to restore the failover function between the 2 cards.
Update(s):
Date: 2014-01-25 21:37:14 UTC We have set into production the new firewall
The switching was properly done except that
one of the routers didn't take into account
the modification of the MAC. We have forced it ,
and this has fixed the issues.
The situation is stable and we have no more
issues.
Date: 2014-01-25 21:35:08 UTC we are cutting the old firewall.
Date: 2014-01-25 21:34:49 UTC we have no more connections passing by the old
firewall. all is passing through the new one.
we are noticing the connections getting back
on the infra. We are checking to manage the internal
congestion to avoid the infra switchover in
internal.
Date: 2014-01-25 21:32:14 UTC it's done.
Date: 2014-01-25 21:32:04 UTC The 3000 phones using the IP which has been
switched from the old firewall to the new are
UP.
We will have to switch the main IP with the SIP/MGCP
on the new firewall.
Date: 2014-01-25 21:30:42 UTC We will switch an IP managing only the MGCP on the
new firewall.
Date: 2014-01-25 21:30:06 UTC We are cutting the firewall salve to recover the status
before the maintenance.
Date: 2014-01-25 06:36:29 UTC Details on the maintenance carried out tonight on the VoIP network infra:
At the routing level, the VoIP infra is working properly as of about 2h15. We have had several issues on the firewall cards which have led to an interruption of the service on the sip and mgcp. The subject of the maintenance was to set into production a spare card delivered by Cisco in the afternoon in order to get back to the working mode in failover active/passive on this infra side.
This new card which was updated and set beforehand should synchronise with the card in production at the level of all sessions status then take in charge the traffic.
However, it seems that some sections of the setting weren't correctly applied and the synchronisation step has in fact deleted these sections of the active card.
The traffic was impacted at this moment. The rollback which was done few minutes later didn't allow the immediate initial reset as the configuration was just incorrect on 2 cards. We had to manually re-decrease the whole configuration to recover the traffic.
2nd problem, for a reason that we setill ignore, the memory use on the cards is mounted abnormally until the saturation of the RAM which has prevented the traffic to flow normally.
The solution was to simultaneously reboot both firewall cards in order to get a complete reset of the status which are being synchronised between 2 cards to get back to the normal performance.
For now, the set of SIP phones is registered but not all the MGCP. Further details on the purely VoIP side are in this task:
http://status.ovh.net/?do=details&id=6182
Date: 2014-01-25 02:35:46 UTC we have reactivated the firewall on v1
Date: 2014-01-25 02:35:25 UTC the p19-v2-6k has crashed. The traffic is flowing on p19-v1-6k
but without the firewall card
Date: 2014-01-25 02:34:36 UTC The slave firewall card cannot
3d22h: Processor 0 of module in slot 4 cannot service session requests.
The master is neither
2d23h: SP: The PC in slot 4 is shutting down. Please wait ...
2d23h: SP: PC shutdown completed for module 4
We have switched the traffic on the slave.
Date: 2014-01-25 00:39:40 UTC We are carrying out tests on the new card.
Date: 2014-01-25 00:39:18 UTC The card is ready. We will test it and reinsert it in the prod tonight at 00:00.