[Comtec Announce] [Core Network] Service Alert 22nd May 2018 - Update 1

David Croft david at comtec.net.uk
Mon Jun 11 08:33:01 BST 2018


This is an incident notification regarding an outage on our core network.

Date: Tuesday, 22nd May 2018
Start time: 14:46 BST
End time: 16:44 BST (Final clear)

Services affected:

Intermittent partial and total outage across our IP network

Report:

A failure of a network device in our core IP network led to a cascade
of additional failures across the network.

Controlled shutdowns of portions of our network were necessary to
bring it back to a stable state in order to fully restore service.

Both the failure and the controlled shutdowns caused packet loss and
temporary routing failures to services hosted on or delivered through
our network.

Root Cause Analysis:

A memory exhaustion on a edge transit router caused it to restart its
BGP process, and the consequent withdrawal and announcement of all
routes from BGP caused other devices on the network to suffer similar
failures in a cascading fashion.

Next Steps:

We are bringing forward our upcoming planned network upgrades, which
will now take place in July. Emergency maintenance sessions will be
announced once lab testing has been completed.

In the meantime we will continue to observe the change freeze on our
network to prevent a further incident.

Regards,

David Croft

-- 
David Croft
Lead Engineer

Comtec Enterprises Ltd
Comtec House
46a Albert Road North
Reigate Industrial Estate
Reigate
Surrey RH2 9EL

Tel: 0845 899 1400
Fax: 0845 899 1401
www.comtec.com

For urgent operational issues please always contact noc at comtec.com
or 0845 899 1423 and not any named individual.


More information about the UK-Announce mailing list