Response from our data center Provider in our Montreal region:
We wanted to provide an update on our initial investigation results following the outage of your service this morning. It is our understanding that a network device was incorrectly chosen to act as a spanning tree root by automatic spanning-tree processes. After multiple attempts to manually correct the issue, a full restart of the spanning tree process on the suspect device was required to recover connectivity.
As a result of this action, core network traffic was briefly interrupted at the facility. Although we anticipated the possibility of an interruption during reconvergence, the process took approximately 3 minutes to stabilize instead of a few seconds. While the symptoms during this period were similar, we would like to be clear that this was not related to the incident of May 19th.
Although our investigation is ongoing, we have not found a configuration error or clear reason to indicate why this happened. We are engaging our vendor support teams to ensure we have a thorough understanding of the issue and to determine whether the cause could be a software bug. We will provide additional updates and further details in an upcoming incident report.
We will be putting additional measures in place as well to ensure this issue does not occur again.