Overview
SIREN is short for “Système d’Inscription Regroupant les Evènements Notoires” in French, or “Subscribable Interface Regrouping Events of Note” in English (in an attempt to translate an already far-fetched acronym). Its purpose is to act as a status page of the services hosted by Arcanite, as well a provide a tool to manage communication about incidents and maintenances.
The core elements of SIREN are components, which are entities that have a name and a status. They can either be services destined to end users (e.g., webserver), or internal services (e.g., proxies, redis, etc), or even a bigger entity like a whole datacenter. The status a component can have is among the following: Unknown, Operational, Performance issues, Partial outage, Major outage or In maintenance.
The components’ statuses are diplayed on a page, with a chosen url. This page also contains data on their past state during the last month (displayed as vertical colored bars), as well as an indication of their uptime during this period. The uptime is computed as the proportion of Operational status over the total time. The total time computation is done without counting periods during which the component had an Unknown or In maintenance status.
Components on a page can be grouped into sections (with an optional description), or even subsections, to order the page into semantically related groups of components.
Events
Additionnaly to the instantaneous and past status of components, a page can display information about events. An event is either an Incident (i.e., an unexpected event that disrupts operational processes), or a Maintenance (i.e., a planned event generally involving software updates and/or hardware replacement).
An event has a name, a start date, and possibly an end date (in the case of incidents one might not know when they will end). In addition to that, an event has a status (one of Scheduled, Ongoing, Finished for maintenances, and Investigating or Resolved for incidents), as well as a timeline of event updates. An event update contains a message, and represents a new piece of information with regard to the event, generally to announce its beginning, its end or an unforeseen issue. For instance, a possible update for an Incident could be “Problem is identified and fixed”, accompanied by a status change from Investigating to Resolved.
During the time an event is ongoing (i.e., the present moment is between its start and end date), a banner is showed on each page containing components that are affected by this event. A note shows which components on that page are concerned by the event, and another shows the last event update message, with an option to see the full timeline of event updates. Additionnaly, a section at the end of the page can be expanded to show the events that occurred during the last week.