{"id":659,"date":"2022-05-31T20:15:27","date_gmt":"2022-05-31T10:15:27","guid":{"rendered":"https:\/\/sysmit.com\/cf22\/?p=659"},"modified":"2023-12-13T15:28:02","modified_gmt":"2023-12-13T05:28:02","slug":"starting-an-sre-team-from-scratch-quick-guide","status":"publish","type":"post","link":"https:\/\/sysmit.com\/cf22\/starting-an-sre-team-from-scratch-quick-guide\/","title":{"rendered":"Building the case for starting a software reliability team"},"content":{"rendered":"\n
This article aims to help engineering leaders consider issues before starting a software reliability team. Since I am an advocate for Site Reliability Engineering (SRE), we will now refer to such a team as the “SRE team”. <\/p>\n\n\n\n
Besides creating a new team, leaders face many responsibilities that are often invisible<\/strong> to individual contributors and their reporting managers. <\/p>\n\n\n None of the above work is ever straightforward. You’re working against continually moving targets that require ongoing assessment and readjustment.<\/p>\n\n\n\n It’s better to be well prepared because the need for a reliability team may come suddenly<\/strong>. <\/p>\n\n\n\n Software reliability may be a non-priority one day for an organization and a mission-critical need the day after.<\/p>\n\n\n\n This recount by Wayne Bridgman on the origin story of BT (British Telecom)’s SRE team describes how the need for reliability can fall into your lap: “I was sitting at my desk when our digital engineering director came over and asked a seemingly casual question, ‘Have you ever heard of SRE?'”. <\/em><\/p>\n\n\n\n That conversation snowballed into a flurry of meetings, which led to senior leadership saying BT must get into all things reliability through a new SRE team.<\/p>\n\n\n\n Starting and funding an SRE team first involves uncovering the burning platform in the organization. <\/p>\n\n\n\n “But wait, what is a burning platform?”<\/em> Very briefly, a burning platform implies that the problem is both urgent and bad enough to cause a strategic change effort. <\/p>\n\n\n\n Magic can happen when this burning platform becomes apparent. Senior leadership gets actively involved, funding appears out of nowhere, and more. A more thorough explanation of the phrase “burning platform” can be found here<\/a>.<\/p>\n\n\n\nPart of a software reliability leader’s responsibilities include: <\/h2>\n\n\n
\n