Job Description
Define and own the SRE transformation roadmap aligned with business objectives and platform priorities
.Strong understanding of SRE principles(SLOSLI| error budgets| toil management| observability).Lead SRE maturity assessments across Observability| Incident Management| Problem Management| Shift Left| and Operational Readiness.
Establish and govern SRE operating models| including engagement with Dev| Ops| Security| and Architecture teams.
Drive adoption of SLOs| SLIs| SLAs| and error budgets across critical services.
Act as the primary interface between engineering teams| service management| leadership| and external partners
Manage Multi workstream SRE programs| ensuring delivery against scope| timelines| risks| and dependencies.
Prepare executive level status updates| dashboards| and steering committee communications.
Oversee improvements in incident reduction| MTTR| availability| and resiliency.
Ensure blameless postmortems| root cause analysis (RCA)| and action tracking are consistently executed.
Govern automation initiatives(runbooks| self-healing| alert tuning| capacity management).Track and report on reliability KPIs| toil reduction| and automation ROI.
Partner with platform teams on observability and SRE tooling strategy(monitoring| logging| tracing| APM).Ensure effective integration of ServiceNow| alerting platforms| and SRE tools.
Support training| enablement| and onboarding of teams into SRE practices and mindset.
Qualification
10 years in IT delivery| operations| reliability| DevOps| or platform engineering roles.
5 years managing largescale programs or transformations(SRE| DevOps| ITIL| or Cloud Ops).
Experience in Mainframe and other Legacy technologies highly preferred.
Familiarity with distributed systems| cloud platforms| and enterprise application landscapes.
Proven experience driving cross functional change across engineering and operations teams.
Experience Required: 10 & Above