Vacant job
- Jobs
- Senior SRE/Production Reliability Specialist
Senior SRE/Production Reliability Specialist
Avaron ABStockholms län, Stockholm
Previous experience is desired
4 days left
to apply for the job
At Avaron, you get the security of a permanent employment combined with the variety of working on-site at different clients. We recruit specialists in everything from technology, IT, and industry to project management and business support – and regardless of the assignment, you have a consultant manager who is there for you and your development.
About the RoleYou will play a key role in building a new SRE/Production Reliability capability for modern, business-critical applications and platforms in a complex banking environment. This is not just about operations, but about creating a long-term sustainable operational model with clear responsibilities, ways of working, and governance in an environment with high demands for availability, security, and compliance.
This role suits you if you thrive at the intersection of technology, change management, and operational reality. You will combine strategic oversight with practical execution and help shape how a new team, new ways of working, and a stable production capability function in everyday life. This is an exciting opportunity for you who want to influence both structure, delivery, and technical maturity in a business-critical environment.
Responsibilities- You develop the target vision, establishment plan, and implementation structure for a new SRE/Production Reliability capability.
- You define the team's mission, responsibilities, service boundaries, and interaction with application teams, service desk, operations partners, platform owners, and security functions.
- You build the team based on roles, competencies, staffing, onboarding, training needs, and readiness models, acting as interim team lead.
- You ensure the team can operate operationally in modern environments with container platforms, OpenShift/Kubernetes, Azure, pipelines, networking, IAM, secrets, storage, and databases.
- You establish ways of working for incident management, alerting, escalation, runbooks, problem management, improvement work, and operational follow-up.
- You introduce a model for production maturity and onboarding of applications to the SRE team, including readiness criteria, documentation, observability, support boundaries, and recovery capabilities.
- You drive the development of monitoring, dashboards, observability, and metrics, preferably with support from LGTM or similar solutions.
- You contribute to reducing operational risk through higher stability, shorter recovery times, fewer recurring incidents, more automation, and clearer governance.
- At least 8–10 years of experience in senior roles within IT operations, production operations, platform, cloud, infrastructure, or SRE.
- Documented experience in establishing, transforming, or leading operational capabilities or teams within operations, platform, or SRE.
- Documented experience working in complex, business-critical IT environments with many dependencies and stakeholders.
- Experience in defining or introducing operating models, responsibility distribution, collaboration forms, processes, and governance between multiple functions.
- Good understanding of container-based platforms such as Kubernetes and/or OpenShift.
- Good understanding of cloud environments, preferably Azure, as well as platform-adjacent areas such as networking, IAM, secrets, storage, and databases.
- Good understanding of CI/CD, pipelines, automation, and modern delivery flows.
- Experience with incident management, readiness, alerting, problem management, and operational follow-up.
- Experience in establishing or developing monitoring, observability, dashboards, alerts, and metrics.
- Experience working in environments with high demands for security, compliance, documentation, access control, and auditability.
- Experience in change management and implementation in organizations where new ways of working need to be anchored among several stakeholders.
- Very good ability to express yourself in Swedish and English, both verbally and in writing.
- Experience from banking, finance, insurance, or other regulated activities.
- Experience in establishing or leading a 24/7 organization with on-call or readiness duties.
- Experience in introducing onboarding models or readiness criteria for applications or services to operations or SRE organizations.
- Experience with OpenShift in an enterprise environment.
- Experience with Azure, including private endpoints, networking, key and secrets management, and relevant platform services.
- Experience with LGTM or another modern observability stack.
- Experience in vendor management and collaboration in ecosystems with internal and external operations partners.
- Experience with resilience, recovery capabilities, continuity, backup, recovery, or DR.
- Experience in driving automation and toil reduction in operational environments.
- Certifications or training in areas such as SRE, ITIL, Kubernetes, OpenShift, Azure, DevOps, or information security.
- Permanent employment at Avaron AB
- Pension plan
- Wellness allowance of 5,000 SEK per year
We are hiring continuously – please apply as soon as possible.
🖐 Was this job fit for someone?
Other jobs in the same field
Maybe it’s time to broaden the search with these available jobs
-
Up to 25% off experiences for mom – Celebrate Mother’s Day with Live it
Tue, 26 May 2026 - 12:00