Rootly Logo

Rootly

Senior Site Reliability Engineer

Sorry, this job was removed at 06:10 a.m. (MST) on Thursday, Apr 23, 2026
In-Office
Toronto, ON
In-Office
Toronto, ON

Similar Jobs

5 Days Ago
Hybrid
Senior level
Senior level
Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
Lead SRE work to design and improve CI/CD pipelines, provision and maintain AWS infrastructure with IaC, manage containerized deployments, provide on-call incident triage and post-incident reviews, drive reliability initiatives (DR, security, cost optimization), implement monitoring/alerting, script automation with Python/Bash, and collaborate with global teams to embed SRE practices.
Top Skills: Aws CloudwatchAws Ec2Aws EcsAws EksAws IamAws LambdaAws RdsAws Route 53Aws S3Aws VpcBashCdkClaude CodeCloudFormationDatadogDockerGithub ActionsGithub CopilotHarnessJenkinsLinuxNew RelicPythonSplunkTerraform
24 Days Ago
In-Office
Senior level
Senior level
Food • Retail • Agriculture • Manufacturing
The Sr. Site Reliability Engineer will enhance the reliability and performance of technology platforms, design cloud-native systems, automate workflows, and support incident response strategies.
Top Skills: AzureAzure DevopsBashBicepGithub ActionsGoJavaKubernetesOpentelemetryPowershellPythonTerraform
9 Hours Ago
Hybrid
Senior level
Senior level
Healthtech • Software
Own reliability, observability, and security for AI/ML platforms (data processing, workspaces, labeling, model serving). Build IaC and automation, define SLOs/error budgets, run incident response and DR exercises, implement security controls, mentor engineers, and optimize cost, capacity, and operational standards across cloud environments.
Top Skills: Azure Ai (Azure Ml)Blue/Green DeploymentCanary DeploymentCi/CdContainer OrchestrationData Lineage ToolingDatabricksDistributed TracingEncryptionFinopsGitopsKey ManagementKubernetesLoggingObservability (MetricsSecrets ManagementSli/Slo FrameworksTerraformTraces)
About Rootly

At Rootly, we are on a mission to be the go-to way companies respond when things go wrong, helping every organization be more reliable. We do this by building an industry-leading incident management platform that allows companies around the world consistently and quickly resolve incidents. We are not simply transforming an industry, we are carving an entirely new +$B segment ourselves and need incredible talent to achieve this ambitious goal together.

Customers love Rootly. Some of the fastest growing companies around the world such as NVIDIA, Figma, Canva, Tripadvisor, Squarespace and more rely on Rootly to power their critical incident management process. They obsess over our delightful enterprise-ready platform and unique partnership model. See why our customers have reviewed us 5 stars on G2.

Investors love Rootly. We are backed by some of the most respected funds in the world from Y Combinator to operators like the CTO of Dropbox and GitHub. We'd be happy to disclose our entire funding and profitability picture live during the interview. As a culture we relentlessly put transparency first. We conduct monthly financial reviews as a team so everyone has a pulse on the health of the business and publish what we are building in our weekly changelog.

About the Role

This is an opportunity to join Rootly as an early SRE leader and shape our technical foundation. You will experience the balance of being scrappy and operating at scale. What you’ll be doing one day could look very different the next. You will be empowered to identify opportunities that will help us grow and own it. In short, this role is designed for individuals that crave ownership, stimulating technical challenges, love shipping fast, and are mission driven. We won’t sugarcoat it, the work will be challenging, but it will also be one of the most rewarding learning experiences of your career.

  • Embed with product teams to enhance observability, reliability, and performance of their services.
  • Own our CI/CD pipelines, observability tooling, monitoring systems, and incident response processes.
  • Build tools and automation to eliminate manual toil, improve engineering velocity and developer experience, and improve system reliability.
  • Collaborate deeply across engineering to understand systems at the code level and surface cross-cutting reliability, performance, and scaling concerns.
  • Architect and scale our infrastructure, ensuring best-in-class performance, availability, and operational excellence.
  • Drive capacity planning efforts to ensure our infrastructure is resilient and scalable as we grow.
  • Define and manage SLOs and error budgets in partnership with Engineering teams who own production services.
  • Be vocal - act as a strong voice and force of reliability, quality, performance, and scalability.
About You'll Need:Minimum Qualifications
  • 5+ years of experience in an SRE, Platform, or Infrastructure Engineering role.
  • 5+ years of experience writing software in a production environment.
  • Strong technical knowledge of cloud infrastructure, distributed systems, and reliability practices.
  • Strong understanding of observability, performance tuning, and scaling strategies.
  • Deep familiarity with incident response, monitoring, and CI/CD systems.
  • Hands-on experience supporting web or RPC services at meaningful scale.
  • You write code to solve infrastructure problems; not shell scripts alone, but production-grade software.
Preferred Qualifications
  • You have a big-picture systems mindset and a proactive approach to reliability.
  • You’ve embedded with product teams and influenced design and architecture decisions.
  • You’re comfortable taking ownership of complex problems—and seeing them through.
  • Experience with Ruby and Go is a plus.
Why Rootly?

We’re not just another startup. We’re building something category-defining and want teammates who crave ownership, love solving hard problems, and thrive in a high-bar, high-impact environment.

Here’s what you can expect when you join Rootly:

  • Competitive compensation and early equity in a fast-growing, venture-backed company.
  • Comprehensive medical, dental, and vision coverage.
  • 3 weeks of vacation, plus unlimited sick and mental health days, and a company-wide end-of-year shutdown to recharge.
  • $500 stipend for home office setup.
  • A fast-moving, high-impact environment where your leadership and ideas directly shape the future of the company.

If this sounds like the kind of challenge and opportunity you’re looking for, apply now and let’s build something great together.

Rootly is an equal opportunity employer. We aim to create an environment where every team member at Rootly feels like they belong so they can have a greater impact on our business and customers. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

What you need to know about the Calgary Tech Scene

Employees can spend up to one-third of their life at work, so choosing the right company is crucial, not just for the job itself but for the company culture as well. While startups often offer dynamic culture and growth opportunities, large corporations provide benefits like career development and networking, especially appealing to recent graduates. Fortunately, Calgary stands out as a hub for both, recognized as one of Startup Genome's Top 100 Emerging Ecosystems, while also playing host to a number of multinational enterprises. In Calgary, job seekers can find a wide range of opportunities.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account