Top Reliability Engineer Jobs in Calgary
SRE's with Delivery: Deployments specialization at GitLab focusing on improving the delivery platform, release management tooling, and processes. Responsibilities include automating the release process, creating new tools, building new release features, collaborating with teams to implement solutions, and developing monitoring and alerting systems.
Reliability Engineer role at Chelsea Avondale with a focus on maintaining high-performance, scalable, and reliable web systems in AWS environment. Responsibilities include designing, implementing, and maintaining infrastructure, monitoring systems, and collaborating with teams to enhance reliability. Preferred qualifications include a Bachelor's degree in Computer Science, 5+ years of experience in a similar role, proficiency in Python, hands-on experience with AWS, and knowledge of NGINX and Unix/Windows server configuration.
Monitoring, troubleshooting, and enhancing database infrastructure using MySQL and MongoDB. Automating deployments with Terraform, developing dashboards with Grafana, and working on PaaS like AWS RDS. Responsible for documentation and operational runbooks.
This role is responsible for equipment reliability and maintenance strategy of utility and manufacturing equipment through continuous monitoring and implementing improvement actions. The position involves developing engineering solutions to repetitive failures and analyzing equipment trends for effective maintenance programs. Additionally, the individual will serve as a technical lead, collaborate with engineering teams, and ensure compliance with regulations in a dynamic biotechnology environment.
As a Site Reliability Engineer at Behavox, you will be responsible for the deployment and maintenance of high-load and large-scale distributed storage and data processing systems in public clouds. You will also monitor applications, troubleshoot issues, automate operations, and administer Linux systems and networks.
Perplexity is seeking a Site Reliability Engineer (SRE) to lead the design, implementation, and scaling of infrastructure supporting web and mobile products. Responsibilities include designing highly available systems, maintaining databases, scaling web server backends, monitoring systems, and collaborating with engineering teams. Qualifications include strong experience with AWS, database management, troubleshooting skills, containerization, and 4+ years of SRE experience.
The Site Reliability Engineer will be responsible for developing software and solutions focused on observability, incident response, reliability, and performance. This role includes participation in 24x7 Site Reliability rotations, troubleshooting, and collaborating with development teams to ensure operational requirements are met.
Join Affirm as a Senior Staff Software Engineer (Reliability) to champion reliability practices, influence infrastructure teams, drive incident management, and foster technical excellence within the organization. Use your expertise in software development, SRE, infrastructure scaling, k8s, and AWS to enhance operational reliability and support the growth of the Infrastructure team.
Featured Jobs
Babylist is looking for a Staff Software Engineer, Site Reliability to play a vital role in ensuring system stability, scalability, and reliability. The role involves supporting shared infrastructure and developer tools and optimizing systems through site reliability engineering, AWS cloud infrastructure, and modern DevOps practices.
Join GlossGenius as a Senior Site Reliability Engineer responsible for maintaining reliable infrastructure, scaling AWS footprint, driving incident management, and promoting SRE culture. Collaborate with product and engineering teams for optimal performance and scalability. Remote or hybrid work in the US or Canada.
The Weekend Site Reliability Engineer will work on site reliability and security, deployment, configuration, monitoring, and improving infrastructure processes. They will focus on availability, latency, change management, emergency response, and capacity management of services in production.
Sporty's is looking for a LatAm Site Reliability Engineer to focus on site reliability and security, deployment, configuration, monitoring, and managing services in production. Responsibilities include improving infrastructure, deployment processes, monitoring cloud infrastructure, and mentoring team members. Required skills include experience in SRE/DevOps, cloud platforms like AWS, Kubernetes, networking protocol, cyber security, cache systems, cloud-native monitoring solutions, and troubleshooting skills. Bonus skills include working with other cloud platforms, CI/CD workflows, system automation tools, and Micro Services and Service Mesh concepts.
Proactively ensure the stability, resilience, and scale of services through automation, testing, and engineering. Collaborate with cross-functional teams to deliver high-quality solutions aligned with technical requirements. Foster a culture of innovation, knowledge sharing, and continuous improvement.
Seeking a Site Reliability Engineer with 2+ years of experience to support internal stakeholders and teams, ensuring system reliability and efficiency. Responsibilities include developing automation, participating in incident response, and collaborating with development teams. Required skills include C#, Java, GoLang, Python, Cloud Computing platforms, observability technologies, RDBMS, NoSQL data store, and containerization technologies.
Seeking a Principal Software Engineer with deep technical expertise in observability, CI/CD, systems architecture, and software engineering. Responsible for guiding technology strategy, driving operational excellence, leading teams, and ensuring delivery of high-quality systems.
As an Engineering Manager for Site Reliability Engineering at Replicant, you will lead a team in scaling a robust cloud infrastructure and ensuring operational excellence for the AI platform. You will mentor and guide team members, oversee on-call processes, manage work prioritization, provide technical leadership, and foster a collaborative remote-working culture.
As a Site Reliability Engineer, your role is to maintain high availability of production and non-production work environments and automate tasks for continuous deployment and continuous integration.
Senior Site Reliability Engineer responsible for designing, building, and maintaining scalable cloud infrastructure at FOSSA. Role involves scaling cloud infrastructure, deploying new services, ensuring platform security, and improving development tools like CI/CD pipelines and monitoring. Strong experience with AWS, Terraform, Helm, Kubernetes, Docker, CI/CD tools, and logging/monitoring tools is required.
Senior Site Reliability Engineer supporting Cloud databases for high-value customers, focusing on reliability, system configuration, and incident response. Requires strong engineering background, SRE experience, communication skills, and expertise with Kubernetes, Helm, and various programming languages.
Design, implement, and maintain observability systems in data centers, ensure uptime of logging and metric systems, collaborate with engineering teams, automate tasks, communicate globally, work in an on-call rotation, and support services across various platforms.
Ground floor opportunity to be the first SRE at Rootly, responsible for supporting critical services, defining SLOs, building tools, and enhancing observability and reliability of services. Requires strong technical knowledge of cloud infrastructure and distributed systems.
Seeking a Site Reliability Engineer to improve platform reliability and uptime, implement automation, and support internal and external customers' needs. Responsibilities include collaborating with engineering teams, maintaining live services, scaling systems sustainably, and implementing SRE best practices.
Join BeyondTrust as a Senior Site Reliability Engineer responsible for designing, developing, and maintaining a reliable microservices-based cloud software solution. Lead the standardization of automated infrastructure and service provisioning and orchestration. Collaborate with engineering teams to deliver high-quality products and services.
Seeking an Engineering Manager to lead the Site Reliability Engineering team at a Contact Center Automation company specializing in AI technology. Responsibilities include scaling cloud infrastructure, ensuring operational excellence, and mentoring a remote engineering team.
Senior Site Reliability / Gitops Engineer at Canonical responsible for driving operations automation, infrastructure as code, and software operation automation. Collaborate with teams, maintain core services, and provide troubleshooting support.
Top Calgary Companies Hiring Reliability Engineers
See AllPopular Job Searches
AI Jobs in Calgary
Automation Engineer Jobs in Calgary
AWS Jobs in Calgary
Azure Jobs in Calgary
Cloud Jobs in Calgary
Database Administrator Jobs in Calgary
DevOps Jobs in Calgary
Engineering Jobs in Calgary
Engineering Manager Jobs in Calgary
Front End Developer Jobs in Calgary
Full Stack Developer Jobs in Calgary
Java Developer Jobs in Calgary
Linux Jobs in Calgary
Machine Learning Jobs in Calgary
NET Jobs in Calgary
Network Engineer Jobs in Calgary
Project Engineer Jobs in Calgary
Python Developer Jobs in Calgary
Quality Assurance Jobs in Calgary
Quality Engineer Jobs in Calgary
Reliability Engineer Jobs in Calgary
Software Engineer Jobs in Calgary
Software Testing Jobs in Calgary
Web Developer Jobs in Calgary
All Filters
No Results
No Results