Maximum of 25 job preferences reached.
Top Reliability Engineer Jobs in Calgary
Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI
Lead design, automation, and maintenance of cloud-based database infrastructure (primarily SQL Server and MySQL). Improve reliability with monitoring, HA/DR, automation, troubleshooting, on-call support, and mentoring of junior engineers while collaborating across teams.
Top Skills:
AuroraAWSBashFailover ClusteringMySQLNew RelicOrchestratorPmmPythonRdsRubySQL ServerVividcortex
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Analytics • Biotech • Generative AI
The Site Reliability Engineer will manage cloud infrastructure, automate tasks, collaborate in agile teams, and ensure service reliability and quality.
Top Skills:
Aurora MysqlAWSAzureBashDockerGCPGoKubernetesPostgresPythonRubyTerraform
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The Staff Site Reliability Engineer will develop Dropbox's reliability strategy, enhance operational excellence, and lead cross-team initiatives. Responsibilities include improving monitoring and incident response systems, mentoring engineers, and aligning stakeholders on reliability priorities.
Top Skills:
Ai-Enabled Software DeliveryDebugging ToolsDistributed SystemsIncident ResponseObservability
eCommerce • Healthtech • Kids + Family • Retail • Social Media
Own and evolve Babylist's AWS infrastructure, Terraform IaC, Kubernetes/EKS clusters, CI/CD, and observability for a platform serving millions. Lead incident response, improve developer tooling, and set reliability standards across engineering teams.
Top Skills:
AWSCdnCircleCICloud NetworkingCronitorDatadogDnsEksGithub ActionsKubernetesLoad BalancersMySQLPagerdutyRdsRedisRuby On RailsSentrySidekiqTerraform
Information Technology • Software
The Reliability Engineer will develop and implement reliability test plans for IVD medical devices, conduct analyses, and lead cross-functional projects while ensuring compliance with regulatory standards.
Top Skills:
JmpMatlabMinitabPythonR
Aerospace
Lead reliability strategy and environmental validation for satellite user terminal hardware. Design and run HALT/HASS/ESS tests, oversee weatherproofing and thermal/vibration testing, perform failure analysis (X-ray, microscopy, cross-sectioning), own DFMEA and physics-of-failure models to predict MTBF and warranty risk, collaborate with electrical and mechanical teams and external labs to drive hardware improvements and regulatory compliance.
Top Skills:
CfdCross-SectioningDfmeaElectronic Thermal Cycle SystemsEnvironmental ChambersEssFeaHaltHassIp67JmpMicroscopyMinitabPower Delivery Network (Pdn) AnalysisReliasoftSalt Fog Corrosion TestingUv Exposure TestingVibration/Shaker TablesWeibull Life-Data AnalysisX-Ray Inspection
Big Data • Fintech • Mobile • Payments • Financial Services
Design and build a centralized reliability platform for production systems, integrating distributed-systems engineering with AI-assisted tools. Implement AI agents for incident triage, log/trace summarization, and developer-facing APIs. Own projects end-to-end and collaborate with product, infra, data, and SRE teams to iterate and improve system health and debuggability.
Top Skills:
ClaudeCursorDistributed SystemsGithub CopilotLlmsPython
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
The Senior Site Reliability Engineer will enhance reliability of Block's platform, improve incident response using AI tools, and coordinate incident management. Responsibilities include building reliable systems, standardizing tools, and leading high-severity incidents during on-call rotations.
Top Skills:
Amazon Web ServicesDatadogDynamoDBGrpcHTTPIstioJavaJSONKotlinKubernetesLaunchdarklyMySQLProtocol BuffersTerraformVitess
Reposted 23 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Security • Software • Cybersecurity • Automation
As an Intermediate Site Reliability Engineer in Environment Automation, you'll automate operations across many GitLab environments, maintain infrastructure reliability using Kubernetes, and enhance IT practices with Terraform and Ansible, while collaborating with senior engineers.
Top Skills:
AnsibleCloud ServicesDevsecopsGitlabGoInfrastructure As CodeKubernetesTerraform
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Lead SRE work to keep Circle highly available and performant: respond to incidents, own monitoring/alerting/log management, manage and optimize MySQL/Postgres/ClickHouse/Redis databases, maintain server infrastructure and deployment pipelines, collaborate with engineering teams, and build internal SRE tooling and automation.
Top Skills:
AWSClickhouseKubernetesLlm-Based Tools (Copilots)MySQLPostgresRedis
Information Technology • Software • Database • Automation
Owner of on-prem reliability and escalations: reproduce and resolve L2/L3 issues across heterogeneous Kubernetes environments, build diagnostics and automation, improve CI and e2e test stability, establish performance baselines, harden install/upgrade flows, and write tooling in Python/Go/Rust to reduce repeat incidents.
Top Skills:
BenchmarkingCiCi/CdContainersE2E TestingGoHealth ChecksHelmInstallersIntegration TestingKubernetesLoad GenerationLogsMetricsNetworkingObservabilityPackagingProfilingPythonRbacRustStorageSupport BundlesTraces
Insurance
As a Reliability Engineer, you'll design, implement, and maintain AWS cloud environments, ensuring systems' reliability and performance, while enhancing monitoring and incident response capabilities.
Top Skills:
AWSNginxPythonUnixWindows
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Artificial Intelligence • Software • Generative AI
The Founding Platform & Reliability Engineer will design and operate reliable, scalable infrastructure for an AI storytelling platform, involving hands-on implementation and strategic decision-making.
Top Skills:
AmplitudeAWSCloud RunFirebaseGCPModalNext.JsNode.jsPythonReactRedisSentryTypescriptUpstash
Artificial Intelligence • Cloud • Social Impact • Software • Wearables
Senior SRE focused on building cloud-native platforms, testable automation, and reliability tooling. Partner with Identity and Security to strengthen authentication/authorization, Okta integrations, and compliance. Design tests, write maintainable code (Go/Python), and improve observability and operational practices.
Top Skills:
AksApmAWSAzureC#Ci/CdEksGoIacInfrastructure As CodeJavaKubernetesLoggingMetricsObservability ToolsOidcOktaPythonSAMLSecrets ManagementTracing
Artificial Intelligence • Fintech • Software • Financial Services
The SRE will own reliability for a cloud-native platform, optimizing performance, availability, and observability, while mentoring engineering teams.
Top Skills:
AWSClickhouseGoKafkaKubernetesPulumiPythonTerraform
Cloud • Software
The Senior Site Reliability / Gitops Engineer will drive automation and collaboration within the IS team, enhancing Canonical's IT operations and services while managing infrastructure as code and cloud technologies.
Top Skills:
Cloud ComputingDockerElasticsearchGitopsGrafanaIacKubernetesLinuxPrometheusPython
Cloud • Software
As a Site Reliability / Gitops Engineer, you will automate operations, develop Infrastructure as Code, maintain core services, and collaborate on service architecture.
Top Skills:
Ci/CdCloud ComputingElasticsearchGrafanaInfrastructure As CodeLinuxPrometheusPython
Cloud • Software
The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.
Top Skills:
KubernetesLinuxOpenstackPython
Cloud • Software
The Senior Site Reliability Engineer will automate operations using Python, manage Kubernetes and OpenStack clusters, and ensure high availability for enterprise infrastructures.
Top Skills:
KubernetesLinuxOpenstackPython
Reposted 4 Days AgoSaved
Artificial Intelligence • Cloud • Information Technology • Software
Design and operate large-scale GPU infrastructure for distributed AI training, ensuring reliability, performance, and efficient customer partnerships.
Top Skills:
AnsibleCudaDeepspeedFsdpGpuHelmInfinibandKubernetesLinuxMegatronNcclNvidia A100Nvidia B200Nvidia H100NvlinkPyTorchRoceTerraform
Artificial Intelligence • Cloud • Information Technology • Software
The Site Reliability Engineer will provision and manage Kubernetes clusters, build automation tools, debug customer issues, and improve infrastructure reliability.
Top Skills:
AnsibleBashDatadogGoGrafanaHelmKubernetesLokiPrometheusPythonTerraform
Information Technology • Software
Maintain and automate large-scale bare-metal infrastructure using MAAS/Ironic and Linux; design networks; automate provisioning (PXE/MAAS/Cloud-init) with Ansible/Bash/Python; deploy observability (Prometheus/Grafana, ELK/Graylog/Loki); integrate APIs and manage virtualization platforms like OpenStack, Proxmox, or VMware.
Top Skills:
AnsibleBashBiosCloud-InitCloudflare ApiDebianElkGitGrafanaGraylogIpmiIronicKolla-AnsibleL2 RoutingL3 RoutingLinuxLokiMaasOpenstackPreseedPrometheusProxmox VePxePythonRaidUbuntuUnifiVlanVmware EsxiVpn
Information Technology • Software
Maintain and automate large-scale bare-metal infrastructure using MAAS/Ironic and Linux. Design networks (VLANs, L2/L3, VPNs), provision servers (PXE/Preseed/Cloud-init), build observability (Prometheus/Grafana, ELK/Graylog/Loki), integrate APIs, and operate virtualization platforms (OpenStack/Proxmox/VMware).
Top Skills:
AnsibleBashBiosCloud-InitCloudflare ApiContainer OrchestrationDebianElkGitGitopsGrafanaGraylogIpmiIronicKolla-AnsibleL2 RoutingL3 RoutingLinuxLokiMaasOpenstackPreseedPrometheusProxmox VePxePythonRaidUbuntuUnifiVlansVmware EsxiVpns
Cloud • Security • Software • Cybersecurity
Lead reliability for global load-balancing infrastructure: build observability pipelines, define SLO/SLIs, lead incident response, automate deployment safety, review designs, and develop tooling (Python/Go) and IaC to reduce operational toil.
Top Skills:
AnsibleContainerizationFeature FlagsGoKubernetesL4 Load BalancingL7 Load BalancingLinuxNb/NlbNetstatObservabilityPythonSaltstackTcpdumpTerraform
Artificial Intelligence • Information Technology • Software • Database
As a Site Reliability Engineer, you will design, implement, and maintain scalable infrastructure, ensure system reliability, automate processes, and collaborate with engineering teams.
Top Skills:
DockerElk StackGoGrafanaJavaKubernetesNode.jsPrometheusPulumiPythonRubyTerraform
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Calgary Companies Hiring Reliability Engineers
See AllPopular Job Searches
Tech Jobs & Startup Jobs in Calgary
Remote Jobs in Calgary
Hybrid Jobs in Calgary
Account Executive Jobs in Calgary
Account Manager Jobs in Calgary
Accounting Jobs in Calgary
AI Jobs in Calgary
Analyst Jobs in Calgary
Analytics Jobs in Calgary
Automation Engineer Jobs in Calgary
AWS Jobs in Calgary
Azure Jobs in Calgary
Business Analyst Jobs in Calgary
Business Development Jobs in Calgary
Cloud Jobs in Calgary
Communications Jobs in Calgary
Content Writer Jobs in Calgary
Controller Jobs in Calgary
Copywriting Jobs in Calgary
Customer Service Jobs in Calgary
Customer Service Manager Jobs in Calgary
Cyber Security Jobs in Calgary
Data Analyst Jobs in Calgary
Data Engineer Jobs in Calgary
Data Jobs in Calgary
Data Science Jobs in Calgary
Database Administrator Jobs in Calgary
Design Jobs in Calgary
DevOps Jobs in Calgary
Engineering Jobs in Calgary
Engineering Manager Jobs in Calgary
Executive Assistant Jobs in Calgary
Finance Jobs in Calgary
Finance Manager Jobs in Calgary
Financial Analyst Jobs in Calgary
Front End Developer Jobs in Calgary
Full Stack Developer Jobs in Calgary
Graphic Design Jobs in Calgary
HR Jobs in Calgary
HR Manager Jobs in Calgary
IT Jobs in Calgary
IT Support Jobs in Calgary
Java Developer Jobs in Calgary
Legal Counsel Jobs in Calgary
Legal Jobs in Calgary
Linux Jobs in Calgary
Machine Learning Jobs in Calgary
Marketing Jobs in Calgary
Marketing Manager Jobs in Calgary
NET Jobs in Calgary
Network Engineer Jobs in Calgary
Operations Jobs in Calgary
Operations Manager Jobs in Calgary
Outside Sales Jobs in Calgary
Payroll Jobs in Calgary
Product Manager Jobs in Calgary
Product Owner Jobs in Calgary
Program Manager Jobs in Calgary
Project Engineer Jobs in Calgary
Project Manager Jobs in Calgary
Python Developer Jobs in Calgary
Quality Assurance Jobs in Calgary
Quality Engineer Jobs in Calgary
Recruiter Jobs in Calgary
Reliability Engineer Jobs in Calgary
Research Jobs in Calgary
Sales Jobs in Calgary
Sales Manager Jobs in Calgary
Sales Rep Jobs in Calgary
SEO Jobs in Calgary
Software Engineer Jobs in Calgary
Software Testing Jobs in Calgary
Staff Accountant Jobs in Calgary
Talent Acquisition Jobs in Calgary
Tax Jobs in Calgary
Technical Support Jobs in Calgary
UX Designer Jobs in Calgary
Web Developer Jobs in Calgary
Writing Jobs in Calgary
All Filters
Total selected ()
No Results
No Results































