Hive.co Logo

Hive.co

Senior Reliability Engineer

Job Posted 7 Days Ago Posted 7 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Canada
Senior level
Remote
Hiring Remotely in Canada
Senior level
As a Senior Reliability Engineer, you will enhance system performance, reliability, and maintainability. You will drive SLO adoption, improve application performance, tackle technical challenges, and lead security initiatives, while collaborating with development and DevOps teams to optimize cloud infrastructure and incident management.
The summary above was generated by AI

Hive is a fast-growing SaaS company offering marketing solutions to live event promoters across North America. Our Engineering Team builds and maintains the systems that empower our customers to do powerful things simply and intuitively. We operate with agility—shipping minimum viable products, deploying multiple times daily, and rapidly iterating based on customer feedback.

At Hive, we handle impressive technical scale: ingesting high-volume data in real-time from 20+ integrations (including Ticketmaster and Eventbrite), storing and querying billions of customer data points, and delivering over 200 million emails and SMS messages monthly to our clients' customers. Our technology stack includes Python, React, Redis, MongoDB, SQL, Elasticsearch, Clickhouse, and various AWS services.

As we continue to scale, we're seeking a Senior Reliability Engineer to join our Reliability Team—the foundation that enables our product and engineering teams to deliver exceptional experiences while maintaining system performance, security, and cost efficiency.

The Role

As a Senior Reliability Engineer at Hive, you'll be part of a team responsible for the performance, reliability, and maintainability of our systems. This role bridges infrastructure, operations, and application engineering to ensure our services are scalable, performant, secure, and cost-effective as we tackle increasingly complex technical challenges.

Tech Stack

AWS, Docker, Kubernetes, Karpenter, Terraform, Python, Django, Redis, MySQL, Clickhouse, MongoDB, Elasticsearch, DataDog, Sentry

What You'll Do

  • Champion system observability improvements through implementation, maintenance, process refinement, and automation for business-critical services

  • Drive SLO adoption and improvement to ensure excellent customer satisfaction across key value streams

  • Enhance application performance at every level, from infrastructure foundations to runtime environments

  • Tackle and resolve complex technical challenges across the entire stack

  • Partner with development teams to design and implement scalable, reliable solutions

  • Lead security and compliance initiatives as integral components of our engineering practice

  • Craft and refine developer tools that boost team productivity and efficiency

  • Develop and implement strategies to optimize cloud infrastructure costs

  • Collaborate with DevOps to maintain and enhance deployment pipelines in our cloud environments

  • Contribute to incident management by defining meaningful metrics, executing against targets, and improving response times and overall system stability

What We're Looking For

  • 7+ years of software engineering experience, with at least 5 years focused on reliability, infrastructure, or platform engineering

  • 3+ years experience with AWS and proven ability to build effective monitoring, alerting, and observability solutions

  • Track record of implementing, maintaining, and improving SLOs and uptime KPIs for critical services

  • Expert knowledge of Linux, Docker, and distributed systems principles with their real-world applications

  • Solid programming skills in both application and infrastructure languages (Python, Go, etc.)

  • Strong grasp of security best practices and a data-driven approach to enhancing stability and availability

  • Excellent communication skills with the ability to collaborate effectively across teams and explain complex technical concepts clearly

Bonus points if you have...

  • Proven experience scaling complex AWS environments and optimizing performance across the full technology stack during periods of significant growth

  • Experience creating developer platforms and CI/CD pipelines that enhance team productivity

  • Skillful approach to cloud cost optimization and resource management

  • Experience in establishing and improving incident management processes

What We Offer

  • Meaningful salary and equity. You're rewarded based on impact

  • Work fully remote from the comfort of your home in Canada

  • Opportunity to shape reliability practices at a rapidly scaling company

  • Collaborative team of experienced engineers passionate about building reliable systems

  • Flexible work hours with minimal meetings

  • Health & Dental coverage

  • Open vacation/PTO policy so you can be happy and healthy!

  • Generous parental leave top-up with a flexible return-to-work plan

About Hive

Hive.co is a marketing platform for event marketers. We help brands personalize and automate their campaigns, using email and SMS, to empower them to sell out so they can focus on making their events unforgettable.

By integrating with ticketing partners like Ticketmaster and e-commerce partners like Shopify, we enable brands to access and act on all their customer data, so they can easily segment their list in thousands of ways, and send more customized, timely email campaigns that land in inboxes.

We started our company inside a University of Waterloo computer lab in early 2014, graduated from Y Combinator that summer (S14 batch) and have been growing ever since. Originally based in Kitchener, our team is now 100% remote and located all across Canada! We strive to provide an online work environment that allows team members to have a strong work life balance while still feeling connected to their team and Hive’s mission.

To learn more about our team check out our About Us page on our website:https://www.hive.co/company

Top Skills

AWS
Clickhouse
Datadog
Django
Docker
Elasticsearch
Karpenter
Kubernetes
MongoDB
MySQL
Python
React
Redis
Sentry
SQL
Terraform

Similar Jobs

6 Days Ago
Easy Apply
Remote
Hybrid
Ontario, ON, CAN
Easy Apply
Mid level
Mid level
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer at Braze, you will maintain internal services, ensure site uptime, and enhance automation and infrastructure reliability. Collaborating with engineering teams, you will leverage a diverse tech stack to develop scalable solutions, manage incidents, and improve tooling for efficient workflows.
6 Days Ago
Easy Apply
Remote
Canada
Easy Apply
Senior level
Senior level
Cloud • Security • Software • Cybersecurity • Automation
As a Senior Site Reliability Engineer, you will automate operations, enhance system security, plan monitoring systems, and respond to emergencies in GitLab's infrastructure.
Top Skills: AnsibleAWSElkGCPGoKubernetesPrometheusRubyTerraform
Senior level
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
The Senior Site Reliability Engineer will ensure the stability, scalability, and efficiency of the infrastructure at Cisco Meraki. Responsibilities include developing automation code, debugging complex systems, optimizing CI pipelines, and collaborating with engineering teams across multiple locations to enhance infrastructure performance and reliability.

What you need to know about the Calgary Tech Scene

Employees can spend up to one-third of their life at work, so choosing the right company is crucial, not just for the job itself but for the company culture as well. While startups often offer dynamic culture and growth opportunities, large corporations provide benefits like career development and networking, especially appealing to recent graduates. Fortunately, Calgary stands out as a hub for both, recognized as one of Startup Genome's Top 100 Emerging Ecosystems, while also playing host to a number of multinational enterprises. In Calgary, job seekers can find a wide range of opportunities.
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account