Grafana Labs Logo

Grafana Labs

Staff Backend Engineer - Databases Tempo | Canada | Remote

Reposted 3 Hours Ago
Be an Early Applicant
Remote
Hiring Remotely in Canada
Expert/Leader
Remote
Hiring Remotely in Canada
Expert/Leader
As a Staff Backend Engineer at Grafana Labs, you'll lead technical initiatives for Tempo's development, ensuring operational excellence and integrating with other products while mentoring engineers and contributing to open source.
The summary above was generated by AI

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. Grafana Cloud, our fully managed observability platform, is flexible and built for scale. With Grafana Cloud's actually useful AI, organizations can see, understand, and act on all their disparate data to move at the speed of their ambitions. Today, more than 35 million users and 7,000+ customers – including Anthropic, Bloomberg, NVIDIA, Microsoft, and Salesforce – trust Grafana Labs to ensure reliability of their applications and systems, resolve incidents quickly, and optimize their telemetry to reduce noise and cost. We are a 100% remote company with 1,600+ team members across 40+ countries, and we’re backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital. Learn more at grafana.com and follow us on LinkedIn and X.

We’re scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.

You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity.

This is a remote position. We are seeking candidates in US and Canada. 

The Opportunity: 

We build Tempo, the open-source distributed tracing backend behind Grafana Cloud Traces and Grafana Enterprise Traces (GET). Tempo makes it easy to search traces, generate metrics from spans, and connect tracing data with logs, metrics, and profiles across the Grafana stack.

2026 is an inflection point for Tempo. After a major architectural upgrade and the launch of TraceQL metrics, we are shifting from foundational work to product and operational excellence, and evolving Tempo from a SaaS database into a platform that powers Grafana’s next generation of observability products (App Observability, Asserts, Traces Drilldown, and AI-driven assistants).

Over the next year, you will help us:

  • Make Grafana Cloud Traces “just work” for customers by eliminating rough edges, confusing limits, and hidden failure modes.
  • Achieve operational excellence at scale as we grow from close to 50 cells today into triple digits this year, with autoscaling, parameterized rollouts, and aggressive toil reduction.
  • Evolve Tempo into a platform enabler: higher-density APIs, trace aggregation, TraceQL metrics math, and machine/LLM-friendly interfaces that downstream products and agents can build on.
  • Push performance further: faster query latency at hundreds of MB/s ingestion and performant 30-day query ranges to match competitors.
  • Prepare Tempo for an agent-driven world: larger, burstier, higher-cardinality workloads, and new categories of AI-powered workflows, such as assistant-driven triage and “why is this slow?”- style investigations.
What You’ll Be Doing: 

As a Staff Engineer on Tempo, you will set technical direction on the hardest problems in our roadmap and raise the bar across the team.

  • Lead multi-quarter technical initiatives from problem framing through rollout, e.g., trace aggregation APIs, Limitless Tempo, autoscaling cells and customer limits, or query engine improvements.
  • Own the architecture of core Tempo components: ingestion, storage, query, and metrics generation. Drive design reviews, make sharp trade-offs on performance, cost, and complexity, and document the “why” for the team.
  • Design APIs for humans and agents. Shape the next generation of Tempo’s interfaces (structured, deterministic, discoverable) so that Act 3 products, LLM-driven assistants, and external integrators can build on Tempo reliably.
  • Drive operational excellence. Own outcomes against concrete SLOs (P99 write latency, incident recurrence, TCO per ingested GB) and push the team toward Zero Ops through automation, parameterized rollouts, and actionable alerts.
  • Partner with Product and sibling teams. Work closely with PMs and with App Observability, Asserts, Drilldown, and Grafana Assistant teams to understand how Tempo gets consumed and to ship what unblocks them.
  • Mentor engineers. Raise the engineering bar through code review, design feedback, pairing on hard problems, and writing that leaves the team smarter than you found it.
  • Participate in on-call for the services you help build, and be a force multiplier in incident response and post-incident learning.
  • Contribute to open source. Tempo is OSS. You will engage the community, review external contributions, and help steer the project in the open.

We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction.

We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups—always paired with strong code review and quality standards.

You’ll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro).

Example problems you could work on

These are the kinds of projects landing in 2026. Any one of them is a Staff-sized problem:

  • Trace aggregation and higher-density APIs: extend TraceQL metrics, design LLM-friendly response types, and make Tempo a first-class data source for Grafana’s AI assistant.
  • Autoscaling end to end: customer limits and Tempo cells, with hysteresis, predictive scaling for spikes, and safe scale-down.
  • Agent-scale ingestion and query: guardrails for bursty, high-cardinality, agent-generated workloads.
  • Query performance: new data formats, smarter query pipelines, targeted optimizations for common Drilldown and Traces workflows, and 30-day query ranges.
  • Rollouts and multi-cell operations: parameterized rollouts, push-button deploys, and the tooling to grow safely into triple-digit cell counts without a proportional increase in alert noise.
  • Limits and self-service: drive customer-facing configuration and observability so escalations trend toward zero.
What Makes You a Great Fit: 
  • Technical leadership. A track record of leading complex, multi-quarter initiatives that spanned design, delivery, and operations, and made the teams around you better.
  • Deep systems experience. Substantial hands-on experience building and operating distributed data systems in production: ingestion pipelines, storage engines, query execution, or similar.
  • Strong software craftsmanship. You write clean, robust, performant software that others can maintain, and you know when to optimize vs. when to ship.
  • Strong Go, or a path to it. We write Tempo in Go. Deep experience in other systems languages (Rust, C, C++) translates well.
  • Operational mindset. You’ve owned production services, carried a pager, reduced toil, and treated SLOs as a product feature, not a chore.
  • Customer focus and pragmatism. You break complex problems into short feedback loops: analyze, design, deliver an MVP, learn, iterate.
  • Leadership through writing and collaboration. You lead through design docs, reviews, and shipped code, not hierarchy. You communicate clearly in a fully remote, asynchronous environment.
Bonus Points For: 
  • Experience with tracing, OpenTelemetry, or large-scale observability systems.
  • Experience designing query languages, SQL/TraceQL-like engines, or APIs intended to be consumed programmatically (by services or agents).
  • Experience with columnar storage formats (e.g., Parquet) or purpose-built on-disk formats for analytical workloads.
  • Experience operating multi-tenant, multi-cell SaaS infrastructure at scale on Kubernetes.
  • Experience building for AI/LLM consumers: structured APIs, metadata/discovery endpoints, deterministic outputs, evaluation harnesses.
  • Open-source contribution or maintainership, and comfort engaging a community in the open.
  • Experience as an on-call user of Grafana, Prometheus, Loki, or Tempo in a previous role (or on a homelab).
  • Experience in a fully remote, globally distributed team.
How we work

We are a remote-first team that meets regularly over video and does most of our work asynchronously, in writing. We value creativity, diverse perspectives, and clear communication. Tempo is relied upon by prominent global organizations to monitor critical applications and infrastructure, and we expect everyone on the team, including our Staff engineers, to contribute ideas that make it a more reliable, more useful, and more loved product.
In Canada, the compensation range for this role is $186,368 - $223,642 CAD. Actual compensation may vary based on level, experience, and skillset as assessed throughout the interview process. All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.


*Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process.

Why You’ll Thrive at Grafana Labs:

  • 100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose.
  • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment.
  • Transparent Communication – Expect open decision-making and regular company-wide updates.
  • Innovation-Driven – Autonomy and support to ship great work and try new things.
  • Open Source Roots – Built on community-driven values that shape how we work.
  • Empowered Teams – High trust, low ego culture that values outcomes over optics.
  • Career Growth Pathways – Defined opportunities to grow and develop your career.
  • Approachable Leadership – Transparent execs who are involved, visible, and human.
  • Passionate People – Join a team of smart, supportive folks who care deeply about what they do.
  • In-Person onboarding - We want you to thrive from day 1 with your fellow new ‘Grafanistas’ to learn all about what we do and how we do it. 
  • Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable.

Equal Opportunity Employer: Grafana Labs is an equal opportunities employer. We welcome applications from everyone regardless of race, colour, nationality, origin, caste, sex, gender reassignment identity or expression, sexual orientation, age, religion or belief, disability, veteran status, genetic information, pregnancy, maternity, marital, family or carer status, or any other characteristic which is protected by local law. We believe that equality and diversity build a strong organisation, and we work hard to ensure that is the foundation of our organisation as we grow.

Grafana Labs may utilize AI tools in its recruitment process to assist in matching information provided in CVs to job postings. The recruitment team will continue to review inbound CVs manually to identify alignment with current openings.

#LI-Remote

For information about how your personal data is used once you’ve applied to a job, check out our privacy policy. 
 

Similar Jobs

31 Minutes Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Lead finance transformation engagements using Oracle Cloud ERP and EPM. Design and implement Oracle Financials and Hyperion solutions, integrate RPA/ML/analytics, ensure compliance, manage stakeholder relationships, coach teams, and drive strategic outcomes on large, cross-border projects.
Top Skills: Ahcs/FahAnalyticsFixed Assets (Fa)Hyperion Financial ManagementMachine LearningOracle ApOracle ArOracle Cloud ErpOracle CmOracle EpmOracle ExpensesOracle FinancialsOracle GlOracle Ppm (Grants)Project BillingProject CostingRpa
31 Minutes Ago
Remote or Hybrid
Senior level
Senior level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
The role involves leading Salesforce technology solutions, managing client relationships, and mentoring teams while driving business growth and delivering innovative solutions.
Top Skills: Salesforce
31 Minutes Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Lead development and delivery of custom enterprise applications: design, code, test, and deploy solutions using Java, Python, and C#.NET. Manage and mentor teams, oversee API/microservices integration, implement CI/CD, conduct code reviews, and engage clients to align solutions with business needs and drive process improvements.
Top Skills: Api ManagementApplication Lifecycle ManagementAtddC#.NetCi/CdEnterprise Application ArchitectureJavaJavaScriptMicroservicesPython

What you need to know about the Calgary Tech Scene

Employees can spend up to one-third of their life at work, so choosing the right company is crucial, not just for the job itself but for the company culture as well. While startups often offer dynamic culture and growth opportunities, large corporations provide benefits like career development and networking, especially appealing to recent graduates. Fortunately, Calgary stands out as a hub for both, recognized as one of Startup Genome's Top 100 Emerging Ecosystems, while also playing host to a number of multinational enterprises. In Calgary, job seekers can find a wide range of opportunities.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account