Design and implement SkyPilot's commercial multicloud platform: architect control and data plane separation, tenant/user management, scaling, monitoring, and alerting. Build production-grade cloud-native platform services and APIs using Go, Kubernetes, gRPC, PostgreSQL, and Terraform, prioritizing reliability, security, and great user experience.
SkyPilot is building the future of multicloud AI infra. We are the Berkeley founding team commercializing SkyPilot (9.5K+ GitHub stars, 200+contributors), to enable AI to run on different cloud infrastructures in a portable, cost-optimizing, and highly available way.
SkyPilot is deployed at 100s of companies, including Fortune 500s and top AI-natives (Shopify, Redis, Abridge, Hippocratic, Applied Compute, etc.). In 2025, adoption grew >600%, now launching more GPUs per month than the biggest neocloud’s fleet. Currently in stealth, SkyPilot is founded in 2024 by UC Berkeley PhDs and professors (incl. Databricks cofounders). We’re building a top-tier engineering team, with current talent from Databricks, Google, Crusoe, ByteDance, and PingCap.
What You’ll Do
You’ll play an instrumental role in designing and implementing SkyPilot’s commercial cloud platform, which will power a reimagined multicloud AI experience:
- Architect SkyPilot’s commercial cloud platform from the ground up: Control plane and data plane separation, tenant/user management, control plane scaling, monitoring, alerting.
- Building core, production-grade platform services: Designing and implementing APIs and services in a cloud-native stack (e.g., Go, Kubernetes, microservices), balancing reliability, security, and simplicity.
Ideal Candidates
You are a seasoned engineer with experience building SaaS/cloud platforms from zero to one.
- 6+ years of experience in building SaaS platforms at startups: You have 6+ years of experience building SaaS platforms at startups, from inception to launch to scaling. You are intimately familiar with the best-in-class tools/vendors needed for a SaaS platform.
- SaaS platform expertise: You have hands-on experience building user and organization management, authentication and RBAC, API gateway, usage metering and billing integration, CI/CD pipelines, and other core platform services — using technologies like gRPC, Go, Kubernetes, PostgreSQL, Terraform.
- Great product taste: You believe great products must deliver both a solid platform foundation and a great user experience.
What We Offer
- Competitive equity, compensation, and health benefits.
- Chance to work with some of the best minds in cloud, distributed, and AI systems, with significant autonomy and ownership.
- Front-row seat at the latest open-source infra startup from Berkeley.
Similar Jobs
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Lead design, deployment, and sustainment of IL6S/TPM systems to eliminate losses and improve equipment reliability. Train and coach teams, run Kaizen and DMAIC events, track KPIs (OEE, MTBF/MTTR), implement SOPs and visual management, perform loss analysis, and support preventive/predictive maintenance to drive productivity and safety targets.
Top Skills:
5WhysAutonomous MaintenanceDmaicE2E Data Collection SystemsGeIshikawaKaizenLean Six SigmaMakigamiMtbbMtbfMttrOeeParetoPdcaPredictive MaintenanceRoot Cause Analysis (Rca)SmedStandard WorkTpmValue Stream Mapping (Vsm)Visual ManagementWpi Tool
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Outbound-focused senior account executive responsible for sourcing and closing new restaurant merchant logos. Duties include prospecting, discovery, demos, consultative selling of Square ecosystem, field relationship building, partnering with BD/Product/Marketing, managing the sales cycle and onboarding, and meeting monthly sales KPIs using Salesforce.
Top Skills:
SalesforceSquare
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Manage and grow ServiceNow partner relationships across Canada: build partner practices, set targets, drive governance, enablement, reporting, business reviews, remediation plans, and achieve joint revenue goals while coaching partners and collaborating with global teams.
Top Skills:
AIServicenow
What you need to know about the Calgary Tech Scene
Employees can spend up to one-third of their life at work, so choosing the right company is crucial, not just for the job itself but for the company culture as well. While startups often offer dynamic culture and growth opportunities, large corporations provide benefits like career development and networking, especially appealing to recent graduates. Fortunately, Calgary stands out as a hub for both, recognized as one of Startup Genome's Top 100 Emerging Ecosystems, while also playing host to a number of multinational enterprises. In Calgary, job seekers can find a wide range of opportunities.


.png)
