For over 25 years, NVIDIA has been at the forefront of transforming computer graphics, PC gaming, and accelerated computing, driven by a legacy of continuous innovation and exceptional talent. We are now leveraging the immense potential of AI to usher in the next era of computing, where our GPUs power the "brains" of computers, robots, and autonomous vehicles that can comprehend the world. This pioneering work demands vision, innovation, and the world's best talent. Join our diverse and supportive environment, where NVIDIANs are inspired to excel and make a profound global impact.
We're hiring a Senior Staff Software Engineer to own the engineering efforts across NVIDIA enterprise systems. You'll partner with IT leadership to transform reactive support into strategic, AI infused automated resolution systems and prevent problems before they occur, balancing speed, security, and an exceptional user experience for NVIDIAs.
What you'll be doing:
Design and implement agentic AI workflows using LLM-based agents, tool calling, RAG patterns, and orchestration frameworks. Push the boundaries of what AI-assisted operations can achieve.
Build robust integrations and automation pipelines across ServiceNow, identity management, monitoring platforms, and enterprise SaaS. Own the full stack from infrastructure to user facing tools.
Triage and resolve Enterprise issues with a focus on automation and improving mitigation and resolution times
Manage and troubleshoot Enterprise scale collaboration, productivity, AI and Infrastructure systems.
Trace and root cause complex, multi system failures. identify patterns in recurring tickets, and build automation or self-service solutions
Build and maintain runbooks, troubleshooting guides, and knowledge base articles that elevate team capabilities
Mentor team members on troubleshooting methodology and systems thinking
What we need to see:
Bachelor’s or Master’s degree in Computer Science, Engineering, IT, or related field (or equivalent experience)
12+ overall years experience in SRE, Enterprise Support or Devops
Experience with SaaS, hybrid cloud, AI/ML environments
Experience building production grade agentic workflows (e.g., multi-agent systems and MCP servers)
Software engineering fundamentals with deep experience in building products and operating large scale systems.
Expertise in two or more backend languages such as Go, Python, or Java with a track record of owning complex production systems.
Full stack engineering experience, including building user-facing web applications and operational dashboards using modern frontend frameworks such as React.js, along with backend APIs and data pipelines.
Systems thinker who naturally traces dependencies, considers second-order effects, and asks "why did this break?" not just "how do I fix it?"
Strong incident management skills: triage, root-cause analysis, blameless postmortems, pattern recognition
Expert troubleshooting across Enterprise hybrid stack such as Jira, Microsoft,OS [Apple,Linux, and Windows], Infrastructure systems such as compute,, AI, and storage.
NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, results-oriented and enjoy learning while having fun, then what are you waiting for? Apply today!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 200,000 CAD - 250,000 CAD.You will also be eligible for equity and benefits.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.

.png)

