WitnessAI Logo

WitnessAI

ML Engineer - Infrastructure

Job Posted 8 Days Ago Posted 8 Days Ago
7 Locations
Mid level
7 Locations
Mid level
The ML Infrastructure Engineer will develop scalable GPU infrastructures, optimize ML workflows, and collaborate with teams for model deployment and efficiency.
The summary above was generated by AI

ML Engineer (Infrastructure)

Location: San Francisco - Bay Area Hybrid

About the Role:

WitnessAI is a leader in providing innovative networking solutions designed to enhance security, performance, and reliability for businesses of all sizes. We are seeking an ML Infrastructure Engineer to optimize, deploy and scale machine learning models in production environments. You will play a critical role in scaling GPU resources, building continuous learning pipelines, and integrating a variety of inference frameworks. Your expertise in model quantization, pruning, and other optimization techniques will ensure our models run efficiently and effectively.

You will contribute to our mission through the following:

  • Develop and Optimize: Design and manage scalable GPU infrastructures for model training and inference. Build automated pipelines that accelerate ML workflows, implement feedback loops for continuous learning, and enhance model efficiency in resource-constrained environments.

  • Implement Advanced Inference Solutions: Evaluate and integrate inference platforms like NVIDIA Triton and vLLM to ensure high availability, scalability, and reliability of deployed models.

  • Collaborate for Impact: Work closely with applied scientists, software engineers, and DevOps professionals to deploy models that drive our company's mission forward. Document best practices to support team knowledge sharing and improve code quality and reproducibility.

The ideal candidate will have expertise in designing, developing, and maintaining scalable ML infrastructure components, including data pipelines and deployment systems. You should have a demonstrated track record of optimizing ML workflows for performance and resource utilization, and stay up to date on best practices for model management and reproducibility. Strong communication skills and the ability to collaborate across functions to execute complex projects are essential.

Qualifications

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

  • 2+ years of experience building and scaling machine learning systems.

  • Proven experience in scaling GPU resources for machine learning applications.

  • Experience with inference platforms like NVIDIA Triton, vLLM, or similar.

  • Demonstrated expertise in model quantization, pruning, and other optimization techniques with frameworks such as TensorRT, ONNX or others.

  • Skilled in automating data collection, preprocessing, model retraining, and deployment.

  • Proficient with cloud platforms such as AWS (preferred), GCP, or Azure, especially in deploying and managing GPU instances.

  • Strong skills in Python; familiarity with other scripting languages is a plus.

  • Experience with CUDA packages.

  • Experience with PyTorch, Tensorflow or similar frameworks.

  • Proficient in Docker and Kubernetes.

  • Experience with Jenkins, Github CI/CD, or similar tools.

  • Experience with Prometheus, Grafana, or similar monitoring solutions.

Soft Skills

  • Strong problem-solving and analytical abilities.

  • Excellent communication and teamwork skills.

  • Ability to work independently and manage multiple tasks effectively.

  • Proactive attitude toward learning and adopting new technologies.

Benefits:

  • Hybrid work environment

  • Competitive salary.

  • Health, dental, and vision insurance.

  • 401(k) plan.

  • Opportunities for professional development and growth.

  • Generous vacation policy.

Salary range:

$140,000-$170,000

Top Skills

AWS
Azure
Cuda
Docker
GCP
Github Ci/Cd
Grafana
Jenkins
Kubernetes
Nvidia Triton
Onnx
Prometheus
Python
PyTorch
TensorFlow
Tensorrt
Vllm

Similar Jobs

Senior level
Automotive • Robotics • Software • Transportation
The role involves designing and implementing infrastructure for training and deploying deep learning models in self-driving trucks, optimizing performance, and collaborating across teams.
Top Skills: Python,C++
8 Days Ago
Vancouver, BC, CAN
Senior level
Senior level
Healthtech
Design and implement secure, scalable ML infrastructure and workflows. Manage CI/CD pipelines, collaborate with data scientists, and maintain documentation for regulatory compliance.
Top Skills: AirflowFastapiGithub ActionsGrpcJenkinsMlflowPythonPyTorchRestSagemakerTensorFlowWeights And Biases
8 Days Ago
Toronto, ON, CAN
Expert/Leader
Expert/Leader
Payments • Software
As a Staff Engineer for ML Infrastructure at Stripe, you'll lead the design of ML solutions, mentor teams, and improve ML development processes across the company.
Top Skills: Distributed SystemsMachine LearningMlopsService Oriented ArchitectureSoftware Development

What you need to know about the Calgary Tech Scene

Employees can spend up to one-third of their life at work, so choosing the right company is crucial, not just for the job itself but for the company culture as well. While startups often offer dynamic culture and growth opportunities, large corporations provide benefits like career development and networking, especially appealing to recent graduates. Fortunately, Calgary stands out as a hub for both, recognized as one of Startup Genome's Top 100 Emerging Ecosystems, while also playing host to a number of multinational enterprises. In Calgary, job seekers can find a wide range of opportunities.
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account