Lead Software Engineer - AI/ML, AWS Neuron, Multimodal Inference Job in Seattle

1025 IT & Software Developer jobs in the US

Company Size

5k+

Company Type

Services

Exp Level

Senior

Job Type

Full-Time

Language

English

Visa sponsorship

Requirements

Must:

- A minimum of 5 years of professional software development experience, excluding internships. - A Bachelor’s degree in Computer Science or a related field. - At least 5 years of experience in designing or architecting new and existing systems focusing on design patterns, reliability, and scaling. - Solid understanding of machine learning fundamentals and large language models (LLMs), including architecture, training, and inference lifecycles, along with practical experience in executing optimizations for enhanced model performance. - Proficient in software development using C++ and Python (expertise in at least one language is essential). - Strong grasp of system performance, memory management, and principles of parallel computing. - Proficient in debugging, profiling, and applying best software engineering practices in large-scale environments.

Technologies

AWS

Architect

Backbone

CUDA

GitHub

Hardware

Support

LLM

Machine Learning

PyTorch

Responsibilities

- Lead efforts to build distributed inference support for PyTorch within the Neuron SDK. - Optimize models to achieve peak performance and efficiency when running on customer AWS Trainium and Inferentia hardware. - Design, develop, and enhance machine learning models and frameworks for deployment on specialized ML hardware accelerators. - Engage in all phases of the ML system development lifecycle including designing distributed computing architectures, implementation, performance analysis, hardware-specific optimizations, testing, and production deployment. - Construct infrastructure for systematic analysis and onboarding of diverse model architectures. - Create and implement high-performance kernels and features for ML operations, using the Neuron architecture and programming models. - Analyze system-level performance across different generations of Neuron hardware. - Execute detailed performance analysis utilizing profiling tools to pinpoint and rectify bottlenecks. - Implement optimizations like fusion, sharding, tiling, and scheduling. - Conduct thorough testing, including unit and end-to-end model testing with continuous deployment through pipelines. - Collaborate directly with customers to optimize their ML models on AWS accelerators. - Work with teams to develop innovative optimization techniques.

Description

As part of the Inference Enablement and Acceleration team, we cultivate a culture of innovation where experimentation is encouraged, and measurable impact is our focus. We prioritize collaboration, technical ownership, and mutual learning. Our team spans various experience levels, and we foster an environment that supports knowledge-sharing and mentorship. Our senior engineers provide one-on-one mentorship and constructive code reviews, while we focus on your professional growth by assigning projects that enhance your engineering capabilities. In this role, you will collaborate with a diverse team of applied scientists, systems engineers, and product managers to push the boundaries of inference capabilities for Generative AI applications. Your day-to-day work will involve troubleshooting performance issues, optimizing memory usage, and shaping the evolution of Neuron's inference stack across both Amazon and the Open Source Community. You will design and code solutions that enhance software architecture efficiencies, create metrics, implement automation, and resolve software defects. Additionally, you will deliver impactful solutions to our expansive customer base, engage in design discussions, conduct code reviews, and communicate with internal and external stakeholders. With a startup-like environment, you will always work on the most critical initiatives, driving business decisions supported by your technical input. Join us in tackling significant and intriguing infrastructure challenges within the AI/ML domain.

Something wrong or incorrect with this job? Tell us in the chat 💬 on the right ➡️

IT & Software developer jobs in the USMachine-Learning Developer jobs in the USMachine-Learning Developer jobs Seattle, WA

You can find Machine Learning Engineer salaries in the United States here.

How many Machine Learning Engineer jobs are in the United States?

Currently, there are 1025 ML, AI openings. Check also: TensorFlow jobs, Python jobs, Computer-Vision jobs - all with salary brackets.

Is the US a good place for Machine Learning Engineers?

The US is one of the best countries to work as a Machine Learning Engineer. It has a vibrant startup community, growing tech hubs and, most important: lots of interesting jobs for people who work in tech.

Which companies are hiring for Machine Learning Engineer jobs in the United States?

Sperasoft, Archon Systems Inc, Giesecke+Devrient, LGS, une Société IBM / an IBM Company, East Coast Coders, Ideal Vacuum, Nexio among others, are currently hiring for ML, AI roles in the United States.

The company with most openings is Leidos as they are hiring for 89 different Machine Learning Engineer jobs in the United States. They are probably quite committed to find good Machine Learning Engineers.