Title: ESI GPU/ML/DL Performance Test Engineer
Location: Austin, TX 78753
Duration: 9 months
Primary Focus:
Devices: Graphics Processing Unit (GPGPU/GPU), Intelligence Processing Unit (IPU)
Architectures: Deep Learning (DL), Machine Learning (ML)
Frameworks: Tensorflow, Caffe, MXNet, PyTorch, Caffe2, Uber Horovod
Models: ResNet50, Inception V3, GoogleNet
Platform/API: NVIDIA CUDA, NVIDIA NCCL Library
- Key mandatory skills/experience in Deep Learning, Machine Learning, GPUs and IPUs required.
Skills, Experience, and Tools:
- Strong experience analyzing requirements, functional specifications, design specifications and other technical documentation and using that to develop test strategies and plans.
- Experience with installing, debugging, and executing open source software frameworks in a Linux environment (focus of GPUs and machine learning, deep learning ideal).
- Ability to develop Linux / BASH shell scripts, Perl and Python packages to support testing and deployment activities in different environments (focus of GPUs and machine learning, deep learning ideal).
- Understanding of Server/Storage products, including devices and technologies such as Server BIOS and BMC, Dell iDRAC, PCIe Switches, NVMe and SAS/SATA drives, GPGPUs, PDUs, Serial Switches, etc., and deployment / management tools and capabilities. DMTF’s Redfish knowledge a plus. (focus of GPUs and machine learning, deep learning ideal).
- Have good understanding of Linux and Windows environments and applicability of solutions based on Linux and Windows environment on top of enterprise class dense and traditional server platforms.
- Work with the development teams to triage and root cause firmware issues
- Strong experience in testing embedded server control and management systems.
- Experience with FPGA and/or CPLD architecture, coding and related system architectural design and implementation a plus.
- Working understanding of PCIe topologies and system impacts of PCIe device interaction.
- Working knowledge of the following standards: SAS/SATA, PCIe, DDR3/4, I2C, NCSI, Ethernet
- Strong ability to identify potential product weakness and be able to provide design guidance and recommendations for improvements to subsequent redesigns
- Working independently under aggressive timelines.
In general between 8-10 years of overall engineering experience.