Jobless Developer
Phizenix logo
Phizenix

Posted 2 months ago

Open

Manager - Software Engineering Kernels

BengaluruOn-siteFull-time

AI Summary

Leads development and optimization of software kernels for AI hardware, mapping algorithms to architectures, and collaborating with compiler, ML, and hardware teams to deliver scalable software in tight timelines.

About this role

What you will do:

The role requires you to be part of the team that helps productize the SW stack for our AI compute engine. As part of the software team, you will be responsible for the development, enhancement, and maintenance of software kernels for next-generation AI hardware. You possess experience building software kernels for HW architectures. You possess a very strong understanding of various hardware architectures and how to map algorithms to the architecture. You understand how to map computational graphs generated by AI frameworks to the underlying architecture. You have had past experience working across all aspects of the full stack toolchain and understand the nuances of what it takes to optimize and trade off various aspects of hardware-software co-design. You are able to build and scale software deliverables in a tight development window. You will work with a team of compiler experts to build out the compiler infrastructure, working closely with other software (ML, systems) and hardware (mixed signal, DSP, CPU) experts in the company.

What you will bring:

Minimum:
MS or PhD in Computer Engineering, Math, Physics, or related degree with 10+ years of industry experience Strong grasp of computer architecture, data structures, system software, and machine learning fundamentals Proficient in C/C++ and Python development in Linux environment and using standard development tools Experience implementing algorithms in high-level languages such as C/C++ and Python Experience implementing algorithms for specialized hardware such as FPGAs, DSPs, GPUs, AI accelerators using libraries such as CuDA etc. Experience in implementing operators commonly used in ML workloads—GEMMs, Convolutions, BLAS, SIMD operators for operations like softmax, layer normalization, pooling, etc. Experience with development for embedded SIMD vector processors such as Tensilica. Self-motivated team player with a strong sense of ownership and leadership

Preferred:
Prior startup, small team, or incubation experience.
Experience with ML frameworks such as TensorFlow and/or PyTorch.
Experience working with ML compilers and algorithms, such as MLIR, LLVM, TVM, Glow, etc.
Experience with a deep learning framework (such as PyTorch or TensorFlow) and ML models for CV, NLP, or recommendation.
Work experience at a cloud provider or AI comp.

Skills

AI AcceleratorsBLAsC++Compiler InfrastructureConvolutionsCuDA (CUDA)DSPsEmbedded SIMD Vector ProcessorsFFT / Linear Algebra Routines (if Mentioned In Context)FPGAsGEMMGlowGPUsLayer NormalizationLinux DevelopmentLLVMML FrameworksMLIRML Models For CV/NLP/recommendationPoolingPythonPyTorchSIMDSoftmaxTensilicaTensorFlowTVM

Explore related jobs

Browse these categories