IN EmploymentAlert | GPU Optimization Engineer
Skip to Main Content

Job Title


GPU Optimization Engineer


Company : BitOoda


Location : Bengaluru, Karnataka


Created : 2025-01-09


Job Type : Full Time


Job Description

Experience Level : SeniorAbout Us : We are an innovative company at the forefront of high-performance computing (HPC) and AI, building cutting-edge solutions powered by GPUs and specialized accelerators. We’re looking for a highly skilled GPU Optimization Engineer to design, develop, and optimize software running directly on bare-metal systems, leveraging the full potential of NVIDIA GPUs, AMD GPUs, and other accelerators.Responsibilities : Architect, develop, and optimize high-performance software for GPU-accelerated systems. Design and implement software that directly interacts with GPU hardware, including NVIDIA CUDA, AMD ROCm, or OpenCL. Optimize existing ML/DL frameworks (e.g., PyTorch, TensorFlow) for maximum performance on NVIDIA and AMD GPUs. Work with heterogeneous systems, integrating specialized accelerators like AMD GPUs or custom chips (e.g., SambaNova, Cerebras). Conduct profiling and tuning to maximize GPU utilization, minimize bottlenecks, and achieve peak system performance. Collaborate with hardware engineers to exploit features like NVLink, NVSwitch, and RDMA for seamless GPU interconnectivity. Develop scalable compute pipelines and contribute to performance benchmarking.Requirements : Education : Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (or equivalent experience). Experience : 5+ years in GPU-accelerated software development, HPC, or related domains.Technical Skills : GPU Programming : Proficiency in CUDA for NVIDIA GPUs. Experience with AMD ROCm, HIP, or OpenCL for AMD GPUs. Frameworks : In-depth knowledge of PyTorch or TensorFlow with GPU optimization. Familiarity with ONNX and hardware-agnostic ML frameworks. Languages : Proficient in Python, C++, and/or C for performance-critical applications. Profiling & Optimization : Hands-on experience with GPU profilers such as NVIDIA Nsight, nvprof, and rocprof. Ability to identify and mitigate bottlenecks in GPU pipelines. System Knowledge : Strong understanding of bare-metal systems, GPU drivers, and OS-level configuration. Experience with containerized GPU environments (NVIDIA Docker, AMD ROCm containers).Preferred Skills : Experience with multiple accelerator platforms, such as SambaNova or Graphcore. Familiarity with distributed computing and interconnects like NVLink, InfiniBand, or PCIe. Knowledge of compiler optimization (LLVM, TVM, XLA). Familiarity with Kubernetes for GPU cluster management.What We Offer : Competitive salary and benefits package. Opportunity to work on cutting-edge HPC and AI systems. Collaborative and innovative work environment with talented engineers. Flexible work arrangements (remote or hybrid options).Join us and help redefine the limits of GPU computing!