TL;DR

Senior Software Engineer (GPU Inference Optimization): Building and optimizing high-performance software for large-scale GPU inferencing of language models with an accent on low-level performance tuning and integration with novel AI hardware. Focus on designing robust software in C/C++ and Python, identifying bottlenecks, and implementing kernel-level improvements for AI models in search advertising.

Location: Onsite in Beijing, China

Company

Microsoft AI focuses on building an online advertising ecosystem and intelligent systems using web-scale data to drive user satisfaction and advertiser ROI.

What you will do

  • Design, develop, and maintain high-performance software for GPU inference of language models.
  • Optimize model inference and training pipelines for speed, throughput, and memory efficiency.
  • Collaborate with platform teams to integrate and tune solutions on emerging accelerator stacks.
  • Profile workloads, identify bottlenecks, and implement kernel-level and system-level performance improvements.
  • Partner with stakeholders to translate requirements into scalable performance features.
  • Validate performance, stability, and correctness through benchmarking and testing.

Requirements

  • 4+ years of technical engineering experience with coding in C, C++, Python, CUDA, or ROCm.
  • 3+ years of practical experience optimizing GPU performance for applications.
  • Practical experience writing new GPU kernels.
  • Cross-team collaboration skills and desire to collaborate in a team of researchers and developers.
  • Bachelor’s Degree in Computer Science or related technical field.

Nice to have

  • Master’s Degree in Computer Science or related technical field with 2+ years of experience.
  • Experience in low-level performance analysis using GPU profiling tools such as NVIDIA Visual Profiler or Nsight Compute.
  • Familiarity with inference optimization frameworks such as TensorRT-LLM, SGLang, or vLLM.
  • Exposure to Deep Neural Network inference and experience with PyTorch, Tensorflow, or ONNX Runtime.

Culture & Benefits

  • Committed to cultivating an inclusive work environment.
  • Values of respect, integrity, and accountability to create a culture of inclusion.
  • Growth mindset, innovation, and collaboration to empower others.
  • Microsoft is an equal opportunity employer.
  • Assistance with religious accommodations and/or reasonable accommodation due to a disability.