JOB DETAILS

GPU Communication - Team Lead

CompanyDRIVENETS
LocationTel Aviv
Work ModeRemote
PostedMay 17, 2026
About The Company
DriveNets is a leader in high-scale networking software for AI infrastructure and service providers. The company pioneered a disaggregated networking architecture that transforms the economics of large-scale networks while maximizing performance, utilization, and operational efficiency. DriveNets-powered networks are deployed by global leaders, including AT&T and Comcast, supporting more than 30% of total U.S. internet traffic. DriveNets AI Fabric delivers full-stack networking for AI infrastructures, providing the highest-performance, Ethernet-based alternative to InfiniBand. The solution is deployed by hyperscalers, NeoClouds, and enterprises worldwide. Founded by Ido Susan and Hillel Kobrinsky, two successful telco entrepreneurs, DriveNets Network Cloud is the leading open disaggregated networking solution based on cloud-native software running over standard white boxes. Over three funding rounds, DriveNets raised $587 million. Its solutions are used by tens of service providers globally and are in proof-of-concept and lab trials at dozens of operators and hyperscalers, consistently ranking #1 in trials for breadth of capabilities and solution quality. AT&T, the largest backbone in the US, deployed DriveNets Network Cloud across its core network, and DriveNets is currently transporting more than 52% of AT&T’s core network traffic. DriveNets is engaged with over 100 Tier-1 operators and cloud-providers on large projects in North America, Asia and Europe.
About the Role

Location: Tel Aviv

#Hybrid

DriveNets is a leader in high-scale disaggregated networking solutions. Founded in 2015, DriveNets modernizes the way service providers, cloud providers and hyperscalers build networks. Supporting the largest network in the world, more than half of AT&T’s backbone traffic is running on DriveNets’ Network Cloud open disaggregated architecture. Raising $587 million in three funding rounds, DriveNets is disrupting the networking market from high-scale architecture to AI platforms, and is bringing onboard the most talented people. We are seeking people that want to make an impact on the world’s leading communication networks and are experienced in networking architecture or AI infrastructure solutions.

Job Summary

We are seeking an experienced technical leader to head our collective communication library development team. This role involves leading a team of engineers in developing high-performance collective communication implementations for multi-NPU and multi-node AI workloads.

Key Responsibilities

  • Lead the design and development of collective communication primitives (All-Reduce, All-to-All, Gather/Scatter and etc)
  • Architect scalable communication protocols for multi-NPU and multi-node systems
  • Optimize communication performance for NPU architectures
  • Provide technical leadership to the team members in NPU programming, distributed systems, and communication protocols
  • Work with a success-driven worldwide international team (Network, NPU, QA, AI, DL/ML Framework)
  • Define project milestones, deliverables, and technical roadmaps
  • Ensure compatibility with major AI frameworks (PyTorch, TensorFlow, JAX) 

Requirements

Required Qualifications

  • BSc/MSc in computer science/computer engineering or equivalent
  • 8+ years of experience in systems programming and distributed computing
  • 5+ years of leadership experience managing technical teams
  • Expert-level C/C++ programming with focus on performance optimization
  • Experience with NPU programming (Triton / CUDA / HIP / OpenCL)
  • Deep understanding of distributed systems, communication protocols, and network programming
  • Experience with DL/ML frameworks (PyTorch, TensorFlow) and distributed training / inferencing
  • Experience with performance profiling and optimization tools
  • Strong communication and interpersonal skills

Preferred Qualifications 

  • Experience with NPU communication library development
  • Contributions to open-source projects (PyTorch, TensorFlow, communication libraries)
  • Familiarity with containerization and orchestration
  • Interoperability experience with partners, vendors and external teams


Key Skills
C/C++ ProgrammingNPU ProgrammingDistributed SystemsCommunication ProtocolsNetwork ProgrammingDeep LearningMachine LearningPerformance OptimizationTechnical LeadershipAI FrameworksProject ManagementInterpersonal SkillsContainerizationOrchestrationOpen-Source ContributionsPerformance Profiling
Categories
TechnologyEngineeringManagement & LeadershipData & AnalyticsSoftware
Job Information
📋Core Responsibilities
Lead the design and development of collective communication primitives and architect scalable communication protocols for multi-NPU and multi-node systems. Provide technical leadership to team members and ensure compatibility with major AI frameworks.
📋Job Type
full time
📊Experience Level
10+
💼Company Size
544
📊Visa Sponsorship
No
💼Language
English
🏢Working Hours
40 hours
Apply Now →

You'll be redirected to
the company's application page