Platform Engineer – AI Supercompute Infrastructure (Networking & Systems) | Cloud & Engineering - - 19400

General Information

Position

Platform Engineer – AI Supercompute Infrastructure (Networking & Systems) | Cloud & Engineering

Work arrangement

Plný pracovní úvazek

City

Gdaňsk, Katovice, Krakov, Lodž, Poznaň, Řešov, Štětín, Varšava, Vratislav

Country

Polsko

Department

Consulting

Team

Engineering, AI & Data

Area of interest

AI - Artificial Intelligence, Cloud, Consulting

Way of work

Vzdáleně

Description & Requirements

Kogo szukamy

We are a technology consulting firm building and operating next-generation AI supercompute infrastructure for the world's most ambitious organizations. As AI Platform Architect and Engineer, you will work hands-on across the full platform stack, including infrastructure, AI platform software, with a particular focus on the architecting and implementing physical and logical platforms layer that makes large-scale GPU clusters perform at their theoretical limits to serve AI platform for users.

As a repeatedly awarded NVIDIA Consulting Partner of the Year in EMEA, we hold one of the deepest and most recognized NVIDIA partnerships in the region. This gives our engineers privileged access to adoption programmes and NVIDIA's engineering teams.

You will work with technology and at a scale that most engineers won't encounter for years.

This is a role for someone at a mid-career stage in platform, infrastructure architecture and engineering. You have solid foundations and real hands-on experience, ability to drive architecture and client discussions, then AI platform set up, and you are ready to level up by working on problems of genuine complexity and scale. You know enough to know what you don't know yet, and you are hungry to close that gap fast.

What We Expect

· 5–8 years of hands-on experience in systems architecture and engineering

· Understanding of overall systems architecture, platforms, from infrastructure through operating systems, containerization, monitoring, etc.

· Ability to drive/participate in architecture discussions internally and with clients, with ability to quickly close knowledge gap where needed

· Familiarity with Linux networking: interfaces, bridges, bonding, namespaces, tc/qdisc, and kernel network tuning

· Basic hands-on experience with Kubernetes or Slurm: enough to navigate cluster operations, understand pod scheduling, and troubleshoot node-level issues

· Knowledge and experience with AI platforms, MLOps

· Experience with at least one monitoring stack: Prometheus, Grafana, Zabbix, or similar

· Experience with network automation and IaC

· Comfort of working directly with physical hardware: servers, switches, cabling and data centre environments

· Understanding of networking fundamentals: OSI model, switching and routing (BGP, OSPF), VLANs, MTU, and traffic engineering

· Working knowledge of high-performance networking technologies: InfiniBand, RDMA, RoCE, or equivalent HPC interconnects

· EU Work Permit

Bonus Experience

· Exposure to NVIDIA products: hardware and AI platform

· Familiarity with NCCL tuning, collective communication patterns, or distributed training networking requirements

· Hands-on time with DCGM, iperf3, perftest, or ibdiagnet for infrastructure benchmarking and validation

· Exposure to container networking

· Any experience in a consulting or client-facing technical role

Twoja przyszła rola

Technical Leader - in this role you will oversee full project lifecycle. Starting from technical advisory - leading customer discussions and mapping business needs into technical platform capabilities. Followed by technical part of strategy, solution deployment and optimization/maintenance services.
Pre-sale work - driving architectural discussions related to platform capabilities, physical and logical requirements. Right sizing of the solution. Driving design and supervising physical installations.
Physical platform configuration. You will not be pulling cables or mounting servers yourself, but you will own the correctness of it. You will define standards, review configurations, oversee data centre teams and third-party contractors performing physical installation, and be accountable for the outcome — every cluster, every server, every port, every cable, every label
Logical platform configuration. You will design and implement full environment starting from virtualization, automation, OS, networking, up to container level, including performance tests.
Network Fabric Management. Configure and operate InfiniBand (IB subnet manager, QoS policies, adaptive routing) and RoCEv2 fabrics purpose-built for GPU-to-GPU communication and distributed training
Network Performance Optimisation. Profile and tune network throughput, latency, and congestion for AI workloads; work with NCCL, GPUDirect RDMA, and high-bandwidth interconnects including NVLink and NVSwitch
Cluster Platform Operations. Support deployment, day-2 operations, and troubleshooting of Kubernetes and Slurm clusters; contribute to OS-level configuration, driver management, and node lifecycle automation
Monitoring & Observability. Instrument network and cluster health using Prometheus, Grafana, and DCGM Exporter; build dashboards that surface GPU utilisation, link errors, and fabric saturation with rigor and clear documentation

Most platform engineers spend years in environments where the network is someone else's problem. Here, it is front and centre because at GPU cluster scale, the network is the performance. You will configure fabrics that move terabits per second between hundreds of GPUs, diagnose issues that block multi-million dollar training runs, and build the intuition for distributed systems that takes most engineers a decade to develop. Backed by our standing as NVIDIA's top EMEA partner, you will learn directly from NVIDIA's hardware and software teams and grow faster than almost any other environment in the industry can offer.

To oferujemy

Flexible working hours,
Permanent employment or contract,
Medical and health insurance,
Multisport and other lifestyle benefits,
Language courses,
Friendly coworkers & team spirit,
Multiple geographies and clients,
Work for well-known brands,
Exposure to trailblazing business and technology projects,
A place in the first line of a digital transformation,
Everyday opportunities to influence how and where we do our business,
A development path to fit your needs.

Ścieżka rekrutacji

We kindly ask you to upload your CV in English.

Shortlisted candidates will be contacted for the interviewing process.

If your CV would be interesting for us there will be a few steps:

* Quick HR call or meeting;

* One or two HR and technical interviews with our colleagues from the respective team;

* Final decision.

O Deloitte

https://www.deloitte.com/pl/pl/careers/deloitte-life/culture-book.html

Deloitte to różnorodność ludzi, doświadczeń, branż i usług, w których je realizujemy - w 150 krajach na świecie. To wyzwania intelektualne, dobry start zawodowy, możliwości ciągłego rozwoju i zebrania cennych życiowych doświadczeń. Musisz zrobić pierwszy krok - postawić kropkę na końcu wysyłanego CV, a potem podpisywanej umowy. Deloitte to po prostu dobry wybór. I kropka.

O zespole

Our Cloud Engineering teams design and deliver interesting cloud projects for clients in Poland and abroad in areas of cloud development, DevOps, integration, migration, data management, infrastructure and others. We help our clients to strategize, design and implement and migrate solutions with use of modern cloud technologies.

#LI-MB3