AI Solution Architect | Remote | #1529 - dotLinkers

The role: AI Solution Architect
Location: 100% Remote
Contract Type: Employment contract
Salary: up to 31 000/month

About the Company

We’re hiring for a company that delivers cutting-edge cloud, networking, and data center (on-prem) solutions, with a strong focus on ultra-low-latency performance, live streaming, and high-performance infrastructure.

Your responsibilities:

Design and deploy large-scale GPU clusters (both on-prem and cloud-based) using Ansible, Terraform, Kubernetes, and Slurm
Lead technical workshops, discovery sessions, and presentations for clients and internal teams
Develop and maintain Infrastructure-as-Code modules for automating provisioning and monitoring of GPU-based resources
Create technical documentation such as white papers, runbooks, webinars, and training materials
Collaborate with engineering and product teams to refine solutions based on customer feedback and operational needs

What we’re looking for:

At least 3 years of experience in DevOps or AI infrastructure involving GPUs
Proven track record of deploying multi-node, multi-GPU clusters at scale
Hands-on experience with infrastructure automation tools such as Ansible and Terraform
Solid knowledge of Kubernetes and Slurm for orchestrating and scheduling GPU workloads
Programming experience in Python or Go
Familiarity with the ML ecosystem — models, tools, and production-grade pipelines
Strong communication skills and ability to work across cross-functional teams

Nice to have:

Experience building scalable, high-availability inference infrastructure
Hands-on involvement with ML Ops pipelines (MLflow, REST APIs, PyTorch, TensorFlow, JAX)
Ability to move ML pipelines from proof-of-concept to production environments
Knowledge of GitOps, Docker, Helm, and CI/CD processes in the ML context
Familiarity with tools like Hugging Face transformers, Scikit-learn, and experiment tracking best practices