- The role: AI Solution Architect
- Location: 100% Remote
- Contract Type: Employment contract
- Salary: up to 31 000/month
About the Company
We’re hiring for a company that delivers cutting-edge cloud, networking, and data center (on-prem) solutions, with a strong focus on ultra-low-latency performance, live streaming, and high-performance infrastructure.
Your responsibilities:
- Design and deploy large-scale GPU clusters (both on-prem and cloud-based) using Ansible, Terraform, Kubernetes, and Slurm
- Lead technical workshops, discovery sessions, and presentations for clients and internal teams
- Develop and maintain Infrastructure-as-Code modules for automating provisioning and monitoring of GPU-based resources
- Create technical documentation such as white papers, runbooks, webinars, and training materials
- Collaborate with engineering and product teams to refine solutions based on customer feedback and operational needs
What we’re looking for:
- At least 3 years of experience in DevOps or AI infrastructure involving GPUs
- Proven track record of deploying multi-node, multi-GPU clusters at scale
- Hands-on experience with infrastructure automation tools such as Ansible and Terraform
- Solid knowledge of Kubernetes and Slurm for orchestrating and scheduling GPU workloads
- Programming experience in Python or Go
- Familiarity with the ML ecosystem — models, tools, and production-grade pipelines
- Strong communication skills and ability to work across cross-functional teams
Nice to have:
- Experience building scalable, high-availability inference infrastructure
- Hands-on involvement with ML Ops pipelines (MLflow, REST APIs, PyTorch, TensorFlow, JAX)
- Ability to move ML pipelines from proof-of-concept to production environments
- Knowledge of GitOps, Docker, Helm, and CI/CD processes in the ML context
- Familiarity with tools like Hugging Face transformers, Scikit-learn, and experiment tracking best practices