AI Ops Engineer

Company: Insight42
Location: Remote (within Germany)
Employment Type: Full-time

About Insight42
Insight42 is a pioneering data and AI technology company focused on building intelligent automation frameworks, secure AI infrastructure, and data platforms for next-generation digital products. Our mission is to merge cloud engineering, MLOps, and generative AI capabilities into resilient, scalable, and efficient systems.

Role Overview
We are seeking an experienced AI Ops Engineer to design and manage the infrastructure powering Insight42’s GenAI and large language model (LLM) workloads. This role involves architecting self-hosted GenAI environments, optimizing GPU/CPU orchestration, and automating model deployment across edge and cloud systems.

Responsibilities

Design and maintain self-hosted GenAI model deployments in secure cloud environments
Build and automate workflows for AI model lifecycle management (training, tuning, and versioning)
Implement observability and reliability solutions for AI workloads
Collaborate with MLOps, DevOps, and Data teams to align infrastructure with model requirements
Integrate models into agentic frameworks and autonomous system ecosystems
Optimize infrastructure for performance, cost efficiency, and scalability
Establish CI/CD and GitOps practices for AI and ML system delivery

Requirements

4+ years of experience in DevOps, MLOps, or AI infrastructure engineering
Hands-on experience deploying self-hosted GenAI models (LLMs, multimodal, or diffusion models)
Strong understanding of containerization, microservices, and orchestration (Docker, Kubernetes)
Familiarity with agentic ecosystems, AI orchestration frameworks, and vector databases
Knowledge of GPU orchestration, distributed inference, or on-prem AI serving solutions
Solid programming and automation skills (Python, Bash, or Go)
Cloud experience with Azure, GCP, or hybrid infrastructure setups
Fluent communication in English; German is a plus

Nice to Have

Experience with open-source AI frameworks (Ollama, vLLM, FastAPI, LangChain, Haystack)
Understanding of AI observability and monitoring tools
Exposure to model compression, quantization, and inference optimization
Awareness of data privacy, security, and compliance in AI systems

What We Offer

Fully remote work flexibility (within Germany)
Competitive salary and performance-based incentives
Opportunity to build and operate AI infrastructure at scale
Collaborative and forward-thinking work culture driven by innovation

Apply