AI Ops Engineer
Company: Insight42
Location: Remote (within Germany)
Employment Type: Full-time
About Insight42
Insight42 is a pioneering data and AI technology company focused on building intelligent automation frameworks, secure AI infrastructure, and data platforms for next-generation digital products. Our mission is to merge cloud engineering, MLOps, and generative AI capabilities into resilient, scalable, and efficient systems.
Role Overview
We are seeking an experienced AI Ops Engineer to design and manage the infrastructure powering Insight42’s GenAI and large language model (LLM) workloads. This role involves architecting self-hosted GenAI environments, optimizing GPU/CPU orchestration, and automating model deployment across edge and cloud systems.
Responsibilities
- Design and maintain self-hosted GenAI model deployments in secure cloud environments
- Build and automate workflows for AI model lifecycle management (training, tuning, and versioning)
- Implement observability and reliability solutions for AI workloads
- Collaborate with MLOps, DevOps, and Data teams to align infrastructure with model requirements
- Integrate models into agentic frameworks and autonomous system ecosystems
- Optimize infrastructure for performance, cost efficiency, and scalability
- Establish CI/CD and GitOps practices for AI and ML system delivery
Requirements
- 4+ years of experience in DevOps, MLOps, or AI infrastructure engineering
- Hands-on experience deploying self-hosted GenAI models (LLMs, multimodal, or diffusion models)
- Strong understanding of containerization, microservices, and orchestration (Docker, Kubernetes)
- Familiarity with agentic ecosystems, AI orchestration frameworks, and vector databases
- Knowledge of GPU orchestration, distributed inference, or on-prem AI serving solutions
- Solid programming and automation skills (Python, Bash, or Go)
- Cloud experience with Azure, GCP, or hybrid infrastructure setups
- Fluent communication in English; German is a plus
Nice to Have
- Experience with open-source AI frameworks (Ollama, vLLM, FastAPI, LangChain, Haystack)
- Understanding of AI observability and monitoring tools
- Exposure to model compression, quantization, and inference optimization
- Awareness of data privacy, security, and compliance in AI systems
What We Offer
- Fully remote work flexibility (within Germany)
- Competitive salary and performance-based incentives
- Opportunity to build and operate AI infrastructure at scale
- Collaborative and forward-thinking work culture driven by innovation