* Build and maintain CI/CD pipelines that support fast, reliable integration and deployment of GenAI systems across complex environments, such as AI agents and MCP clients/servers.
* Develop and deploy AI solutions and agentic frameworks, creating both command-line tools and interactive notebooks that support our team’s research and automation needs.
* Build and manage data pipelines and databases (SQL) for large-scale data handling, including collecting and curating datasets (e.g. adversarial prompts for model red-teaming) to support AI training and evaluation.
* Integrate research into production by implementing the latest generative AI advancements (including white-box LLM development) and developing evaluation frameworks (for example, agentic reasoning tests) as part of our automation workflow.
About ActiveFence:
ActiveFence is the leading provider of security and safety solutions for online experiences, safeguarding more than 3 billion users, top foundation models, and the world’s largest enterprises and tech platforms every day. As a trusted ally to major technology firms and Fortune 500 brands that build user-generated and GenAI products, ActiveFence empowers security, AI, and policy teams with low-latency Real-Time Guardrails and a continuous Red Teaming program that pressure-tests systems with adversarial prompts and emerging threat techniques. Powered by deep threat intelligence, unmatched harmful-content detection, and coverage of 117+ languages, ActiveFence enables organizations to deliver engaging and trustworthy experiences at global scale while operating safely and responsibly across all threat landscapes.
Must
* B.Sc. in Computer Science, Computer Engineering, or a related field (or equivalent hands-on experience.
* 5+ years of experience managing large-scale, high-performance cloud infrastructure in AWS in production environments.
* Expertise in scripting and automation using Python and shell. Proven ability to write clean IaC and CI/CD pipelines.
* Strong understanding of Linux systems, networking, and distributed system design.
* Ability to break down monolithic systems into scalable, loosely coupled services and microservices.
* Strong cross-functional communication and collaboration skills, with a DevOps mindset to drive best practices across teams.
* Hands-on experience deploying and scaling generative AI models (e.g. large language models or other AI/ML models) in production.
* Hands-on experience with deploying MCP clients/servers Nice-to-Have
* M.Sc. in Computer Science, Computer Engineering, or a related field.
* Experience with large-scale cluster management or HPC scheduling tools (e.g. Slurm).
* Knowledge of advanced observability and monitoring tools such as Prometheus and Grafana.
* Experience with cloud cost optimization (FinOps concepts) and exposure to infrastructure security tools or configuration management for compliance.
* Hands-on experience working with A2A protocol.






