AI Quality and Evaluation Manager

Company

Lorien

Location

London

Employment Hours

Full Time

Employment Type

Contract

Salary

Job Requirements/Description

AI Quality & Evaluation Manager (Contract)

Location: Hybrid working - Blackfriars 3 days per weekContract: 6 months, Outside IR35

Are you passionate about building the future of AI quality? Do you thrive in hands-on roles where you can shape frameworks from the ground up and make a real impact? We're looking for an experienced AI Quality & Evaluation Manager to join our team on a contract basis and lay the foundations for robust, reliable, and user-focused AI services across our business.

What You'll Do

Design and implement a comprehensive AI testing and evaluation framework for all AI solutions, including LLM-based tools, RAG systems, and third-party platforms.
Define and document quality standards for semantic accuracy, factual consistency, bias, tone, and relevance.
Develop reusable testing templates, data sets, and evaluation methods that can be scaled and maintained by internal teams.
Run hands-on testing of AI prototypes and production tools to assess technical performance and business value.
Collaborate with business users to guide practical testing and feedback processes.
Deliver training and upskilling materials to empower internal staff to sustain the framework after your contract ends.
Support vendor evaluations and POC assessments with robust test protocols.
Establish baseline metrics and dashboards to measure ongoing AI quality and relevance.
Work closely with engineering and product leads to embed testing into delivery workflows.
Champion responsible AI practices to ensure fairness, transparency, and user trust.

What You'll Bring

Strong hands-on experience in testing and evaluation of AI or software systems, ideally with NLP or LLM-based applications.
Understanding of prompt evaluation, semantic search, and LLM behaviour (accuracy, hallucination, bias, tone, etc.).
Familiarity with tools like Trulens, HumanLoop, PromptLayer, or similar; experience designing QA approaches for GenAI environments.
Knowledge of modern AI architectures (RAG pipelines, embeddings, API integrations such as OpenAI, Azure OpenAI, Anthropic).
Experience designing and implementing structured test regimes in fast-evolving contexts.
Excellent communication and facilitation skills, engaging both technical and business audiences.
Proven ability to create sustainable frameworks, documentation, and training materials.

Who You Are

A builder who loves creating practical, scalable solutions.
Hands-on and analytical, balancing experimentation with process.
Collaborative and empathetic, bridging technical and non-technical teams.
User-focused, driven by delivering real value.
Committed to responsible AI, fairness, and transparency.

Ready to shape the future of AI quality with us?Apply now and help us ensure our AI-enabled services are accurate, consistent, and trusted by all.

Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy.

Apply Now