TL;DR

AI QA Engineer: Developing and executing AI model evaluation strategies and implementing automated/manual testing for LLM-based applications with an accent on detecting biases and hallucinations. Focus on optimizing model performance and ensuring high-quality AI outputs through debugging and root cause analysis.

Location: Hybrid in Armenia

What you will do

Develop and execute AI model evaluation pipelines, ensuring accuracy, consistency, and fairness.
Implement automated and manual testing for LLM-based applications.
Work closely with AI engineers to debug failures, identify root causes, and optimize model performance.
Collaborate with AI Engineers to integrate testing into early-stage development.
Build and manage test datasets, ensuring high-quality, diverse, and balanced samples.
Develop synthetic data pipelines to enhance model evaluation.

Requirements

Experience with AI/ML testing frameworks and LLM evaluation methodologies.
Strong understanding of LLM behaviors, biases, failure modes, and edge cases.
Proficiency in Python and familiarity with ML testing frameworks (e.g., PyTest, Unittest).
Experience with test dataset management and annotation tools.
Familiarity with synthetic data generation and adversarial testing techniques.
Strong problem-solving and debugging skills to analyze AI failures and inconsistencies.
English: B2 required with the ability to evaluate AI-generated text and improve prompts.

Culture & Benefits

Krisp is an Equal Opportunity Employer.
All applicants are considered regardless of race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation.
No tolerance for discrimination or harassment of any kind.
All employees and contractors treat each other with respect and empathy.

Hiring process

Apply through the provided form.
Only shortlisted candidates will be contacted for the next stages.

AI QA Engineer

Описание вакансии

TL;DR

What you will do

Requirements

Culture & Benefits

Hiring process

Мэтч