Manual QA Engineer

About Alivo

At Alivo, we’re transforming how businesses connect with their customers. Using cutting-edge AI, we help companies deliver instant, intelligent, and personalized communication that drives growth and customer loyalty. Our mission is simple: empower businesses to thrive by making every interaction unforgettable.

Who We’re Looking For

We have an automated test suite that runs on every change — what we need is a sharp, methodical human to own the two things automation can't: manual/exploratory testing of the product, and the quality of our AI's conversations. A core part of our product runs on phone and chat communication, so judging whether those interactions are actually good — natural, accurate, effective, on-brand — sits at the heart of this role, alongside making sure features work and releases are safe. You'll be the human quality gate before releases ship, and the person who owns the question "is our AI good enough to go out?"

What You'll Do

  • Manually test new features and UI changes across the web app before release — exploratory testing, edge cases, and "does this actually work for a real user."
  • Verify behavior in staging / cloud environments, not just locally.
  • Write and follow test charters and checklists so each release gets a systematic, repeatable manual pass.
  • Review PRs and the accompanying demo videos, and verify the changes hold up.
  • File clear, reproducible bug reports — precise steps, expected vs. actual, screenshots/recordings.
  • Partner with developers on peer testing and help tighten the team's manual QA process over time.
  • Evaluate the phone and chat conversations the AI handles — is it natural, accurate, and effective, and does it handle objections and edge cases the way a strong human would?
  • Build and maintain evaluation sets: curated conversations and scoring rubrics that let us measure AI quality consistently and catch regressions when prompts or models change.
  • Red-team the AI — probe for failure modes, off-script behavior, hallucinations, and awkward or incorrect responses.
  • Track AI quality over time and feed clear, specific findings back to the team to improve prompts, behavior, and model choices.
  • Help define what "good" means for our conversations — the bar, the rubric, the acceptance criteria for shipping.

Qualifications

  • QA / software testing experience, or a strong, demonstrable aptitude for methodical, detail-oriented testing.
  • Sharp attention to detail and a habit of thinking about how things break.
  • Excellent English, written and spoken — you'll be evaluating real phone and chat conversations, so near-native fluency and an ear for natural, correct, effective language are essential.
  • Sales or customer-facing communication experience — you can tell when a conversation flows well, handles objections, and lands the way a strong rep would.
  • Good judgment about AI/LLM behavior — you can articulate why a response is good or bad, not just that it feels off.
  • Clear, unambiguous written communication; bug reports and quality findings that leave no guesswork (especially important for remote collaboration).
  • Reliable, organized, and self-directed.
  • Hands-on experience evaluating LLM outputs or building evals (rubrics, golden datasets, LLM-as-judge, regression testing).
  • Familiarity with test management / bug tracking tools (e.g. Linear, Jira, TestRail).
  • Ability to read CI output and understand what the automated suite already covers, so manual effort goes where it's actually needed.
  • Basic API testing (e.g. Postman), light SQL, or scripting.
  • Exposure to building or shipping AI / LLM products.

Interested? 

Email your resume to jobs@alivo.ai to apply.