About The Job
Mercor
connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include
Benchmark
,
General Catalyst
,
Peter Thiel
,
Adam D'Angelo
,
Larry Summers
, and
Jack Dorsey
.
Position:
Grafana Evaluation Task Designer
Type:
Contract
Compensation:
$90–$150/hour
Commitment:
10–15 hours/week
Role Responsibilities
- Design realistic, multi-step Grafana workflows, including dashboards, alerting rules, and data source configuration.
- Perform each workflow on a hosted Grafana instance to produce a reference trajectory.
- Write clear, specific task prompts with measurable outcomes for programmatic verification.
- Implement programmatic graders to check task completion accuracy.
- Review AI agent attempts, identify failures, and tag root causes.
- Calibrate task difficulty to ensure challenges are solvable, iterating on prompts based on model performance.
Qualifications
Must-Have
- 2+ years of daily, professional Grafana experience.
- Deep familiarity with PromQL, dashboard templating, alerting pipelines, and data source configuration.
- Ability to articulate workflows for programmatic verification.
- Comfort writing basic grading scripts in Python.
Preferred
- Experience with Grafana API automation.
- Kubernetes/infrastructure monitoring background.
- Familiarity with AI evaluation or benchmarking.
Application Process (Takes 20–30 mins to complete)
- Upload resume
- AI interview based on your resume
- Submit form
Resources & Support
- For details about the interview process and platform information, please check: https://talent.docs.mercor.com/welcome
- For any help or support, reach out to: [email protected]
PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.
,