Adverise with usGet your job listing or Product in front of thousands of AI trainers.
Contact us
We are building a benchmark dataset to evaluate AI models on professional document understanding and instruction following within the Education domain.
Tasks consist of complex, multi-step requests grounded in real-world workspace files (technical drawings, project specifications, engineering reports), web search, and code execution — each paired with a clearly defined ground truth output and an objective evaluation rubric. You will be responsible for authoring tasks that test an AI's ability to interpret engineering documentation, follow multi-step instructions, and produce precise, well-structured outputs.
We expect a minimum commitment of 15–20 hours per week.
Ideal candidates have 3+ years of hands-on experience in one or more of the following sub-domains:
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.
Mercor partners with leading AI labs and enterprises to train frontier models using human expertise. You will work on projects that focus on training and enhancing AI systems. You will be paid competitively, collaborate with leading researchers, and help shape the next generation of AI systems in your area of expertise.