Staff Machine Learning Engineer

Worldwide Salaried Open

Job Description

Staff ML Engineer: Agentic AI Team: AI Agents | Location: Melbourne / Sydney / Remote (AU) What we have built We run production AI agents that autonomously resolve customer service tickets across 100,000+ Zendesk accounts. These agents take a customer issue, plan a multi-step resolution, execute real actions (refunds, order modifications, escalations) through live APIs, and close the ticket without a human in the loop. The agent core uses a proprietary iterative architecture: the agent decomposes goals into plans, pulls reusable skills from a registry, executes, evaluates the outcome, and refines. Each iteration feeds back into the next attempt. We have a working self-learning mechanism where successful resolution patterns are synthesized into new skills and fed back into the registry, so the system improves from its own execution history. On multi-step tool-use benchmarks (GAIA-class), our agents perform at parity with the best published results. Our internal evaluation suite runs 158+ scenario-based tests from real Zendesk tickets, scored continuously through Braintrust with regression detection on every deploy. What we need help with Pushing the architecture further. The iterative planner works, but there are open questions we have not solved yet: how to handle plan decomposition when the goal is ambiguous, how to manage interference between memory tiers under concurrent sessions, how to make skill acquisition more selective (the agent acquires skills too eagerly today), and how to design multi-agent delegation patterns where one agent hands off subtasks to specialized agents via A2A (the Agent-to-Agent protocol). Domain-specialized agent models. We are building toward training our own models, specialized for customer service resolution via RL on production trajectories. The data pipeline is already being instrumented (resolution outcomes, escalation patterns, user satisfaction signals). The next step is the RL training infrastructure itself: reward curricula, rollout systems, and the feedback loops that turn a capable base model into a specialist that matches or beats frontier models on our task distribution at significantly lower inference cost. This is a 6-12 month build, and we need someone who can own both the science and the systems. Hardening evaluation. We run 158+ scenario evals continuously with regression detection, but multi-turn evaluation and automated trajectory analysis (pinpointing where reasoning diverged) are still early. We need quality gates that block deploys when agent performance drops, and we need them integrated into CI, not run as an afterthought. Guardrails at enterprise scale. The threat surface for autonomous agents includes tool misuse, cascading action chains, prompt injection, and hallucination loops that burn tokens before anyone notices. We need multi-layered defenses with supervisor patterns, capabilities-based access control, and output validation that works across thousands of concurrent sessions without adding meaningful latency. What we are looking for 5+ years building production ML/AI systems, with hands-on experience in agent architectures (planning, tool dispatch, memory, error recovery). If you have only used LangChain tutorials, this is not the right fit. Strong evaluation instincts. You understand why public benchmarks diverge from production performance and you have built internal evals to close that gap. OPTIONAL: Experience with or genuine depth in RL for language models: reward shaping, online/offline tradeoffs, reward hacking as a diagnostic signal. We are building toward domain-specialized training and need someone who can lead that work. Python and PyTorch fluency. Familiarity with at least one agent framework, combined with the judgment to know when to build custom. The intelligent heart of customer experience Zendesk software was built to bring a sense of calm to the chaotic world of customer service. Today we power billions of conversations with brands you know and love. Zendesk believes in offering our people a fulfilling and inclusive experience. Our hybrid way of working, enables us to purposefully come together in person, at one of our many Zendesk offices around the world, to connect, collaborate and learn whilst also giving our people the flexibility to work remotely for part of the week. As part of our commitment to fairness and transparency, we inform all applicants that artificial intelligence (AI) or automated decision systems may be used to screen or evaluate applications for this position, in accordance with Company guidelines and applicable law. Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please click here. Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre-employment testing, or otherwise participate in the employee selection process, please send an e-mail to [email protected] with your specific accommodation request. Apply To This Job

Apply now

Staff Machine Learning Engineer

Job Description

More jobs

Sr. Full Stack Engineer (SaaS Applications)

Sr. Full Stack Engineer (SaaS Applications)

Medibank - Customer Solutions Specialist

Integration Engineer

Sr Director - Analyst, AI Technology and Operations Evolution (Remote US)

Software Engineer- Ignition SCADA

Senior Business Development Representative - US

REMOTE - Citrix Engineer

Admissions Application Review Specialist - Office of Admissions

Senior Software Engineer

Administrative Assistant Office Manager

Experienced Data Entry Specialist – Remote Opportunity at arenaflex

Backend Developer (PostgreSQL – PL/pgSQL Specialist)

Video AI Editor - 26052801

Experienced Customer Service Representative – Gonzales Branch

Flexible Part-Time Research Study Participant – Remote, Online & In-Person Opportunities | Earn Up to $790/Week | No Experience Required

Experienced Full Stack Customer Service Representative – Global Airline Support

` Start This Week | Fully Remote Job | Entry Level Leadership | No Experience Needed

Experienced Customer Service Representative – Providing Exceptional Amazon Customer Experience

Customer Service Representative for Travel – Redefining Corporate Travel Experiences with arenaflex