2026-03-17
Partnering with Product: How R&D Leaders Can Shape Customer-Centric AI Roadmaps
There's a pattern I've seen play out across the industry: the CEO reads about AI, tells the CTO to "add AI to the product," and engineering gets a vague mandate to build something impressive. Six months later, there's a chatbot nobody uses, a recommendation engine that recommends the wrong things, and a frustrated product team wondering why the AI features don't move the metrics they care about.
The failure isn't technical. The models work. The infrastructure scales. The engineers are talented. The failure is alignment—a fundamental disconnect between what R&D builds and what customers actually need.
At PageUp, I partner with our CTO and SVP Engineering to own product roadmaps across the Strategic AI Tribe, aligning cross-functional R&D and Product teams to deliver AI features for enterprise clients operating across 190+ countries. This work—from our recruiter co-pilot Paige to transparent skill matching and resume intelligence—has taught me that the R&D-Product partnership is the single most important factor in whether AI features succeed or fail. Here's how we approach it.
The R&D-Product Alignment Challenge
The core tension in AI product development is that R&D and Product often optimise for different things. Product teams think in terms of user problems, customer value, and business metrics. R&D teams think in terms of model capabilities, technical feasibility, and system architecture. Neither perspective is wrong—both are essential. But without deliberate alignment, they pull in different directions.
Nearly 90% of organisations now use AI regularly, yet fewer than 20% have successfully scaled beyond pilot projects. The gap isn't technology—it's execution. And execution failures almost always trace back to misalignment between what's technically possible, what's commercially valuable, and what's operationally sustainable.
Common misalignment patterns:
- Technology-first thinking: R&D builds what's technically interesting rather than what solves the highest-value customer problem
- Demo-driven development: Features optimised for impressive demos rather than production value, leading to the "AI pilot purgatory" that traps so many organisations
- Metric mismatch: R&D measures model accuracy while Product measures user engagement, and neither connects directly to business outcomes
- Scope creep through ambition: AI capabilities expand during development because the technology makes it possible, not because the customer needs it
- Cost blindness: Features are prioritised without understanding the infrastructure cost of running them at scale—leading to situations where successful adoption actually hurts the business
Gartner has warned that over 40% of agentic AI projects will be scrapped by 2027 unless they're carefully scoped. The scoping discipline has to come from tight R&D-Product partnership.
Three-Horizon AI Roadmapping
Traditional product roadmaps don't work well for AI features because AI development has different uncertainty characteristics than conventional software. A standard feature might be uncertain in scope but predictable in quality—you know what it will do once built, even if you're unsure how long it will take. AI features are uncertain in both scope and quality—you might not know whether the feature will meet quality thresholds until you've invested significant effort.
We use a three-horizon model that organises AI work by confidence level rather than fixed timelines:
Horizon 1 (0-6 weeks): Committed features. These have proven technical approaches, available training data, defined quality thresholds, and clear customer demand. The AI capability has been validated through prototyping or prior work. We commit to delivering these with high confidence. Example: extending our resume summarisation to support additional document formats based on our existing, validated summarisation pipeline.
Horizon 2 (6 weeks to 3 months): Planned features with documented risks. The technical approach is promising but not fully validated. There are identified risks—data quality uncertainties, model performance questions, or integration challenges—with documented mitigation plans. We plan for these but communicate the uncertainty. Example: building interview question generation from job descriptions, where the core capability works but calibrating quality across diverse industries requires iteration.
Horizon 3 (3-6 months): Strategic exploration. These represent research directions, not committed features. The outcome is insight and validated learning, not a shipped product. We invest in these to build understanding and reduce uncertainty for future commitments. Example: exploring multi-agent workflows for end-to-end recruitment automation, where the technology direction is promising but production readiness is months away.
This framework gives Product the predictability they need for customer commitments while giving R&D the space they need for honest assessment of technical uncertainty.
Translating Business Outcomes to Technical Plans
The most important habit in R&D-Product partnership is starting with business outcomes, not technical capabilities. When Product says "we need AI-powered candidate matching," the first question shouldn't be "which model should we use?" It should be "what business outcome are we trying to drive, and how will we measure success?"
We use a four-layer execution framework that creates clear line-of-sight from business strategy to engineering work:
Layer 1: Business outcomes. Define winning in business terms. Not "build AI skill matching" but "reduce time-to-shortlist by 40% for enterprise customers." The business outcome is what the customer pays for and what our sales team sells.
Layer 2: OKRs. Connect strategy to executable quarterly goals. The business outcome translates to objectives with measurable key results that R&D and Product co-own. This shared ownership is critical—if Product owns the objective and R&D owns the key results, or vice versa, the partnership fractures.
Layer 3: Product and engineering plans. Translate business outcomes into feasible technical roadmaps. This is where R&D expertise shapes what's possible, proposes alternatives, and identifies risks. The technical plan serves the business outcome, not the other way around.
Layer 4: Metrics and accountability. Establish clear ownership and progress tracking that connects engineering delivery to business impact. Every AI feature should have metrics that answer: "Is this delivering the value we promised?"
The Automation vs Augmentation Decision
One of the most consequential decisions for any AI feature is whether it automates a human task or augments human capability. This isn't a technical decision—it's a product and business strategy decision that has profound implications for user experience, customer trust, and organisational change management.
Automation means the AI performs the task independently, with humans reviewing exceptions. Example: automatically screening out candidates who don't meet mandatory requirements.
Augmentation means the AI assists humans in performing the task, with humans making the final decision. Example: summarising resumes and highlighting relevant skills so recruiters can make faster, better-informed decisions.
The distinction matters because it reshapes workflows, user interfaces, and the entire value proposition. Augmentation requires different UX patterns (side-by-side comparison, explanation, easy override) than automation (exception queues, confidence thresholds, audit logs). It also has different regulatory implications—the EU AI Act's human oversight requirements are particularly relevant when AI influences hiring decisions.
At PageUp, we've deliberately chosen augmentation over automation for most of our AI features. Our transparent skill matching shows recruiters exactly how a candidate's skills align with role requirements, moving beyond "black box" scoring to provide clear visibility into the AI's reasoning. This isn't just a product decision—it's a trust-building decision that reflects our understanding of how enterprise customers want to use AI in high-stakes decisions.
Making the decision explicit for each feature prevents the dangerous drift where an augmentation tool gradually becomes an automation tool as users learn to trust it blindly. If a feature is designed for augmentation, the UX should maintain human engagement, not enable passive acceptance.
Avoiding AI Pilot Purgatory
AI pilot purgatory is where promising prototypes live forever in "just one more iteration" loops, never graduating to production. It's one of the most common failure modes in enterprise AI, and it's almost always a symptom of R&D-Product misalignment.
Symptoms of pilot purgatory:
- The demo keeps getting better, but there's always a reason it's not ready for production
- Success metrics keep shifting—first it was accuracy, then latency, then coverage, then edge cases
- The pilot serves a small group of enthusiastic users but faces resistance from the broader user base
- Nobody can articulate what "done" looks like in concrete, measurable terms
How to escape (or avoid) purgatory:
- Define graduation criteria before starting the pilot. What specific metrics, at what thresholds, with what consistency, measured over what period, constitute success? Write these down. Get Product and R&D to sign off on them together.
- Set a time limit. Pilots that run indefinitely consume resources without delivering value. Set a clear evaluation date: at this point, we either ship, pivot, or kill the initiative.
- Align metrics on business outcomes, not model accuracy. A model with 95% accuracy that doesn't move business metrics is less valuable than a model with 85% accuracy that measurably improves recruiter productivity.
- Include operational readiness in the criteria. A pilot isn't successful just because the model works. It's successful when the infrastructure, monitoring, support, and training are in place for production operation.
Deep Collaboration: Skill Taxonomies as a Case Study
One of the best examples of deep R&D-Product collaboration in our work is the development of AI-powered skill matching using industry-standard taxonomies. This project illustrates how R&D expertise and product insight combine to create something neither could build alone.
The problem: recruiters need to match candidates to jobs based on skills, but skills are described differently across industries, regions, and educational systems. A "software engineer" in one context might be a "developer" in another. "Project management" encompasses vastly different skill sets depending on the domain. Simple keyword matching produces poor results.
The R&D insight: industry-standard skill taxonomies like ESCO (European Skills, Competences, Qualifications and Occupations) and ONET (US Occupational Information Network) provide structured frameworks mapping thousands of occupations to tens of thousands of skills. ESCO covers 3,000+ occupations and 13,000+ skills across EU markets. ONET provides detailed US occupational data. Using these taxonomies as foundations for AI-powered skill matching produces dramatically better results than training on unstructured job posting data alone.
The Product insight: recruiters don't care about taxonomy structures. They care about finding the right candidates quickly. The skill matching needs to be transparent—showing why the AI thinks a candidate is a good match—and flexible enough to handle the messy reality of how people describe their experience. The taxonomy provides the backbone, but the user experience determines whether anyone actually uses it.
The collaboration: R&D built the ADR-driven skill matching pipeline using ESCO and O*NET as foundational knowledge, with AI models handling the semantic matching between candidates' free-text descriptions and structured skill frameworks. Product designed the transparent matching interface that shows recruiters exactly which skills were identified, how they map to the role requirements, and where gaps exist. Together, we created a feature that's both technically sophisticated and genuinely useful.
This kind of deep collaboration—where R&D's technical knowledge and Product's customer insight continuously inform each other—is what separates AI features that deliver value from AI features that just demonstrate capability.
Unit Economics and Sustainable Scaling
Here's a trap that catches many organisations: your AI feature succeeds, adoption grows, and your infrastructure costs grow faster than your revenue. Successful AI adoption can be economically unsustainable if you haven't planned for the unit economics of scale.
The scaling cost challenge:
AI features often have variable costs that scale with usage—more users means more inference requests, more tokens processed, more infrastructure consumed. Unlike traditional software where marginal cost per user is near zero, AI features can have meaningful marginal costs that compound as adoption grows.
How we manage this:
- Cost per interaction budgets: Every AI feature has a target cost per user interaction. If costs exceed this target, we investigate optimisation opportunities before scaling further.
- Tiered model strategies: Not every user interaction needs a frontier model. We route requests based on complexity, using cost-effective models for straightforward tasks and reserving premium models for complex ones.
- Value-based prioritisation: Features that deliver measurable business value justify higher infrastructure costs. Features with uncertain value get tighter cost constraints until value is demonstrated.
- Proactive capacity planning: We model infrastructure costs at 2x, 5x, and 10x current usage to understand the cost curve before scaling into it.
The conversation about unit economics needs to happen early—during roadmap planning, not after launch. R&D brings the cost modelling expertise; Product brings the revenue and value context. Together, you can ensure that features you're investing in are economically viable at the scale you're planning for.
Conclusion: Partnership as Strategy
The partnership between R&D and Product isn't a process improvement—it's a strategic capability. In a world where most organisations struggle to move AI beyond pilots, the ability to consistently deliver AI features that solve real customer problems at sustainable economics is a genuine competitive advantage.
This capability doesn't come from better models or more engineers. It comes from alignment: shared understanding of what success looks like, honest assessment of what's technically feasible, disciplined prioritisation based on business value, and continuous communication between the people who understand the technology and the people who understand the customer.
If you're an R&D leader building AI products, invest as much in your relationship with Product as you invest in your technical architecture. Build shared frameworks for decision-making. Create forums for honest conversation about uncertainty and risk. And remember that the goal isn't to build impressive AI—it's to build AI that makes your customers' lives better.
The roadmap that matters isn't the one with the most AI features. It's the one where every AI feature was built because a customer needed it, validated because the data supported it, and scaled because the economics sustained it. That roadmap only exists when R&D and Product build it together.