Skip to content
Fulkerson Advisors

Strategy

When Not to Hire McKinsey for AI

An honest structural argument from a former McKinsey associate. The Big 3 still own work no boutique can match. AI implementation is not that work.

Bilal Bitar
Bilal BitarCo-Founder & Managing Partner
12 min readMay 30, 2026

Last year, a Chief AI Officer at a Fortune 200 industrial company asked me a question across a conference-room table that I would have answered very differently five years ago. He had a slide on the wall behind him from a Big 3 firm, the kind I used to help build: a three-by-three matrix of AI use cases, scored on value and feasibility, with a tidy phase-one through phase-three roadmap underneath. He pointed at it and said: 'I've had this deck for fourteen months. I have not shipped a single thing on it. What am I missing?'

I spent the first chapter of my career at McKinsey. I know how that deck was built, who built it, what their incentives were, and why it is, on its own terms, a genuinely good piece of work. The deck was not the problem. The problem is that the deck is the deliverable, and AI implementation is not a deliverable. It is an operating reality you live with for years. And the engagement model that produced the deck is not the engagement model that ships the use case in production by month nine.

What follows is not a takedown. I have colleagues at the Big 3 whose intellectual horsepower I would still hire any day of the week for the right scope. This is, instead, a structural argument: when the Big 3 are exactly the right call, when they are structurally the wrong one, and what a buyer should actually ask before signing the SOW. The question is not which firm. The question is which engagement model fits the work.

When the Big 3 Are the Right Call

Begin with what they do exceptionally well, because the moat is real. If you are a CEO trying to set enterprise-wide AI strategy at the board level, the Big 3 are usually the correct choice. The reason is not the analytical horsepower, though that exists. The reason is institutional weight. When a McKinsey partner presents an AI strategy to your board, the board takes the conversation differently than they would from a thirty-person boutique. That is not vanity; it is a real procurement reality, and pretending otherwise is naive.

There are at least five scenarios where I will recommend a buyer go to a Big 3 firm without hesitation. First: regulatory exposure. If your AI program lives inside a regulated function, banking, insurance, pharma, defense, and you need the brand of the advisor on the cover page to satisfy a regulator or a board risk committee, hire the Big 3. The cover page matters. It is one of the few places where logo is substance.

Second: enterprise-scale change management across more than three business units. Anything that touches twenty thousand employees, requires a meaningful org redesign, and has a multi-year horizon plays to the staffing depth that only the largest firms have. We at Fulkerson do not have eighty consultants to embed across six geographies. McKinsey does. Use them for that.

Third: politically sensitive decisions where you need an external voice to absorb the blame. CFOs and CEOs sometimes need to make a hard call, kill a beloved internal initiative, restructure a function, exit a business line, and they need an external advisor whose institutional weight makes the decision survivable. This is a perfectly legitimate use of consulting capital and the Big 3 are the right tool. Fourth: post-merger integration where AI is one workstream among many. Fifth: market-entry analysis where you need a global research footprint.

In each of these, the engagement model fits the work. The deliverable genuinely is the analysis and the alignment. The pyramid of analysts under a partner is the right shape because the work is research-intensive, slide-intensive, and consensus-building-intensive. The billable hours buy you what they say they buy you.

Where the Engagement Model Stops Fitting

Now consider a different scope. The same Chief AI Officer wants to deploy a model that auto-mappings unstructured customer data into the firm's internal schema. The success criterion is not a recommendation. It is a system in production with an SLA, owned by a team, that survives the departure of any individual contributor and degrades gracefully when the upstream data format changes. This is not strategy work. This is engineering work with a strategy wrapper.

Three structural features of the Big 3 model make this kind of work hard for them to do well, none of which reflect on the quality of the people. First, pyramid staffing. The economics of a Big 3 engagement require a leveraged pyramid: one partner, two engagement managers, six to twelve associates and analysts. The associates are extraordinary at synthesis, hypothesis testing, and stakeholder management. They are not, with rare exceptions, the people you want writing the evaluation harness that determines whether your model is shippable. The pyramid is shaped for slideware, not for production code.

Second, billable-hour incentives. Big 3 partners are measured on the size of the engagement and the gross margin per consultant-week. This creates a real and human pressure to scope engagements that justify the team you have on the bench, not the team the problem requires. I have watched, and at one point participated in, engagement scoping conversations where the question 'what does the client actually need' was not the first question asked. The first question was 'what can we sell.' I do not say this with anger. I say it because the incentives are doing what incentives do.

Third, the deliverable is the slide deck. This is the deepest structural mismatch. A McKinsey engagement ends, in nearly all cases, with a final readout deck and a transition memo. The model lives or dies after the team rolls off. On AI work, the period between week twelve and month nine is where the value is created or destroyed, and that is precisely the period when a Big 3 team is, by design, no longer present. The economics of the firm cannot support a five-person team embedded for nine months in a single client; the partners need to be on the next sale.

The model lives or dies in the period between week twelve and month nine; and that is precisely the period when a Big 3 team is, by design, no longer present.

A Case in Texture

A Caribbean conglomerate we worked with last year had been through two prior engagements with brand-name firms before we arrived. They had two excellent strategy decks. They had a fully scored use-case prioritization matrix. They had an org-design recommendation for a Center of Excellence. What they did not have was a single demand forecasting model in production, a hiring pipeline for the data scientists the Center would need, or a training curriculum for the operators who would consume the forecast on Monday mornings.

The work we did was not more intellectually sophisticated than what the prior firms had done. The prior decks, frankly, were excellent. The work was different in kind: we wrote the model, we built the evaluation harness against the historical data, we wrote the job descriptions for the four hires the Center needed, we ran the first cohort of the training, and we stayed on call through the first three monthly forecast cycles after handoff. That work is not pyramidal. It is forward-deployed. And it does not generate the gross margin per consultant-week that a Big 3 firm requires.

Six Questions to Ask Any Advisor

If you are a buyer evaluating an AI advisor, Big 3 or boutique, here are the six questions I would ask. They are designed to be uncomfortable. Discomfort is the point.

One: on the last three AI engagements you closed, name the system that is in production today and the team that owns it. Not the use case. The system, by name, and the person on the client side whose performance review depends on it. If the advisor cannot name three, the engagement model is producing decks, not systems.

Two: who specifically will write the production code? Not the architecture diagrams; the code. Get a name, a GitHub handle if possible, and a sample of their prior work. If the answer is 'we have a deep bench of engineers we can scope onto the project,' the answer is no one. The bench is fictional until you can interview them by name.

Three: what is your evaluation methodology before a model ships? Not 'how do you measure ROI'; that is a different question. I mean: how do you decide a model is good enough to be allowed into production? What is the eval set? Who labels it? What is the threshold? If the advisor does not have a clear and specific answer, they have not shipped enough models to know that this is the question that separates research from production.

Four: tell me about a pilot you killed. Every honest AI advisor has killed pilots. We have killed two in the last eighteen months. If the answer is 'we have not killed any, all our pilots have gone to production,' the advisor is either lying or is so risk-averse in pilot selection that they are only doing trivially easy work. Neither is the partner you want.

Five: what is your engagement model after the model is live? Specifically: in month nine, when something breaks at 2 a.m. and a frontline operator does not know whether to call you or the internal team, who is on the phone? If the answer is 'we transition fully at month six,' you are buying a deck, not a system.

Six: walk me through the worst project in your portfolio. Not the success stories. Where did you most underperform, and what did you change in your engagement model as a result? The Big 3 partners I respect most can answer this in detail. The ones I respect least pivot to a success story. The advisor's answer to this question tells you whether you are dealing with a learning organization or a sales organization.

The Boutique Advantage, Stated Honestly

I am wary of how boutiques typically describe themselves. 'Nimble.' 'Hands-on.' 'Senior team.' These are weak claims because they are unfalsifiable, and most of them are not actually true of most boutiques. Let me try to be more specific about what a serious specialist firm can offer that the Big 3, structurally, cannot.

First: the senior people stay on the engagement. At a boutique built right, the partner who sold the work is the partner who is in the working session on a Wednesday afternoon in month four. This is not because we are virtuous. It is because we are small enough that we cannot delegate down a pyramid we do not have. The economics force the senior presence; do not credit us for what the structure imposes.

Second: forward-deployed engineering. The people writing the code, designing the evaluation harness, integrating with the client's data infrastructure, are the same people in the executive readouts. There is no translation layer between the engineering reality and the strategic narrative, because they are the same person. This is what BCG's X unit and Accenture's Applied Intelligence are trying to build internally; they are doing serious work in that direction. But they are doing it inside a firm whose dominant gross-margin engine is still the pyramid, and that tension shows up in pricing, staffing, and partner attention.

Third: skin in the game on what gets shipped. A boutique that lives or dies on three or four large engagements at a time cannot afford to ship a model that fails. The Big 3 can. A failed AI pilot at McKinsey is a learning, written into a knowledge-management database, and the partner moves on to the next mandate. A failed pilot at Fulkerson is the reference call our next prospect makes. The asymmetry is real, and it is in the buyer's favor when they choose the boutique.

Fourth, and this is the one most often misunderstood: a willingness to tell a client to stop. We have, on two separate engagements in the last two years, recommended that the client kill a project we were being paid to scope and ship. In both cases the work was not going to generate enough value to justify the operating cost. A Big 3 partner whose engagement is funded for nine months has a real and structural disincentive to deliver that recommendation in month three. We have a structural incentive to deliver it, because our reputation is a more durable asset than the remaining six months of fees.

What the Big 3 Still Own

I want to be honest about what no boutique, ours included, can match. The Big 3 have benchmark data across hundreds of clients in the same industry. When a CFO asks 'what is the right cost-to-serve ratio for an AI-enabled procurement function in industrial chemicals,' McKinsey has the data to answer that with confidence and ours, candidly, is thinner. The benchmarks are a moat.

They have geographic reach. A global client with operations in fourteen countries who needs simultaneous workstreams in each can be staffed from the Big 3 in a way no thirty-person boutique can match. They have C-suite relationships built across decades; a McKinsey partner who has known the CEO since the CEO was a VP can have conversations a boutique founder cannot, and that access is worth real money in the right scope.

They have brand cover for hard decisions, as discussed. They have institutional research; the McKinsey Global Institute and BCG Henderson Institute do work no boutique funds. And they have the political legitimacy in front of regulators and boards that is, for some decisions, the entire point of the engagement. A boutique that pretends these are not real advantages is selling against a strawman.

How to Actually Choose

The reframe I would offer a buyer is to stop asking which firm and start asking which engagement model. There are at least four engagement models in the AI advisory market today, and the right one is determined by the work, not the brand on the cover page.

Strategy at the top of the house: pick a Big 3 firm. The pyramid model fits the work. You are buying analysis, alignment, and institutional weight, and that is what they sell. Change management at enterprise scale: pick a Big 3 firm, possibly with a boutique specialist in support for the technical workstreams. Implementation of a specific, value-tied AI use case in production: pick a specialist boutique with forward-deployed engineers, and pick them on the strength of the six questions, not on the strength of the logo. Build-versus-buy roadmaps with renewal-economics implications: pick a specialist boutique that has done it before and can name the pilots they killed.

There is a fifth scenario worth naming because it is increasingly common: clients who hire a Big 3 firm for the strategy and a boutique for the implementation, deliberately staged. This is a perfectly rational architecture and we work in it often. The Big 3 partner is not threatened, in my experience, when this is structured honestly. The partner I trust the most at one of those firms recommended Fulkerson into an engagement last quarter; the engagement model fit and he knew it. Mature buyers and mature advisors can do this without ego.

The Reframe

I left McKinsey eleven years ago. I think about my time there with real respect and a fair amount of nostalgia for the colleagues. Nothing in this essay is intended to suggest that the work they do is not serious; it is among the most serious work in the consulting industry. What I have come to believe, after a decade in development finance and now in AI advisory, is that the engagement model a firm sells is not a feature of the firm; it is the firm. Pyramid staffing, billable hours, slide-deck deliverables, partner-on-the-sale-not-on-the-execution are not bugs in the McKinsey model. They are the model, and they work brilliantly for the work they are designed to do.

AI implementation, the part of AI that determines whether the technology actually changes the P&L, is not the work that model was designed to do. It requires a different shape of team, a different pricing structure, a different definition of done, and a different appetite for staying on the line at month nine when the model misbehaves. If you are a buyer with that scope, the question is not whether McKinsey is good. McKinsey is good. The question is whether the engagement model fits the work.

When it does, hire them. When it does not, hire someone whose structure aligns with the outcome you are actually trying to produce. That is not disruption; it is procurement. And procurement, done well, is the most underrated skill in the AI buyer's toolkit.

Frequently asked

When should I hire McKinsey, BCG, or Bain for AI work rather than a boutique?
Hire the Big 3 when the scope is board-level AI strategy, enterprise-wide change management touching multiple business units, regulated-industry decisions where institutional brand provides cover, post-merger integration where AI is one workstream of many, or politically sensitive decisions that require an external voice. In each, the pyramid staffing model and slide-based deliverable fit what the work actually requires.
Why is the Big 3 engagement model often a poor fit for AI implementation?
Three structural reasons. Pyramid staffing puts associates rather than senior engineers on the production code; billable-hour incentives push toward larger engagements rather than tightly scoped ones; and the deliverable is the slide deck, so the team rolls off around the moment, typically month twelve to month nine post-handoff, when the model's success or failure is actually determined. None of this reflects on the quality of the people, only on the alignment of the model with the work.
What questions should I ask any AI advisor before hiring them?
Six. Name three systems in production today from your last three engagements, and the owner on the client side. Who specifically will write the production code, by name. What is your evaluation methodology before a model ships. Tell me about a pilot you killed. What is your engagement model in month nine when something breaks. Walk me through your worst project. Advisors who cannot answer these specifically are selling slideware.
What does a specialist boutique offer that a Big 3 firm structurally cannot?
Senior partners who stay on the engagement through execution rather than rolling off after the sale; forward-deployed engineers who write the code and present the readouts; a structural incentive to ship working systems because the reference matters more than the next month of fees; and a willingness to recommend killing a project mid-engagement when the economics no longer justify it.
What do Big 3 firms still do better than any boutique?
Cross-client benchmark data, global geographic reach, decades-long C-suite relationships, brand cover for politically sensitive decisions, institutional research output from groups like the McKinsey Global Institute, and the regulatory and board-level legitimacy that some decisions genuinely require. A boutique that denies these advantages is selling against a strawman.
Can I hire a Big 3 firm and a boutique together on the same AI program?
Yes, and it is increasingly common. A Big 3 firm leads the strategy and change-management workstream; a specialist boutique runs the implementation and engineering workstream. Mature partners on both sides work in this architecture without friction. The key is structuring it deliberately upfront so the workstreams reinforce rather than duplicate each other.

Related topics

ConsultingStrategyAI AdvisoryEngagement Model

Working through a similar problem? We’d be glad to compare notes.