The Black Swan Problem: Why AI-Augmented Funds Need Human Exercise

On 24 February 2022, Russia invaded Ukraine. Within seventy-two hours, every portfolio model built on historical correlation patterns broke. Russian and European hedge funds plunged 36% in the first two months of 2022, per HFR data. BlackRock’s Emerging Frontiers Fund — which had increased its Russia exposure just before the invasion — dropped 10% in February alone, its worst monthly loss in over a decade. The funds that navigated the dislocation best were not the ones with the most sophisticated quantitative models. They were the ones whose senior partners had lived through previous regime changes — 2008, the dot-com crash, the Asian financial crisis — and had the muscle memory to act decisively when the models broke.

Now imagine a fund where those partners have not made an unassisted investment decision in three years because their AI system has handled everything. Would they still have that muscle memory?

The Atrophy Problem

It is well-documented in aviation. A landmark NASA study by Casner and colleagues, published in Human Factors in 2014, found that while pilots’ instrument-scanning and aircraft-control skills were reasonably well retained under automation, the cognitive skills for manual flight — tracking position without a map display, deciding next navigational steps, recognising instrument failures — degraded measurably. Volz and Dorneich (2020) extended the finding to flight planning. Airmanship erosion was implicated in the 737 MAX crash analyses — the autopilot did not make those crews worse pilots; the lack of practice did.

The same dynamic applies to investment judgment. An AI system that handles deal screening, market research, and portfolio monitoring is useful — until it encounters something it has never seen before. That is when the human partner needs to take the controls. And if they have not exercised their unassisted judgment in months, they will hesitate at exactly the wrong time.

The Wrong Solution and the Right One

The wrong solution is less AI. Reducing agent coverage to “keep humans sharp” is like telling airline pilots to turn off autopilot during cruise. It is inefficient and dangerous — the routine work is exactly where automation adds the most value.

The right solution is a structured exercise programme. Manthan Intelligence implements this as a quarterly protocol.

The Human Exercise. Once every quarter, the senior decision-maker receives a raw company brief — pitch deck, public financials, website, Crunchbase profile. No deal memo. No multi-agent assessment. No knowledge graph context. Just the same raw materials that any analyst would start with. The decision-maker writes their own assessment: verdict, confidence level, top three risks, top three strengths, two-sentence thesis. Then — and only then — does the full agent pipeline run on the same company.

The Diagnostic. Four outcomes, each valuable. Convergence with high confidence validates calibration. Convergence with different reasoning surfaces complementary paths and feeds both into the knowledge graph. Divergence — human was right is a calibration gift: a blind spot to fix. Divergence — system was right is the most important outcome: it maps where human coverage has degraded.

What This Means for Tail Events

Nassim Taleb’s core insight was not that black swans exist — everyone knows that. It was that systems become fragile precisely because they optimise for normal conditions. The more efficient your process is for the 95% of routine decisions, the more vulnerable you are to the 5% that do not fit the pattern. Mark Spitznagel’s Universa Investments, the most prominent tail-risk fund of the past two decades, is built on exactly this premise: the time to prepare for the dislocation is years before it arrives.

The quarterly exercise does not prevent black swans. Nothing does. It maintains the decision-maker’s capacity to recognise when they are in one. The partner who assessed a company cold last quarter retains the ability to think from first principles when the models break. The partner who has not made an unassisted decision in two years will reach for the agent dashboard during a market dislocation and find it has nothing useful to say.

The Charaka View

Calibration measures performance against historical patterns. Black swans are, by definition, outside those patterns. The human exercise is not testing whether the human can beat the system on routine decisions — they cannot, and they should not try. It is testing whether the human can still think independently when the system has nothing to offer. The fund that can demonstrate a systematic human exercise programme is a fund that has thought seriously about AI risk. Every LP conversation about “what if the AI is wrong?” has a concrete answer: the quarterly log of human-vs-system assessments, where they diverged, and what was learned. That is not a defensive answer. It is a differentiated one — and it is the operational counterpart to our agent-to-human ratio thesis.

This analysis draws on Hedgeweek’s reporting on hedge fund losses during the Ukraine invasion, Casner et al.’s 2014 study on manual flying skill retention, Volz and Dorneich’s 2020 cognitive skill degradation research, and Jetwhine’s analysis of automation and airmanship. Human editorial oversight applied.

This analysis is informational and does not constitute investment advice, a research report, or a recommendation to buy, sell, or hold any security.

Charaka Notes by Manthan Intelligence. Subscribe

The Atrophy Problem

The Wrong Solution and the Right One

What This Means for Tail Events

The Charaka View

Never miss an insight