GFO

Sequoia, 2023

From Generative AI's Act Two

Data moats for generative AI startups are on shaky ground; future generations of foundation models may obliterate any data advantages startups currently build.

View original source ↗

Credibility?Composite credibility score, weighted blend of Specificity, Accuracy, and Calibration. Higher means more credible.

42/ 100

Evaluated

Specificity?Was the claim falsifiable? 100 means a precise, dated, quantitative prediction. 0 means an unfalsifiable platitude.

35

Accuracy?Did the predicted thing happen by today? 100 means clearly yes, 0 means clearly no, 50 means mixed or partial.

45

Calibration?Was the magnitude and timing right? 100 means right number and date. 0 means off by an order of magnitude or many years.

40

Reasoning

Sequoia's 2023 prediction was directionally partially correct but overstated the obliterating effect of foundation models on startup data moats. The claim is vague (no specific date, no quantitative threshold), earning a low specificity score. As of mid-2026, the evidence shows a nuanced and mixed picture rather than a clean validation. On one hand, the prediction correctly identified that thin-wrapper startups relying on generic data advantages are indeed vulnerable: early-stage AI funding has slowed for undifferentiated startups, and the VC community broadly agrees that 'simply utilizing a third-party LLM to perform basic task automation is no longer a viable business strategy.' Foundation models have also moved up the stack, with OpenAI and Anthropic shipping agentic products that directly compete with application-layer startups. On the other hand, the prediction's stronger claim — that foundation models would 'obliterate' data advantages — has not materialized. Instead, the consensus across multiple 2025–2026 investor reports (Madrona, Foundation Capital, CB Insights, Menlo Ventures) is that proprietary data moats remain the primary source of defensibility, especially when tied to unique, hard-to-replicate datasets (e.g., regulated patient records, domain-specific interaction loops). CB Insights explicitly notes that for certain startups, their data is something 'no competitor can replicate, regardless of what models get released.' Madrona states flatly that 'data isn't a byproduct of product usage — it is the moat.' The market has evolved toward a more nuanced view: static or generic data moats are indeed fragile, but dynamic, proprietary, domain-specific data flywheels are strengthening, not eroding. The prediction was partially right about the fragility of shallow data advantages but wrong about the wholesale obliteration of data moats as a competitive strategy.

Sources

Last evaluated 6/1/2026, 7:18:30 PM, claude-sonnet-4-6