← Blog

Data Is Not Software

What Anthropic’s AI Analytics Project Reveals About the Real Challenge

Our read on what it takes to get AI working on real business data — and why most companies can’t replicate it alone.

By Bruno Ruyú June 2026

This week, Anthropic published a detailed account of how they built self-service data analytics internally using Claude. It’s an unusually candid piece — and the candor is what makes it valuable. Not because it showcases what Claude can do, but because it documents what it actually took, and where the limits are.

We read it carefully. Here’s our perspective.

1. The problem is real

Anthropic didn’t publish this piece to market their product. They published it because connecting AI to real business data turned out to be genuinely hard — hard enough that their own engineering team had to build a significant internal infrastructure project just to make it work for themselves.

That’s worth pausing on. The company that builds the most capable AI models in the world needed months of specialized engineering work to answer their own internal business questions reliably. Not because Claude isn’t powerful. But because the problem isn’t a model problem. It’s a data problem.

If the leading AI lab in the world needs to invest at this level to solve this for themselves, that tells you something fundamental about the nature of the challenge.

2. LLMs alone can’t solve it — because data is not software

This is not a new insight for us. In 2021, I read the original Transformers paper and tested GPT-2. The models were impressive — but it was immediately clear that the hard problem wasn’t the model. It was the data underneath it. That became the founding premise of Teramot, and it’s what I’ve said in every pitch, every talk, and every architecture decision since.

In June 2026, Anthropic published it as a section title in their own blog. That’s not a coincidence. It’s a confirmation.

Code is deterministic and self-describing. You can read a function and understand what it does. Data is different. It carries business context, organizational history, ambiguity, and constant change that accumulates over years inside a company. A model — no matter how capable — cannot infer what “revenue” means in your specific context, which entities your team refers to as “the Q2 launch,” or why two departments calculate the same metric differently.

Anthropic tested this assumption directly. When they tried to bootstrap their semantic layer using LLMs — having the model auto-generate metric definitions from raw tables and query logs — the output was “net-negative on our evals versus a smaller, human-curated layer.” The model produced plausible-looking definitions that encoded the very ambiguities they were trying to eliminate.

You can’t prompt your way out of a data problem.

3. Their solution required a world-class team — and they’re still not satisfied

What Anthropic built is technically impressive. Governed canonical datasets, a human-curated semantic layer, a business context graph with indexed documents and organizational structure, colocated artifacts across layers with CI checks that protect cross-layer integrity, and a provenance footer on every response showing where the data came from and how fresh it is.

That’s exactly the right architecture. The issue is what it took to build and maintain it.

It required a specialized team working over multiple months. Every incremental change to the metadata triggers evaluation runs that take an hour each and cost significant resources. Skills need to be reviewed and updated every month as business logic and data models evolve.

And after all of that investment, they report 95% accuracy in production responses — and are candid that they don’t yet have a robust solution for what they call “silent failures”: the cases where the answer is wrong, looks plausible, and gets used without anyone noticing. One answer in twenty reaching a senior leader undetected isn’t a production system. It’s a very well-built prototype.

4. Every critical layer requires humans in the loop

Perhaps the most telling detail in the piece: after all the engineering investment, human oversight remains structurally necessary at every critical layer.

The semantic layer cannot be automated — as their own experiments showed, model-generated definitions are worse than smaller human-curated ones. The business context layer, which they describe as “the layer most teams skip,” requires continuous human curation to stay current with how the business actually works. Every response destined for leadership requires explicit human sign-off before it’s trusted.

Their conclusion is direct: generate documentation with AI, but have a human own the definition.

That’s the right answer for Anthropic. They have the team to make it work. But it raises the obvious question: what does a company do if it doesn’t have that team?

For any organization that wants to replicate what Anthropic built, the honest answer is that they need to hire and maintain a dedicated group of specialized data and AI engineers — not to build the product, but just to keep the AI data layer running. That’s not a product strategy. That’s a cost center. And it’s one most companies can’t sustain.

What this means — and how Teramot approaches it differently

The architecture Anthropic describes is not wrong in itself. The problem is the operating model it requires.

The reason their approach depends on ongoing human maintenance is structural: they built an AI layer on top of existing data infrastructure, and compensated for the lack of a structured foundation with skills, reference documentation, and human curation. When the underlying data changes — which it does, constantly — the text-based knowledge layer drifts, and humans have to realign it.

Teramot starts one layer below. Before anything reaches an AI agent, we deploy a full data lakehouse — Bronze, Silver, and Gold layers — where every transformation is deterministic SQL and all metadata lives in a structured relational database, not in text files. Once that foundation is in place, the AI doesn’t infer — it queries. The result is the same every time.

Does this require human iteration to build? Yes — and we think that’s honest. Business knowledge doesn’t emerge automatically. The difference is that with Teramot, that iteration is fast — typically under an hour to deploy a first functional version, and five minutes for each new iteration — and stable over time. You don’t rebuild it every month. You build it once, and it holds. Because the knowledge is structured and relational, changes to the data model propagate correctly instead of silently breaking a text file somewhere downstream.

After three years of R&D and production deployments across companies in Latin America, we launched in February 2026 the version that makes this fully self-serve: a company can go from zero to a governed, AI-ready data lakehouse by signing up on our website and entering a credit card. No onboarding from our side. No specialized engineers on theirs.

What Anthropic described is the right problem, solved the right way given their constraints. What we’re building is the same outcome — AI that actually works on real business data — designed for the companies that don’t have Anthropic’s team.

← Perspectivas

Los Datos No Son Software

Lo que el proyecto de Analytics de Anthropic revela sobre el verdadero desafío

Nuestra lectura sobre lo que hace falta para que la IA funcione con datos reales de negocio — y por qué la mayoría de las empresas no puede replicarlo sola.

Por Bruno Ruyú Junio 2026

Esta semana, Anthropic publicó un relato detallado de cómo construyeron analytics de datos self-service internamente usando Claude. Es una nota inusualmente honesta — y esa honestidad es lo que la hace valiosa. No porque muestre lo que Claude puede hacer, sino porque documenta lo que realmente costó hacerlo, y dónde están los límites.

La leímos con atención. Acá nuestra perspectiva.

1. El problema es real

Anthropic no publicó esta nota para hacer marketing de su producto. La publicó porque conectar IA con datos reales de negocio resultó ser genuinamente difícil — tan difícil que su propio equipo de ingeniería tuvo que construir un proyecto interno de infraestructura significativo solo para que funcionara para ellos mismos.

Vale la pena detenerse en eso. La empresa que construye los modelos de IA más capaces del mundo necesitó meses de trabajo especializado para responder sus propias preguntas de negocio de forma confiable. No porque Claude no sea poderoso. Sino porque el problema no es un problema de modelos. Es un problema de datos.

Si el laboratorio de IA líder en el mundo necesita invertir a este nivel para resolverlo por sí mismo, eso te dice algo fundamental sobre la naturaleza del desafío.

2. Los LLMs solos no pueden resolverlo — porque los datos no son software

Este no es un insight nuevo para nosotros. En 2021, leí el paper original de Transformers y probé GPT-2. Los modelos eran impresionantes — pero quedó inmediatamente claro que el problema difícil no era el modelo. Era el dato debajo. Esa se convirtió en la premisa fundacional de Teramot, y es lo que digo en cada pitch, en cada charla y en cada decisión de arquitectura desde entonces.

En junio de 2026, Anthropic lo publicó como título de sección en su propio blog. No es una coincidencia. Es una confirmación.

El código es determinístico y se autodescribe. Podés leer una función y entender qué hace. Los datos son diferentes. Llevan contexto de negocio, historia organizacional, ambigüedad y cambio constante que se acumula durante años dentro de una empresa. Un modelo — por más capaz que sea — no puede inferir qué significa “revenue” en tu contexto específico, a qué se refiere tu equipo cuando habla del “lanzamiento del Q2”, o por qué dos áreas calculan la misma métrica de forma diferente.

Anthropic probó esta suposición directamente. Cuando intentaron generar automáticamente su capa semántica usando LLMs — haciendo que el modelo auto-generara definiciones de métricas a partir de tablas y logs de queries — el resultado fue “net-negative on our evals versus a smaller, human-curated layer.” El modelo produjo definiciones que parecían correctas pero que codificaban exactamente las ambigüedades que querían eliminar.

No se puede hacer prompting para salir de un problema de datos.

3. Su solución requirió un equipo de primer nivel — y aún no están satisfechos

Lo que Anthropic construyó es técnicamente impresionante. Datasets canónicos gobernados, una capa semántica curada por humanos, un grafo de contexto de negocio con documentos indexados y estructura organizacional, artefactos colocados en capas con CI checks que protegen la integridad entre capas, y un footer de procedencia en cada respuesta que muestra de dónde vienen los datos y qué tan frescos son.

Esa es exactamente la arquitectura correcta. El problema es lo que costó construirla y mantenerla.

Requirió un equipo especializado trabajando durante varios meses. Cada cambio incremental en la metadata dispara ciclos de evaluación que tardan una hora cada uno y cuestan recursos significativos. Las skills necesitan revisarse y actualizarse cada mes a medida que la lógica de negocio y los modelos de datos evolucionan.

Y después de toda esa inversión, reportan 95% de accuracy en respuestas en producción — y son honestos en admitir que todavía no tienen una solución robusta para lo que llaman “silent failures”: los casos donde la respuesta es incorrecta, parece plausible, y se usa sin que nadie lo note. Una respuesta de cada veinte llegando a un líder sin ser detectada no es un sistema de producción. Es un prototipo muy bien construido.

4. Cada capa crítica requiere humanos en el loop

Quizás el detalle más revelador: después de toda la inversión en ingeniería, la supervisión humana sigue siendo estructuralmente necesaria en cada capa crítica.

La capa semántica no puede automatizarse — como sus propios experimentos demostraron, las definiciones generadas por modelos son peores que una capa más pequeña curada por humanos. La capa de contexto de negocio, que describen como “the layer most teams skip,” requiere curation humana continua para mantenerse al día con cómo el negocio realmente funciona. Cada respuesta destinada al liderazgo requiere aprobación humana explícita antes de ser utilizada.

Su conclusión es directa: generar documentación con IA, pero que un humano sea dueño de la definición.

Esa es la respuesta correcta para Anthropic. Ellos tienen el equipo para hacerlo funcionar. Pero plantea la pregunta obvia: ¿qué hace una empresa que no tiene ese equipo?

Para cualquier organización que quiera replicar lo que Anthropic construyó, la respuesta honesta es que necesita contratar y mantener un grupo dedicado de ingenieros especializados en datos e IA — no para construir el producto, sino solo para mantener la capa de datos con IA funcionando. Eso no es una estrategia de producto. Es un centro de costos. Y es uno que la mayoría de las empresas no puede sostener.

Qué significa esto — y cómo Teramot lo aborda de forma diferente

La arquitectura que Anthropic describe no está mal en sí misma. El problema es el modelo operativo que requiere.

La razón por la que su approach depende de mantenimiento humano continuo es estructural: construyeron una capa de IA sobre infraestructura de datos existente, y compensaron la falta de una fundación estructurada con skills, documentación de referencia y curation humana. Cuando los datos subyacentes cambian — lo cual ocurre constantemente — la capa de conocimiento basada en texto se desalinea, y los humanos tienen que realinearla.

Teramot empieza una capa más abajo. Antes de que nada llegue a un agente de IA, deployamos un data lakehouse completo — capas Bronze, Silver y Gold — donde cada transformación es SQL determinístico y toda la metadata vive en una base de datos relacional estructurada, no en archivos de texto. Una vez que esa fundación está en su lugar, la IA no infiere — consulta. El resultado es el mismo cada vez.

¿Esto requiere iteración humana para construirse? Sí — y creemos que eso es honesto. El conocimiento de negocio no emerge automáticamente. La diferencia es que con Teramot, esa iteración es rápida — típicamente menos de una hora para deployar una primera versión funcional, y cinco minutos para cada nueva iteración — y estable en el tiempo. No se reconstruye cada mes. Se construye una vez, y se mantiene. Porque el conocimiento es estructurado y relacional, los cambios en el modelo de datos se propagan correctamente en lugar de romper silenciosamente un archivo de texto en algún lugar aguas abajo.

Después de tres años de R&D y deployments en producción en empresas de América Latina, lanzamos en febrero de 2026 la versión que lo hace completamente self-serve: una empresa puede pasar de cero a un data lakehouse gobernado y listo para IA registrándose en nuestro sitio e ingresando una tarjeta de crédito. Sin onboarding de nuestra parte. Sin ingenieros especializados de su lado.

Lo que Anthropic describió es el problema correcto, resuelto de la manera correcta dadas sus restricciones. Lo que estamos construyendo es el mismo resultado — IA que realmente funciona con datos reales de negocio — diseñado para las empresas que no tienen el equipo de Anthropic.