The Semantic Layer problem keeps coming back. Every few years, the industry rediscovers that data teams don't have a working Semantic Layer. The label changes. The tools change. The outcome doesn't. For thirty years, every attempt to solve the Semantic Layer problem has failed for the same reason. Now, for the first time, that reason is changing - and AI Analytics is the thing that finally moves it. This piece is about why past iterations failed, what's actually different now, and the three new problems that show up the moment business users start authoring semantic context themselves.
Why Past Semantic Layer Attempts Failed
In the 90s, BusinessObjects introduced "universes" - a layer where business definitions lived above the database. The promise: analysts pick metrics by name, the platform handles the SQL. The reality: universes lived in the hands of the BI team. The people who actually knew what "revenue" meant in each region weren't the ones maintaining them. Definitions drifted. Then they became stagnant and inflexible. And then they were ignored.
Twenty years later, the modern data stack tried again. Looker's LookML promised the semantic layer would finally live in version control. dbt added a metrics layer. Cube. AtScale. The vocabulary changed. The fundamental setup didn't. The layer was still authored by data engineers, who had to interview business stakeholders to find out what anything meant. The bottleneck wasn't tooling. It was that business knowledge had to be translated through the data team before it could become operational.
Then AI showed up. Recently, Andreessen Horowitz's Jason Cui and Jennifer Li published a piece arguing that the Context Layer - structured business meaning - is the missing foundation for reliable AI agents. They're right that context is the bottleneck. But the implied solution was to extract context from queries, logs, and metadata. Same move as before, with better algorithms. Automation cannot capture context that hasn't been authored yet.
Three eras. Three iterations. Same failure mode every time: the people who carry the context were never the people building the layer.
The Semantic Layer Problem Was Never About Technology
Ask a senior data engineer and they'll be honest about it. They don't know what an "MQL" means at this specific company. They don't know which of the five marketing-spend granularities is the one anyone should trust. They can run any query you describe. They cannot tell you what the company means when it says "active customer."
That knowledge lives with the head of growth, the product manager, the marketing analyst, the finance lead.
A VP of Data at a B2C company put it bluntly recently: "I don't believe in a natural semantic layer." His reasoning: marketing spend at his company comes through in five different granularities - ad, ad-group, network, channel, daily aggregate - and they don't reconcile. Same metric, five answers. Which one matters depends entirely on the question.
This is the part the industry has been quietly avoiding for thirty years. The Semantic Layer as a concept assumed there was one definitive set of meanings to capture. There isn't. Meaning lives in the question being asked, the moment it's asked, and the person asking it. That's not something to store in a YAML file maintained by an off-by-one team.
So the whole project failed for the same reason every time. The data team owned the layer. The business owned the meaning. The two were never going to meet.
How Claude Code Skills Are Changing the Semantic Layer for AI Analytics
For the first time, business users have a programmable surface they open every day. Not a config screen. Not a CMS. Not a place to log into and edit definitions. They open Cursor, Claude Code, or Claude with a plugin, and they write text files that explain things.
They call them skills. A skill is a short document that tells the AI: "When someone asks about the funnel, look at these four tables. Column X means user-on-trial. Don't merge mobile sessions and web sessions - here's why."
The marketing analyst writes the marketing skill. The product analyst writes the product skill. The finance lead writes the finance skill. Each person describes their corner of the business in their own words, points at the right tables, and names the gotchas. They open a PR. The data team reviews it. If it makes sense, it merges. The skill ships to the entire organization.
This is happening in production right now. One B2C company - whose VP of Data shared the rollout in detail - rolled out an internal Claude Code plugin to twenty users two weeks ago. Company-wide rollout is days away. The plugin ships with eight or nine skills authored by the data team. But the design is explicit: those aren't the skills that matter. The next skills get written by the analysts themselves, code-reviewed by the data team, merged into the repo.
For the first time, the people who carry the context are also the people authoring it. The data team's role moves from translator to reviewer. The Semantic Layer stops being a document to maintain and starts being a living conversation in version control.
This is the people-and-process problem finally giving way. Not a new tool. A new author.
Three New Problems: Governance, Semantic Drift, and Cost
The moment business users start authoring skills, three new problems open up.
Governance
The data team still needs to know which skills are producing wrong answers, where the agent is going off the rails, and where to intervene. The current answer at most teams is "tail a Slack channel and read the logs." One VP of Data acknowledged it directly: "We don't have monitoring on quality today." They rely on marketers catching wrong numbers because they happen to know them by heart. That works for twenty users. It does not work for two hundred.
Semantic drift
An analyst writes a skill in May, hardcodes a table and column name. In July, an upstream pipeline ships a rename. The skill silently breaks. Nobody finds out until a stakeholder asks a question and gets a wrong answer. As one VP described it: the analyst writes "this is the table, this is the column" - tomorrow the field changes, nobody notices, it falls the first time it runs. Multiply that across hundreds of skills authored by people who aren't tracking pipeline changes, and the result is slow, quiet rot.
Cost
Nobody looks at token usage… yet. At one B2C company, token usage already hit roughly $10K per person at peak before active cost management kicked in - and that was with a sophisticated team paying attention to model selection. When analysts write skills, they're not thinking about token efficiency. A poorly structured skill that pulls the whole world into context every time is invisible to its author and very visible on the bill.
These three are the new shape of the work. The old work - go interview the business and translate meaning into YAML - is going away. The new work is governance, drift control, and cost-aware enablement of the people who now author the layer themselves.
Where This Leaves AI Analytics
The Semantic Layer problem isn't solved. But for the first time in thirty years, it's moved. The bottleneck has shifted from "the data team has to write everything" to "everyone can write, and now there needs to be a system that keeps that working at scale." That's a much better problem to have - and it's the one that determines whether AI Analytics actually works inside an organization.
This is the shift Upriver, the AI Data Engineering Platform, is building for.
FAQs: The Semantic Layer and AI Analytics
What is a Semantic Layer?
A Semantic Layer is a layer above raw data that defines business meaning - what a metric like "revenue" or "active customer" actually means inside a specific organization, how different concepts relate, and which tables and columns to use to compute them. It's what lets a person, or an AI agent, ask "what was revenue last quarter?" without having to know SQL or the schema.
Why has the Semantic Layer been so hard to solve?
Every attempt - BusinessObjects in the 90s, Looker LookML and dbt in the modern stack, the AI-era Context Layer - has put the responsibility for authoring meaning on the data team. But the data team doesn't carry that meaning. It lives with business users. The translation gap between the two is what has caused every iteration to fail.
How are Claude Code skills changing the Semantic Layer for AI Analytics?
Claude Code skills let business users author small text files that describe their corner of the business directly - which tables to use for a funnel question, what specific columns mean, the gotchas. They submit a PR, the data team reviews, and the skill ships. For the first time, the people who carry the context are the people authoring it.
What is semantic drift?
Semantic drift is what happens when a skill is authored against the current state of a data environment and the environment changes underneath it. An analyst hardcodes a table and column; an upstream rename ships months later; the skill silently breaks. It's an emerging failure mode unique to federated semantic context authoring.
Why is token cost suddenly a concern for AI Analytics?
When analysts and business users write skills, they're not thinking about token efficiency. Poorly structured skills can pull large amounts of context into every interaction, and at scale this compounds. At one B2C company, token usage hit around $10K per person at peak before active cost management. Most organizations haven't started measuring it yet.
How is this different from an AI coding assistant?
AI coding assistants are built for code-first workflows. Solving the Semantic Layer for AI Analytics requires reasoning over the data environment - warehouse, orchestrator, code, and the business definitions on top - and a system to govern the skills business users author. That's the work an AI Data Engineering Platform is built for, not a code assistant.