Why your data still can’t answer a simple question

Part 1

Agentic data approaches promise faster answers and less analyst bottleneck. Here’s what it actually takes to make that promise hold in production and why getting it right matters more than getting it fast.

Every organization I talk to has the same problem dressed up in different clothes. Somewhere in the business, a decision maker is sitting on a question that the data could answer and yet the answer is days away, routed through a ticketing system, a sprint cycle, a release window. By the time it arrives, the moment has often passed.

Over the last couple of years we’ve been working on shortening that gap, giving business users a more direct path to the answers that data analysts shouldn’t have to spend their time on. That’s what “Talk to data” means in practice, and we’ve now built and deployed it for real organizations in real production environments. What I want to share here is what we’ve learned about what actually works and what needs to be true before you should trust it with your users.

The slow route between question and answer

Most organizations offer decision makers two ways to get data. They can go to a data analyst, who knows the domain, knows the models, and can produce a reliable answer, but who is also juggling ten other requests and may not get back to you until next week. Or they can open a BI report, which answers the questions it was designed to answer and stops there. The moment someone needs something slightly different, they’re back to creating a ticket.

The critical variable in all of this is time to answer. In most organizations, that’s measured in days, sometimes weeks, which in a fast-moving business environment carries a real and often invisible cost.

“For the low-to-medium complexity questions, we can leverage agents and let them handle it. For the truly complex analyses, we still route to the analyst. The goal isn’t to replace expertise but to stop wasting it on simple questions.”

– Jaroslav Reken –

Agentic approaches work precisely because they absorb that low-to-medium tier of questions, the ones that are genuinely answerable from the data but that currently sit in an analyst’s backlog because there’s no other route. When that layer starts to self-serve, analysts get their time back for the work that actually requires them, and users get answers on a timeline that’s useful rather than merely eventual.

Why the semantic layer is the difference-maker

An agent without context is roughly as useful as asking a very fast junior analyst who has never seen your business. What makes an agent genuinely reliable and allows it to answer questions accurately rather than just plausibly is a well-structured semantic layer sitting between it and the raw data.

Think about what a good data analyst brings to the table: knowledge of the business domain, familiarity with where the data lives, an understanding of the relationships between tables, the logic behind calculated fields, the gap between what a column is called and what it actually means in practice. The semantic layer is the structured representation of that knowledge, and the richer it is, the more of that analyst-level context the agent can draw on when constructing an answer.

In a Microsoft environment, Power BI’s semantic model is the most natural starting point. It already encodes relationships, measures, and business logic that most organizations have invested significant time building, which means you’re working with a foundation that already exists rather than constructing one from scratch. Gartner called 2026 the year of context, and in our experience, organizations that had already invested in their semantic models had a meaningful head start when it came to getting agents to perform reliably.

What a semantic layer provides an agent

Business definitions of metrics and dimensions, not just column names

Relationships between data sources, pre-modelled and validated

Calculated measures that encode business logic

A shared vocabulary that maps user language to data structures

One thing that surprised us in practice: model size matters far less than the quality of context you provide. We’ve worked with models ranging from a handful of tables to well over 600, across industries from logistics to insurance to manufacturing, and the agent’s ability to perform doesn’t degrade with scale in the way people tend to assume.

What this looks like when it works

Two deployments shaped how we think about this: one at a client, one inside our own team, and they illustrate fairly different situations where the approach made sense.

Case one: Logistics at HOPI

The logistics department at HOPI Holding had a capacity problem that will be familiar to a lot of organizations: a single highly capable person was fielding most of the data questions from colleagues who lacked the technical skills to query the models themselves. When that person was busy or overloaded, the questions simply accumulated.

We built a data agent integrated directly into Microsoft Teams, connected to the underlying Power BI models, so that users could ask questions the way they’d ask a colleague. The agent handled query construction, retrieved the data, and returned an interpreted answer with context — an explanation of what the number meant, not just the number itself.

The outcome went beyond time saved. Users who had never engaged directly with data started asking questions they would never have raised with a human analyst, partly because the barrier was lower and partly because they could iterate in real time. Data adoption increased, and new analytical scenarios emerged that the team hadn’t anticipated when we started the project.

Case two: Sales team at Joyful Craftsmen

Our own sales team inherited a reporting model built for a different version of the business. They were unfamiliar with its structure, had limited Power BI experience, and faced a straightforward choice: invest months rebuilding the reporting layer to something they understood, or find a faster path to the insights they needed.

Within days of the agent going live, the team was extracting meaningful insights from a model they hadn’t built and couldn’t have queried directly. When the person who originally built that model moved on, the knowledge didn’t leave with them; it was embedded in the semantic layer and remained accessible through the agent.

Enterprise readiness is where most projects stall

The gap between a convincing demo and a trustworthy production deployment is where most agentic projects run into trouble.

Getting something impressive running in an afternoon is genuinely straightforward: there are frameworks, APIs, and enough tutorials to take you there quickly. Deploying something you can trust in front of real users, with real data, inside a real security environment — that’s the work that most proof-of-concepts never get tested against.

We’ve defined a set of criteria that any production agent deployment has to satisfy before it goes anywhere near a real user base. None of it is glamorous, but every item on the list represents something we’ve either been burned by or narrowly avoided.

Enterprise readiness criteria

Security perimeter: data access must respect existing permission structures, not bypass them

Infrastructure as code: the deployment must be reproducible, versionable, and auditable

Controlled change management: updates to models or instructions must go through a defined process

Response validation: there must be a mechanism to verify the agent is producing correct answers

Compatibility with existing data models: the agent works with what you have, not what you wish you had

Extensibility: the architecture can grow without being rebuilt

Monitoring: usage, failures, and drift are tracked and visible

Building trust through validation, not reassurance

People often ask whether an AI agent can provide incorrect answers, and the honest answer is yes, though so can your data analyst. Errors in both cases tend to come from the same sources: gaps in data quality, missing context, or a question that was ambiguous enough to support multiple valid interpretations. The agent is fallible in ways that are, if anything, more systematic and therefore more detectable than the errors a human might make.

What makes this manageable is automated response validation. We build a curated set of questions with known correct answers, expressed both as DAX queries and as reference datasets. Every time the underlying model changes, every time the agent’s instructions are updated, every time a new data source is connected, those tests run automatically. If the agent’s answers drift from expected results, we know before users do.

This matters as much organizationally as it does technically. Users build trust in the agent through demonstrated reliability over time, and that trust is fragile in the early stages of a rollout. A wrong answer that circulates before it’s caught can set an adoption programme back months. The teams that succeed are the ones who invest in catching those problems before they become incidents.

When it comes to adoption sequence, we start with the IT and data team since they built the models, know the edge cases, and will spot a wrong answer immediately. Then a small group of key users who know the business domain well enough to push back when something looks off. Only once those two groups have stress-tested the agent under normal conditions do we expand to a wider audience. The timeline is slower than some clients would prefer, but it’s the difference between a rollout that compounds confidence and one that quietly undermines it.

The Microsoft stack: trade-offs you need to consider

Most of the organizations we work with are in the Microsoft ecosystem, so it’s worth highlighting the key considerations when choosing between the main options. The right choice depends on what infrastructure you already have, what level of control and testability you need, and how much implementation work you’re prepared to take on.

In practice, the decision often comes down to speed vs. control. Out-of-the-box options like Fabric and Copilot tools are faster to get started with and require less engineering effort, but come with limitations in extensibility, validation, and model control. Custom-built approaches take more time and require Azure infrastructure, but offer greater flexibility, testability, and the ability to tailor the solution to specific business needs.

One constraint that cuts across these approaches: your models need to be AI-ready — properly described, well-structured, with clean naming and documentation. If that investment hasn’t been made yet, it needs to happen first, and it takes real time. Custom solutions can compensate to some extent with additional instructions and context, but there’s a ceiling to how much you can patch over a poorly prepared semantic layer.

My honest advice is to start with the approach that gets you to a testable prototype fastest, and treat the migration decision as something you’ll make once you understand your actual requirements. You’ll understand them much better after six weeks of real user feedback than you will upfront. The tooling landscape is moving quickly enough that agility in your architecture choices matters more than picking the theoretically optimal path on day one.

Coming in Part 2

From proof of concept to production: the conversation we didn't expect to have

In the second part of this series, I sit down with Jan Laš, CIO at HOPI Holding, to talk through what the deployment actually looked like from the inside: the moments of doubt, the security surprises, the users who made it work and the ones who didn’t, and what we’d do differently today. It’s a more honest conversation than most case studies allow.

My mission is to change your life in how you make everyday decisions based on proper data. I do this by helping corporate data leaders create and adopt data strategies built on Microsoft platforms using artificial intelligence. Over my 15+ year career, I have gained experience which I use to mentor clients and pass on to community as a speaker and data leader in the industry.

Jaroslav Reken
Co-Founder & Data Strategist

The post Why your data still can’t answer a simple question appeared first on Joyful Craftsmen.

Why your data still can’t answer a simple question

Part 1

The slow route between question and answer

Why the semantic layer is the difference-maker

What this looks like when it works

Case one: Logistics at HOPI

Case two: Sales team at Joyful Craftsmen

Enterprise readiness is where most projects stall

Building trust through validation, not reassurance

The Microsoft stack: trade-offs you need to consider

Coming in Part 2

From proof of concept to production: the conversation we didn't expect to have

Jaroslav Reken
Co-Founder & Data Strategist

Rate

Share

Share

Rate

Why your data still can’t answer a simple question

Part 1

The slow route between question and answer

Why the semantic layer is the difference-maker

What this looks like when it works

Case one: Logistics at HOPI

Case two: Sales team at Joyful Craftsmen

Enterprise readiness is where most projects stall

Building trust through validation, not reassurance

The Microsoft stack: trade-offs you need to consider

Coming in Part 2

From proof of concept to production: the conversation we didn't expect to have

Jaroslav RekenCo-Founder & Data Strategist

Rate

Share

Share

Rate

Jaroslav Reken
Co-Founder & Data Strategist