SQLServerCentral Article

Vector Search in SQL Server 2025: Storing Embeddings, Querying Them, and What to Watch Out For

,

Your team wants semantic search on data the company already owns and stores in SQL Server. Maybe it is product descriptions, support tickets, internal documents, or knowledge-base articles. Someone has done a proof of concept with embeddings, and now the conversation is turning into "where do we put these vectors and how do we query them in production." If you would rather not stand up a separate vector database next to your existing SQL Server, this article is for you.

I am going to focus on the SQL Server-specific mechanics: where vectors live, how the three flavors of distance behave, how hybrid filters interact with vector ranking, and what the new vector index actually buys you. I will not re-explain what an embedding is. If you need that, the Microsoft Learn page on vector search and vector indexes in the SQL Database Engine has a reasonable primer.

Everything below targets SQL Server 2025 (17.x). Most of it also works on Azure SQL Database, Azure SQL Managed Instance configured with the Always-up-to-date update policy, and SQL database in Microsoft Fabric. I will call out where a feature is preview rather than GA. As of this writing, the VECTOR data type with the default float32 base, VECTOR_DISTANCE, and VECTORPROPERTY are GA. The float16 base type, VECTOR_SEARCH, CREATE VECTOR INDEX, and AI_GENERATE_EMBEDDINGS are preview features and require ALTER DATABASE SCOPED CONFIGURATION SET PREVIEW_FEATURES = ON. Verify the current status on Microsoft Learn before depending on any of this in production.

One more thing before we start. The demo embeddings in this article are tiny (4 to 8 dimensions) and handcrafted with the values chosen to make the behavior visible. In real systems they come out of an embedding model and are typically 384, 768, 1024, or 1536 dimensions. I am keeping them small here so you can paste, run, and read the result without staring at a 1,536-element literal. Wherever it matters, I will mention what changes at production scale.

Storing embeddings alongside relational data (GA)

The new VECTOR(n) type stores a fixed-length array of floats in an internal binary format and surfaces them to the client as a JSON array, which is convenient because it round-trips through any driver that understands a string. The dimension count is fixed at the column level. With the default float32 base type the maximum is 1,998 dimensions. With the preview float16 base type the maximum is 3,996 dimensions and storage halves, at the cost of precision.

Here is the demo schema we will reuse. Realistic enough to be useful, small enough to read.

CREATE TABLE dbo.demo_products
(
    product_id            INT IDENTITY(1,1) PRIMARY KEY,
    sku                   NVARCHAR(20)  NOT NULL,
    name                  NVARCHAR(100) NOT NULL,
    category              NVARCHAR(40)  NOT NULL,
    is_active             BIT           NOT NULL,
    description           NVARCHAR(400) NULL,
    description_embedding VECTOR(4)     NULL
);

In a real system that VECTOR(4) would be VECTOR(1536) or whatever your embedding model produces. The dimension count is part of the type, so it has to match the model. You cannot mix a 768-dimensional embedding into a column declared 1,536, and you cannot quietly upgrade a model that produces a different number of dimensions without a column change.

The sample data is five rows: three input devices that should cluster together and two pieces of furniture that should sit far away from them. KB-01 and KB-02 are intentional near-ties so we can see ranking sensitivity; CH-02 is also is_active = 0 so it has a job in the hybrid section.

INSERT INTO dbo.demo_products (sku, name, category, is_active, description, description_embedding)
VALUES
    ('KB-01', 'Mechanical keyboard, blue switches', 'peripherals', 1,
     'Tenkeyless mechanical keyboard, clicky blue switches.',
     CAST('[0.90, 0.10, 0.05, 0.00]' AS VECTOR(4))),
    ('KB-02', 'Wireless keyboard, low-profile',     'peripherals', 1,
     'Compact low-profile wireless keyboard, scissor switches.',
     CAST('[0.85, 0.12, 0.10, 0.00]' AS VECTOR(4))),
    ('MS-01', 'Optical mouse, 6 buttons',           'peripherals', 1,
     'Wired optical mouse with two side buttons.',
     CAST('[0.80, 0.20, 0.05, 0.00]' AS VECTOR(4))),
    ('CH-01', 'Office chair, mesh back',            'furniture',   1,
     'Mid-back mesh office chair, adjustable arms.',
     CAST('[0.05, 0.05, 0.10, 0.95]' AS VECTOR(4))),
    ('CH-02', 'Standing desk, electric',            'furniture',   0,
     'Electric sit-stand desk, discontinued model.',
     CAST('[0.10, 0.05, 0.05, 0.90]' AS VECTOR(4)));

Two quick functions are useful for sanity checks. VECTORPROPERTY returns the dimension count and base type, so you can confirm a column matches the model you think it matches.

SELECT
    sku,
    VECTORPROPERTY(description_embedding, 'Dimensions') AS dims,
    VECTORPROPERTY(description_embedding, 'BaseType')   AS base_type
FROM dbo.demo_products
ORDER BY product_id;
sku    dims  base_type
-----  ----  ---------
KB-01  4     float32
KB-02  4     float32
MS-01  4     float32
CH-01  4     float32
CH-02  4     float32

The old way of doing this was to store embeddings as varbinary(max) with a hand-rolled byte layout, or as a JSON string in nvarchar(max). Both work. Both make you carry the dimension count out-of-band, force per-row deserialization on every query, and give you no native protection against accidentally inserting a 769-element array into what was supposed to be a 768-dimensional column. The VECTOR type fixes those three problems and gets a more efficient binary path on TDS for clients on a recent driver.

What not to do with vector columns:

  • No constraints other than NULL / NOT NULL. No PRIMARY KEY, UNIQUE, FOREIGN KEY, CHECK, or default through a computed column. Equality and uniqueness on vectors do not really make sense anyway.
  • No B-tree or columnstore index on a vector column. The dedicated CREATE VECTOR INDEX path is the only index option, and that is a preview feature; more on it below.
  • Not allowed in memory-optimized tables.
  • Not supported with Always Encrypted.
  • Not allowed as a base type for an alias type via CREATE TYPE, similar to xml and json.

One subtle gotcha: sp_describe_first_result_set currently reports a vector column as varchar or nvarchar. Some ORMs and code generators rely on that procedure for metadata, so check what your data access layer thinks the column type is before assuming end-to-end binary transport. If the client is on an older driver, the server falls back to JSON-array text over the wire anyway. See Microsoft Learn: VECTOR data type.

Performance note: the on-disk and in-memory representation is the optimized binary form. Old varbinary or JSON-string approaches will be doing string parsing or byte unpacking on every query; the typed column does not.

Exact similarity with VECTOR_DISTANCE (GA)

VECTOR_DISTANCE(metric, a, b) computes a distance between two vectors of the same dimensionality. Three metrics are supported: cosine, euclidean, and dot. The function returns a distance, not a similarity, so smaller is more similar. For cosine the value is 1 - cos_sim, which lives in roughly [0, 2]. For Euclidean it is the L2 distance. For dot product it is the negated dot product, so that smaller still means more similar.

The metric you pick changes the ranking. To make this visible, here is the same query vector run under all three metrics. The query vector is [0.88, 0.11, 0.07, 0.00], deliberately closest to KB-01 in direction. CH-01 and CH-02 are far in every metric and are excluded from this listing for readability.

DECLARE @q VECTOR(4) = CAST('[0.88, 0.11, 0.07, 0.00]' AS VECTOR(4));

SELECT
    m.metric,
    p.sku,
    CAST(VECTOR_DISTANCE(m.metric, p.description_embedding, @q)
         AS DECIMAL(10,6)) AS distance
FROM dbo.demo_products AS p
CROSS APPLY (VALUES ('cosine'), ('euclidean'), ('dot')) AS m(metric)
WHERE p.category = 'peripherals'
ORDER BY m.metric, distance;

The exact decimal values will vary slightly across hardware and driver versions because of float32 rounding, so do not chase the last digit. The point is that all three metrics rank KB-01 first here, but they do not have to. If KB-02 had a magnitude of 2.0 in the same direction, dot product would put KB-02 ahead while cosine would still put KB-01 first.

metric     sku    distance
---------  -----  ----------
cosine     KB-01  0.000168
cosine     KB-02  0.001702
cosine     MS-01  0.005402
dot        KB-01  -0.804500
dot        KB-02  -0.762700
dot        MS-01  -0.722500
euclidean  KB-01  0.024495
euclidean  KB-02  0.045826
euclidean  MS-01  0.110000

Cosine ignores magnitude and only cares about direction; Euclidean cares about both; dot product is direction times magnitude. Use cosine when your embeddings are normalized or you only care about meaning, Euclidean when geometric distance matters, and dot product only when the model documentation explicitly says to (some retrieval models output un-normalized vectors and expect dot product as the natural inner-product score).

A typical "show me the top three nearest" query is straightforward.

DECLARE @q VECTOR(4) = CAST('[0.88, 0.11, 0.07, 0.00]' AS VECTOR(4));

SELECT TOP (3)
    sku,
    name,
    CAST(VECTOR_DISTANCE('cosine', description_embedding, @q)
         AS DECIMAL(10,6)) AS distance
FROM dbo.demo_products
ORDER BY VECTOR_DISTANCE('cosine', description_embedding, @q);

Here are the top 3 results.

sku    name                                  distance
-----  ------------------------------------  ---------
KB-01  Mechanical keyboard, blue switches    0.000168
KB-02  Wireless keyboard, low-profile        0.001702
MS-01  Optical mouse, 6 buttons              0.005402

The old way of doing this, before the type existed, was to keep embeddings as a JSON string in nvarchar(max) and do the math by hand. It looked roughly like the following, and it is worth reading once just to appreciate that you do not have to write it again.

-- Cosine distance hand-rolled over an nvarchar(max) JSON column.
-- Don't actually use this. It's here for contrast.
SELECT TOP (3) p.sku,
       1 - (
            SUM(a.[value] * b.[value])
            / NULLIF(
                SQRT(SUM(a.[value] * a.[value])) *
                SQRT(SUM(b.[value] * b.[value])), 0)
       ) AS distance
FROM   dbo.demo_products_legacy AS p
CROSS  APPLY OPENJSON(p.embedding_json)
            WITH ([value] FLOAT '$') AS a
CROSS  APPLY OPENJSON('[0.88, 0.11, 0.07, 0.00]')
            WITH ([value] FLOAT '$') AS b
WHERE  a. = b.
GROUP  BY p.sku, p.embedding_json
ORDER  BY distance;

Beyond being verbose, this approach has no protection against length mismatches, no efficient binary representation, and the optimizer cannot reason about it as anything more interesting than a CROSS APPLY over a string parse. See Microsoft Learn: VECTOR_DISTANCE (Transact-SQL).

The most common gotcha with VECTOR_DISTANCE is sorting backwards. People reflexively write ORDER BY VECTOR_DISTANCE(...) DESC from muscle memory and then wonder why the worst matches show up first. Distance is small-is-better for all three metrics. The other gotcha is reaching for dot because it sounds simple and forgetting that without normalization, magnitude leaks into the score; large-norm vectors win regardless of their direction.

Performance note: VECTOR_DISTANCE is exact and never uses a vector index even if one exists on the column. In a plan it shows up as a Compute Scalar over whatever feeds it, and cost scales linearly with the number of rows the optimizer hands to it.

Hybrid search: filters plus vector ranking (GA)

This is the section most teams actually need. Real semantic search almost never runs over the entire table. It runs over "the documents this user is allowed to see, in this tenant, that are still active, in the last 90 days." The vector ranking is the last thing you do, after relational filtering has done most of the work.

Let me extend the demo with a tenant column and a few more rows so the filter has something to bite into.

ALTER TABLE dbo.demo_products
    ADD tenant_id INT NOT NULL CONSTRAINT df_demo_products_tenant DEFAULT (1);
GO

UPDATE dbo.demo_products SET tenant_id = 1;

INSERT INTO dbo.demo_products
    (sku, name, category, is_active, description, description_embedding, tenant_id)
VALUES
    ('KB-T2-01', 'Gaming keyboard, RGB',        'peripherals', 1,
     'RGB mechanical gaming keyboard, hot-swap switches.',
     CAST('[0.89, 0.11, 0.06, 0.00]' AS VECTOR(4)), 2),
    ('KB-T2-02', 'Compact 60% keyboard',        'peripherals', 1,
     'Compact 60 percent layout, programmable.',
     CAST('[0.87, 0.13, 0.08, 0.00]' AS VECTOR(4)), 2),
    ('MS-02',    'Trackball mouse, ergonomic',  'peripherals', 1,
     'Thumb-operated trackball, ergonomic shape.',
     CAST('[0.78, 0.22, 0.06, 0.00]' AS VECTOR(4)), 1);

The two new tenant-2 keyboards are the dangerous ones. Their vectors are right in the middle of the keyboard cluster (they would rank near the top by pure cosine distance) but they belong to a different tenant and absolutely must not appear in tenant 1's results.

DECLARE @q         VECTOR(4) = CAST('[0.88, 0.11, 0.07, 0.00]' AS VECTOR(4));
DECLARE @tenant_id INT       = 1;

SELECT TOP (3)
    sku,
    name,
    CAST(VECTOR_DISTANCE('cosine', description_embedding, @q)
         AS DECIMAL(10,6)) AS distance
FROM dbo.demo_products
WHERE tenant_id = @tenant_id
  AND is_active = 1
ORDER BY VECTOR_DISTANCE('cosine', description_embedding, @q);

The tenant-2 rows are gone, even though two of them would have ranked second and third on raw distance, and CH-02 is gone because it is inactive. This is the entire reason most teams should think twice before adopting a separate vector database. With VECTOR_DISTANCE over a normal table the optimizer applies the relational predicate first, exactly as you would expect; vectors are scanned only over the matching rows, and the filter participates in the plan as an ordinary Index Seek or Scan with predicate pushdown.

sku    name                                  distance
-----  ------------------------------------  ---------
KB-01  Mechanical keyboard, blue switches    0.000168
KB-02  Wireless keyboard, low-profile        0.001702
MS-01  Optical mouse, 6 buttons              0.005402

The old way of doing hybrid search with an external vector store was painful. You would call the vector store with the query embedding and a top-K like 50 or 100, get back a list of IDs, then go to SQL Server and run WHERE id IN (...) AND tenant_id = 1 AND is_active = 1. Often the relational filter would knock most of the candidates out and you would discover you needed K of 500 or 1,000 to end up with three usable results, and even that was guesswork. Sync drift between the two stores meant tenant 1 occasionally got results that no longer existed or no longer belonged to them. Putting the filter and the ranking in the same engine is the single biggest reason to keep this in SQL Server.

The gotcha here is subtle and worth slowing down for. With VECTOR_DISTANCE alone, no vector index, the filter is applied first and you scan-and-rank only the matching rows. As soon as you add a vector index, filter behavior depends on the index version. Earlier vector index versions effectively do the approximate search first and then post-filter, which means a narrow filter can leave you with very few results, sometimes zero, even when many qualifying rows exist in the corpus. The latest vector index version supports iterative filtering during the ANN traversal, which is much closer to what you would intuitively expect, but as of this writing the latest index version is only available on Azure SQL Database and SQL database in Microsoft Fabric. Box-product SQL Server 2025 has the earlier behavior. Plan accordingly: if your typical filter is highly selective (one tenant out of thousands), exact VECTOR_DISTANCE over the filtered set may produce both better and more predictable results than ANN over the unfiltered set.

Performance note: with no vector index this is a regular table scan or index seek over the relational predicate, with a Compute Scalar adding the distance and a Top N Sort doing the ranking. Cost scales linearly with the number of rows passing the filter, which is usually fine until that number gets large.

Vector indexes and approximate search (Preview)

Once your filtered candidate set gets into the hundreds of thousands or millions, exact distance over every row gets expensive. The vector index, built on a DiskANN-derived structure, gives you an approximate nearest-neighbor (ANN) search in sublinear time, with a small recall trade-off. CREATE VECTOR INDEX and the ANN search syntax are preview features and require PREVIEW_FEATURES = ON.

Here is a different demo schema, support tickets, which is closer to what most readers will actually be searching.

ALTER DATABASE SCOPED CONFIGURATION SET PREVIEW_FEATURES = ON;
GO

CREATE TABLE dbo.demo_tickets
(
    ticket_id      INT IDENTITY(1,1) PRIMARY KEY,
    tenant_id      INT           NOT NULL,
    status         NVARCHAR(20)  NOT NULL,
    created_at     DATETIME2(0)  NOT NULL,
    subject        NVARCHAR(200) NOT NULL,
    body           NVARCHAR(MAX) NULL,
    body_embedding VECTOR(8)     NOT NULL
);

Eight rows, three loose intent clusters: login/auth issues, billing issues, and one outlier hardware return. The first two auth tickets are intentional near-ties.

INSERT INTO dbo.demo_tickets
    (tenant_id, status, created_at, subject, body, body_embedding)
VALUES
    (1, 'open',   '2026-04-01', 'Cannot log in after password reset',
     'User reset password, now sees ''invalid credentials''.',
     CAST('[0.90,0.10,0.05,0.00,0.00,0.00,0.00,0.00]' AS VECTOR(8))),
    (1, 'open',   '2026-04-02', 'Login fails with correct password',
     'User reports valid password rejected on web login.',
     CAST('[0.88,0.12,0.06,0.00,0.00,0.00,0.00,0.00]' AS VECTOR(8))),
    (1, 'closed', '2026-03-15', 'MFA prompt loops on mobile',
     'Authenticator app prompt repeats after approval.',
     CAST('[0.80,0.18,0.10,0.00,0.00,0.00,0.00,0.00]' AS VECTOR(8))),
    (1, 'open',   '2026-04-05', 'Invoice charged twice this month',
     'Customer billed twice for same subscription period.',
     CAST('[0.00,0.00,0.00,0.85,0.15,0.05,0.00,0.00]' AS VECTOR(8))),
    (1, 'open',   '2026-04-06', 'Refund not received after cancel',
     'Cancelled plan, refund shows pending after 10 days.',
     CAST('[0.00,0.00,0.00,0.82,0.18,0.06,0.00,0.00]' AS VECTOR(8))),
    (2, 'open',   '2026-04-04', 'Cannot sign in to portal',
     'Different tenant, login problem on partner portal.',
     CAST('[0.89,0.11,0.05,0.00,0.00,0.00,0.00,0.00]' AS VECTOR(8))),
    (1, 'closed', '2026-02-20', 'Hardware return, defective unit',
     'RMA request for keyboard with stuck key.',
     CAST('[0.05,0.05,0.05,0.05,0.05,0.05,0.85,0.10]' AS VECTOR(8))),
    (1, 'open',   '2026-04-07', 'Coupon code rejected at checkout',
     'Promotional code returns ''expired'' but is valid.',
     CAST('[0.00,0.00,0.00,0.78,0.20,0.08,0.00,0.00]' AS VECTOR(8)));

Now create the vector index. The metric is fixed at create time, so pick the one your application actually uses.

CREATE VECTOR INDEX vi_demo_tickets_body
    ON dbo.demo_tickets (body_embedding)
    WITH (METRIC = 'cosine', TYPE = 'diskann');

Run the same logical query two ways. First with exact distance, then with the approximate-search syntax that engages the index.

DECLARE @q VECTOR(8) =
    CAST('[0.89,0.11,0.06,0.00,0.00,0.00,0.00,0.00]' AS VECTOR(8));

-- Exact (no index used)
SELECT TOP (3)
    ticket_id,
    subject,
    CAST(VECTOR_DISTANCE('cosine', body_embedding, @q)
         AS DECIMAL(10,6)) AS distance
FROM dbo.demo_tickets
WHERE tenant_id = 1
ORDER BY VECTOR_DISTANCE('cosine', body_embedding, @q);

-- Approximate (uses the vector index)
SELECT TOP (3) WITH APPROXIMATE
    ticket_id,
    subject,
    CAST(VECTOR_DISTANCE('cosine', body_embedding, @q)
         AS DECIMAL(10,6)) AS distance
FROM dbo.demo_tickets
WHERE tenant_id = 1
ORDER BY VECTOR_DISTANCE('cosine', body_embedding, @q);

The exact result is:

ticket_id  subject                              distance
---------  -----------------------------------  ---------
2          Login fails with correct password    0.000113
1          Cannot log in after password reset   0.000168
3          MFA prompt loops on mobile           0.011880


Approximate result:

The approximate result:

ticket_id  subject                              distance
---------  -----------------------------------  ---------
2          Login fails with correct password    0.000113
1          Cannot log in after password reset   0.000168
3          MFA prompt loops on mobile           0.011880

The two results are identical, and that is the lesson, not a coincidence. With eight rows the ANN search has nothing to skip; it traverses essentially the whole graph and returns the same answer as exact search. The cost of building and maintaining the index has bought you nothing yet. As a rule of thumb, below roughly 50,000 vectors per filtered query domain, exact VECTOR_DISTANCE usually beats ANN on both predictability and total operational cost. Above that, the index starts to earn its keep, but you should measure it on your hardware and your data before committing.

The older table-valued VECTOR_SEARCH function with a TOP_N argument still works for backward compatibility, but Microsoft Learn calls it deprecated for new code; the SELECT TOP (N) WITH APPROXIMATE syntax shown above is what to use going forward. See Microsoft Learn: VECTOR_SEARCH (Transact-SQL) and Microsoft Learn: CREATE VECTOR INDEX (Transact-SQL).

The old way of doing this without a SQL-resident index was to bolt on a separate vector store: Pinecone, Milvus, Qdrant, Weaviate, or pgvector if you also happened to be on Postgres. They all index well, all have a search API, and all of them mean a second system to back up, secure, monitor, patch, and keep in sync with the source of truth. For teams whose data already lives in SQL Server, the operational case for keeping vectors in the same engine is strong even if a dedicated store would beat it on raw ANN speed.

Several gotchas with vector indexes:

  • The latest vector index version, with full DML support and iterative filtering inside the WHERE clause, is currently only available on Azure SQL Database and SQL database in Microsoft Fabric. On box-product SQL Server 2025 the index is read-mostly: inserts, updates, and deletes against a vector-indexed table behave more restrictively, and you maintain the index by rebuild. Check Microsoft Learn for the current state on your platform before designing around it.
  • An index is built for one metric. If you need both cosine and Euclidean ranking on the same column, you need two indexes, or you fall back to VECTOR_DISTANCE for the off-metric one.
  • Building a vector index is not free; rebuilding is also not free. Budget the time and IO the way you would for a columnstore rebuild.
  • Recall is not 100 percent. ANN trades a small chance of missing a true top-K result for sublinear search time. If your application cannot tolerate occasional misses, you want exact search.

Performance note: ANN search is sublinear in row count, but the constant factors and the recall/latency trade-off depend heavily on graph parameters, hardware, and dimensionality. I am deliberately not quoting numbers I have not measured.

AI_GENERATE_EMBEDDINGS, briefly (Preview, extra setup required)

You can generate embeddings inside T-SQL with AI_GENERATE_EMBEDDINGS, which calls a registered EXTERNAL MODEL. This is a preview feature and requires PREVIEW_FEATURES = ON plus an embedding model endpoint configured for the database. I am keeping this section short on purpose because most readers cannot run it without first setting up that endpoint.

-- Sketch only. Requires an EXTERNAL MODEL registration
-- and PREVIEW_FEATURES = ON.
SELECT AI_GENERATE_EMBEDDINGS(
           N'mechanical keyboard with blue switches'
           USE MODEL my_embedding_model);

Because this depends on a live model endpoint, results are not deterministic across model versions and the dimensionality is set by the model, not by you. If you are running the demos in this article without an embedding model wired up, use the handcrafted vectors I have shown throughout; they exist for exactly this reason. See Microsoft Learn: AI_GENERATE_EMBEDDINGS (Transact-SQL).

The gotcha here is one that bites every team that ships a semantic search system: embeddings drift across model versions. If you re-deploy or upgrade your embedding model, the vectors you stored last week are no longer in the same space as the vectors you generate today. You have to re-embed the entire corpus and re-embed every query against the same model version, or distances become meaningless. Vector dimensions are also tied to the model: a 1,536-dimensional column cannot accept 768-dimensional output, so a model swap can be a schema change.

Operational gotchas roundup

A loose collection of things I have either run into myself or watched other teams run into.

Driver and TDS versions. For native binary transport of vectors, you need Microsoft.Data.SqlClient 6.1.0 or later, Microsoft JDBC Driver 13.1.0 or later, or the mssql-python driver. Older drivers still work, but they receive vectors as varchar(max) JSON over the wire, which is slower and easy to miss in code review when an ORM does the conversion silently. Check the actual driver version your application uses, not just the one your developer machines have.

Dimension limits. 1,998 for float32, 3,996 for float16 (preview). Most production embedding models stay well below either limit, so this is rarely the binding constraint, but it does cap a few research-grade models.

float16 trade-offs (preview). Half precision halves storage and reduces memory pressure. Recall on common embedding models is usually close to float32 in my limited testing, but it is not free, and it is preview, so benchmark on your own data and your own queries before committing. The base type is part of the column declaration; you cannot mix float16 and float32 within a column.

Re-embedding when source rows change. Pick a strategy and write it down. The two patterns that work in practice are: (1) a last_modified watermark column plus a job that re-embeds rows whose text changed since the embedding was generated, or (2) a CDC or trigger-based queue that pushes changed rows to an embedding worker. Whichever you pick, make sure inserts go through the same path. The most common failure mode in this area is "we re-embedded everything in March, then forgot to re-embed inserts after April 1st," which produces a corpus where some rows have stale embeddings and rank badly even though their text is obviously relevant.

Backup and restore size. A 1,536-dimensional float32 embedding is 1,536 × 4 bytes, roughly 6 KB per row before page overhead. A million rows with embeddings is about 6 GB just for the vectors. Plan database file growth, backup windows, and restore time accordingly. Compression helps less than you might expect because the binary representation is already dense.

Tooling metadata. sp_describe_first_result_set reports vector columns as varchar or nvarchar. ORMs, code generators, and diff tools that rely on it can produce surprising results. Confirm what your tooling thinks the column type is before you assume it is going to round-trip cleanly.

Rolling this out: when to adopt, when to wait

If you have under 100,000 candidate rows after your typical relational filter, you do not need a vector index yet. Build the demo, get the team comfortable with picking the right metric, write the hybrid query, and watch the plans. Exact VECTOR_DISTANCE over a properly filtered set is fast enough for a lot of workloads, and the operational story is simpler.

Stage the embedding pipeline separately from the query path. The two should be able to run independently so a model upgrade or an embedding job failure does not take search down. Keep a held-out labeled set of "this query should return these documents" pairs so you can measure recall over time; ANN can silently regress when the index is rebuilt with different parameters, and you will not notice without that ground truth.

Watch the GA-versus-preview boundary before you depend on VECTOR_SEARCH or CREATE VECTOR INDEX for production traffic on box-product SQL Server 2025. Preview behavior can change before GA, and the latest vector index features (full DML support, iterative filtering) are still cloud-first. Verify the current state on Microsoft Learn at the time you adopt.

Wrap-up

The honest summary is that for most teams already on SQL Server, the new vector type and exact-distance search cover the use case, the hybrid filter story is genuinely better than separate-vector-store designs, and the index is worth adopting once the corpus actually justifies it. Treat the preview features as preview, measure on your own data, and do not let the "AI" framing rush you past good database hygiene.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating