Blog Post

Using a Regular Expression to Detect a Number–#SQLNewBlogger

,

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers. Here are some hints to get started.

I had a customer recently that was looking to work with Data Masker for SQL Server and had questions about how to handle some situations. In this case, they needed to detect a number type in a field that was overloaded with multiple types of data. Here’s an example of what they had in their “string” (varchar) field. Look at the stringvalue column below:

2022-06-07 08_23_13-SQLQuery1.sql - ARISTOTLE.SimpleTalk (ARISTOTLE_Steve (58))_ - Microsoft SQL Ser

If the string was a “nnn nnn nnnn” number value, then they wanted to change it. If it had other values, then leave it alone. This is really a query problem and a WHERE clause to structure.

One would think this is where you use ISNUMERIC() and try that. If I run this, I get zero rows back.

SELECT d.stringvalue
FROM dbo.ddmdemo AS d
WHERE ISNUMERIC(d.stringvalue) = 1

This isn’t really a number, as the sequence has spaces. What if we try this:

SELECT d.stringvalue
FROM dbo.ddmdemo AS d
WHERE ISNUMERIC(REPLACE(' ', '', d.stringvalue)) = 1

It also returns no values.

Really, this appears to really be a regular expression type of query, so I could do this, using LIKE.

SELECT *
FROM dbo.ddmdemo AS d
WHERE d.stringvalue LIKE '[0-9]%'

That, however, gives me two rows in this set of data. I see these results:

2022-06-07 08_31_14-SQLQuery1.sql - ARISTOTLE.SimpleTalk (ARISTOTLE_Steve (58))_ - Microsoft SQL Ser

The reason is that I am matching the first character only. The argument is a pattern and using square brackets implies a single character in a range. Since there are a lot of different patterns, and the “234223 Test” matches that, I ought to be more specific.

This particular pattern from the customer is 3 numbers, space, 3 numbers, space, 4 numbers. Anything else is non matching. Since there could be trailing spaces, I’d really want this:

SELECT d.stringvalue
FROM dbo.ddmdemo AS d
WHERE d.stringvalue LIKE '[0-9][0-9][0-9] [0-9][0-9][0-9] [0-9][0-9][0-9][0-9]'

This returns my single row. It would match any row that is of the pattern “nnn nnn nnnn” where n is a numerical value from 0-9.

There are other considerations here, and certainly this is likely to be a complex set of masking rules, but this shows a relatively simple way to detect a numerical pattern in a string.

SQL New Blogger

This was an interesting case. I initially thought  LIKE and an expression, but thought maybe there was a quicker way with isnumeric(). I didn’t find one, so I explained that and then the way that did work for me.

To me, this gives someone who glances at my blog a bit of insight into how I think and what I considered. This might be how they think, or someone on their team thinks. This might get me an interview.

Write about the problems you solve and how/why you do it.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating