The PARSE() function was introduced to T-SQL in SQL Server 2012. It can be used for converting string values to date/time or number data types. We already had CAST and CONVERT for this though, so what was different about this new conversion function?
There are two main use cases for PARSE:
- Where you need greater flexibility (or for the conversion to be more forgiving of dubious formats) when translating a text value
- Where you need to be able to convert text values expressed according to a different set of cultural rules – or in a different language when it comes to dates.
NOTE: PARSE() can have performance implications, so read the text to the end and carefully test any code using this function.
The syntax of PARSE is:
PARSE ( string_value AS data_type [ USING culture ] )
Without the optional 'culture' parameter PARSE follows the same syntax as the CAST function e.g.
PARSE('1,000' AS int);
The USING statement allows you to also specify the culture/language the string value is formatted in. For instance, if I want to convert a date from French. My current language is set to British so this fails:
SELECT PARSE('25 Mai 2018' AS date);
But if I specify the culture:
SELECT PARSE ('25 Mai 2018' AS date USING 'fr-FR');
Then I get the result I am looking for:
A common use of PARSE is to interpret dates more flexibly than the existing functions. For instance, the following conversion attempts fail as the date format isn't supported by the CAST function:
SELECT CAST('Friday, 25 May 2018' AS date); SELECT CAST('Fri 25 May 2018' AS date); SELECT CAST('May/25/2018' AS date);
Whereas if we use PARSE it works fine:
SELECT PARSE ('Friday, 25 May 2018' AS date); SELECT PARSE ('Fri 25 May 2018' AS date); SELECT PARSE('May/25/2018' AS date);
All return the correct date.
As a side note, you might think that with the examples above that specify the day of the week, PARSE could just be ignoring that and looking at the rest of the text to work out the date. That's not the case, if we try:
SELECT PARSE('Thursday, 25 May 2018' AS date);
We get an error. PARSE isn't a magic wand to convert dates of any style, it understands what it understands, but there are still formats it will not recognise. For instance:
SELECT PARSE('25th May 2018' AS date);
Returns an error.
You can also use PARSE to convert to numeric types and here it is also more forgiving on format that CAST. For instance:
SELECT CAST('1,0,0,0,,' AS int);
Fails with an error, whereas:
SELECT PARSE('1,0,0,0,,' AS int);
Returns 1,000. Whether this is a good thing or not you'll have to judge depending on your use case. I suspect the reason PARSE allows this is that the use of separators to break up numbers varies from culture to culture and isn't always used consistently, it's easier just to go with the assumption that, as long as you're using your equivalent of the thousands separator, then anything goes.
The biggest strength of PARSE is probably in the converting from different formats based on culture. If you've had this sort of issue then you know it can be a real pain. This most often happens when converting between dates in the US format and that used by most of the rest of the world i.e.
mm/dd/yyyy - USA
dd/mm/yyyy - Rest of World (almost)
Using PARSE, if I get a column of text data that represents dates, and I know the source was a US system I can easily convert it as such:
SELECT PARSE(SomeColumn AS date USING 'en-US');
And if I send data to someone in the US, they can use PARSE the other way around:
SELECT PARSE(SomeColumn AS date USING 'en-GB');
It's not just dates where we have this sort of issue. As alluded to above, when it comes to expressing numbers we have our differences around the world too. The UK and the USA side together on defining the decimal point as a dot (.) and the thousands separator as a comma (,). Most of the rest of the world however, has it the other way around (though it's different again in Arabic). So:
SELECT PARSE('1,000' AS decimal(10,2) USING 'en-US');
Gives me “One Thousand” (the fact that I feel I need the clarity of explaining the value in words demonstrates how tricky this sort of problem can be when you're operating in multiple cultures).
SELECT PARSE('1,000' AS decimal(10,2) USING 'fr-FR');
I'm not going to try and exhaustively list all the possible confusions and scenarios. The likelihood is that, if you have to deal with this, then you know about it, and you're already seeing PARSE as the solution to your problems. I'd advise though that you test your specific use cases, PARSE isn't perfect.
For instance Wikipedia (https://en.wikipedia.org/wiki/Decimal_separator) states that the 22nd General Conference on Weights and Measures declared in 2003 that "numbers may be divided in groups of three in order to facilitate reading; neither dots nor commas are ever inserted in the spaces between groups".
That means we would never do this - 1,000,000, or this - 1.000.000, but we would instead express one million as 1 000 000. The same article specifies that this has become the recommendation of other technical organisations including the US National Institute of Standards and Technology. As such we might feel this is the format to use if we want to be unambigious globally. Unfortunately, PARSE doesn't see that as an international standard. So:
SELECT PARSE('1 000 000' AS decimal(10,2) USING 'fr-FR');
Returns one million – but:
SELECT PARSE('1 000 000' AS decimal(10,2) USING 'en-GB'); SELECT PARSE('1 000 000' AS decimal(10,2) USING 'en-US');
Both return errors.
As shown in my example at the beginning of this post, PARSE isn't just about dealing with formatting difference. Where you have the month of a date expressed in a local language, PARSE can be used to translate that:
SELECT PARSE('25 Mai 2018' AS date USING 'fr-FR'); --French SELECT PARSE('25 Mai 2018' AS date USING 'de-DE'); --German SELECT PARSE('25 Mei 2018' AS date USING 'nl-NL'); --Dutch SELECT PARSE('25 Maggio 2018' AS date USING 'it-IT'); --Italian SELECT PARSE('25 Mayo 2018' AS date USING 'es-ES'); --Spanish
All return the correct date without error.
As a final point it is worth quickly discussing PARSE and performance. For two reasons PARSE doesn't perform as well as the existing CAST and CONVERT functions, the main reason is that is implemented via CLR rather than being a native SQL function which makes it a lot slower. Secondly, if you just stop to think about it logically – there's often a bit more work for PARSE to do to work out how to convert your value, so it's likely this also adds overhead.
In a quick test I converted 100,000 dates expressed as text. Using CAST the CPU consumption was 172ms. Using PARSE it was 29735ms – so between 100 and 200 times slower. As such it's best avoiding for bulk operations and using the existing CAST or CONVERT where you can.
The full Microsoft reference for PARSE including a list of all the cultures you can use is here: https://docs.microsoft.com/en-us/sql/t-sql/functions/parse-transact-sql?view=sql-server-2017
The Wikipedia article referenced above that discusses local differences in use of decimal separators: https://en.wikipedia.org/wiki/Decimal_separator
The same topic is discussed on the Microsoft site here: https://docs.microsoft.com/en-us/globalization/locale/number-formatting
Wikipedia reference on date formats by country: https://en.wikipedia.org/wiki/Date_format_by_country