Do You Deal with UTF-8?

Microsoft is adding UTF-8 support in Azure SQL Database, and it will be coming in SQL Server 2019. If you don't know what this is, perhaps you want to read a bit about it, as it can save space if you have the need to use Unicode characters. This format uses a variable number of bytes to encode characters, and this is often used on the web and email. My question today is:

Are you looking to store data in UTF-8?

The way this works with SQL Server can be complex. In fact, not everyone thinks this is really done well, as there are some bugs in the initial versions. As I've watched some people try to work with this, it is a very confusing and complex topic. I thought this might be a simple "SQL Server handles everything" collation, but it doesn't appear that this will be the case. Calculating space needed for data isn't as simple as I might expect. Not having to prefix strings with N is nice, but I'm not sure that this will actually work in practice.

I've seen some discussions of how to work with this, and it's complicated. In fact, it's not easy to tell how much storage you might need for characters. The storage differences can be confusing, depending on the code range you work with. Since most of us know that our users will try to add data we would never expect to our database, and we might run into issues with not enough space. For those of us specifying the size for our columns, we now need to know how many bytes are in use, not characters.

Likely this is easy for those of us that work in the English world and stick with varchar, but maybe not. I'm curious today how many of you will attempt to work with UTF-8 (or are waiting for it). It would also be good to know about any challenges or issues you've had working with the encoding in other systems or languages.

Contribute

by DaveConvery

SQLServerCentral

Write for SQLServerCentral - we're looking for new writers for SQL Server articles, scripts and questions.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-03-25

11,791 reads

Privacy Policy

by DaveConvery

SQLServerCentral

Redgate (which means Red Gate Software Limited and its subsidiary Red Gate Software Inc.) respects your privacy. All information you give us is held with the utmost care and security. Please take time to review this privacy policy as it sets out our privacy practices and tells you how your personal information will be treated […]

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-03-25

52,500 reads

Terms of Use

by DaveConvery

SQLServerCentral

Thank you for visiting the SQLServerCentral website (the "Site"). Please read these Terms of Use carefully as your use of the Site will be subject to them. 1. Introduction 1.1 The Site is owned and operated by Red Gate Software Limited ("Redgate"), and is hosted in England on Redgate's behalf. Redgate is a company registered […]

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2019-03-25

4,195 reads

It Depends

by Steve Jones

SQLServerCentral.com

Editorial

It depends. The mantra of many DBAs and others in IT. Steve Jones reminds us why it applies.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(1)

You rated this post out of 5. Change rating

2020-06-23 (first published: 2009-03-25)

604 reads

Discuss

The Basics of Cryptography Part 2

by Michael Coles

SQLServerCentral.com

Miscellaneous

Encryption in SQL Server is difficult to implement and manage, but it is being required more and more often. However understanding what encryption means is as important as being able to manage and implement it. Michael Coles brings us part 2 in his series on explaining the mysteries behind cryptography.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2005-08-17

11,140 reads

Discuss

Do You Deal with UTF-8?

Rate

Share

Categories

Share

Rate

Do You Deal with UTF-8?

Rate

Share

Categories

Share

Rate

Related content

Contribute

Privacy Policy

Terms of Use

It Depends

The Basics of Cryptography Part 2