Removing Null Characters (0x00) From Values in SQL Server

I recently came across a problem while crawling a BCS data source in SharePoint 2010. Most of the 130,000 records were getting indexed, but I was getting an error on 25 records:

Error while crawling LOB contents. ( ‘.’, hexadecimal value 0x00, is an invalid character. Line 25, position 10. )

When I first took a look at the offending records in the database I couldn’t see any actual problems.  The issue was that the null character in the data was being treated as a string terminator, so it was not being returned with a standard select query. I was finally able to detect the null values by casting the suspected offending column to varbinary:


SELECT CAST(TheColumn AS varbinary(max)) FROM TheTable WHERE TheID = 123

Casting to varbinary returns the character values in the column as hexadecimal.  Once the values were converted to hexadecimal I could see the null values represented as 0000 (In my case the offending column was nvarchar, if your column is varchar you’ll be looking for 00).

Now that I could see the problem, I needed a way to quickly and easily remove the characters from any records that contained them in the database. I found a post on StackOverflow that finally pointed me in the right direction:

http://stackoverflow.com/questions/2828333/what-is-the-null-character-literal-in-tsql

Simply searching for char(0) didn’t return any values in my case. However if the data is cast from nvarchar to varchar and the collation is set to ‘SQL_Latin1_General_CP1_CI_AS’ it does work. Note that the cast to varchar was required for the null character to be detected. Luckily in my case the offending records didn’t contain Unicode characters.  Here’s what I finally ended up with to search for and remove the null characters:


UPDATE TheTable
SET TheColumn = REPLACE(CAST(TheColumn as varchar(max)) COLLATE SQL_Latin1_General_CP1_CI_AS, CHAR(0), '')
WHERE CAST(TheColumn as varchar(max)) COLLATE SQL_Latin1_General_CP1_CI_AS like '%' + CHAR(0) + '%'

Comments

7 responses to “Removing Null Characters (0x00) From Values in SQL Server”

  1. Bruce Avatar
    Bruce

    This works great, unless you need to retain the unicode characters. The conversion to VARCHAR rips that away. (or more accurately, converts it to the ascii character that occupies the same byte position in the ascii character table)

    Not really a solution for dealing with data in an international user environment.

  2. Karthik Avatar
    Karthik

    This solution is working fine …..with the columns those having null value

    Thanks for the post……

  3. Varun Avatar
    Varun

    Perfect answer. After looking around and googling for hours, I got it here. Many thanks.

  4. IG Avatar
    IG

    10 out 10. Exactly what I was looking for. Worked perfectly and saved my bacon. Thanks! : )

  5. Alex Avatar
    Alex

    I goggled and successful used this way of eliminating the null character:
    select Replaced = replace(‘BAD_STRING_OR_COLUMN_NAME’ COLLATE Latin1_General_BIN, nchar(0x00) COLLATE Latin1_General_BIN , ”)

    To see the difference run this:
    select Replaced = ‘BAD_STRING_OR_COLUMN_NAME’ COLLATE Latin1_General_BIN + nchar(0x00) + ‘|||’
    select Replaced = replace(‘BAD_STRING_OR_COLUMN_NAME’ COLLATE Latin1_General_BIN + nchar(0x00) + ‘|||’, nchar(0x00) COLLATE Latin1_General_BIN , ”)

  6. Nadeem Ilyas Avatar
    Nadeem Ilyas

    Thanks mate, works fine for me 🙂

    REPLACE(CAST(COLUMN_NAME as varchar(max)) COLLATE SQL_Latin1_General_CP1_CI_AS, CHAR(0), ”)

  7. juan pablo Gomez Serna Avatar
    juan pablo Gomez Serna

    Great still works on 2022

Leave a Reply to IG Cancel reply

Your email address will not be published. Required fields are marked *