Posts Tagged ‘nvarchar boundary’

Tricky string concatenation

March 24, 2009

I had a bug today in one of my stored procedures. The error message indicated that somewhere a string truncation was happening. Easy enough right?


When generating dynamic sql code into some string variable which will get executed, I prefer using nvarchar(max). After all you never know how big can a string get right?. But let’s say some sp gets as input parameter a table name. No point in declaring it as nvarchar(max) since the maximum length for a database object can be 128. So I use that.

The trick is that when using constants for concatenation sql server has a limitation: 4000 for nvarchar and 8000 for varchar:

declare @sql nvarchar(max)
select @sql = replicate(N’a’,3000) + replicate(N’b’, 3000)
select len(@sql)

The result is 4000. Sql Server examines both operands before the assignement, sees them as constants and decides to use the maximum limit for nvarchar(n), even though @sql is declared as max.

Another example where varchar is used with nvarchar to illustrate data type precedence, combined with implicit conversion from varchar to nvarchar

declare @sql varchar(max), @sql1 nvarchar(max)
set @sql1 = ”
select @sql = replicate(‘a’,3000) + replicate(‘b’, 3000)
select len(@sql)
–result is 6000

declare @sql varchar(max), @sql1 nvarchar(max)
set @sql1 = ”
select @sql = replicate(‘a’,3000) + replicate(‘b’, 3000) + @sql1
select len(@sql)
–result is 4000

Sql Server converted the varchar variable @sql (length = 6000) to nvarchar. It truncated it because max length for nvarchar is 4000. it used for this datatype precedence

Sql server will not let you assign a string constant with a length greater than 4000. It will truncate:
declare @sql nvarchar(max)
select @sql = replicate(N’a’,5000)
select len(@sql) –result is 4000

The tricky part
What happens when I concatenate one string smaller than 4000 with other string variables?

The bug I had is not that obvious:

declare @sql nvarchar(max), @str1 nvarchar(128), @str2 nvarchar(max)
SET @str1 = N’x’
set @str2 = N’y’
set @sql = REPLICATE(N’a’,4000) + @str1 + @str2

select len(@sql)
select @sql

The resutl of the len is 4001. Oh, it’s greater than the 4000 limit so you would think everything is ok right? Not so fast cowboy. The @sql result ends up with ‘ay’, and in fact the length should be 4002 right? So, where did the ‘x’ character go?
Well, sql server will start evaluating the expression before assignment (also it doesn’t know that the result is a nvarchar(max)). It evaluates from left to right: first replicate(N’a’, 4000) + @str1. @str is nvarchar(128) so it assumes the result should be a constant of maximum 4000, thus it truncates the ‘x’. Next it adds that to @str2 which is a nvarchar(max) so it assumes the result is a nvarchar(max0, thus it adds the ‘y’

How can you fix it:

  1. use nvarchar(max) for all variables which are concatenated
  2. before assignment, initialize @str like this: set @sql = N” and then @sql = @sql + strings_to_concatenate. This way sql server will convert the result of concatenation from nvarchar(max) from the beginning, because it evaluates from left to right

I prefer using the second approach; no point in declaring variables using larger data types if you don’t really need it