Understanding varchar(max) 8000 column and why I can store more than 8000 characters in it
I can see why you're misunderstanding this - it's a little tricky. These are all valid:
- VARCHAR(1) - one character string
- VARCHAR(4000) - 4,000 characters
- VARCHAR(8000) - 8,000 characters - and if you use a number for this field's definition, that's the highest NUMBER you can use, but watch this:
- VARCHAR(MAX) - that one holds up to 2GB.
And yes, if you try to get data out of a VARCHAR(MAX) field, and somebody stored 2GB in there, buckle up.
Question 1: How does the 8000 characters come into play and where should I be aware of it?
Setting n to 8000 causes 8000 characters to come into play. You need to be aware of the Precision, Scale, and Length (Transact-SQL) references about char, nchar, nvarchar, and varchar. In contrast, setting n to max (no quotes) causes SQL Server to store (and return) the maximum number of bytes (as mentioned in your quote).
Question 2 : will a .net datareader query to this column always return the full result with 100 000+ character?
This is a .Net (not a SQL Server) question, but the .Net datareader fetches a stream of bytes. A byte is not a character, and SQL Server returns bytes (not characters). If n is set to 8000 and the data type is nvarchar, SQL Server returns up to 8000 bytes, which the .Net datareader can interpret to be 4000 Unicode characters. If n is set to 8000 and the data type is varchar, SQL Server returns up to 8000 bytes, which the .Net datareader can interpret to be up to 8000 ANSI characters. If n is set to max and the data type is nvarchar, SQL Server returns up to 2^31-1 bytes, which the .Net datareader can interpret to be up to (2^31-1)/2 characters. If n is set to max and the data type is varchar, SQL Server returns up to 2^31-1 bytes, which the .Net datareader can interpret to be up to 2^31-1 ANSI characters.
If you choose use char or varchar (instead of nchar or nvarchar) because they can store more "characters" (more accurately: bytes), you need to be aware that many Unicode characters have no equivalent ANSI character (thus a large portion of our world's users will be unable to see their localized/native characters in your app).