When a new text column is defined to have a maximum length of 4,000 characters (as in the pdf_forms sample solution pdf_form_values “field_value” column), how does the database treat that field when each new record is created? Does it actually reserve 4,000 characters of disk space for that field each time a new record is created, even if the field will only contain 2 characters for a particular record?
I am using Sybase, so I am particularly interested in how Sybase does this, but I would also like to know whether all SQL databases tend to handle this situation the same way.
With the pdf_forms sample solution, I also noticed that the “actual_pdf_form” media column shows a field length of 2147483647. That is a very big number. Where did it come from?
As you can probably tell from my questions, I am concerned about bloating my database files unnecessarily by assigning columns lengths that are longer than necessary.
SQL databases have a number of different datatypes for text fields. One is called “CHAR” (Character) and one is called “VARCHAR” (Variable Character). When you use a “CHAR” type of field - and define it to 4,000 - and your data is one character long - then it will “pad” the field with 3,999 spaces - to “fill up” the maximum of 4,000 characters. This WILL take up space in the database.
With VARCHAR (the default type of column created when you create a new column using the Dataproviders dialog) - then with a max length of 4,000 and data of one character - all that is stored is that ONE character.
The datatype of a media field (BLOB - Binary Large OBject) - is usually a “VARBINARY” (Variable Binary) data type. This has a large maximum size (as you would want!). If you take a look at your table in Sybase Central - you can see the actual sizes. Don’t worry about making your database “bloat” - using VARCHAR and VARBINARY will make sure it will only be as big as the data in there.
With SQL Anywhere char data type implementation actually does not take up space unless it is needed. char[4000] storing “hello there” will take up 11 bytes plus some overhead. This definition can be misleading.