Plenty of us are using in our databases Bit column as information if record is valid or not. But when you're selecting only valid records, you must filter table with Del=0 or something. So we have plenty of selects, which are using full scan thank to "Del" column low selectivity. Yes, we can delete records and all associated records in child tables if we want, but there two defects can arise.
First of them is absence of history related to our data collections. Second one is higher CPU and I/O load.
So I decided to choose two columns in my tables to localize records in time. First of them is column with GETDATE() default value (you know it probably :)), "CreateDate" name and nonclustered index on it, in descend order.
A several week ago I spoked with one software company owner about NULL values in columns. "Here is no reason to use delegated values instead of NULLs", he said. "When database engine comes with new version, here can be plenty of difficulties with those values.Why you have choosed this kind of mechanism?"
I thought, that seeking some values in datetime column would be faster then seeking columns which has NULL instead of value. Scanning NULLs is slower then scanning values as I noticed in my last article. I answered him that database engine always use index scan and never use index seek. "You're wrong", he said. "Server always use index scan."
I talked about it with my boss and he said: "Compare those two mechanisms".
So, I decided to write testing script. I create table with integer data type primary key with identity incrementation and two columns with date time data type. I insert about 1 million records into it and then I select all records with delegate values, records with null values and then I seek records in some random interval. Interval is random for every iteration for those two tests. With selecting range over NULL enabled column is more slower, because SARG went for a walk :).
Programmer, I have written about him in second paragraph was right and I was wrong with my decision. But when you select limited ranges of data, using delegate values will be always slower.
Note: You can make some steps to history, when you select data like this:
FROM dbo.Table
WHERE CreateDate <= @ParameterDate AND DeleteDate >= @ParameterDate
Here you can download testing script.
