One of the questions that I have had to visit with over the last few years has been what is the difference between a null and empty string ‘’ and a space ‘ ‘. To a laymen or someone who does not work with databases often these may all be the same thing. The way they look at it, it’s just empty, it’s just something that is not there but really what does this mean to the SQL Server guys and gals?
I remember listening in on a conversation years ago that I found really interesting. Let’s take a look at this from a usage stand point. I have a questioner that I have all perspective employees fill out. I do not hire based on gender, but it may be a requirement if I am ever sued because someone claims that I do not hire enough of one gender or another. So I leave the question to the point where if they want to answer it they can. If not then they do not have to. Based on this “case”, how would or how could I use the three values?
The first two values are easy enough to determine. Someone could put a value of ‘M’ for male or ‘F’ for female. Pretty simple and to the point, now for an example purpose only lets assume that candidates are entered into the database as soon as we see a resume. The Gender is not listed on the resume and we do not want to make an assumption on the gender. This would be a value of NULL. In short the question has not been presented yet, there has not been and attempt to get the data.
Now lest say the candidate has come in to the office to turn in an application, on the application we will say there is a place to check the gender in addition to male and female they can choose ‘not to answer’, or ‘prefer not to answer’. However, the candidate did not select any of the boxes. They were given a chance to supply the data but simply did not check one of the three boxes. So really we made an attempt to get the question but were unable to retain a value. Therefore we could represent this as a empty space or a ‘’. Now if the candidate selected the option of “Prefer not to answer”, we could place a ‘U’ for unknown in there. The ‘U’ is a value just like a ‘ ‘ (space) is. So either would work, the space would represent that the question had been asked and the candidate choose not to give an answer.
Now as silly as this may seam to all of us really there could be handled differently, this was just an example. In the world of design I would consider using other values for the options that it could be. In short my design could may look something like this:
- M –Male
- F – Female
- U- Prefer not to answer the question
- Z – Has not been prompted with the question
And then I would have made the column on the database to not accept nulls but would have the default of ‘U’ be assigned.
I hope that this example has helped explain some of the differences that we as DBA’s should see between Null, ‘’, and ‘ ‘.