The Difference Between Null, Empty And Zero-Length Data / Strings

A common problem faced by new programmers is understanding the difference between null, empty and zero-length variables, especially when working with database records.

While, for most intents and purposes, the three things have the same effect — either you have some data you can work with, or you don’t — they arise from different circumstances. Understanding how null, empty and zero-length are different can help you avoid data errors in your programs.

The short version is this: If a variable simply doesn’t exist — usually because it hasn’t been declared, but sometimes because it hasn’t been assigned a value — it’s null. If the variable exists but contains no data, it’s empty. And if a structured data variable, such as an array, exists but doesn’t contain any items, it’s zero-length.

An analogy is in order.

Think of a database as a house. A house has rooms, and in each of these rooms are furniture and accessories that are appropriate to that room.

For example, the kitchen has a stove, refrigerator and sink. The bedroom has a dresser, bed and armoire. The living room has a couch, television and coffee table.

Where’s The Fridge?

Let’s refocus the analogy. We’ll liken a database table to your kitchen, and the refrigerator as a column in that table.

As a general rule, refrigerators hold food. What specific food is in there doesn’t really matter, so long as it’s food. (I know the fridge doesn’t hold every kind of food and often contains non-food items. Bear with me, please, and just accept for the purpose of the analogy that a fridge holds food.)

That’s much like the way a database column works. For example, a column might be a VARCHAR(50). The column doesn’t care what you are storing in it; it only cares that you are storing text that is not longer than 50 characters.

But let’s suppose you just bought a house (i.e., you are creating a new database table). There is a kitchen, but the old owners took the refrigerator and left the stove.

There’s a space where the refrigerator used to be. But there’s no refrigerator there. That’s null.

In other words, just because you’ve set aside some space where the refrigerator ought to be, and just because you say to visitors, “the fridge is going to be right over there,” that doesn’t make one magically appear.

It’s the same thing with database columns (and, for what that’s worth, all scalars [i.e., variables]). Declaring that you intend to have data at some point doesn’t mean there is data. And when there’s no data, you get a null.

Let’s bring the analogy back around. You have a database table called kitchen. In kitchen is a column named refrigerator, and refrigerator will hold food data.

When you first create the refrigerator column in your database, you’ve basically said, “I will eventually have some food, and I will need to put it someplace, and that place will be a fridge, which will go right there.” But until you actually get around to all that, there is no fridge.

Basically, the database says, “I know that’s where you want to put the fridge. But until you get one, that’s just a void space.”

So there is no fridge. Or, in programming parlance, it’s null.

Nothing From Nothing Means Nothing; You Gotta Have Something …

You can’t get blood from a turnip, especially when you don’t even have a turnip. The same is true of nulls. Null isn’t zero; it’s nonexistence.

If you ask me, “How big is your swimming pool?,” I can’t answer that question; I don’t have a swimming pool. Never have, probably never will.

I can’t say a nonexistent swimming pool has zero size. I can’t say it’s infinite or infinitesimal, either. The largest and smallest possible measures are still measures, and you can’t quantify things that don’t exist.

That’s why nulls cause so much trouble for new programmers. They write their programs with the presumption that the data they are referencing actually exists.

Consider our house database again, with its kitchen table (no pun intended). Let’s suppose I’m baking cookies in the stove and washing dishes in the sink, but I don’t have a fridge. The one record in the table might look like this:

Column Value
stove cookies
sink dishes
fridge

Let’s then suppose I want to serve milk with my cookies. I might use this code in VB.NET to tell me if I have milk:

Dim objConn As New SqlConnection("connection string")
Dim objCmd As New SqlCommand("SELECT * FROM kitchen", objConn)

objConn.Open()
Dim objReader As SqlDataReader = objCmd.ExecuteReader()
If objReader.HasRows() Then
	objReader.Read()
	If objReader("fridge") = "milk" Then
		Console.WriteLine("Cookies and milk! Yummy!")
	Else
		Console.WriteLine("There's no milk in the fridge. :(")
	End If
Else
	Console.WriteLine("There are no matching records.")
End If

objConn.Close()
objCmd.Dispose()
objConn.Dispose()

Notice this line:

If objReader("fridge") = "milk" Then

We are using the equality operator to determine if the value of objReader(“fridge”) is an empty string. But remember: there is no fridge.

Asking about what is inside a fridge that doesn’t exist is far more straightforward to a computer than it is to a person: The database basically says back to you, “What is this fridge of which you speak?”

Yet in the average person’s mind, not having any milk because you don’t have a fridge is the same thing as not having any milk in a fridge you do have. Either way, no milk.

This touches on the idea of identity, and it’s a good place to begin describing emptiness.

I Spit Out Like A Sewer Hole, Yet Still Receive Your Kiss

Let’s go back to our house database analogy.

Suppose, even though I don’t have any food, I decide to go ahead and buy a refrigerator. I put it in the place where I said I was going to put it, but there’s no food in it.

That’s emptiness. I have a fridge, but it is empty.

Column Value
stove cookies
sink dishes
fridge

So, if I go back to line 8 of the example above:

If objReader("fridge") = "milk" Then

Visual Basic now knows there is, indeed, a fridge in the kitchen. So if I ask it whether there’s milk in the fridge, it can go look inside the fridge and see no, there isn’t.

The difference between null and empty is especially problematic with weakly typed programming languages, such as PHP.

PHP will implicitly assign a null variable an empty value if you use a comparison operator, such as == or !=. (In other words, if PHP doesn’t have a refrigerator, it will automatically conjure one up in order to tell you the fridge is empty, since it figures that’s what you really want to know.)

That reinforces the idea in some programmers’ minds that null is the same thing as empty. To get around this, one uses PHP’s identity operators, === and !==.

The identity operator attempts to bring strong typing to the weakly typed data used by PHP. Identity means, in a nutshell, “are these two variables both equal, and of the same data type?”

Generally speaking, we can trust PHP to handle automatically converting our data types — such as changing nulls into empty strings — so that the comparison operator works 99 percent of the time. But once in a while, we run into a situation where PHP’s implicit comparisons aren’t enough, and that’s when we need to use identity.

This can easily be demonstrated with the strpos() function. Ideally, when strpos() does not find the needle in the haystack, it returns the boolean false; when it finds the needle, it returns the position of the needle as an integer.

But sometimes, the value returned by strpos() if the needle isn’t found will be a value other than false, such as zero (if the needle is found at the beginning of the haystack) or an empty string.

So, we shouldn’t use the comparison operator ==, because PHP won’t implicitly cast zero or an empty string as boolean false. Instead, we use the identity operator ===, which basically says, “are these values equal, and the same type?” And that, in turn, instructs PHP to explicitly cast each of the variables as a mutual type — in our case, booleans — in order to make its comparison.

Consider the example below, which you can run for yourself here:

<?php
$foo = "Valerie";
$bar = "V";

echo "strpos() position of $bar in $foo is: ";
echo strpos($foo, $bar);
echo "<br />";

echo "strpos() return value of $bar in $foo is of type: ";
echo gettype(strpos($foo, $bar));
echo "<br />";

if(strpos($foo, $bar) == false) {
	echo "strpos() comparison operator says $bar is <strong>not</strong> in $foo";
}
else {
	echo "strpos() comparison operator says $bar is in $foo";
}
echo "<br />";

if(false === strpos($foo, $bar)) {
	echo "strpos() identity operator says $bar is <strong>not</strong> in $foo";
}
else {
	echo "strpos() identity operator says $bar is in $foo";
}
echo "<br />";
?>

(Many thanks to Valerie at theprogrammerskit.com for the variables.)

In the example above, strpos() finds “V” at the very start of “Valerie.” It does this, essentially, by converting “Valerie” into an array of characters, then iterating through each until it finds “V”. Since all PHP arrays are indexed from 0, strpos() returns 0.

When PHP’s comparison operator makes an implicit typecast from an integer to a boolean, it considers zero to be false and all other integer values to be be true. So because 0 is false, but it is also the index position of “V”, PHP erroneously reports the letter was not found.

However, when we use the identity operator, we are telling PHP, “First, find the type of the first operand. Next, convert the type of the second operand to be the same as the first. Finally, compare the values.”

While the implicit casting with the comparison operator makes the integer 0 equal to boolean false, the identity operator equates 0 with true, because it converts data using a different algorithm. (As to the why and how of that, I am not sure, except that PHP developers must have figured out that they needed a way to convert the integer 0 to boolean false, and decided the identity operator would take on that job.)

So, when strpos() returns a zero, PHP knows that it must first explicitly convert zero into a boolean. In this case, it understands zero to be the same as boolean true. And so, the identity operator properly states that the letter was found.

In strongly typed languages such as .NET, however, there’s an inherently huge difference between the ethereal and temporal.

In PHP, for the most part null and empty are apples and oranges; they’re not the same thing, but they’re at least both fruits. In .NET, null and empty are your marriage to Marisa Miller and oranges. There’s literally no comparison to be made.

However, because humans equate nothing with zero, thanks to our ontological capacities, almost all programming languages have some construct that allows us to equate null and empty. In .NET, for example, the IsNullOrEmpty() method of the String class:

Dim MyString As String

If MyString.IsNullOrEmpty() Then
	Console.WriteLine("MyString is null or empty, says IsNullOrEmpty")
End If

In PHP, again, the comparison operators will implicitly convert your nulls into empty strings. However, there is also the built-in is_null() function, which will also return a boolean if your variable is explicitly cast as null; you can couple that with an empty-string comparison to equate null and emptiness:

<?php
$foo = NULL;

if(is_null($foo) || $foo == "") {
echo '$foo is null or empty';
}
?>

Emptiness Is Loneliness, And Loneliness Is Cleanliness

Which brings us to zero-length, which is basically the condition of existing, and being something, but not anything useful.

This is most common with arrays and similar structured data (e.g., object properties). An array has zero length if it has been instantiated but contains no data. A zero-length data structure, such as an array, itself isn’t null, and it has type; but it contains nothing.

It’s like having a refrigerator with its door chained shut; it’s there, but you can’t look inside it or take anything out. Of course, you can always unchain the door, and then put things in or out, or just look at what’s there.

Consider this PHP code, the output of which you can see here:

<?php
$foobar = array();
echo 'the length of $foobar is: ' . count($foobar);
echo "<br />";

foreach($foobar as $foo) {
	echo "$foo<br />";
}

$foobar[] = "Hello";
$foobar[] = 46;
echo 'the new length of $foobar is: ' . count($foobar);
echo "<br />";

foreach($foobar as $foo) {
	echo "$foo<br />";
}
?>
  • Line 3 prints 0; $foobar has zero length because we didn’t add anything to it.
  • Lines 6-8 are ignored, because $foobar has zero length; there is no each.
  • Line 12 prints 2, because there are now two items in the array.
  • Lines 15-17 print “Hello” and 46, because foreach now has items to iterate.

Null vs. Empty vs. Zero Length Recap

So, in review: Null means the variable does not exist. Empty means it exists, but it contains no value. Zero length means an array or similar structure has been declared — and, if the language is strongly typed, has type — but contains no values.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • Check out the Commenting Guidelines before commenting, please!
  • Want to share code? Please put it into a GitHub Gist, CodePen or pastebin and link to that in your comment.
  • Just have a line or two of markup? Wrap them in an appropriate SyntaxHighlighter Evolved shortcode for your programming language, please!