The Trouble With PHP’s Weak Data Types: An Example Examined And Explained

Asked recently on Yahoo! Answers:

Multiple nested MySQL functions in PHP?

I was going through code today, trying to make some things more compact. The application worked without problems, so I knew that mysql error statements were superfluous.

This is the type of statement I was changing, I figured that I’d remove the seemingly unnecessary $result2 variable

$result2 = mysql_query($carts) or die(mysql_error());<br />
while ($row2 = mysql_fetch_array($result2)) {<br />
	echo "<option>$row2[0]</option>";<br />
}

so this is what I changed it into (I basically replaced where it said $result2, to what $result2 had contained, and removed the error check)

while ($row2 = mysql_fetch_array(mysql_query($carts))) {<br />
	echo "<option>$row2[0]</option>";<br />
}

but this code returned infinite loops, much to my surprise. Why is it doing this? is there a way around it?

Recently, I wrote about how PHP’s weak data types often can lead to problems for new programmers who don’t understand the difference between null, empty and zero-length variables. Here’s another opportunity to expose why strong data typing is essential for best programming practices, and to show how PHP’s weak data types — normally, a source of comfort for beginning programmers — can be the source of extensive frustration.

Let’s look at the first block of code, which was to be “cleaned up” by this questioner:

$result2 = mysql_query($carts) or die(mysql_error());
while ($row2 = mysql_fetch_array($result2)) {
	echo "<option>$row2[0]</option>";
}

The code here is straightforward. Somewhere prior to this block, the questioner connected to a database server and selected a database. Now, he wants to populate a select with the results set from a SELECT query (stored in a variable named $carts).

Let’s examine each line:

  1. mysql_query() instructs a connected MySQL database server to execute the specified query and return a resource or boolean, depending on the type of query run and any errors encountered by the function. In this case, assuming the database server does not encounter an error, a result resource will be returned.
  2. mysql_fetch_array() takes the result returned by the query, instructs the database server to populate the noted array variable with the contents of the current row, and to advance the internal pointer in the result to the next row. while() instructs this process to continue until the internal row pointer of the resource reaches the end of the result.
  3. The echo statement outputs a new option element to the page, setting as its value the value of the first cell of the $row2 array.

Grab some data, use it as the value(s) for some element(s) on the page. This is known as data binding, and it’s as routine a task as Web programming allows.

But in the process of data binding with PHP’s built-in functions, we actually work with several unmentioned objects and data types, which would have been a lot more transparent had we used a more structured programming language.

How Data Binding Works In Strongly Typed Languages

Bear with me for a moment while I show how the code block above would be different if we used a more strongly typed, more object-oriented programming language to accomplish the same task.

For example, if I want to populate a select list in ASP.NET from a database, I might declare a DropDownList control on my page, and use a SQLDataSource control to populate it:

<asp:DropDownList runat="server" ID="MyDropDown" DataSourceID="MyDataSource" DataTextField="some_column" DataValueField="some_other_column" />
<asp:SqlDataSource runat="server" ID="MyDataSource" SelectCommand="SELECT * FROM some_table" ConnectionString="<%$ ConnectionStrings:MyNorthwind%>" />

This level of abstraction hides from me all the objects and data types I might otherwise need to explicitly declare in order to add option elements to a select in ASP.NET. (In other words, it works like PHP in terms of writing it, because I don’t need to concern myself with data types or other pesky technical details; ASP.NET, like PHP, takes care of that for me.)

Actually, it’s better than PHP, because I am not writing control loops or working with things like result resources; I simply plug in some parameters and ASP.NET does the grunt work for me. This is abstraction at the height of its power.

An alternative approach in ASP.NET would be to use a subroutine to bind data from the query to a DropDownList control. If I wanted to do that, I would first declare a DropDownList control:

<asp:DropDownList runat="server" ID="MyDropDown"  />

Then, in my code behind / server script block, I would declare all the data types and variables needed to actually connect to, and extract from, a database server the records I want:

Imports System.Data
Imports System.Data.SqlClient

Sub Page_Load(Sender As Object, E As EventArgs) Handles Me.Load
	If Not Page.IsPostBack Then
		MyDropDown_dataBind()
	End If
End Sub

Sub MyDropDown_dataBind()
	Dim objConn As New SqlConnection("My Connection String")
	Dim objCmd As New SqlCommand("SELECT * FROM table", objConn)

	Try
		objConn.Open()
		Dim objReader As SqlDataReader = objCmd.ExecuteReader()

		If objReader.HasRows Then
			MyDropDown.DataSource = objReader
			MyDropDown.DataTextField = "some_column"
			MyDropDown.DataValueField = "some_other_column"
			MyDropDown.DataBind()
		Else
			MyDropDown.Items.Insert(0, New ListItem("No records returned!"))
		End If

		objConn.Close()
	Catch Ex As Exception
		MyDropDown.Items.Insert(0, New ListItem("There was a database error getting the records: " & Ex.Message))
	Finally
		objCmd.Dispose()
		objConn.Dispose()
	End Try
End Sub

Notice that the code above is far more explicit and complex than the previous code blocks. It employs several classes: SqlConnection, SqlCommand, SqlDataReader, DropDownList (referenced by calling MyDropDown), ListItem and Exception, to name several. Each of these classes are their own data types, and each employs primitive types, such as integers and strings, or even each other, as properties and / or arguments for methods they employ.

Strong typing means knowing exactly what you’re working with as you program. Abstraction, when done to the extent ASP.NET provides, means not having to worry one bit about what you’re working with.

PHP’s approach fits somewhere in the middle: You don’t really need to know what you’re working with, but not knowing can cause lots of trouble.

What’s Wrong With The Second Code Block

Let’s look now at the second code block and sort out its troubles.

while ($row2 = mysql_fetch_array(mysql_query($carts))) {
	echo "<option>$row2[0]</option>";<br />
}

Line 1 accomplishes several tasks, which we will evaluate from the right to the left, since that’s the order PHP will use:

  • We query the database; again, a mysql result resource is returned on success.
  • We read the first line of that result into an array named $row2 The internal record pointer of the result resource is advanced to the next record.
  • We repeat the process due to the while loop.

In other words, so long as the query always returns at least one record, the while condition is always satisfied, so the loop never stops executing. It just keeps spitting out the first record of the results set, over and over again.

The while condition always evaluates to true because the structure of the code does not consider how a result resource works. And that, in large part, is the fault of PHP’s weak data types.

When mysql_query() returns a result resource, what it’s really doing is telling the database server to hold, in the database server’s memory, the records requested.

The resource returned by the MySQL server isn’t itself an PHP variable, the way a string or integer is; it’s a reference PHP uses to tell the database server, “Remember the records I asked for with that query? The ones you bookmarked with the value stored in this reference? I want you to work with those records.”

Looking again at the questioner’s first code block:

$result2 = mysql_query($carts) or die(mysql_error());
while ($row2 = mysql_fetch_array($result2)) {
	echo "<option>$row2[0]</option>";
}
  • In Line 1, $carts is the string we will pass as a query; e.g., “SELECT * FROM table”. $carts is an internal variable — that is, a variable that exists within PHP itself, which PHP can manipulate directly.
  • In Line 1, $result2 is a resource. In this case, it will be a result, which is a reference (or a kind of bookmark) that MySQL gives to PHP after MySQL sets aside, in its memory, the records we requested. PHP doesn’t have these records; they remain on the database server. Whenever PHP wants to work with those records, it sends the reference (bookmark) held in $result2 back to MySQL, and MySQL gets, from its memory, the records referenced (bookmarked) by that PHP variable. That makes $result2 an external variable; PHP does not directly work with the information contained in $result2, but uses it to tell MySQL to perform tasks on its behalf.
  • Line 2 instructs MySQL to send, to PHP, the current record referenced by $result2’s row poiner (more on that in a moment). Once MySQL sends that record, PHP assigns that record’s columns to an associative array named $row2. In this case, PHP uses the external variable $result2 to remind MySQL which set of records PHP wants, and to tell MySQL to send to PHP information it can assign to an internal variable, $row2.
  • When PHP asks MySQL to send the information it assigned to $row2, it also tells MySQL to advance the row pointer to the next record in the result resource. The row pointer, which is part of the result resource, is how MySQL knows where PHP left off when it last worked with the records in the result resource. Because advancing the row pointer happens silently — that is, MySQL does this implicitly, as part of its mysql_fetch_array() tasks; you don’t explicitly command MySQL to advance the row pointer — it’s easy to forget about the row pointer altogether.

Inefficient, Inelegant, But By Design

So, how do we achieve the efficiency desired by this questioner?

In short, we don’t. By design, PHP requires a fairly verbose process to bind data to a page or HTML element; there’s really no more efficient manner of going about it. In fact, the block below is about as compact as it gets when binding option elements to a select, using only PHP’s built-in functions:

<select id="myselect" name="myselect">
<?php
	$link = mysql_connect("host", "user", "pass") or die("cannot connect to db server");
	mysql_select_db("dbname") or die("cannot select database");
	$rs = mysql_query("SELECT * FROM table") or die("cannot parse query");
	while($row = mysql_fetch_array($rs)) {
		echo "<option>$row[0]</option>";
	}
?>
</select>

If you have lots and lots of controls to bind in the same way, you might consider using a class. Among the benefits of a class, as we have seen in the past, is that we can reuse code for similar tasks.

Below is a class, named optionMaker, that takes as its constructor arguments a database query and link resource, respectively; its public method, makeOptions(), spits out option elements from the resulting record set.

<?php
class optionMaker {
	public $link, $query;
	private $rs, $row;

	function __construct($q, $l) {
		$this->query = $q;
		$this->link = $l;
		$this->getResults();
	}

	private function getResults() {
		if(!$this->rs = mysql_query($this->query, $this->link)) {
			unset($this->rs);
		}
		elseif(mysql_num_rows($this->rs) == 0) {
			unset($this->rs);
		}
	}

	public function makeOptions() {
		if(!is_null($this->rs)) {
			while($this->row = mysql_fetch_array($this->rs)) [
				echo "<option>" . $this->row[0] . "</option>n";
			}
		}
		else {
			echo "<option>No matching records or database error getting records</option>n";
		}
	}
}
?>

Note that this class isn’t very secure; it doesn’t check if the $link resource is valid, for example; it doesn’t sanitize the SQL it is sent; it will pass UPDATE, DELETE, DROP or other possibly harmful queries along. While these security holes shouldn’t be of major concern — presumably, you wouldn’t poison your own well by sending bad parameters to this class — it’s always best to secure your applications against unintended uses and attacks. Use this at your own risk.

(Herein lies another benefit of object-oriented languages: In ASP.NET, there are a number of ways to prevent controls from sending potentially harmful queries to a database, sanitize input, etc. with little or no extra coding, compared to PHP.)

The easiest way to use the class above is to save it to a php file; call it optionMaker.php. Then, include that PHP file wherever you want to use it. Your use of the class to populate three select elements might look like this:

<?php
require_once("optionMaker.php");
$link = mysql_connect("server", "user", "pass") or die("cannot connect to db server");
mysql_select_db("dbname") or die("cannot select database");
?>

<select id="select1" name="select1">
<?php
$options1 = new optionMaker("SELECT column1 FROM table1", $link);
$options1->makeOptions();
?>
</select>

<select id="select2" name="select2">
<?php
$options2 = new optionMaker("SELECT column1 FROM table2 WHERE column2 = 'foo'", $link);
$options2->makeOptions();
?>
</select>

<select id="select3" name="select3">
<?php
$options3 = new optionMaker("SELECT a.column1 FROM table3 AS a LEFT JOIN table4 AS b ON a.id = b.foreign_key WHERE b.column1 = 'foo' AND b.foreign_key IS NOT NULL", $link);
$options3->makeOptions();
?>
</select>

If the database server can’t parse the query you supply, or if the query returns no records, the class simply nulls out the result resource returned by the database server; the makeOptions() method, in turn, returns a single option, noting the problem, if the result resource is null. (Strong typing, once again, solves a problem: What happens if the class has errors or no results? By using null, we escape such situations gracefully, and provide useful feedback to the user.)

Note also that this class will always use only one column — the first column in a successful result resource — as the value of the option elements it outputs.

I distribute all code under the GNU GPL version 3.

5 Comments

  1. Wow, thanks for the whole article. I’ve learned alot from you, and I think I’ve made it to a blog post twice now!

    I haven’t delved too deeply into object and classes quite yet, but the solution seems to be a very good one to me!

    Maybe I should look into ASP.NET sometime too.

    Thanks again

  2. ASP.NET isn’t necessarily a better language than PHP.

    For example, weak typing makes it easier, when working in PHP, to pass variables around between functions and even to MySQL or other databases. PHP implicitly converts data to a needed type much more naturally and easily than ASP.NET does (this is especially notable, again, when dealing with nulls). As such, you generally encounter fewer data collisions in PHP.

    PHP code tends to be more concise and to incur considerably less overhead than ASP.NET, which is another plus in its column.

    Generally speaking, the more complex your program, the more important strong typing, abstraction, encapsulation and other object-oriented principles become. Or, the bigger your problem, the more ASP.NET is the solution.

    I generally recommend PHP for beginning Web programmers, for reasons I’ve noted in previous blog entries, but as I’ve also noted before, there are things ASP.NET simply does better than PHP.

    Which programming languages you use should be a matter of your personal preferences; you need to like the languages in which you program in order to be successful using them.

Leave a Reply

  • Check out the Commenting Guidelines before commenting, please!
  • Want to share code? Please put it into a GitHub Gist, CodePen or pastebin and link to that in your comment.
  • Just have a line or two of markup? Wrap them in an appropriate SyntaxHighlighter Evolved shortcode for your programming language, please!