Automatically Hash Tagging Text With PHP And MySQL Part 2: Adding New Hash Tags To The Database Table

Yesterday I wrote a short script that will automatically tag a subject string with terms contained in a database table. Today, I’m going to alter that script, so that user-designated hash tags will be added to the database.

For example, suppose we have a hash tag database table that contains the terms winter, summer and sun. If we apply those to the opening line of Richard III, it looks like this:

Now is the #winter of our discontent made glorious #summer by this #sun of York;
and all the clouds that lour'd upon our house in the deep bosom of the ocean buried.

But what if we decided we wanted to also hash tag clouds, now and in the future? We can certainly manually do so every time, but having a convenient way to add that text to the database would be quite helpful.

That’s what today’s script will do: Find all the words we have manually hashtagged in the subject string and add them to the database, if they are not in the database already and we confirm we want them added.

Alterations To The HTML Form

The HTML form is fundamentally the same as before. We add a pair of radio buttons, however, that determine whether new terms should be added to the database.

<form id="tform" name="tform" action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post">
	<textarea id="ttext" name="ttext" cols="50" rows="3"><?php echo $_POST['ttext']; ?></textarea>
	<br />
	Add new terms to the database?<label><input type="radio" name="tadd" value="1" /> Yes</label> <label><input type="radio" name="tadd" value="0" checked="checked" /> No</label> 
	<br />
	<input type="submit" name="submit" id="submit" value="Submit" />
</form>

Changes To The Database Global Constants

Because we’re going to run two queries, I’ve changed the database constants a bit, modifying the previous select query to instead be the name of the table and the name of the terms column, so that I can build both select and insert queries.

//your database server variables
define('MYSQL_HOST', 'hostname');
define('MYSQL_USER', 'db_user');
define('MYSQL_PASS', 'db_password');
define('MYSQL_DB', 'db_name');
define('MYSQL_TABLE', 'taxonomy_table_name');
define('MYSQL_TERM_COLUMN', 'taxonomy_column_name');
Taxonomy” is a more formal name for tags / categories / labels. I’m using it here specifically for link bait. So sue me.

Changes To The Autotagging PHP Function

Because we now expect to see hashtags in the input text, we need to remove any extras that show up, since the process of autotagging may now add extraneous hash tags. (In truth, I should have checked for the presence of hash tags in the original function. Oops.)

function autotag($input, $terms) {
	//tags $input with $terms
	//returns false on error, tagged string on success

	if(strlen(trim($input)) < 1) {
		trigger_error('function autotag: string to be tagged is empty', E_USER_WARNING);
		return false;
	}
	if(!is_array($terms)) {
		trigger_error('function autotag: terms is not an array', E_USER_WARNING);
		return false;
	}

	$tmp = array();
	foreach($terms as $term){
		//matches will be terms exactly as in database,
		//followed by space or newline
		$tmp[] = "/($term)(\s|$)/i";
	}
	$out = preg_replace($tmp, '#$0', $input);
	$out = preg_replace('/#{2,}/', '#', $out);
	return $out;
}

A PHP Function To Save New Tags To The Database

Now, for the function that will save new terms to the database.

  • We’ll identify all hashtagged words in the subject string, then place them in an array.
  • We’ll compare that array against the current terms array; any terms in the subject, that aren’t in the original terms list, we know should be added to the database.
  • A SQL statement is built and then passed to the database server.
  • The function returns the added terms, if any.
function save_tags($input, $terms) {
	//save new tags to database
	//returns Boolean false on error, 
	//string of added terms on success
	
	if(strlen(trim($input)) < 1) {
		trigger_error('function save_tags: string to be tagged is empty', E_USER_WARNING);
		return false;
	}
	if(!is_array($terms)) {
		trigger_error('function save_tags: terms is not an array', E_USER_WARNING);
		return false;
	}
	
	//get terms from subject string
	$tmp = array();
	$new_terms = array();
	preg_match_all('/#\w+(\s|$)/', $input, $tmp);
	foreach($tmp[0] as $term) {
		$new_terms[] = trim(strtolower(str_replace('#', '', $term)));
	}
	$tmp = array_diff($new_terms, $terms);
	
	//if new terms, add to database
	if(count($tmp) > 0) {
		//build sql string
		$sql = "INSERT INTO " . MYSQL_TABLE . " (" . MYSQL_TERM_COLUMN . ") VALUES ";
		foreach($tmp as $term) {
			$sql .= "('$term'), ";
		}
		$sql = substr($sql, 0, strlen($sql) - 2);
		$sql .= ";";
	
		//insert to database table
		if(!$link = mysql_connect(MYSQL_HOST, MYSQL_USER, MYSQL_PASS)) {
			trigger_error('function save_tags: Cannot connect to database server. Please check your host name and credentials', E_USER_WARNING);
			return false;
		}
		
		if(!mysql_select_db(MYSQL_DB)) {
			trigger_error('function save_tags: Cannot select the database. Please check your database name', E_USER_WARNING);
			return false;
		}
		
		if(!$rs = mysql_query($sql)) {
			trigger_error('function save_tags: Error parsing query. MySQL error: ' . mysql_error(), E_USER_WARNING);
			return false;
		}
	}
	return implode(", ", $tmp);
}

The Invocation Code

To actually command the script to add the new terms to the database, we can simply check that the value of the radio button is true (1). If so, we’ll go ahead and call save_tags.

$terms = at_get_terms();
$content = "Enter text in the textarea below, then click Submit. The text will be automatically tagged with terms contained in the database. Any newly tagged terms will be added to the database.";

if(isset($_POST['submit'])) {
	$content = "<strong>Hashtagged string:</strong> " . autotag(htmlspecialchars($_POST['ttext']), $terms);
	if($_POST['tadd'] == '1') {
		$content .= "<br /><strong>Terms added to database:</strong> " . save_tags(htmlspecialchars($_POST['ttext']), $terms);
	}
}

And once again, that’s it.

I have a demo running that goes through the motions of adding new hashtags to the database, but doesn’t actually do so (for obvious reasons; some of you can’t be trusted to keep it clean, in a technical or grammatical sense). It’s at http://demo.dougv.com/php_auto_hashtag_update/

Source code on github: https://github.com/dougvdotcom/php_auto_hashtag. Specifically, you want index2.php, not index.php.

All links in this post on delicious: http://www.delicious.com/dougvdotcom/automatically-hash-tagging-text-with-php-and-mysql-part-2-adding-new-hash-tags-to-the-database-table

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • Check out the Commenting Guidelines before commenting, please!
  • Want to share code? Please put it into a GitHub Gist, CodePen or pastebin and link to that in your comment.
  • Just have a line or two of markup? Wrap them in an appropriate SyntaxHighlighter Evolved shortcode for your programming language, please!