Since I’ve been working within PHP for almost 8 years now, I’ve developed some tools which would be nice if included in the core of PHP but I find it hard to do without.
As I have forked off various stand-alone projects which don’t have my usual library of PHP code attached to them, I’ve found myself copying these functions around as they are truly essential to coding almost anything.
You’ll note that a few principles of programming in the following examples:
- Sensible defaults for values are always allowed to be provided by the caller
- Handle multiple input values (arrays) seamlessly when it makes sense
- Support functional programming such that return values generally allow further execution without if statements
And awaaaaaaay we go …
Number 5. To quote, or to unquote, that is the question …
This one’s a simple one, mostly because it’s got a lot of bang for the buck, and it’s another of those simple parsing techniques which is extremely useful more often than I’ve seen.
I’ve used this to remove quotes around a database table in MySQL: `TUser`, or for removing parenthesis from a table’s varchar size value “varchar(23)” (after extraction), and removed quotes from HTTP attributes I’m parsing using preg_match as flexibly as I can for bad HTML:
attribute := /("[^"]*"|'[^']'|[^'"]+)/
Not for every circumstance, but that’s why it’s number 5.
Note the &$left_quote=null is PHP 5 only.
/** * Unquote a string and optionally return the quote removed. * * @param string $s A string to unquote * @param string $quotes A list of quote pairs to unquote * @param string $left_quote Returns the quotes removed * @return Unquoted string, or same string if quotes not found */ function unquote($s, $quotes = "''\"\"", &$left_quote=null) { if (is_array($s)) { $result = array(); foreach ($s as $k => $ss) { $result[$k] = unquote($ss, $quotes, $left_quote); } return $result; } if (strlen($s) < 2) { $left_quote = false; return $s; } $q = substr($s, 0, 1); $qleft = strpos($quotes, $q); if ($qleft === false) { $left_quote = false; return $s; } $qright = $quotes{$qleft + 1}; if (substr($s, -1) === $qright) { $left_quote = $quotes{$qleft}; return substr($s, 1, -1); } return $s; }
Usage of this would be:
$testimonial = unquote(trim($_POST['testimonial']), '""\'\''); $table = unquote($table, '``'); $varchar_size = unquote($size, "()"); $php_quote_style = null; $php_string = unquote($php_string, '""\'\'', $php_quote_style);
Hopefully, you get the idea.
Number 4. pair: The swiss army knife of parsing
And no, I didn’t misspell PEAR. The function is called pair. And its twin, pairr. The code is below, but first, how often do you parse stuff? Me, all the time. As the previous function unquote shows, I do it a lot.
While preg_match is good for almost every parsing need, sometimes you just want to rip something apart without all the extra baggage of regular expressions, and just want simple splitting of a string by a simple one-or-more character delimiter.
As a simple example, in the database code, it’s nice to be able to specify a database and table, instead of just a table at times. For simple systems, you would never do this, but once you start scaling your web site, you would do this all the time as you partition data into different databases. That said, I want to be able to pass database.table into a function, and parse it as it should be parsed. This is where the pair function comes into play:
/** * Breaks a string in half at a given delimiter, and returns default values if delimiter is not found. * * Usage is generally: * * list($table, $field) = pair($thing, ".", $default_left, $default_right); * * @param string $a A string to parse into a pair * @param string $delim The delimiter to break the string apart * @param string $left The default left value if delimiter is not found * @param string $right The default right value if delimiter is not found * @return A size 2 array containing the left and right portions of the pair */ function pair($a, $delim = '.', $left = false, $right = false) { $n = strpos($a, $delim); return ($n === false) ? array($left, $right) : array(substr($a, 0, $n), substr($a, $n + strlen($delim))); }
Now, I’m sure there are some talented PHP developers out there saying, “Why, I could do the same thing with explode($a, $delim, 2)! Ha ha to you, Mister Smarty Pants!”
To which I would say, “Yes, Grasshopper, but when placed into an expression using list($a, $b) causes a warning in PHP when the delimiter is not found (Doh!), and doesn’t have the neat defaults built in.”
A nice practical example we have:
function sql_table_quote($table) { list($dbname, $table) = pair($table, ".", null, $table); $table = unquote($table, "``"); return empty($dbname) ? "`$table`" : '`'.unquote($dbname, "``")."`.`$table`"); }
If you’re wondering what the twin, pairr, looks like, it’s identical to pair, with strrpos on the first line to do a reverse-search for that delimiter.
Number 3. path: Slash before … wait, after … oh forget it
Since one’s job as developer is to avoid making mistakes, one of the most common mistakes I’ve made in writing code is the infamous trailing slash after directory names … or not. You can either write your code like one of the following lines:
$f = file_get_contents("$path/web-app.conf"); /* or */ $f = file_get_contents($path . "web-app.conf");
… depending on whether $path is “/my/web/directory/conf/” or “/my/web/directory/conf”. How many times have you found those phantom files just lingering above the directory where the files were destined to go?
It’s happened to me enough to want to avoid the issue completely if possible. Is there a slash after this variable when its in the configuration file I loaded on that machine there? I don’t know, nor do I care anymore.
Since engineers often use paths with trailing slashes (or without) in different contexts, I’ve completely bypassed the situation by adding a simple function path to my code repertoire which avoid this issue completely
You can write any path, and pass an array, an array and a string, or just a list of strings, and it will concatenate them together with one slash in between each component, except the last one.
Here it is:
/** * Create a file path and ensure only one slash appears between path entries * @param mixed path Variable list of path items, or array of path items to concatenate * @return string with a properly formatted path */ function path(/* dir, dir, file */) { $mixed = func_get_args(); $r = array_shift($mixed); if (is_array($r)) { $r = call_user_func_array("path", $r); } foreach ($mixed as $p) { if (is_array($p)) { $p = call_user_func("path", $p); } $r .= ((substr($r, -1) === "/" || substr($p, 0, 1) === "/")) ? $p : "/$p"; } $r = str_replace("/./", "/", $r); $r = preg_replace("|//+|", "/", $r); return $r; }
Since manipulating files is, like, everything, this one is used all the time:
$file_extension = "csv"; $f = file_get_contents(path($site_root, "images/extension-icons", "$file_extension.gif"));
Number 2. Type conversion with defaults. to_bool, to_integer, to_double
As a good developer, we all know that you check all user input into any script and make sure to screen it for bad values, right?
Well, maybe not.
That said, I try to, and after I’ve written some new functionality I walk through the code and make sure I’ve cleaned any $_REQUEST $_GET or $_POST values and asserted what type they should be … (Actually, I don’t use $_REQUEST either but that’s another article, maybe …)
So this is a little cheating because it’s not one function but many, and they all do the same thing: Convert to a type, and return a default value if the conversion doesn’t work. Simple, yet essential.
As I write my code knowing that “X should be an integer, and if it’s not, I’ll give a reasonable default and the code will still work correctly.”
/** * Ensures a value is an integer value. If not, the default value is returned. * * @param mixed $s Value to convert to integer * @param mixed $def The default value. Not converted to integer. * @return mixed The integer value, or $def if it can not be converted to an integer */ function to_integer($s,$def=null) { if (is_numeric($s)) return intval($s); return $def; } /** * Ensures a value is an double value. If not, the default value is returned. * * @param mixed $s Value to convert to double * @param mixed $def The default value. Not converted to double. * @return mixed The double value, or $def if it can not be converted to an integer */ function to_double($s,$def=null) { if (is_numeric($s)) return doubleval($s); return $def; } /** * Parses $value for a boolean value. Intended for parsing developer or user inputs which include the values: * @return mixed Returns true or false, or $default if parsing fails. * @param mixed $value A value to parse to find a boolean value. * @param mixed $default A value to return if parsing is unsuccessful * @desc * <ul> * <li>True, False, T, F</li> * <li>0,1</li> * <li>Yes, No, Y, N</li> * <li><em>empty string</em></li> * <li>Enabled, Disabled</li> * </ul> */ function to_bool($value, $default = false) { if (is_bool($value)) return $value; if (!is_scalar($value)) return $default; if (strpos(";1;t;y;yes;on;enabled;true;", ";" . strtolower($value) . ";") !== false) return true; if (strpos(";0;f;n;no;off;disabled;false;null;;", ";" . strtolower($value) . ";") !== false) return false; return $default; }
I don’t know if it’s cheating to convert scalars to a string and then do the strpos, but I figure it’s probably faster than in_array.
For all three functions, note that the default passed in is returned unmodified by the function. This is essential to allow values to be returned which detect that the parsing failed. That is:
$a = to_integer($window_width, null); if ($a === null) { echo "Window width is funky: $window_width"; }
If we converted the default to an integer, it would return zero, and how would that be distinguished from a $window_width of zero? Well, it wouldn’t. So there.
And the Number 1 most useful non-native PHP function that I use is … drum roll, please …
Number 1: avalue. Clean up that ugly isset code!
If you’re like me, you don’t like seeing notices, errors, or any sort of warnings at all when you execute your PHP code. How often have you imported a PEAR library, loaded some open source code, and found that when you run it you get a slew of warnings?
I don’t like warnings, errors or notices. Perhaps I’m just like that, but once you turn them on, and clean up the code, you get in the habit, I believe, of writing much better code in the long run.. I set error_reporting to E_ALL at the start of every page, script, or anything PHP unless it’s written by someone else and will fill my logs with a bunch of lame errors.
How often have you seen code like this:
if (isset($_GET['q'])) { $query = $_GET['q']; } else { $query = "dog food"; } if (isset($_GET['language'])) { $lang = $_GET['language']; } else { $lang = "en"; }
etc. For a bunch of form values. All the time. How, um. Fugly. How about this instead:
$query = avalue($_GET, 'q', 'dog food'); $lang = avalue($_GET, 'language', 'en');
Ahhh. Clean. A lot of times you’ll just get warning filled code with the following:
$query = $_GET['q']; $lang = $_GET['language'];
etc.
The warnings are there for a reason: It’s not a good idea to write bad code. In the case of arrays, you shouldn’t be accessing values which aren’t there, unless you’re sure of what you’re doing. Enter the avalue function, which I use everywhere. It’s as simple as it gets, but more useful than I can find.
function avalue($a, $k, $default = null) { assert('is_array($a)'); if (!is_array($a)) { error_log("$a is of type ". gettype($a) . _backtrace(), E_USER_WARNING); } assert('is_string($k) || is_numeric($k)'); if (!is_string($k) && !is_numeric($k)) { error_log("$k is of type ". gettype($k). " value is $k" . _backtrace(), E_USER_WARNING); } return array_key_exists($k, $a) ? $a[$k] : $default; }
This is also useful for validating input values… Huh?
What ‘cha talking about Willis?
How about this:
$def_lang = "en"; $languages = array( "en" => path(I18N_ROOT, "en_GB"), "fr" => path(I18N_ROOT, "fr_FR"), "de" => path(I18N_ROOT, "de_DE") ); $language_path = avalue($languages, avalue($_GET, "language", $def_lang), $languages[$def_lang]);
$language_path is guaranteed to be a valid value in $languages, assuming $def_lang is a valid key in $languages.
If you use any function in your PHP “core code”, try using avalue, it’s a time and life-saver for me.
2 replies on “Top 5 Most Useful non-native PHP functions”
How about replacing avalue() with:
$name = isset($_GET[‘name’]) ? $_GET[‘name’] : ‘default’;
or
$name = @$_GET[‘name’] ? $_GET[‘name’] : ‘default’;
or (php 5.3) shorter
$name = @$_GET[‘name’] ?: ‘default’;
or better still (php 5.3) longer
$name = isset($_GET[‘name’]) ?: ‘default’;
personally if you’re running php 5.3+ then i would use
$name = isset($_GET[‘name’]) ?: ‘default’;
instead of avalue() ;-)
Kind regards,
Scott
Absolutely good tip, but you don’t want the “isset” result, you want the value from $_GET[‘name’].
So, I think what you want is:
$name = @$_GET[‘name’] :? ‘default’;
isset returns a boolean, so your latter
isset($arr[$name]) ?: 'default'
returns true or ‘default’.
I’m going to say that I prefer the syntactic sugar of using avalue. While I use it a lot, it’s not a huge CPU hog and it adds a nice hook for debugging. I put in assert statements to make sure that $name is not NULL and is a proper scalar. This finds a variety of errors.
Translating the above $language_path to your syntax becomes:
$language_path = @$languages[@$_GET['language'] :? $def_lang] :? $languages[$def_lang];
Which isn’t too shabby, actually, though it may confuse some…
Thanks for the comments!