I’ve got a homebrewed PHP CMS running on my Slice, and I’ve always meant to look into memcached to improve pageload times. The other day, I finally got around to it, and it was very easy! I should have done this years ago. So come in, the water’s fine!
Are you running on a Debian/Ubuntu Slice?
apt-get install memcached php5-memcache
This will install both memcached itself and the php extension, and start the memcached server running on your Slice, listening on localhost:11211 and using up to 64 MB of your RAM. If you want to increase or decrease this, edit /etc/memcached.conf.
telnet localhost 11211
Type “stats” and hit enter. Now type Ctrl+], then Ctrl+d to quit the telnet session.
Memcached has three main functions that you care about. get, set, and delete. It stores your data like a Python dictionary, as key-value pairs. So the “key” is the name you want to give the data, and the value is the actual data.
One difference from a Python dictionary is that Memcached has no way of retrieving a list of all the keys it has stored. So make sure you’re not storing values in memcached with keys that you’re generating on the fly, such as from the current timestamp or something, because you’re not going to be able to ask memcached to give you a list of keys. I couldn’t believe this when I learned it, but it’s true. So give name your keys predictable things that you won’t have to guess at in your code.
For instance, I’ve got a key called “recent_comments”, and the value of that is some HTML representing recent comments that were left on my blog. Another key is called “recent_entries”, and it contains the 12 most recent blog posts. I’m also memcaching the comments on individual blog posts. So the key for a blog post’s comments might be “entry_comments_11298” if that blog post has the primary key in its id column of “11298”. The key is unique, so I won’t be storing one thing and retrieving it somewhere inappropriate, but it’s easy to figure out what the key should be for a given blog post, because all I do is concatenate the prefix “entry_comments_” to the entry_id.
I found a handy-dandy PHP class that helps you connect to your memcached server on this blog. Here’s that guy’s code, slightly molested by me because I didn’t like his idea of having a separate “constants.php” file to include, so I just inlined his constants at the top of my version:
define("__MEMHOST","localhost");
define("__MEMPORT",11211);
class clsMem extends Memcache {
static private $m_objMem = NULL;
static function getMem() {
if (self::$m_objMem == NULL) {
self::$m_objMem = new Memcache;
// connect to the memcached on some
//host __MEMHOST running it om __MEMPORT
self::$m_objMem->connect(__MEMHOST, __MEMPORT)
or die ("Dave's not here, man.");
}
return self::$m_objMem;
}
Stick that in a file called memcache.php, and require_once it in the head section of your main template file. I use a head.php file that all my other PHP pages include, so I just stuck this line in it:
require_once("/var/www/servers/whatever.com/memcache.php");
Now you have a braindead easy way to connect to Memcached and operate/manipulate your cache. Just call methods on clsMem::getMem(), like so:
$variable = clsMem::getMem()->get("some_key");
That fetches the contents of some_key from Memcached into $variable, so you can echo it or do something else with it. Need to set a variable? No problem:
clsMem::getMem()->set("some_key", $variable, false, 600);
The “false” argument is telling memcached not to compress your $variable, and the 600 is the number of seconds before memcached will delete your cached data.
After a while, I noticed that there was a pattern to how I was using Memcached, so I decided to abstract it out into a convenience function, which I added to the memcache.php file.
static function cache_or_get($cachekey, $create_item_func, $timeout=2592000){
if ($cacheitem = clsMem::getMem()->get($cachekey) ) {
return $cacheitem;
}
else {
$cacheitem = $create_item_func();
if (clsMem::getMem()->set($cachekey, $cacheitem, false, $timeout) ) {
return $cacheitem;
}
else {
//maybe we should log something here, or send an email to an administrator
return $cacheitem;
}
}
}
Now everywhere that I would have run a computationally expensive function to fetch rows from the database, build some HTML, and echo a chunk of one of my pages, I call cache_or_get. If the item is already cached, I get it. If it’s not, I go ahead and run my expensive function to generate it, then I cache it for next time for a month (unless I pass in a different number of seconds).
Let’s say I have a naive, expensive, uncached function called get_recent_entries.
function get_recent_entries(){
$var = do_expensive_stuff();
return $var;
}
I just create a new function with the same name, and rename my old function, like so:
function uncached_get_recent_entries(){
$var = do_expensive_stuff();
return $var;
}
function get_recent_entries(){
return clsMem::cache_or_get("recent_entries", uncached_get_recent_entries);
}
It’s like a surgical precision strike. None of your other code even needs to know that you’re using memcached! Brilliant, if I do say so myself. Here’s another version where I manually set the expiration time, because it’s generating a calendar that changes once every 24 hours:
function uncachedMonthCal(){
//do some long boring calculations, and then...
return $monthCal;
}
function seconds_to_midnite(){
$time = time();
return mktime(0,0,0,date('m', $time), date('j', $time) + 1, date('Y', $time)) - $time;
}
function monthCal(){
return clsMem::cache_or_get("monthCal", uncachedMonthCal, seconds_to_midnite());
}
Now let’s say you need to delete a cache item. You’ve posted a new blog entry, or one of your readers has posted a comment, so the cache needs to be deleted in order for the next call to cache_or_get() to recreate it. Otherwise, your page will be painfully outdated as soon as what’s in your database is newer than what’s being pulled from Memcached and displayed. What to do? Well, determine exactly which keys you need to delete. Then delete them. Here’s a snippet of my code that runs right after a new comment is inserted into the database:
//delete the memcached blog-wide recent comments
clsMem::getMem()->delete("recent_comments");
//delete the memcached comments listing for this blog post
clsMem::getMem()->delete("entry_comments_" . $comment_entry_id);
So even though I had told Memcached to store those two keys for a month, I’ve manually deleted them and they’ll be rebuilt the next time a pageload happens that calls cache_or_get() for either or both of them. That’s how you make sure your page stays as fresh as your data.
I hope you can see from my examples how profoundly easy and painless it can be to implement memcached and speed up your site. I only wish someone had taken me by the hand and shown me just how simple it is. Good luck! If you come up with any improvements to the way I’m doing things, please share!