SCRIPTSTER. East Java Baker: 2007-09-02

Friday, September 7, 2007

PHP editor review: gPHPEdit

gPHPEdit is not the most complete editor that you will ever use and has a simple and unclutered interface which makes writing code easy. Its based on the Scintilla editor and takes it a step further than SciTE and uses Gnome 2. What it lacks in features it makes up with pure speed. It’s opened at exactly the last place you left it before you can blink your eye.

The main features are:

Syntax highlighting for all functions up to PHP 4.3
Code assistance (function, parameter assistance PHP 4.3 only)
Syntax checking
Tabbed viewing
Support for HTML and CSS

The syntax highlighting works but if you’re like me as you can tell from the site I like to work with a black background (otherwise after a while I have very sore eyes) and after changing all the setting there still remained some portions of the code with the default white background. On searching the site this bug has been fixed and will be included for the 1.0 release or can be downloaded via cvs. Another thing working with a black background when selecting text that is white you can’t read it anymore as there is no way to change the hight lighting color.

The PHP code assistance is great and just like the application is displayed very quickly and can be adjusted. The syntax checker is manual and has to be called using the menu or F9 this can be quite annoying and it does not check any functions or parameters within the file.

The search facilities are very basic in that only one file can be searched and the searching only highlights the first occurrence at the top of the file and you must continue through each occurance within the file. This can be quite tedious for large files with many partial matches.

If you’re working on a large project with multiple folders navigation can be quite tedious although on smaller projects where all files are in a small number of folders the handy list of classes with their functions can be displayed on the left of the screen for easy navigation but beware of files that have functions only as they will be listed so a folder containing numerous files of this type can be very difficult to navigate.

Overall, this editor is quick and easy to work with but without some more advanced features it feels a bit naked, the lack of automated indentation and auto syntax checking become more and more frustrating after long term use. Quick folder navigation would be an enormous improvement and these three feature would alone push this application onto the desktop of many more developers. Despite the lack of more advanced features it isn’t a tool I would quickly uninstall. If you’re an Eclipse user for example and you want to quickly open a couple of files to view and perform small edits this is where it excels. In the time that eclipse starts you can open edit and save and be doing something else. This application is reaching version 1.0 and is showing the some good signs that with some further development and added features would be a much more useful application.

CakePHP & CodeIgniter Benchmark

After reading How fast is your framework I was rather intrigued to compare CakePHP, CodeIgniter (Also added Symfony). As both frameworks are very similar and one of the best comparisons I’ve heard is “Cake is like a strict teacher while CodeIgniter is your favorite sports teacher” but I can’t remember where I found it. For me this sums the two very similar frameworks in that Cake has very strict naming conventions etc while CodeIgniter allows you more leeway. This is a particular advantage if your using databases that you do not control or have permission to modify etc. Anyway back to the subject of this post the bench mark of both these frameworks.

Update (11 Dec 2006):
After some kind pointers in the right direction from Larry E. Masters (aka PhpNut) and Nate from CakePhp.org to help showing me small modification and oversights in my testing procedures, I was happy to correct. (Thanks for the help guys). You will notice on line 07 of the helloworld controller the addition of var $helpers = null; and also some modifications to the app/config/code.php setting the AutoSession value to false greatly improved the performance. After some discussion the results of this quick benchmark aren’t really in a real world environment and shortly a more complete set of benchmarks for a simple app will be added. These results will not only be more useful but also be able to test the more advanced features of each framework. The change to production mode also cleared up the failed requests
System:

System: 3.0GHz Intel 512 Ram
Os: Ubuntu Edgy
Webserver: Apache 2.0
PHP 5.16

Framework:

CakePHP 1.1.11.4064
CodeIgniter 1.5.1
Symfony 1.0 beta1

Test:

Each framework is required to have the output “HelloWorld!” produced from a view and a controllers will obviously have to be created. The code I have used is shown below along with the benchmarking results. There are no caching available logging and no databases are connected. I’ve used ApacheBench to test and have also shown the results with the call for 10 concurrent users and testing for 60 secs.
ab -c 10 -t 60 http://framework/helloworld

CakePHP

Controller: helloworld_controller.php

02. class HelloWorldController extends AppController {
03.     var $layout = null;
04.     var $autoLayout = false;
05.     var $uses = array();
06.     var $helpers = null;
07.     function index()
08.     {
09.     }
10. }

CodeIgniter

Controller: helloworld.php

02. class Benchmark extends Controller {
03.  function Index()
04.  {
05.      $this->load->view('benchmark_view');
06.  }
07. }

Symfony

Controller: action.class.php

02. class mymoduleActions extends sfActions
03. {
04.
05.   public function executeIndex()
06.   {
07.   }
08. }

Also for Symfony the layout.php was modified to contains

< ?php echo $sf_data->getRaw('sf_content') ?>'

as it was adding a large amount of HTML content.The views are identical just containing ‘Hello World!’

Results:

I have moved the results to another page to aid reading because as I add each framework it’s starting to get quite long. For the full results: Web Framework Benchmarking Results
Requests per second for each framework:

CodeIgniter:  58.51
CakePHP:      37.46 29.67
Symfony:      22.78

So the results leave no doubt that CodeIgniter is much faster than CakePHP some concern must be present about CakePHP in that 213 requests failed. Symfony comes last but as it’s still in beta maybe this will improve, when it’s a stable release I will run the benchmark again.

If you would like some other frameworks tested leave a comment and I will try to add them later. I will try to add Djanjo later this week then and RoR.

In response to Chris Hartjes' "More Framework Fun," aka Why The Zend Framework is a Bad Idea

posted by: Nate

With a few notable exceptions, I tend to avoid speaking publicly about my feelings on the Zend Framework, for reasons which will become clear shortly if they aren't already. However, Chris Hartjes and I were engaged in a discussion about the merits of the Zend Framework vs. CakePHP, which Chris recently blogged about here. While we agreed on most points, and most importantly the conclusion, Chris' post didn't quite capture the central points of my argument, or more likely, I didn't put them across very clearly.

What started off as a follow-up comment to his post quickly turned into a post in it's own right. I know I haven't posted in a while, and a lot has been going on which I need to catch up on / post about, but I promise I'll get to that shortly. For now, without further ado, I give you the crux:

First, I don't think the addition of a console makes Cake any less MVC-oriented. While I could theoretically build non-MVC console-based (obviously) apps off of it, the console tools themselves are ancillary to the actual framework structure, and are primarily designed to assist in tasks focused on building apps off of that structure.

This distinction is important and intentional: the classes and components in the CakePHP core were all designed together, the better to work together (unlike other PHP projects where multiple independent solutions are merely "glued" together in hopes of the best). Not only that, but they were designed to work together within a particular structure and hierarchy. The ability to develop applications quickly derives from this synergy, and the further away from that you move, the more you begin to slide down the slippery slope of becoming Yet Another PHP Component Library.

Second, in terms of comparing Cake to ZF, the ORM layer is important, but it's really secondary to a couple of other factors, specifically the underlying philosophy and core architecture. CakePHP's core architecture is far more developed and vastly superior to the Zend Framework's. I contend that this is a direct result of having a clear definition (or lack thereof) of each project's philosophy, and actually sticking to it (or not, respectively).

While CakePHP has a very solidified, well-defined and focused philosophy, the Zend Framework is based on the mildly-specific-at-best principles of "Extreme Simplicity" and ye olde 80/20 Rule, i.e. "20% of stuff that 80% of people are likely to use" (or is it "80% of the stuff that 20% of people will use"? Someone please correct me, as this is likely a misquote).

The other important point here is that, when it comes to philosophy, we actually stick to our guns. While Cake is very robust and fairly comprehensive, it is still very light compared to most other frameworks or libraries in it's class. The Zend Framework, on the other hand has, as of this writing, core components for Amazon and Flickr web services, a measurement unit conversion component, and an Audioscrobbler component, yet they have no ORM layer. Even the most simple database interaction in the Zend Framework essentially consists of manually writing SQL queries in PHP syntax.

In comparing each framework's size in terms of code and functionality, the Zend Framework is significantly bigger than Cake. Here in the U.S., we are typically of the mind that "bigger is better" (often to our own collective detriment). It logically follows then, that the Zend Framework is more capable, and therefore better. However, as we have seen, not only is it missing key elements of functionality which are highly relevant in the web development arena, but it lacks a clear vision or discernible goals, which is why additional features and components continue to pile up in a seemingly ad-hoc fashion. It has some very spiffy features, to be sure, but as a project lacks the vision or direction to decide what to do with or about all of them. You could say the framework is all dressed up with nowhere to go.

However, even without a specific direction, it seems that the Zend Framework is proceeding rather quickly towards their as-yet-unkown destination. A cursory review shows that they have nearly 200 contributors, with over 100 commits in the past week. So perhaps a more accurate analogy would be to being behind the steering wheel of a Formula 1 racer, only without the steering wheel.

In a good framework, components that provide specific, application-level functionality are ancillary to the framework's core, and for good reason. Suppose you have a web application which you would like to extend by adding an XML API, but your framework of choice does not natively support outputting data as XML (for the record, CakePHP does support XML output). In a well-designed MVC application, it is a simple matter to integrate an external library with your view tier to handle XML output. Such is the importance of a strong core architecture: when an application is provided with structure by default, it is given a solid foundation for expansion. The architecture defines a place within the structure for each element of the application, leading to well-designed, loosely-coupled, agile, maintainable code. Therefore, the question of whether or not a framework supports feature X becomes far less important than the question of how to properly integrate feature X, and how easy it is to do so.

A great (or terrible, depending on how you look at it) thing about developing in PHP is that, for any reasonably well-known technology, one can quite easily find half a dozen PHP packages to integrate it (the fact that these packages may vary significantly in degrees of quality is beyond the scope of this paragraph).

So, the fact that the Zend Framework packages dozens of helpful components which may or may not be useful in any given application doesn't really add a lot of value, since, if it becomes necessary to implement some specific functionality, one could just as easily find several other libraries of equivalent or roughly-equivalent functionality. It's also important to point out that at this stage in the ZF's development, the average PHP library for any given task will likely be more mature than it's ZF counterpart.

For too long, the PHP community has suffered from try-to-make-everybody-happy-itis (which often masks itself as reinvent-the-wheel-to-make-myself-happy-itis). This disease is prevalent in many major PHP projects, i.e. PEAR, but is practically epitomized by the Zend Framework. The result of this disease is, of course, that nobody is really happy, and you end up with a lack-luster, unfocused product.

To summarize: I have, in the preceding paragraphs, clearly laid out criteria for evaluating two frameworks, one against the other. As such, I contend that based on said criteria, choosing CakePHP over the Zend Framework is a measurably and objectively better decision for the vast majority of web applications.

Thanks for reading, and have a nice day. (At least I tried to end on a positive note).

Form Building: More Auto-Magic Than You Can Handle? (for CAKEPHP BAKER)

posted by: Nate One thing that's been getting some attention in Cake 1.2 is the building of forms. Recently, I was rewriting some old code from an application which I've been developing on and off for quite a while now, and the difference was remarkable. Consider the following: (Old)

<code><form id="TaskEditForm" method="post" action="">" onSubmit="return false;"></code>

Here we see a hand-coded form tag with embedded PHP, echoing the ID of the current Task object. Now consider the following code, updated for 1.2, which produces effectively the same output: (New)

<code><?=$form->create(array('default' => false)); ?></code>

Okay, let's start with what we don't see. You'll notice first that we don't see any reference to a URL; neither a controller nor an action; not even a model name. We also don't see a DOM ID, or any reference to the JavaScript event in the preceeding code. Let's start with the bit about the model. In Cake 1.2, we're transitioning to an approach to form building that is more directly model-oriented, and according to the API, the first parameter to <code>FormHelper::create()</code> is actually supposed to be the name of a model, i.e.: <code><?=$form->create('Task', array('default' => false)); ?></code>

However, if you don't provide one, it is assumed to be the default model for the controller (in this case TasksController). So how does it even know whether this is an add or edit form? Simple: it checks <code>$this->data</code> for a value for the primary key for the given model. If it is set and not empty, it is assumed to be an edit form, otherwise it is an add form. According to convention, 'add' and 'edit' are the default names for form-related actions. Most of the rest of the attributes are pretty straightforward: as with form elements, DOM IDs are now auto-generated for forms themselves, based on the model and type (add/edit). The last remaining bit is <code>'default' => false</code>. This is new for both forms and links, and provides you a simple way to disable the default action without actually having to write any JavaScript. We've also replaced most of the form-related methods in HtmlHelper with roughly equivalent ones in FormHelper. We've also added some wrapper methods to FormHelper, which you can provide with a few hints about how you want your form elements to render, and they take care of everything for you. Here's an example: (Old)

<pre><code>

// Controller:

$this->set('contacts', $this->Task->Contact->generateList());

// View:

<div class="input">

<label for="TaskContactId">Assigned to</label>

<?=$html->selectTag('Task/contact_id', $contacts, $this->data['Task']['contact_id'], null, null, false); ?>

</div>

</code></pre> (New)

<pre><code>

// Controller:

$this->set('contacts', $this->Task->Contact->generateList());

// View:

<?=$form->input('contact_id', array('label' => 'Assigned to', 'empty' => false)); ?>

</code></pre> Again, both code listings produce effectively the same output. <code>FormHelper::input()</code> takes various hints about what type of form element you want it to generate based not only on the information you provide it, but also on the field's model data, and on the view environment itself. Let's first compare the <code>$form->input()</code> part with the <code>$html->selectTag()</code> (you can ignore the controller code, it's the same in both). The first thing is the lack of the 'Task/' part in the field definition that is passed to the method. We're still using that syntax, but if you don't provide a model name, that's okay, because Cake already knows that we're creating a form for the Task model, so it is used automatically. The next thing you'll notice is that <code>$contacts</code>, our list of options to actually use in the <select> element, is not present in the view code at all. Since Cake knows that contact_id is a foreign key to another model, it checks the view for a variable called <code>$contacts</code> (the plural of the key name, with the '_id' part removed), and if it is found (and it is an array), Cake uses it as the option list. The other factors that determine the type of form field that <code>input()</code> will generate are as follows: <ul><li>As in the above example, if the field is a foreign key, and the corresponding options variable is specified, a select menu will be rendered.</li><li>If you specify an <code>'options'</code> key in the second parameter, a select menu will be rendered.</li><li>If the field name is <code>'password'</code> or <code>'passwd'</code>, a password field will be rendered.</li><li>Text and varchar fields are rendered as textareas and text inputs, respectively.</li><li>Booleans render as checkboxes (and the label is rendered to the right of the element).</li><li>Date, time, and datetime fields all render with the corresponding group of select menus.</li><li>If the field is the primary key of the model, it renders as a hidden field.</li></ul> Of course, you can always manually specify the type of element rendered using the <code>'type'</code> key of the <code>$options</code> array, which can be set to any one of the following values: <code>'hidden', 'checkbox', 'text', 'password', 'file', 'select', 'time', 'date', 'datetime',</code> or <code>'textarea'</code>. If the <code>'type'</code> key is unspecified, and the field type is not one of the above, it will render as a textarea. In addition to rendering the input element itself, <code>input()</code> will also render a label, a wrapper <div> and any associated validation messages, if present. This is all by default, but you can customize the behavior, as in the following: <code><?=$form->input('terms', array('label' => array('text' => 'Terms of service', 'class' => 'title'))); ?></code>

Here, the <code>'label'</code> key has been defined as an array in order to specify custom rendering options for the <label> tag. You can also define <code>'label'</code> as a string (the text of the label itself), or false, to disable the rendering of the label. The <code>'div'</code> key works similarly for controlling the rendering of the wrapper div, except that assigning it a string sets the class name. Now, at the risk of being too meta, there's also a wrapper method for the wrapper method: <code>FormHelper::inputs()</code>. This method takes a single array parameter, which can be indexed, associative, or mixed. The array is simply a list of fields to be rendered with <code>FormHelper::input()</code>. By default, all fields are rendered with the default options, but you can use the associative array syntax to specify options for individual fields, as in the following: <code><?=$form->inputs(array('first_name', 'middle_initial' => array('label' => 'MI'))); ?></code>

Here, the <code>first_name</code> field would be rendered with defaults, but the <code>middle_initial</code> field would be rendered with custom label text. Using this syntax it is possible (though not always recommended ;-)) to render an entire form in one line of code.

PHP namespaces

For those of you who are not following the PHP internals mailing list, the namespaces subject has been brought up again, with a simple implementation from Dmitry Stogov.

Its hard to say if it actually gets integrated in the source, as its a touchy subject and many implementations have came before this; but I love the way its implemented.

Quotes from Dmitry:

Namespaces are defined the following way:
namespace Zend::DB; class Connection { } function connect() { } ?>
Namespace definition does the following:
All class and function names inside are automatically prefixed with namespace name. Inside namespace, local name always takes precedence over global name. It is possible to use the same namespace in several PHP files.
The namespace declaration statement must be the very first statement in file.

Every class and function from namespace can be referred to by the full name
- e.g. Zend::DB::Connection or Zend::DB::connect - at any time.
require 'Zend/Db/Connection.php'; $x = new Zend::DB::Connection; Zend::DB::connect(); ?>
Namespace or class name can be imported:
require 'Zend/Db/Connection.php'; import Zend::DB; import Zend::DB::Connection as DbConnection; $x = new Zend::DB::Connection(); $y = new DB::connection(); $z = new DbConnection(); DB::connect(); ?>

PHP: Arrays vs. Objects

By Evert

In a lot of cases arrays are used in PHP to store object-like information, like the results of a database query. I do this a lot too, but I kind of want to change things around to make use of VO's. I feel this makes a lot more sense, since most of the application I build are heavy OOP anyway, and I get all the added OOP benefits, like type-hinting, inheritance.. well, you know the deal.

I wanted to see what the differences would be in terms of memory consumption, so I set up the following test:

 
   // first test simple associative arrays 
   $memory1 = xdebug_memory_usage( );

   $data = array();

   for($i=0;$i<1000;$i++) {

       $data[] = array(
            'property1' => md5(microtime()),
            'property2' => md5(microtime()),
            'property3' => md5(microtime()),
       );

   }

   $array =  xdebug_memory_usage()-$memory1 . "\n";

   // Now do the same thing, but with a class.. 

   class Test {

       public $property1;
       public $property2;
       public $property3;

   }

   $data = array();

   $memory1 = xdebug_memory_usage( );

   for($i=0;$i<1000;$i++) {

       $test = new Test();
       $test->property1 = md5(microtime());
       $test->property2 = md5(microtime());
       $test->property3 = md5(microtime());
       $data[] = $test;


   }

   $object = xdebug_memory_usage()-$memory1;

   echo 'Arrays: ' . $array . "\n";
   echo 'Objects: ' . $object;

?>

My results were

 Arrays: 536596
Objects: 521932

I knew there was a good chance objects would take up less memory, because arrays need to store both the propertyname (or key) and value for every record, while the object only needs to store the values, because the propertynames are stored centrally in the class definition, what I didn't expect was that using arrays takes more than 20 times more memory. This is hardly an accurate formula, but it does tell you something.

Right, that was stupid.. I had my testing code wrong and I did the $data=array(); right after the second xdebug_memory_usage(). The actual conclusion here is that there's not much difference. I was hoping the objects would make a significant difference, but its minimal.

Caching in PHP using the filesystem, APC and Memcached

By Evert

Caching is very important and really pays off in big internet applications. When you cache the data you're fetching from the database, in a lot of cases the load on your servers can be reduced enormously.

One way of caching, is simply storing the results of your database queries in files.. Opening a file and unserializing is often a lot faster than doing an expensive SELECT query with multiple joins.

Here's a simple file-based caching engine.

 
// Our class
class FileCache {

  // This is the function you store information with
  function store($key,$data,$ttl) {

    // Opening the file
    $h = fopen($this->getFileName($key),'w');
    if (!$h) throw new Exception('Could not write to cache');
    // Serializing along with the TTL
    $data = serialize(array(time()+$ttl,$data));
    if (fwrite($h,$data)===false) {
      throw new Exception('Could not write to cache');
    }
    fclose($h);

  }

  // General function to find the filename for a certain key
  private function getFileName($key) {

      return '/tmp/s_cache' . md5($key);

  }

  // The function to fetch data returns false on failure
  function fetch($key) {

      $filename = $this->getFileName($key);
      if (!file_exists($filename) || !is_readable($filename)) return false;

      $data = file_get_contents($filename);

      $data = @unserialize($data);
      if (!$data) {

         // Unlinking the file when unserializing failed
         unlink($filename);
         return false;

      }

      // checking if the data was expired
      if (time() > $data[0]) {

         // Unlinking
         unlink($filename);
         return false;

      }
      return $data[1];
    }

}

?>

Key strategies

All the data is identified by a key. Your keys have to be unique system wide; it is therefore a good idea to namespace your keys. My personal preference is to name the key by the class thats storing the data, combined with for example an id.

example

Your user-management class is called My_Auth, and all users are identified by an id. A sample key for cached user-data would then be "My_Auth:users:1234". '1234' is here the user id.

Some reasoning behind this code

~~I chose 4096 bytes per chunk, because this is often the default inode size in linux and this or a multiple of this is generally the fastest.~~ Much later I found out file_get_contents is actually faster.

Lots of caching engines based on files actually don't specify the TTL (the time it takes before the cache expires) at the time of storing data in the cache, but while fetching it from the cache. This has one big advantage; you can check if a file is valid before actually opening the file, using the last modified time (filemtime()).

The reason I did not go with this approach is because most non-file based cache systems do specify the TTL on storing the data, and as you will see later in the article we want to keep things compatible. Another advantage of storing the TTL in the data, is that we can create a cleanup script later that will delete expired cache files.

Usage of this class

The number one place in web applications where caching is a good idea is on database queries. MySQL and others usually have a built-in cache, but it is far from optimal, mainly because they have no awareness of the logic of you application (and they shouldn't have), and the cache is usually flushed whenever there's an update on a table. Here is a sample function that fetches user data and caches the result for 10 minutes.

 
 // constructing our cache engine
 $cache = new FileCache();

 function getUsers() {

    global $cache;

    // A somewhat unique key
    $key = 'getUsers:selectAll';

    // check if the data is not in the cache already
    if (!$data = $cache->fetch($key)) {
       // there was no cache version, we are fetching fresh data

       // assuming there is a database connection
       $result = mysql_query("SELECT * FROM users");
       $data = array();

       // fetching all the data and putting it in an array
       while($row = mysql_fetch_assoc($result)) { $data[] = $row; }

       // Storing the data in the cache for 10 minutes
       $cache->store($key,$data,600);
    }
    return $data;
}

$users = getUsers();

?>

The reason i picked the mysql_ set of functions here, is because most of the readers will probably know these.. Personally I prefer PDO or another abstraction library. This example assumes there's a database connection, a users table and other issues.

Problems with the library

The first problem is simple, the library will only work on linux, because it uses the /tmp folder. Luckily we can use the php.ini setting 'session.save_path'.

 
  private function getFileName($key) {

      return ini_get('session.save_path') . '/s_cache' . md5($key);

  }

?>

The next problem is a little bit more complex. In the case where one of our cache files is being read, and in the same time being written by another process, you can get really unusual results. Caching bugs can be hard to find because they only occur in really specific circumstances, therefore you might never really see this issue happening yourself, somewhere out there your user will.

PHP can lock files with flock(). Flock operates on an open file handle (opened by fopen) and either locks a file for reading (shared lock, everybody can read the file) or writing (exclusive lock, everybody waits till the writing is done and the lock is released). Because file_get_contents is the most efficient, and we can only use flock on filehandles, we'll use a combination of both.

The updated store and fetch methods will look like this

   // This is the function you store information with
  function store($key,$data,$ttl) {

    // Opening the file in read/write mode
    $h = fopen($this->getFileName($key),'a+');
    if (!$h) throw new Exception('Could not write to cache');

    flock($h,LOCK_EX); // exclusive lock, will get released when the file is closed

    fseek($h,0); // go to the beginning of the file

    // truncate the file
    ftruncate($h,0);

    // Serializing along with the TTL
    $data = serialize(array(time()+$ttl,$data));
    if (fwrite($h,$data)===false) {
      throw new Exception('Could not write to cache');
    }
    fclose($h);

  }

  function fetch($key) {

      $filename = $this->getFileName($key);
      if (!file_exists($filename)) return false;
      $h = fopen($filename,'r');

      if (!$h) return false;

      // Getting a shared lock 
      flock($h,LOCK_SH);

      $data = file_get_contents($filename);
      fclose($h);

      $data = @unserialize($data);
      if (!$data) {

         // If unserializing somehow didn't work out, we'll delete the file
         unlink($filename);
         return false;

      }

      if (time() > $data[0]) {

         // Unlinking when the file was expired
         unlink($filename);
         return false;

      }
      return $data[1];
   }

?>

Well that actually wasn't too hard.. Only 3 new lines.. The next issue we're facing is updates of data. When somebody updates, say, a page in the cms; they usually expect the respecting page to update instantly.. In those cases you can update the data using store(), but in some cases it is simply more convenient to flush the cache.. So we need a delete method.

 
    function delete( $key ) {

        $filename = $this->getFileName($key);
        if (file_exists($filename)) {
            return unlink($filename);
        } else {
            return false;
        }

    }

?>

Abstracting the code

This cache class is pretty straight-forward. The only methods in there are delete, store and fetch.. We can easily abstract that into the following base class. I'm also giving it a proper prefix (I tend to prefix everything with Sabre, name yours whatever you want..). A good reason to prefix all your classes, is that they will never collide with other classnames if you need to include other code. The PEAR project made a stupid mistake by naming one of their classes 'Date', by doing this and refusing to change this they actually prevented an internal PHP-date class to be named Date.

 
    abstract class Sabre_Cache_Abstract {

        abstract function fetch($key);
        abstract function store($key,$data,$ttl);
        abstract function delete($key);

    }

?>

The resulting FileCache (which I'l rename to Filesystem) is:

 
class Sabre_Cache_Filesystem extends Sabre_Cache_Abstract {

  // This is the function you store information with
  function store($key,$data,$ttl) {

    // Opening the file in read/write mode
    $h = fopen($this->getFileName($key),'a+');
    if (!$h) throw new Exception('Could not write to cache');

    flock($h,LOCK_EX); // exclusive lock, will get released when the file is closed

    fseek($h,0); // go to the start of the file

    // truncate the file
    ftruncate($h,0);

    // Serializing along with the TTL
    $data = serialize(array(time()+$ttl,$data));
    if (fwrite($h,$data)===false) {
      throw new Exception('Could not write to cache');
    }
    fclose($h);

  }

  // The function to fetch data returns false on failure
  function fetch($key) {

      $filename = $this->getFileName($key);
      if (!file_exists($filename)) return false;
      $h = fopen($filename,'r');

      if (!$h) return false;

      // Getting a shared lock 
      flock($h,LOCK_SH);

      $data = file_get_contents($filename);
      fclose($h);

      $data = @unserialize($data);
      if (!$data) {

         // If unserializing somehow didn't work out, we'll delete the file
         unlink($filename);
         return false;

      }

      if (time() > $data[0]) {

         // Unlinking when the file was expired
         unlink($filename);
         return false;

      }
      return $data[1];
   }

   function delete( $key ) {

      $filename = $this->getFileName($key);
      if (file_exists($filename)) {
          return unlink($filename);
      } else {
          return false;
      }

   }

  private function getFileName($key) {

      return ini_get('session.save_path') . '/s_cache' . md5($key);

  }

}

?>

There you go, a complete, proper OOP, file-based caching class... I hope I explained things well.

Memory based caching through APC

If files aren't fast enough for you, and you have enough memory to spare.. Memory-based caching might be the solution. Obviously, storing and retrieving stuff from memory is a lot faster. The APC extension not only does opcode cache (speeds up your php scripts by caching the parsed php script), but it also provides a simple mechanism to store data in shared memory.

Using shared memory in APC is extremely simple, I'm not even going to explain it, the code should tell enough.

     
    class Sabre_Cache_APC extends Sabre_Cache_Abstract {

        function fetch($key) {
            return apc_fetch($key);
        }

        function store($key,$data,$ttl) {

            return apc_store($key,$data,$ttl);

        }

        function delete($key) {

            return apc_delete($key);

        }

    }

?>

~~My personal problem with APC that it tends to break my code.. So if you want to use it.. give it a testrun.. I have to admit that I haven't checked it anymore since they fixed 'my' bug.~~. This bug is now fixed, APC is amazing for single-server applications and for the really often used data.

Memcached

Problems start when you are dealing with more than one webserver. Since there is no shared cache between the servers situations can occur where data is updated on one server and it takes a while before the other server is up to date.. It can be really useful to have a really high TTL on your data and simply replace or delete the cache whenever there is an actual update. When you are dealing with multiple webservers this scheme is simply not possible with the previous caching methods.

Introducing memcached. Memcached is a cache server originally developed by the LiveJournal people and now being used by sites like Digg, Facebook, Slashdot and Wikipedia.

How it works

Memcached consists of a server and a client part.. The server is a standalone program that runs on your servers and the client is in this case a PHP extension.
If you have 3 webservers which all run Memcached, all webservers connect to all 3 memcached servers. The 3 memcache servers are all in the same 'pool'.
The cache servers all only contain part of the cache. Meaning, the cache is not replicated between the memcached servers.
To find the server where the cache is stored (or should be stored) a so-called hashing algorithm is used. This way the 'right' server is always picked.
Every memcached server has a memory limit. It will never consume more memory than the limit. If the limit is exceeded, older cache is automatically thrown out (if the TTL is exceed or not).
This means it cannot be used as a place to simply store data.. The database does that part. Don't confuse the purpose of the two!
Memcached runs the fastest (like many other applications) on a Linux 2.6 kernel.
By default, memcached is completely open.. Be sure to have a firewall in place to lock out outside ip's, because this can be a huge security risk.

Installing

When you are on debian/ubuntu, installing is easy:

apt-get install memcached

You are stuck with a version though.. Debian tends to be slow in updates. Other distributions might also have a pre-build package for you. In any other case you might need to download Memcached from the site and compile it with the usual:

 ./configure
make
make install

There's probably a README in the package with better instructions.

After installation, you need the Pecl extension. All you need to do for that (usually) is..

pecl install Memcache

You also need the zlib development library. For debian, you can get this by entering:

apt-get install zlib1g-dev

However, 99% of the times automatic pecl installation fails for me. Here's the alternative installation instructions.

 pecl download Memcache
tar xfvz Memcache-2.1.0.tgz #version might be changed
cd Memcache-2.1.0
phpize
./configure
make
make install

Don't forget to enable the extension in php.ini by adding the line extension=memcache.so and restarting the webserver.

The good stuff

After the Memcached server is installed, running and you have PHP running with the Memcache extension, you're off.. Here's the Memcached class.

 
    class Sabre_Cache_MemCache extends Sabre_Cache_Abstract {

        // Memcache object
        public $connection;

        function __construct() {

            $this->connection = new MemCache;

        }

        function store($key, $data, $ttl) {

            return $this->connection->set($key,$data,0,$ttl);

        }

        function fetch($key) {

            return $this->connection->get($key);

        }

        function delete($key) {

            return $this->connection->delete($key);

        }

        function addServer($host,$port = 11211, $weight = 10) {

            $this->connection->addServer($host,$port,true,$weight);

        }

    }

?>

Now, the only thing you have to do in order to use this class, is add servers. Add servers consistently! Meaning that every server should add the exact same memcache servers so the keys will distributed in the same way from every webserver.

If a server has double the memory available for memcached, you can double the weight. The chance that data will be stored on that specific server will also be doubled.

Example

 
    $cache = new Sabre_Cache_MemCache();
    $cache->addServer('www1');
    $cache->addServer('www2',11211,20); // this server has double the memory, and gets double the weight
    $cache->addServer('www3',11211);

    // Store some data in the cache for 10 minutes
    $cache->store('my_key','foobar',600);
    
    // Get it out of the cache again
    echo($cache->fetch('my_key'));
   
?>

Some final tips

Be sure to check out the docs for Memcache and APC to and try to determine whats right for you.
Caching can help everywhere SQL queries are done.. You'd be surprised how big the difference can be in terms of speed..
In some cases you might want the cross-server abilities of memcached, but you don't want to use up your memory or have your items automatically get flushed out.. Wikipedia came across this problem and traded in fast memory caching for virtually infinite size file-based caching by creating a memcached-compatible engine, called Tugela Cache, so you can still use the Pecl Memcache client with this, so it should be pretty easy. I don't have experience with this or know how stable it is.
If you have different requirements for different parts of your cache, you can always consider using the different types alongside.

Thursday, September 6, 2007

Build your own CMS with TikiWiki

Takeaway: Need a CMS system? TikiWiki takes the wiki one giant step further. Jack Wallen gives a primer on this flexible system.

I've covered a lot of CMS systems, but none of those systems had me as wide-eyed as TikiWiki. Why? This system has a plethora of options and tools. There is so much here that, upon installation, I wasn't exactly sure where to start first. It's almost too much, but not too much to be useful. You'll probably never use 100 percent of TikiWiki's offerings; just choosing the features you'll stick with will take you a while.

What's TikiWiki?

What is TikiWiki, really? As you can infer from the name, it's a wiki: a collaborative bit of technology used to gather information, or an open source dictionary, if you will. Visitors (those with permissions, at least) can add, remove, and edit content as they see fit. Wikipedia is probably the most popular wiki. However, albeit enormous in scope, Wikipedia is limited to what it does for the public.

TikiWiki takes the wiki one giant step further: TikiWiki has a feature set that looks like it belongs to a large-scale system. Take a look:

Articles

Blog

Calendar

Charts

Chat

Contact

Directory

Featured Links

File Gallery

Forum

Friendship Network

Gmap

HTML Page

Image Gallery

Live Support

Newsletter

Newsreader

Notepad

Personal content

Personal Messaging

Polls

Quizzes

Shoutbox

Slideshow

Spreadsheet

Tasks

Tracker (forms and database generator)

User Files

Webhelp

Webmail

Wiki

That's just the listing under Content Features, so we're obviously not going to cover the entire system. This article will help you configure your system to suit your needs.

Getting and installing

For the purposes of this article, I'll be installing TikiWiki on a fresh Fedora 7 installation. This install comes straight off the Live CD with the most recent updates applied. The only additional software I had to add was:

MySQL (4.0.x)

PHP 5.2.x

phpMyAdmin (Optional, but makes the creation of the database simple)

Other than those items listed above, there are few requirements for TikiWiki. You will need hardware that meets the following:

512 MB of RAM

100 MB of free hard drive space

Minimum 32 MB of memory allocated for PHP

You will also need a Web server; Apache is the recommended server.

Once you have met all the requirements, download the software from Sourceforge and move the file to your Web server's document root. Since I am using Fedora with Apache, my document root will be /var/www/html . Unpack the file (the method of unpacking will depend on the file you download) with one of these three commands:

tar xvzf tikiwiki-RELEASE-NUMBER.tar.gz

bunzip2 tikiwiki-RELEASE-NUMBER.tar.bz2;tar xvf tikiwiki-RELEASE-NUMBER.tar

unzip tikiwiki-RELEASE-NUMBER.zip

With tiki unpacked, you will find a directory called tikiwiki -RELEASE-NUMBER . You can change the name of that directory to something easier to remember at this point. If you don't, your users will have to remember http://yourdomain/tikiwiki-RELEASE-NUMBER/ every time they need to drop by your installation. For the purposes of this article, I am not going to rename the folder for continuity.

The next step is to create your database. As I mentioned earlier, the easiest way to create the database is to install phpMyAdmin and install from there. You will not have the headache of having to create a database the hard way. Remember the name you used for your database, because you're going to have to enter it during the installation process.

The first thing you need to do now is open your browser to http://yourdomain/tiki-1.9.7/tiki-install.php . This will begin the installation process. The first step in this installation process is to enter the information for your database. Figure A illustrates the information needed.

Figure A

You'll have this information from the database you created either manually or with the help of phpMyAdmin.

Once you've entered your information, press Submit Query to be transported to the next step. The next step is to select the database profile you want. You have four choices:

Default Installation Profile

BasicEnable Profile for easy use

Fishclub profile

Slashdot profile

The database profile that is the best choice for CMS use is the BasicEnable profile. Select that and press Create. What should happen now is a new window that scrolls as each installation operation succeeds (or fails). In my case, no new window opened and I was greeted with blank screen. Out of curiosity, I entered the URL http://localhost/tikiwiki-1.9.7/ and found the opening TikiWiki screen! So if you are not greeted by an operation execution window, fear not.

Naturally, the next step is to log in as the administrator to start administering the system. If you remember back to the installation process, an administrator account was never set up. Once again, never fear -- the TikiWiki team has this covered. To log in as the administrator, use the username admin and the password admin. To make matters safe, the first thing you will be required to do is change the administrator password. You won't have a choice in this case.

The creation of your TikiWiki site is done and ready for you.

Let's administer

One of the first things you will notice is how enormous the menu is. Figures B and C highlight just how many entries you have to play with.

Figure B

There are some administration steps to be taken within these menu entries.

Figure C

There's the administration menu.

Even within the Admin menu, there are many administration tasks to be done. We are, however, going to focus on the Admin home entry. Select the Admin home link to reveal the full set of admin tools, as shown in Figure D .

Figure D

Before you start configuring, you'll want to enable or disable the features you want.

Notice the Tip above the icons. This Tip is particularly useful because it gives you a link to the page where you Enable/Disable features on your TikiWiki site. There will be features you will want to use, and some you won't. Naturally, it's best that your users can't see those features you don't want to use.

In order to take care of that, you'll want to first visit the link offered in the above tip (in my case, http://localhost/tikiwiki-1.9.7/tiki-admin.php?page=features ). The enable/disable features page offers a huge amount of features. Figure E shows only a single section of the feature listing.

Figure E

This is a listing of only the basic features.

Go through each section (Features, Content Features, Administration Features, User Features, and General Layout Options) and select everything you want on your site. If the feature has the checkbox checked, that feature will be included. Once you have selected everything you know you want, select Change Preferences to move on. Note : For a complete description of each feature, visit the TikiWiki Features page .

Once you have enabled your features, go back to the Admin Home page to start your configuration.

One of the features I enabled was the Site Identity feature. This feature allows you to customize (or "brand") your TikiWiki site to fit your company. Press the Site Identity icon to open up the configuration options for this feature.

Within this feature, there are six sections:

Custom Code (You can use custom XHTML or Smarty Code). Note : If you enable this option, make sure your code is absolutely correct, or you risk messing up your installation.

Site Breadcrumbs (Site location bar)

Site Logo (Your company logo here)

Site Ads and Banners (Advertising dollar opportunities)

Site Menu Bar (Requires that PHP Layers dynamic menus be enabled in the Features section)

The above list has some nice configurations. The Site Menu Bar, for example, adds a nice mouse over menu bar at the top of the page. When your mouse hovers over a category, a clickable menu appears. Nice touch.

Now that you have your site identity set up, let's move on to the General Administration section to get into the nuts and bolts of your installation.

There are six sections to take care of in this administration page:

General Preferences

General Settings

Date and Time Formats

Other (Icons and separators)

Change Admin password

Most of the above are self-explanatory. There are some notables to mention. Within the General preferences, the Use Group Homepages refers to TikiWiki's ability to assign certain home pages to certain groups. This can only be used, naturally, if you are creating and using various groups. This can be very helpful if you are using TikiWiki within a large organization.

Also note that there are quite a large amount of themes to choose from. Go through them until you'll find one to suit your needs.

One last thing to mention on this page is the Mail end of line. There are two options: LF and CRLF. Although the default is LF, choose CRLF. There may be cases where you're using an MTA that requires only LF (line feed end of line end of line characters), but most likely you will be using an MTA that requires LF+CR.

Once you have made all your necessary configurations, press Change Preferences to move on.

Changing your home page

With basic administration out of the way, you may want to go back and edit your home page. As it stands, the home page is empty. If you press the Home link, you will see the page has no content, as shown in Figure F .

Figure F

You have to actually edit the home page before it's really usable.

To edit this page, select the Edit link and a familiar window will appear, as shown in Figure G .

Figure G

This content editing window is quickly becoming second nature to most Web users.

The creation and editing of pages is as straight-forward as it gets. There are two nice features added here: Edit Summary and Import Page. The Edit Summary allows the administrator to see a listing of every edit that has happened to a page. Select the History link under the page (the History Link appears when you are editing a page underneath the Edit window) to reveal the history, as shown in Figure H .

Figure H

You can select the b link to roll back the page to a previous version.

As administrator, you also have the following powers via the links underneath the home page text:

Remove the home page

Lock the home page

Change the Permissions of the home page

View the History of the home page

View Similar pages

Undo the home page

Add a comment to the home page

Attach a file to the home page

Naturally, the editing of other pages is done in the same manner.

Final words

TikiWiki is enormous. There is so much to do within the confines of this system that covering it as a whole would require a book. But fear not, we'll dig deeper into this outstanding CMS system.

The more I use TikiWiki, the more impressed I am. I have found this system to be one of the more flexible systems available for Content Management. Even though it falls within the framework of a wiki, it can easily act as a more standard Web site.

Install TikiWiki and have a look around it. The time you spend on it will be time well spent.