Drupal has a number of tasks that need to run on a regular basis. The index for the search engine of the website needs to be indexed regularly to make sure all new content is added to the index. The xml sitemap can be submitted that way. All these "housekeeping items" can be initiated by cron.
Every drupal installation comes with a file called cron.php which is located at the root folder of the drupal installation. You can call the file through a web browser by domainname.com/cron.php. What you get is a blank screen. It does not produce any output.
Normal persons do not go to their browser every hour to visit that URL so most linux/unix provider offer a cron job through a control panel of some sort like Cpanel. In there you can specify to run this file on a regular basis.
So that's what I did, I specified the cron job from the example provided by my hosting provider site:
GET http://yourdomain.com/path_to_file/cgi-bin/file.cgi > /dev/null
I've recently changed the frequency with which it runs so now I have entries in my traffic report for cron.php. These are my apache log files, Google Analytics obviously is not affected by it.
Php runs as an apache module as well as command line driven. So I tried the command line version but the direct reference ends up in errors.
After a little searching on the drupal website I've managed to configure it correctly. What I have done is create a file called cron.sh and i've uploaded it to the folder above the document root and i've made it executable. Here is the content :
#!/bin/sh
ROOT=/your document root goes here
cd $ROOT
/bin/env - HTTP_HOST=websitenamegoeshere.com SCRIPT_FILENAME=$ROOT/cron.php
/usr/local/bin/php -q $ROOT/cron.php
In the cron setting in the website control panel I've specified to run every 15 minutes:
$HOME/cron.sh > $HOME/cronlog.log
It could be though that your php is located at /usr/bin/php instead of /usr/local/bin/php.
Works like a charm.