Just Software Solutions

Reduce Bandwidth Usage by Supporting If-Modified-Since in PHP

Sunday, 30 September 2007

By default, pages generated with PHP are not cached by browsers or proxies, as they are generated anew every time the page is loaded by the server. If you have repeat visitors to your website, or even many visitors that use the same proxy, this means that a lot of bandwidth is wasted transferring content that hasn't changed since last time. By adding appropriate code to your PHP pages, you can allow your pages to be cached, and reduce the required bandwidth.

As Bruce Eckel points out in RSS: The Wrong Solution to a Broken Internet, this is a particular problem for RSS feeds — feed readers are often overly enthusiastic in their checking rate, and given the tendency of bloggers to provide full feeds this can lead to a lot of wasted bandwidth. By using the code from this article in your feed-generating code you can save yourself a whole lot of bandwidth.

Caching and HTTP headers

Whenever a page is requested by a browser, the server response includes a Last-Modified header in the response which indicates the last modification time. For static pages, this is the last modification time of the file, but for dynamic pages it typically defaults to the time the page was requested. Whenever a page is requested that has been seen before, browsers or proxies generally take the Last-Modified time from the cached version and populate an If-Modified-Since request header with it. If the page has not changed since then, the server should respond with a 304 response code to indicate that the cached version is still valid, rather than sending the page content again.

To handle this correctly for PHP pages requires two things:

  • Identifying the last modification time for the page, and
  • Checking the request headers for the If-Modified-Since.

Timestamps

There are two components to the last modification time: the date of the data used to generate the page, and the date of the script itself. Both are equally important, as we want the page to be updated when the data changes, and if the script has been changed the generated page may be different (for example, the layout could be different). My PHP code incorporates both by defaulting the modification time of the script, and allowing the user to pass in the data modification time, which is used if it is more recent than the script. The last modification time is then used to generate a Last-Modified header, and returned to the caller. Here is the function that adds the Last-Modified header. It uses both getlastmod() and filemtime(__FILE__) to determine the script modification time, on the assumption that this function is in a file included from the main script, and we want to detect changes to either.

function setLastModified($last_modified=NULL)
{
    $page_modified=getlastmod();
    
    if(empty($last_modified) || ($last_modified < $page_modified))
    {
        $last_modified=$page_modified;
    }
    $header_modified=filemtime(__FILE__);
    if($header_modified > $last_modified)
    {
        $last_modified=$header_modified;
    }
    header('Last-Modified: ' . date("r",$last_modified));
    return $last_modified;
}

Handling If-Modified-Since

If the If-Modified-Since request header is present, then it can be parsed to get a timestamp that can be compared against the modification time. If the modification time is older than the request time, a 304 response can be returned instead of generating the page.

In PHP, the HTTP request headers are generally stored in the $_SERVER superglobal with a name starting with HTTP_ based on the header name. For our purposes, we need the HTTP_IF_MODIFIED_SINCE entry, which corresponds to the If-Modified-Since header. We can check for this with array_key_exists, and parse the date with strtotime. There's a slight complication in that old browsers used to add additional data to this header, separated with a semicolon, so we need to strip that out (using preg_replace) before parsing. If the header is present, and the specified date is more recent than the last-modified time, we can just return the 304 response code and quit — no further output required. Here is the function that handles this:

function exitIfNotModifiedSince($last_modified)
{
    if(array_key_exists("HTTP_IF_MODIFIED_SINCE",$_SERVER))
    {
        $if_modified_since=strtotime(preg_replace('/;.*$/','',$_SERVER["HTTP_IF_MODIFIED_SINCE"]));
        if($if_modified_since >= $last_modified)
        {
            header("HTTP/1.0 304 Not Modified");
            exit();
        }
    }
}

Putting it all together

Using the two functions together is really simple:

     exitIfNotModifiedSince(setLastModified()); // for pages with no data-dependency
     exitIfNotModifiedSince(setLastModified($data_modification_time)); // for data-dependent pages

Of course, you can use the functions separately if that better suits your needs.

Posted by Anthony Williams
[/ webdesign /] permanent link
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

16 Comments

Great bit of code, thanks

Would you put this code just after the session handler and before the page's document declaration?

by Silklink at 15:00:33 on Monday, 21 January 2019

The tip is great. I was checking something for a project of mine, and finally got it. I wanted a method to dynamically serve some js files in minified format, using the jsmin-php project from google code. I wanted to enhance the system using if-modified-since headers. Will post the direct url from my site once my project is over.

by php trivandrum at 15:00:33 on Monday, 21 January 2019

Hi completed my project.. and is available at http://www.php-trivandrum.org/code-snippets/reduce-bandwidth-usage-in-php.html, though the caption is the same, I went in a slightly different point. My requirements were to compress the javascript files.. Will need to make this a wordpress plugin, with the future expiry as a plugin option.

by php trivandrum at 15:00:33 on Monday, 21 January 2019

great thanks.. i have tried with meta tag "revise affter" for preventing google bot crawl too much my classified site. but no success

then find out your solution... this help me alot .. combine with some cached file solution .. this function is the best to reduce load of server..

thank again from tintuc

by tin tuc at 15:00:33 on Monday, 21 January 2019

sorry i got other question related to caching file

normaly i put your function before other function which include the cache file to the page.. but if the cached file may be change over time..

which is the best for my.. put this function just before or affter include cached files

regards

by game avatar at 15:00:33 on Monday, 21 January 2019

great, thankssssssss

by dred at 15:00:33 on Monday, 21 January 2019

I read this to better understand and apache mod Modified.

by Phim at 15:00:33 on Monday, 21 January 2019

great tips, I sent him a link to this article as well as used your email link to send Expedia a scolding on the topic

by May chieu acto at 15:00:33 on Monday, 21 January 2019

thanks . just apply to my game site.. it work great

by game ban sung at 15:00:33 on Monday, 21 January 2019

very useful info bro ... I am really thankful to you ... going to read it again and implement it

thanks again

by aun at 15:00:33 on Monday, 21 January 2019

Thanks for a marvelous posting! I quite enjoyed reading it, you may be a great author.I will ensure that I bookmark your blog and may come back down thee road. I want to encourage yourself to continue your great job, have a nice day!

by at 15:00:33 on Monday, 21 January 2019

Thanks for a marvelous posting! I quite enjoyed reading it, you may be a great author.I will ensure that I bookmark your blog and may come back down thee road. I want to encourage yourself to continue your great job, have a nice day!

by at 15:00:33 on Monday, 21 January 2019

Your article is very good, I will regularly visit this site to read.

by at 15:00:33 on Monday, 21 January 2019

Thank you for sharing, I would often to this site to read the information.

by at 15:00:33 on Monday, 21 January 2019

what does this mean when you write ($last_modified=NULL) in bracket?

by Rajesh at 15:00:33 on Monday, 21 January 2019

I presume you're referring to

function setLastModified($last_modified=NULL)

This says that the function setLastModified takes a single parameter called $last_modified, which is set to NULL if the caller doesn't provide a value. You can thus say setLastModified() to set the date to the page modification date, or setLastModified($some_date) to set the date to the contents of $some_date.

by Anthony Williams at 15:00:33 on Monday, 21 January 2019

Add your comment

Your name:

Email address:

Your comment:

Design and Content Copyright © 2005-2024 Just Software Solutions Ltd. All rights reserved. | Privacy Policy