A Perch XML Sitemap
One of the best things about the Perch CMS is its lack of boilerplate. In fact, Perch creator Rachel Andrew is quite opposed to such bloated, polyfill-ridden framework code. Why include a mass of default code you've yet to decide you even need? The performance hit is just not tolerable.
The trade-off (if you can call 'having tremendous fun authoring your own code' a trade-off) is that there are a number of features that are simply not supported by default. At the time of writing, a dynamic XML sitemap for Perch's blog extension is one such feature. In Wordpress, you'd simply install a plugin, click a few buttons in the admin GUI and assume that the 4.5 star community rating means it is dependable. In Perch, you have to be a real developer and do it mostly from scratch. The basic steps are as follows:
- Create a sitemap.php page
- Create a sitemap-item.html template
- Configure perch_blog_custom() in sitemap.php to use the template
- Manage extensions in the .htaccess file
- Refer to the sitemap location in your robots.txt
Create Sitemap.php
Sitemap.php is set up almost exactly like any other Perch-enabled page on your server, except that the perch runtime include should appear after the content type header that defines the document as XML:
<?php
header ("Content-Type:text/xml");
include('perch/runtime.php');
?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<!-- ENTRIES HERE -->
</urlset>
With that taken care of, the next thing you ought to do is include all the static and category pages. Nothing fancy; just code them in manually using the following format:
<url>
<loc>http://www.your-site.com/your-url</loc>
<changefreq>monthly</changefreq>
<priority>0.2</priority>
</url>
Priority: The priority tag is used to tell search engines which pages should be paid the most attention when there are a number of site-wide updates to your content. "1.0" is highest and is typically attributed to the homepage. On a blog that lists excerpts of recent articles on the home page (like HeydonWorks), this pattern seems logical.
Sitemap-item.html
Our next task is to create a template for our post entries so that we can inject them amongst the hand-coded <url>
s we devised in the last section. Take a single <url>
snippet like the one described above and place it in a new file, saving it in templates/blog
as 'sitemap-item.html'. After you've inserted your Perch content template tags, the code should look something similar to this:
<url>
<loc>http://www.your-site.com/article/<perch:blog id="postSlug" /></loc>
<changefreq>monthly</changefreq>
<priority>0.2</priority>
</url>
Next, we need to call the post URL information via our template in sitemap.php. This is a perfect case for the perch_blog_custom()
function, configured as follows.
$opts = array(
'template'=>'blog/sitemap-item.html',
'sort'=>'postDateTime',
'sort-order'=>'DESC'
);
perch_blog_custom($opts);
We've already told the sitemap.php to render as XML, but the .php extension remains. Google will be looking for 'sitemap.xml' so we'll need to make a rule in our htaccess file to serve 'sitemap.php' when 'sitemap.xml' is requested. Add the following to your htaccess:
RewriteRule (.*).xml(.*) $1.php$2 [nocase]
As well as submitting your XML sitemap to Google via Google Webmaster Tools, it's also a good idea to place a reference in robots.txt
:
Sitemap: http://www.your-site.com/sitemap.xml
That should be everything you need.