Wednesday, June 18, 2008

Effective use of PDFs in websites and intranets

My agency has historically provided documents within our website and intranet in three formats, HTML (web pages), RTF (Rich Text Format) and PDF. The rationale behind this has been to give customers choice.

It has also allowed us to look at relative usage over time to see which formats are most preferred by our customers and staff.

The ratio we see by visits roughly averages as follows:
100 HTML (webpages) : 12 PDF : 1 RTF

This does suggest there while, as you'd expect, most web users prefer to view web pages, there is a legitimate place in our website for PDF versions. (RTF we're considering dropping altogether.)

There are significant incremental costs involved in delivering documents in different formats.

This includes the issues in managing updating across the formats and, particularly for PDF, managing accessibility and effective searching.

This leads on to the core issue:
If we have a legitimate need to provide different formats of documents in our website and there is a cost to doing so, how do we maximise the effectiveness of the different formats in order to maximise our ROI?

Here's some steps that my agency has taken.

Firstly, looking at PDF-specific issues, many PDFs are not designed to be found easily in search engines. Where they are findable, the text provided in the PDF results is often gobble-de-gook.

This is easily fixed by setting a couple of properties in each PDF, well explained in the article Make your PDFs work well with Google (and other search engines) in the Acrobat user group.

Accessibility can also become an issue. While PDFs are actually quite good for accessibility purposes, many are never optimised for accessibility either due to lack of knowledge or lack of time. Given that government has a legal obligation to deliver accessible websites this could be quite a large issue for some agencies when audited.

Adobe's PDF creator comes with the ability to test the accessibility of a PDF and suggest improvements. I use this regularly on PDFs and find that it's both effective and provides useful suggestions. If you are unsure of what you can do to address PDF accessibility, simply running this report can provide you with a handle on what needs to be done.

The PDF creator also comes with a system for metatagging images within PDF documents with alternative text and structuring the order in which headings and text blocks are read to help people who cannot read the words, such as those who are vision impaired.

The most recent versions of Adobe Acrobat reader also include a screen reader for the vision-impaired, and simply using this tool to listen to your documents while closing your eyes can give you a clearer insight into how accessible your PDFs really are.

Finally, in my opinion, PDFs are not a great format for online use. If you're on the web you expect to find web pages. PDF is a useful print alternative, but isn't really the format of choice for reading online. In my experience PDFs are primarily used when someone wants to print a document for later reference.

HTML web pages are quite simple and fast to update. However PDF (and RTF) require significantly more attention and, often, specialist designers or tools.

This adds cost and time but not always significant value, particularly when changes are quite small and non-critical.

There are approaches that can reduce the cost and time required - and avoid those situations when your PDFs and web pages do not match.

My agency is in the process of implementing a CSS-based replacement for printable PDF fact sheets. Basically we've developed a fact sheet print template for web pages which can be used to generate more effective PDF-like pages.

Another approach we are looking at for the future is to use PDF on the fly generators, which allow the delivery of any web content as a PDF at a click of a button.

The advantage of this approach is that an agency can continue to provide PDF versions, but without the effort and cost of developing them. Only the website's HTML version needs to be maintained as the PDF version is basically generated on request from your website users.

So findability, accessibility and more accurate and timely delivery are all achievable with PDFs with just a little thought. These lift the effectiveness of this format, helping our customers find and access the information.

Of course, most people will still prefer web pages, but if your agency is committed to offering PDF as an option - or the sole way to access documents - with some improvements to their effectiveness you'll be helping ensure that your customers get what they need.

