PDF files are inappropriate
Let it be said right away: PDF files are inappropriate in a SEO-context compared to ordinary web pages. There are several reasons for this:
- PDFs are hard to make visible on Google.
- PDF files do not allow users to interact with your website.
- PDF files are not user-friendly on mobile devices.
- Readers of PDF files are hard to track in Google Analytics.
- Users often download PDF files, and if you update the content later, users may not see the new content because they have an old version lying around.
Our recommendation is therefore to avoid PDF files as much as possible and to display your content on regular pages on your website instead. If you are already using PDF files, you may want to move the content from the PDF files to one or more regular pages - in which case you will need to delete the PDF files and set up redirects (see page 39) from the PDF files to the new pages.
Nevertheless, a PDF file can be a great way to present content - be it product information, sales presentations, cases, e-books, white papers, services and more.
Search engine optimization of PDF files
Content-wise, the same guidelines apply for optimising PDF files as for optimising web pages. This means that the content must be sufficient and that the keyword for which the PDF file is to be found must be included in the body text, the title and selected subheadings. It is a good idea to link to your website in the body of the text so that readers can get to know your company better.
File names on PDF files should be relevant and concise. We recommend that file names:
- have a maximum length of 50 characters including spaces
- are identical to the title of the PDF file in question
- does not contain special characters (æ, ø, å, ö, é, etc.), punctuation and capital letters
- uses hyphens to separate words (not hyphens or spaces).
It can be advantageous to fill in metadata in PDF files. In Word, you do this by clicking on the 'Files' tab and then 'Properties' and 'Summary'. The following should be filled in:
- Title. Title of the PDF file, possibly combined with a subtitle.
- Author. The name of the person or company behind the PDF file.
- Subject. A brief description of the contents of the PDF file.
There is no length limit on either title, author or subject. Other types of metadata can be filled in as needed, but SEO-wise it has hardly any effect.
The final file size should be as small as possible. You can ensure this by using only a few images and a few different fonts. So-called vector-based graphics (often in SVG format) generally take up significantly less space than actual images (often in PNG or JPG format).
Indexing and blocking PDF files
Google indexes PDF files in much the same way as regular web pages. PDF files appear alongside the regular search results on Google, differentiated only by a small 'PDF' tag. Images from PDF files can be indexed in Google Image Search. Text in scanned PDF files can also be read and indexed by Google in the vast majority of cases.
However, Google does not typically index PDFs as often as regular web pages. The reason is simply that the content of PDF files is usually very static, so Google does not need to crawl PDF files as often, saving precious server resources. Here's an example of a PDF file and a regular web page in Google's search results:
If you don't want Google to index your PDF files, you can either not link to them publicly (so Google can't find them) or add a small piece of code to your web server configuration file that blocks access to one or more of the files. The latter requires the help of your website's technical manager. On Apache web servers, the configuration file is an htaccess file and in this file the search engines can be blocked from all PDF files with the following code:
Header set X-Robots tag "noindex, nofollow"
.
Visitor tracking in PDF files
There are limited possibilities for visitor tracking in PDF files, as it is not possible to insert tracking scripts in PDF files in the same way as on regular web pages. On the other hand, PDF files typically represent such a small part of the content of a website that the lack of tracking is negligible.
The only common solution is event tracking in Google Analytics, which allows tracking the number of clicks to PDFs (including PDF downloads). This solution can also be used for tracking clicks to video files, for example. Read more about setting up event tracking in Google Analytics here: https://support.google.com/analytics/answer/1012044.