This page contains affiliate links. Which means if you buy something using the affiliate links on this page, I earn a commission. Affiliate links help me get paid for the TONS of time, effort and energy I spend on creating free high-quality content for you (and also pays for hosting and software costs).
Thank you for your support. Read more about my affiliate links policy here.
Did you know:
Google displays your PDF files in search results?
When a Google search term matches the contents of your PDF, the search result can contain a direct link to your PDF file.
This means that even the people who aren't really searching Google for free access to your PDFs, can still view and download them ...
without signing up to your newsletter or buying your ebook.
As a result, you lose valuable leads and sales, if you don't protect PDF files on your site.
Today You'll Learn
Now if you're like most people, you've simply uploaded the PDFs and didn’t really bother hiding them.
Because you figured, Who can find them without knowing the exact URL?
And to be fair, you did nofollow all links leading to the download page and to the actual PDF itself, so you should be ok, right?
Why you should hide the PDFs on your website
Google indexes and displays PDF files in search results because your PDF contains keywords that Google might find useful in search results.
This means your lead magnets like: case studies, cheat sheets, whitepapers etc. can easily be found by anyone using Google.
It’s even worse if your PDF is paid ebook. People are getting access to what you are selling, for free!
And it’s not even difficult to search for PDFs on the internet:
Websites like PDFsearchengine provide an easy way to search for PDF files on the web.
Or you can use Google’s own advanced search.
All you have to do, is enter the domain name of the website and select file type as PDF.
Voila, you can find the PDF files indexed by Google.
Try it right now and see how many of your PDFs are visible to anyone willing run a simple Google search.
You should run such a search for your competitors PDFs to learn how to gain a competitive advantage!
(They might have read your PDFs already!)
OR you can search for PDF files of topics you are interested in:
What if the PDFs aren’t used for lead gen AND aren’t a paid product?
Should I still hide them?
Short answer: Yes.
I understand, sometimes you want to give away a PDF to your website visitors without wanting anything from them.
In that case too you should hide your PDFs because they are likely to contain more or less the same keywords for which you want your pages to rank higher, not your PDFs.
If you don’t hide the PDFs, you can incur a duplicate content penalty from Google for the content in your PDFs.
And of course, you're not gonna go through all your PDFs, look for potential duplicate content, remove the duplicate content (probably ruining the PDF in the process) and upload them again.
All while making sure there are no broken links.
And you don't have to, if you follow the method below.
Why Use Download Pages
This section is a bit of a segway, but it fits in neatly with the Hide PDF method.
So just stick with me while I explain why you need download pages ...
Now, when someone signs up to your newsletter, you ask them to go to their inbox and click the link in your email to download the PDF.
(please don't tell me you just provide a link to the PDF as soon as you get their email address ... cause that's a sure fire way to bloat up your list with junk email addresses. Always make them open the first email. Make sure it's a real email address)
Now, once they click the link, the best thing to do is send them to a download page on your site where they can download the PDF.
Why send them to a download page, you ask?
Because the download page allows you to do some nifty stuff like:
And if you don't have a unique download page for EACH of your lead magnets, you lose these opportunities.
PSST: Download pages on this site are customized too 🙂
How (not) to hide your PDFs?
Let's take a look at some solutions suggested elsewhere on the internet and see if they work:
You setup your lead magnet or purchased PDF to be delivered through your email marketing service like ActiveCampaign.
This way you completely avoid hosting the file on your server.
It seems like a good idea if, for some reason, you don't want the new subscriber to return to your site.
The problems with this approach are:
You could host your PDFs on a third party service like Dropbox and simply provide the link in your email.
But sadly, that doesn’t work either.
Unless you password protect your lead magnets or the folders they are in, Google will still index them.
And the PDFs will probably benefit from the higher authority rank of services like Dropbox and feature even more prominently in the search results.
That's a double whammy!
That's worse than just leaving the PDFs on your server.
This will make sure search engines can't read through your PDFs and that'll cause Google to drop the file from it's results.
But Google has already has indexed the PDF file before you password protected it. And there’s no telling when Google will update their index to reflect the changes you made.
It could takes weeks or even months. But that's not the problem.
The problem is that everyone who downloads the newly protected PDF file will encounter a password and if they are unable to access the file, it's bad for business.
See, when I signup on a mailing list to download a lead magnet, I know what to expect. It's pretty standard:
- 1Signup to the list and be sent to a page which says something like: “Thank you for signing up, now check your inbox for the email which contains the link to the download file.”
- 2Check my inbox and find the email. Click the link in it I’m sent to a page from where I can download the lead magnet.
I’m sure you must have gone through this process so many times that you don’t even think about the steps anymore.
You don’t even read the instructions anymore because they are so standard.
You just do it automatically.
Guess what, it’s the same for your website visitors.
They don’t expect the lead magnets to be password protected.
So even if you provide them with the password, they are likely not to find it.
The whole password issue can be dealt with by providing the password in big bold fonts on your download page and also in your email.
No wait, there’s another issue!
The whole point of downloading a PDF is that I can access it later.
So 2 months later when I open your PDF lying on my desktop, what are the chances that I will remember the password?
You see how password protecting your PDF lead magnets is not a good idea?
If you already use robots.txt then you might think it’s a good idea to use it to block access to your PDF file using the code:
Disallow: *.pdf # Block pdf files. Non-standard but works for major search engines.
Disallow: /pdfs/ # Block the /pdfs/directory if you keep all your PDFs in one folder
The problem is that this only prevents Google from accessing your PDF files but does nothing to remove them from its index or being listed in search results.
While looking for resources to help me create this post, I came across many websites which claimed this to be the perfect solution.
Few are even listed in top 5 results of Google.
Obviously they didn’t do their research and everyone who implemented their advice now THINKS that their PDF files are being protected from Google search.
In a nutshell, this will only work for new PFDs that you upload and won’t do anything to remove the old PDFs from any search engine’s database.
Given that this solution is still being touted as the best, all I want to say is to the tech & marketing bloggers is that they should do their research before posting such stuff.
They aren’t just relying on wrong info themselves but also misleading anyone who reads their crappy advice.
Anyways, angry-man-rant over, let’ move on …
Noindex the “Here's Your Download” (basically the lead magnet download button page) and nofollow all the links to that page.
But that doesn’t noindex your PDF file and Google still might be able to find it and index it.
Soooo ... that doesn’t work either.
Now, if only there was a way where you didn’t have to make too much of an effort and this problem could go away in just a couple of steps?
Fortunately there is 🙂
Here’s what you need to do:
Noindex the PDF files themselves
This means that Google will get the instruction to not index PDF files on your website and also to drop them from its index soon.
You should expect Google to do this fairly quickly because Google takes noindex tags quite seriously.
To do this you need to either:
<Files ~ ".pdf$">
Header set X-Robots-Tag "noindex, nofollow"
To block other file types: You can repeat this code as many times as you like and all you need to do is replace the .pdf part with another file type.
Blocking .mp4 files as an example:
<Files ~ ".mp4$">
Header set X-Robots-Tag "noindex, nofollow"
Take a deep breath and relax. It's done.
There’s no need to inform Google or any other search engines of the change.
They will catch on fairly quickly.
Of course, this still means that your PDFs can be downloaded without signing up for your email list if someone knows the exact URL of the PDF file or the “Here's Your Download” page.
For this reason I always recommend you noindex the “Here's Your Download” page.
This will prevent the download page from being indexed by Google.
So there’s you have it:
This is how you protect your PDF files from showing up in Google’s index.
Leave me a comment below if you found this useful.
Pulkit Gera - Blogging Done Better