User-generated PDF documents disclose private information - Update
User-generated PDF documents can potentially disclose information the author might not wish to reveal. Using Internet Explorer and a virtual PDF generator to print a PDF file from a HTML page causes the document's entire storage path, for example file://C:\\Users\dab\Downloads\document.pdf, to be stored in the document itself. Unlike the file information in the document header and footer, these details cannot be excluded.
A similar problem used to exist in Microsoft Word, which saved the entire storage path as well as the author's details in the document itself. This information can easily be read using Notepad. Recent versions of Word, however, no longer present this problem.
While the behaviour itself doesn't represent a security hole, it could become a privacy issue because it allows third parties to obtain information about the directory structure on the author's computer. Path names may reveal details about user names, the software installed, or the category a document is file under.
The security specialist, pseudonym Inferno, who discovered the problem points out that a simple Google search produces millions of PDF documents containing such path information. The problem is caused by Internet Explorer inserting the full storage path and file name in the document title. Which PDF writer is used is irrelevant. When tested by the heise Security team, the information was disclosed when combining IE8 with CutePDF, but PDF writers by Adobe (Distiller) and other vendors also caused the problem when used with the Microsoft browser. Firefox doesn't behave this way and only inserts the file name in the document title.
Microsoft has reportedly been informed of the problem. The vendor apparently plans to fix it in Internet Explorer 9. As a workaround, Inferno suggests using an editor to remove any private information. However, this may cause the PDF to become corrupted and impossible to display in a reader.
Update - According to a post on faq-o-matic.net, a document's entire storage path can also be added inadvertently when converting PowerPoint presentations into PDFs. For example, PowerPoint stores the entire path of any embedded graphics as meta-information. Users can, however, disable this option under the "Document Properties".
- Millions of PDF invisibly embedded with your internal disk paths, Inferno's Full Disclosure post.
- Millions of PDF invisibly embedded with your internal disk paths, Inferno's blog post.
- PowerPoint-PDFs zeigen interne Informationen an, German language post from faq-o-matic.net.