Note This post is outdated . See the 2018 version here.
I’ve been working on developing a paperless workflow. I want to save my digital files in PDF/A format, specially designed for archiving. (In particular, I want to create PDF/A-1b documents.) An important part of this workflow is the ability to print documents and emails to PDF/A. I found that with some tweaking, I can use CutePDF to do so. Here’s how.
Note This is a fairly advanced procedure and requires Administrator permissions.
CutePDF uses a program called Ghostscript to convert a printer file to PDF. If you don’t already have Ghostscript, CutePDF Writer 3.0 downloads Ghostscript 8.15. But you need Ghostscript 9.07 for this PDF/A conversion, so you need to install Ghostscript first.
1. If CutePDF is installed, uninstall it.
2. Download the GNU Affero-licensed version of Ghostscript 9.07 here. I found that the 32-bit version works fine even under 64-bit Windows 7. Install Ghostscript to the default directory, C:\Program Files (x86)\gs\gs9.07
. At the end of the install, go ahead and let it Generate cidfmap for Windows CJK TrueType fonts.
3. Download the free CutePDF Writer 3.0 here. Install it, but be very careful to uncheck all the extra software it will try to install:
The CutePDF installer should automatically find your Ghostscript 9.07 installation and should not prompt you to download Ghostscript.
4. Create an empty folder on your C: drive called C:\GS_PDFA (Ghostscript PDF/A).
5. Download the Adobe ICC profiles here. An ICC profile describes a “color space.” We’ll use the simplest one, Adobe RGB (1998). From the downloaded zip archive, extract AdobeRGB1998.icc to the C:\GS_PDFA folder.
6. Ghostscript needs some special instructions for creating PDF/A files. These are partially contained in a PDFA_def.ps file. For more information, see the sample file in your Ghostscript installation (C:\Program Files (x86)\gs\gs9.07\lib\PDFA_def.ps
). Also see this bug report.
I’ve modified the sample to work with the 3-color RGB color space. In the C:\GS_PDFA folder, create an empty text file named PDFA_def.ps and paste in the following text. (Since this a derivative work of a file included with GNU Affero-licensed Ghostscript, please use it under the terms of the GNU Affero General Public License.)
/ICCProfile (C:/GS_PDFA/AdobeRGB1998.icc) % Customize. def [ /Title (Title) % Customize. /DOCINFO pdfmark % Define an ICC profile : [/_objdef {icc_PDFA} /type /stream /OBJ pdfmark [{icc_PDFA} <</N systemdict /ProcessColorModel get /DeviceGray eq {1} {systemdict /ProcessColorModel get /DeviceRGB eq {3} {4} ifelse} ifelse >> /PUT pdfmark [{icc_PDFA} ICCProfile (r) file /PUT pdfmark % Define the output intent dictionary : [/_objdef {OutputIntent_PDFA} /type /dict /OBJ pdfmark [{OutputIntent_PDFA} << /Type /OutputIntent % Must be so (the standard requires). /S /GTS_PDFA1 % Must be so (the standard requires). /DestOutputProfile {icc_PDFA} % Must be so (see above). /OutputConditionIdentifier (AdobeRGB1998) % Customize >> /PUT pdfmark [{Catalog} <</OutputIntents [ {OutputIntent_PDFA} ]>> /PUT pdfmark
7. CutePDF also needs some instructions for printing to PDF/A. In the C:\GS_PDFA folder, create an empty text file named PDFWrite.pdfa.rsp. Open the file in Notepad and paste in this text
-sDEVICE=pdfwrite -q -dPDFSETTINGS=/default -dAutoRotatePages=/All -dNOPAUSE -dBATCH -dPDFA -dNOOUTERSAVE -sProcessColorModel=DeviceRGB -dUseCIEColor -dPDFACompatibilityPolicy=1 "C:\GS_PDFA\PDFA_def.ps"
8. The second part of the instructions for CutePDF must be added to the CutePDF installation folder, C:\Program Files (x86)\Acro Software\CutePDF Writer
. You will need Administrator permissions.
Create an empty text file in that folder named setup.ini. Run Notepad as Administrator, open the file, and paste in this text:
[Parameters] Command="C:\Program Files (x86)\gs\gs9.07\bin\gswin32c.exe" Arguments=-sOutputFile="%1" @"C:\GS_PDFA\PDFWrite.pdfa.rsp" -
This tells CutePDF where to find your Ghostscript installation and what arguments to pass to it. We are basically telling it to use the commands in our custom PDFWrite.pdfa.rsp, which will in turn load the special PDFA_def.ps file before loading the iniput stream. And yes, the last line in setup.ini ends in a hyphen (-)!
Print to PDF/A
That’s it! Now, when you print using CutePDF, the file should be created in PDF/A-1b format. After opening in Adobe Reader, you’ll see a blue bar at the top:
Note that this message, in spite of what it says, does not guarantee compliance with the standard. To test compliance, you can use Adobe Acrobat’s pre-flight testing (Acrobat offers a 30-day free trial). You can also try a free online validator like the one at PDF-Tools.com or the one at intarsys.de (German). In my testing, files created with Ghostscript passed the Acrobat and PDF-Tools tests, but failed the intarsys test with font errors.
Print to Plain PDF
If you want to go back to printing to plain PDF from CutePDF, in the CutePDF installation folder, just rename setup.ini to setup.pdfa.ini. CutePDF will then use its default settings. You can leave the C:\GS_PDFA folder set up so all you need to do for PDF/A printing is to rename setup.pdfa.ini back to setup.ini.
Converting PDF to PDF/A
If you already have a text-based (searchable) PDF file, perhaps received from a bank or utility, you can convert it to PDF/A without printing with CutePDF. See Batch Convert PDF to PDF/A.
Sample Files and an Issue with Chrome
Added March 2, 2018
Here is a sample page printed on February 27, 2018 from Google Chrome 64.0.3282.186 using CutePDF 3.2.0.1 and GPL Ghostscript 9.22:
And here is a sample page printed on March 2, 2018 from Firefox 58.0.2 using CutePDF 3.2.0.1 and GPL Ghostscript 9.22:
It seems that the output from Chrome (170KB) is made up of a patchwork of small images; the text is not selectable and not searchable. In the output from Firefox (233KB), the text is selectable and searchable.
The whole point of this exercise is to go paperless by creating searchable PDF/As. I see that this was reported as a Chrome issue in 2011; the only suggested workaround is to use Chrome’s native Save as PDF printer, but then that’s not PDF/A, so you’d have to run it through the PDF/A batch conversion process. I hate to think how many non-searchable PDFs I’ve created since switching to Chrome. If anyone has a one-step workaround that allows printing searchable PDF/A files from Chrome, please post a comment.
Update March 24, 2018 Today I used Chrome to Save as PDF, which was searchable, but after conversion with the PDF/A script, it was much larger, difficult to read (blurry), and NOT searchable. Same page printed in Firefox directly to CutePDF was small and searchable. Found another relevant 2011 bug report on Chrome, but the suggested workaround–using Ctrl-Shift-P to bypass the Chrome Preview–still did not create searchable text when printed to CutePDF. There is some wierdness going on with rasterizing before printing, which may help font rendering but does not help in creating archival PDFs!
Pingback: Batch Convert PDF to PDF/A | MCB Systems
I know this trick is ancient in techie terms but it’s just what I needed for my current project so wanted to say thanks for sharing, works great.
Matt – I still use this every day. Glad it helps you too!
Hi Mark,
Would it be possible to share a PDF file you created this way? Have to involve the IT guys of my organisation to install this, so I would like to take a look at it beforehand. Thanks!
By the way, I would like to validate a file using veraPDF. I work for a heritage organisation that is involved in creating a good tool for PDF/A validation. The software for creating PDF/A files still seems to be a problem in a lot of cases, certainly for content based on e-mails. CutePDF might still be one of the easiest solutions…
Thanks for sharing your solution.
Stijn
Stijn,
I’ve posted a sample at the end of the article. Let me know how it fares on veraPDF!
I’ve noticed in the last year or so that the PDF/A conversion using Ghostscript (as described in the linked Batch Convert article) sometimes seems to degrade the PDF quality. Invoices from the cable company, for example, are harder to read after conversion. I haven’t delved into what is causing that–maybe some new options in PDF formatting that Ghostscript doesn’t like? Maybe the cable company uses some strange fonts or text-as-image stuff?
One advantage of the batch conversion approach, at least during testing, is that you can see the warnings and errors generated during the conversion. Those are not shown when using CutePDF, though I assume the same thing is going on in the background.
Thanks for adding the file. I tested it and it validates perfectly as PDF/A-1B.
I certainly hope to test your setup myself. Would like to tweak it as well, to create PDF/A-2B or even -2A documents. I’ll post it if we find anything interesting. :)
Stijn, thanks for the follow-up. Note that I’ve discovered what, for me, is an issue when printing to CutePDF from Chrome: the PDF/A is not searchable. See the modified section at the end of the post above.
Thanks for sharing the issue. That is indeed an issue. Do you have other experiences with printing from Outlook for example?
Stijn – just did a quick test print to CutePDF from Outlook 2016 and it is searchable. It seems the issue isn’t with CutePDF or the Ghostscript conversion, it’s with the source program. Chrome apparently prints using a patchwork of images–maybe to make smaller output–and that means there is no text for CutePDF to include.
Very good to know. Thanks a lot for the follow up!
Hello Stijn,
Speak dutch?
I get a blank page when I print something to PDF. All the above has been processed.
Thanks for your answer.
BTW, we use GS version C:\Program Files\gs\gs9.18\bin\gswin64c.exe
Greetings Xander
Dear Xander,
I speak dutch, but I’m afraid I cannot help you. For the moment, I didn’t use the method mentioned above, but just tested it in plain cutepdf. I guess it’s better to address your question to the author, Mark.
Stijn
This PDF/A thing has unfortunately become pretty unreliable. See my March 24, 2018 comment in the main article. In some cases, searchable PDF’s are no longer searchable after converting with Ghostscript (version 9.22). In other cases, fonts become fuzzy and less legible. If I ever find the time, I will try to figure out what has changed. For now, keep in mind that it may not work.
I finally figured out what has been going on with PDF/A conversion: semi-transparent fonts were getting rasterized, causing fonts to go fuzzy and files to lose searchability and bloat in size. Recent versions of Ghostscript can handle PDF/A-2b, which supports transparency, which solves all that. See the rewritten post here:
https://www.mcbsys.com/blog/2018/10/use-cutepdf-to-print-to-pdf-a-for-free-2018-edition/