miscnotes:diffpdf
Comparing the text in two PDF files
Need to compare the text in two PDF files to find differences?
Try the following (for Windows):
- Download Xpdf for pdftotext.exe.
- Download GNU utilities for Win32 for diff.exe.
- Extract the text from the PDF files while preserving the layout with:
pdftotext -layout file.pdf
- Determine the differences and store these side by side (-y) in a text file with:
diff -y --width=220 file1.pdf file2.pdf > file1_file2_diff.txt
- You might need to test different settings for the value after –width= to prevent lines from being terminated prematurely.
- Open file1_file2_diff.txt to look at the differences.
miscnotes/diffpdf.txt · Last modified: 2012/07/12 16:23 by bas