Compare PDF Files using REST API in Python

PDF (Portable Document Format) is one of the most commonly used file types today. Typically used to distribute read-only documents, preserving the layout of a page. In various cases, we may need to compare the contents of two or more PDF documents or compare multiple versions of the same document. We can easily compare PDF documents programmatically to identify similarities and differences. In this article, we will learn how to compare PDF files using a REST API in Python.

The following topics shall be covered in this article:

REST API to Compare PDF Files and Python SDK

For comparing PDF documents, we will be using the Python SDK of GroupDocs.Comparison Cloud API. It allows you to compare ‎two or more documents of the supported formats and find the differences. Please install it using the following command in the console:

pip install groupdocs_comparison_cloud

Please get your Client ID and Secret from the dashboard before following the mentioned steps. Once you have your ID and secret, add in the code as shown below:

Compare Two PDF Files using a REST API in Python

We can compare PDF documents on the cloud by following the simple steps given below:

  1. Upload the PDF files to the cloud
  2. Compare PDF Files
  3. Download the resultant PDF file

Upload the PDF Files

Firstly, we will upload the source and target PDF files to the cloud using the following code sample:

As a result, the uploaded files will be available in the files section of the dashboard on the cloud.

Compare PDF Files using Python

We can compare two PDF documents programmatically by following the steps given below:

  • Firstly, create an instance of the CompareApi.
  • Next, create an instance of the FileInfo.
  • Then, set the source PDF file path.
  • After that, create another instance of the FileInfo.
  • Then, set the target PDF file path.
  • Next, create an instance of the ComparisonOptions.
  • Then, assign source and target files.
  • Also, set the output file path.
  • After that, create an instance of the ComparisonsRequest with ComparisonOptions object
  • Finally, get results by calling the CompareApi.comparisons() method with ComparisonsRequest as argument.

The following code sample shows how to compare two PDF files using a REST API in Python.

Compare Two PDF Files using a REST API in Python

Compare Two PDF Files using a REST API in Python.

The resultant PDF file also contains a summary page at the end of the document, as shown below:

Summary Page.

Summary page showing total deleted or inserted elements.

Download the Resultant File

The above code sample will save the differences in a newly created PDF file on the cloud. It can be downloaded using the following code example:

Compare Multiple PDF Files in Python

We can compare multiple PDF documents by following the steps given below:

  • Firstly, create an instance of the CompareApi.
  • Next, create an instance of the FileInfo and set the source PDF file path.
  • Then, create another instance of the FileInfo and set the target PDF file path.
  • After that, repeat the above step to add more target files.
  • Next, create an instance of the ComparisonOptions.
  • Then, assign source/target files and set the output file path.
  • After that, create an instance of the ComparisonsRequest with ComparisonOptions object
  • Finally, get results by calling the CompareApi.comparisons() method with ComparisonsRequest as argument.

The following code sample shows how to compare multiple PDF files using a REST API in Python.

Customize Comparison Results in Python

We can customize the style of changes found in the result of the comparison process by following the steps given below:

  • Firstly, create an instance of the CompareApi.
  • Next, create an instance of the FileInfo and set the source PDF file path.
  • Then, create another instance of the FileInfo and set the target PDF file path.
  • Next, create an instance of the Settings.
  • Then, set compare sensitivity and various properties to customize Item’s style.
  • Next, create an instance of the ComparisonOptions.
  • Then, assign source/target files and set the output file path.
  • After that, create an instance of the ComparisonsRequest with ComparisonOptions object
  • Finally, get results by calling the CompareApi.comparisons() method with ComparisonsRequest as argument.

The following code sample shows how to customize comparison results using a REST API in Python.

Get List of Changes in Python

We can get a list of all the changes found during the comparison of PDF files by following the steps given below:

  • Firstly, create an instance of the CompareApi.
  • Next, create an instance of the FileInfo and set the source PDF file path.
  • Then, create another instance of the FileInfo and set the target PDF file path.
  • Next, create an instance of the ComparisonOptions.
  • Then, assign source/target files and set the output file path.
  • After that, create an instance of the PostChangesRequest with ComparisonOptions object
  • Finally, get results by calling the CompareApi.post_changes() method with ComparisonsRequest as argument.

The following code sample shows how to get a list of changes using a REST API in Python.

Get List of Changes in Python

Get List of Changes in Python.

Compare and Save with Password & Metadata in Python

We can password-protect the resultant file and save it with metadata by following the steps given below:

  • Firstly, create an instance of the CompareApi.
  • Next, create an instance of the FileInfo and set the source PDF file path.
  • Then, create another instance of the FileInfo and set the target PDF file path.
  • Next, create an instance of the Settings.
  • Then, create an instance of the Metadata.
  • After that, set various metadata properties such as author, company, last_save_by, etc.
  • Then, set password and password_save_options.
  • Next, create an instance of the ComparisonOptions.
  • Then, assign source/target files and set the output file path.
  • After that, create an instance of the ComparisonsRequest with ComparisonOptions object
  • Finally, get results by calling the CompareApi.comparisons() method with ComparisonsRequest as argument.

The following code sample shows how to save the resultant file with a password and metadata using a REST API in Python.

Try Online

Please try the following free online PDF comparison tool, which is developed using the above API. https://products.groupdocs.app/comparison/pdf

Conclusion

In this article, we have learned how to compare PDF documents on the cloud. We have also seen how to compare multiple PDF files, customize changes style and get a list of changes in Python. This article also explained how to programmatically upload multiple PDF files to the cloud and then download the resultant file from the cloud. Besides, you can learn more about GroupDocs.Comparison Cloud API using the documentation. We also provide an API Reference section that lets you visualize and interact with our APIs directly through the browser. In case of any ambiguity, please feel free to contact us on the forum.

See Also