Extract or Remove Annotations from Word Files using Python

As a Python developer, you can annotate any Word (.doc or .docx) file programmatically on the cloud. You can also extract or remove all the annotations from Word files using Python. The annotations include comments, popups, and various other graphical objects in the document providing additional information. This article will be focusing on how to extract or remove annotations from DOCX files using a REST API in Python.

The following topics shall be covered in this article:

Document Annotation REST API and Python SDK

For extracting or removing annotations from DOC or DOCX files, I will be using the Python SDK of GroupDocs.Annotation Cloud API. It allows you to programmatically build online document and image annotation tools. Such tools can be used to add annotations, watermark overlays, text replacements, redactions, sticky notes, and text markups to the business documents of all popular formats. It also provides .NETJavaPHPRuby, and Node.js SDKs as its document annotation family members for the Cloud API.

You can install GroupDocs.Annotation Cloud to your Python project using the following command in the console:

pip install groupdocs_annotation_cloud

Please get your Client ID and Client Secret from the dashboard before you start following the steps and available code examples. Once you have your ID and secret, add in the code as shown below:

Extract or Remove Annotations from DOCX Files using a REST API in Python

You can extract or delete all the annotations from the DOCX files by following the simple steps mentioned below:

Upload the Document

Firstly, upload the DOCX file to the Cloud using the code example given below:

As a result, the uploaded DOCX file (input.docx) will be available in the files section of your dashboard on the cloud.

Extract Annotations from DOCX Files in Python

Please follow the steps mentioned below to extract annotations from the Word document programmatically.

  • Create an instance of AnnotateApi
  • Create a FileInfo instance
  • Set the file path
  • Create a request by calling the ExtractRequest method
  • Get results by calling the AnnotateApi.extract() method

The following code snippet shows how to extract annotations from the Word document using a REST API.

The above code sample will return an array of all the annotations in JSON format as shown below:

Extract Annotations from DOCX File using Python
Extract Annotations from DOCX File using Python

Remove Annotations from DOCX Files in Python

Please follow the steps mentioned below to delete annotations from the Word document programmatically.

  • Create an instance of AnnotateApi
  • Create a FileInfo instance
  • Set the file path
  • Define RemoveOptions
  • Set file info to AnnotateOptions
  • Provide annotation IDs to remove
  • Set output file path
  • Create a request by calling the RemoveAnnotationsRequest method
  • Get results by calling the AnnotateApi.remove_annotations() method

The following code snippet shows how to remove annotations from the Word document using a REST API. You need to mention annotation IDs that need to be removed from the document.

Remove Annotations from DOCX File using Python
Remove Annotations from DOCX File using Python

Download the Output File

The above code sample will save the output DOCX file (output.docx) after removing annotations on the cloud. You can download it using the following code sample:

Try Online

Please try the following free online DOCX annotation tool, which is developed using the above API. https://products.groupdocs.app/annotation/docx

Conclusion

In this article, you have learned how to extract or remove annotations from Word documents on the cloud using Python. You also learned how to programmatically upload the DOCX file on the cloud and download the file from the cloud. You can learn even more about GroupDocs.Annotation Cloud API using the documentation. We also provide an API Reference section that lets you visualize and interact with our APIs directly through the browser. In case of any ambiguity, please feel free to contact us on the forum.

See Also