Extract Images from PDF Files using Node.js

Extract Images from PDF Files using Node.js

PDF documents preserve the content including images and text as it is. In certain cases, we may need to extract images from PDF files to reuse them. We can easily extract all the images or images from specific pages embedded in the PDF documents programmatically on the cloud. In this article, we will learn how to extract images from PDF files using a REST API in Node.js.

The following topics shall be covered in this article:

Image Extractor REST API and Node.js SDK

For extracting images from PDF documents, we will be using the Node.js SDK of GroupDocs.Parser Cloud API. It allows extraction of text, images, and the parsing of data by a template from all popular document formats. Please install it using the following command in the console:

npm install groupdocs-parser-cloud

Please get your Client ID and Secret from the dashboard before following the mentioned steps. Once you have your ID and secret, add in the code as shown below:

Extract Images from PDF using a REST API in Node.js

We can extract images from PDF documents by following the simple steps mentioned below:

Upload the Document

Firstly, we will upload the PDF document containing images to the cloud using the code sample given below:

As a result, the uploaded PDF file will be available in the files section of the dashboard on the cloud.

Extract All Images from PDF File in Node.js

Now, we will extract all the images from the uploaded PDF file programmatically by following the steps given below:

  • Firstly, create an instance of ParseApi.
  • Next, provide the uploaded PDF file path.
  • Then, define ImageOptions and assign the file.
  • After that, create the ImagesRequest with ImageOptions as an argument.
  • Finally, extract images by calling the images() method.

The following code sample shows how to extract all the images from a PDF file using a REST API in Node.js.

Extract Images from PDF using a REST API in Node.js
Extract Images from PDF using a REST API in Node.js

Download Extracted Images

The above code sample will save the extracted images on the cloud. We can download these images using the code sample given below:

Save Images by Page Numbers from PDF Documents in Node.js

We can extract images from specific pages of a PDF file instead of the whole document by following the steps given below.

  • Firstly, create an instance of ParseApi.
  • Next, provide the uploaded PDF file path.
  • Then, define ImageOptions and assign the file.
  • Set the start page number and the total number of pages from where to extract images.
  • After that, create the ImagesRequest with ImageOptions as an argument.
  • Finally, extract images by calling the images() method.

The following code sample shows how to extract images by page numbers from a PDF document using a REST API in Node.js. Please follow the steps mentioned earlier to download the extracted images.

Extract Images From Document Attached with PDF in Node.js

We can also extract images from a document inside a container, available as an attachment in a PDF file, by following the steps given below.

  • Firstly, create an instance of ParseApi.
  • Next, provide the uploaded PDF file path.
  • Then, define ImageOptions and assign the file.
  • Next, define ContainerItemInfo and provide the relative path of the inside document.
  • After that, create the ImagesRequest with ImageOptions as an argument.
  • Finally, extract images by calling the images() method.

The following code sample shows how to extract the images from a document inside a PDF document using a REST API in Node.js. Please follow the steps mentioned earlier to download the extracted images.

Try Online

Please try the following free online PDF Parsing tool, which is developed using the above API. https://products.groupdocs.app/parser/pdf

Conclusion

In this article, we have learned how to:

  • extract images from PDF files using Node.js on the cloud;
  • programmatically upload a PDF file to the cloud;
  • download the extracted images from the cloud.

Besides, you can learn more about GroupDocs.Parser Cloud API using the documentation. We also provide an API Reference section that lets you visualize and interact with our APIs directly through the browser. In case of any ambiguity, please feel free to contact us on the forum.

See Also