Merge, extract text, images and metadata

We’re pleased to share insight of upcoming GroupDocs.Parser Cloud API, a new addition to groupdocs.cloud product list. GroupDocs.Parser Cloud is a document parsing solution. As a developer, you’ll be able to add document parsing feature in your applications on any platform without depending on any third-party plugin or tool. The main feature of this REST API will be to parse documents on user-defined templates to extract data from your invoices, quotation or other kinds of business documents.

Some of the supported features in upcoming API are as following. The REST API will not be limited to the following features, but we will be keep adding new useful features.

Features

  • Parse Document by Template

  • Extract Text

    • Extract only text

    • Extract formatted text using extraction mode option; Plain Text, HTML and Markdown

    • Extract text from specific pages by setting the page range

  • Extract Images

  • Document information extraction

  • Template Management

Supported formats

In the first release of GroupDocs.Parser Cloud API, we will be supporting the following file formats:

DOC

Microsoft Word Document

DOT

Microsoft Word Document Template

DOCX

Office Open XML Document

DOCM

Office Open XML Macro-Enabled Document

DOTX

Office Open XML Document Template

DOTM

Office Open XML Document Macro-Enabled Template

TXT

Plain text

ODT

Open Document Text

OTT

Open Document Text Template

RTF

Rich Text Format

PDF

Portable Document Format File

HTML

Hypertext Markup Language File

XHTML

Extensible Hypertext Markup Language File

MHTML

MIME HTML File

MD

Markdown

XML

XML File

CHM

Compiled HTML Help File

EPUB

Digital E-Book File Format

FB2

FictionBook 2.0 File

XLS

Microsoft Excel Spreadsheet

XLT

Microsoft Excel Template

XLSX

Office Open XML Spreadsheet

XLSM

Office Open XML Macro-Enabled Spreadsheet

XLSB

Office Open XML Binary Spreadsheet

XLTX

Office Open XML Spreadsheet Template

XLTM

Office Open XML Macro-Enabled Spreadsheet Template

ODS

Open Document Spreadsheet

OTS

Open Document Spreadsheet Template

CSV

Comma Separated Values

XLA

Excel Add-In File

XLAM

Excel Open XML Macro-Enabled Add-In

NUMBERS

Apple iWork Numbers

PPT

PowerPoint Presentation

PPS

PowerPoint Slideshow

POT

PowerPoint Template

PPTX

Office Open XML Presentation

PPTM

Office Open XML Macro-Enabled Presentation

POTX

Office Open XML Presentation Template

POTM

Office Open XML Macro-Enabled Presentation Template

PPSX

Office Open XML Presentation Slideshow

PPSM

Office Open XML Macro-Enabled Presentation Slideshow

ODP

Open Document Presentation

OTP

Open Document Presentation Template

PST

Outlook Personal Information Store File

OST

Outlook Offline Data File

EML

E-Mail Message

EMLX

Apple Mail Message

MSG

Outlook Mail Message

ONE

OneNote Document

ZIP

Zipped File

Security and Authentication

The GroupDocs.Parser Cloud REST API is secured and requires authentication. You will need AppSID and AppKey for authentication, which can be created at the dashboard.

API Explorer

We will provide a Web-based API Reference Explorer for GroupDocs.Parser Cloud. So you will be able to try the REST APIs right away in your browser. And also you can get information about all the resources in the API.

SDKs

GroupDocs.Parser Cloud will come with SDKs for all popular programing languages hosted on our GitHub repository along with working examples, that will allow you to integrate it into existing systems. The SDKs will be wrapped around REST API. The SDK will take care of low-level details of making requests and handling responses, that will let you focus on writing code specific to your particular project.

Our first version

We are currently finalizing the documentation and examples for GroupDocs.Parser Cloud. We have planned to release the first version of new product soon with features shared above. If you have any questions or suggestions, please feel free to write us on groupdocs.cloud Forum.

Please stay tuned to groupdocs.cloud blog for further updates.