Classify Documents and Raw Text using C#

Classify Documents and Raw Text using C#

Text classification or text categorization is the process of assigning tags or categorizing text into organized groups. As a C# developer, you can easily classify raw text or documents programmatically on the cloud. In this article, you will learn how to classify documents and raw text using a REST API in C#.

The following topics are discussed/covered in this article:

Document Classification REST API and .NET SDK

For classifying text or documents, I will be using the .NET SDK of GroupDocs.Classification Cloud API. It enables you to classify your raw text as well as documents ‎into predefined categories. The SDK supports multiple taxonomy types, such as IAB-2, Documents & Sentiment taxonomy. The classification information shows the best class with its probability score.

You can install GroupDocs.Classification into your Visual Studio project from the NuGet Package Manager or using the following command in the Package Manager console:

Install-Package GroupDocs.Classification-Cloud

Please get your Client ID and Client Secret from the dashboard before you start following the steps and available code examples. Once you have your client ID and Secret, add in the code as shown below:

Classify Word Documents using a REST API in C#

You can classify your Word documents by following the simple steps given below:

Upload the Document

Firstly, upload the DOCX file on the Cloud using the code sample given below:

As a result, the uploaded DOCX file will be available in the files section of your dashboard on the cloud.

Classify Word Documents using C#

You can classify Word documents programmatically by following the steps given below.

  • Create an instance of ClassificationApi
  • Create an instance of BaseRequest
  • Set the DOCX file path and assign it to the BaseRequest document
  • Create ClassifyRequest with BaseRequest
  • Set BaseClassesCount
  • Get ClassificationResponse by calling the ClassificationApi.Classify() method

The following code sample shows how to classify a Word document using a REST API.

Classify Word Documents using a REST API in C#
Classify Word Documents using a REST API in C#

Classify Word Documents for Taxonomy using C#

You can classify Word documents for a taxonomy programmatically by following the steps given below.

  • Create an instance of ClassificationApi
  • Create an instance of BaseRequest
  • Set the DOCX file path and assign it to the BaseRequest document
  • Create ClassifyRequest with BaseRequest
  • Set BaseClassesCount
  • Set Taxonomy
  • Get ClassificationResponse by calling the ClassificationApi.Classify() method

The following code sample shows how to classify a Word document for “documents” taxonomy using a REST API. Please follow the steps mentioned earlier to upload the file.

ClassName: ADVE
ClassProbability: 77.17
--------------------------------
ClassName: Resume
ClassProbability: 22.83
--------------------------------
ClassName: Scientific
ClassProbability: 0.01
--------------------------------

You can use the following as a taxonomy to classify the documents:

  • default
  • iab2
  • documents
  • sentiment
  • sentiment3

You may read more about classifying request parameters in the “Classify Request Parameters” section.

Classify Raw Text using a REST API in C#

You can classify any raw text programmatically by following the steps given below.

  • Create an instance of ClassificationApi
  • Create BaseRequest instance
  • Provide raw text to BaseRequest description
  • Create ClassifyRequest with BaseRequest
  • Set BaseClassesCount
  • Get ClassificationResponse by calling the ClassificationApi.Classify() method

The following code sample shows how to classify raw text using a REST API.

ClassName: Hobbies_&_Interests
ClassProbability: 43.02
--------------------------------
ClassName: Business_and_Finance
ClassProbability: 26.64
--------------------------------
ClassName: Technology_&_Computing
ClassProbability: 18.25
--------------------------------

Try Online

Please try the following free online classification tool, which is developed using the above API. https://products.groupdocs.app/classification/

Conclusion

In this article, you have learned how to classify Word documents and raw text on the cloud using C#. You also learned how to programmatically upload the DOCX file on the cloud. You can learn more about GroupDocs.Classification Cloud API using the documentation. We also provide an API Reference section that lets you visualize and interact with our APIs directly through the browser. In case of any ambiguity, please feel free to contact us on the forum.

See Also