GovText Features | Singapore Government Developer Portal
Have feedback? Please

GovText Web Portal: Topic modelling with interactive presentation of results

Topic modelling is a statistical technique that can be used to discover hidden topics (collections of words and phrases centred around different themes) within a set of documents. Through the visualisations of the results provided by GovText, users will be able to understand the topic models from both a high-level perspective and a low-level, detailed view. They can see the different topics (and their constituent words) in the entire dataset, and the proportions of different topics in individual documents.

The GovText Web Portal offers two options for Topic Modelling, using the following algorithms:

  • Latent Dirichlet Allocation (LDA)
  • Correlated Topic Model (CTM)

GovText Web Portal: Text summarization

Text summarization is the process of generating a shortened version of a document that is concise and yet retains the essential information in the original document.

The GovText Web Portal offers two types of summarization:

  • Abstractive summarization (Normal mode)
    • The key points of an article are paraphrased into a short and coherent paragraph, by a language model pre-trained on news articles and open datasets.
  • Extractive summarization (Quick mode)
    • The most important sentences from an article are stacked together to create a summary. The lack of paraphrasing is compensated with a faster processing speed.

GovText Model-hosting Platform: Development, deployment, and hosting services of customised AI models

The GovText team can help agencies and central products to develop, deploy and host the customised text analytics models on the GovText Model-hosting Platform, so that the client systems can use the text analytics services via API.

GovText Model-hosting Platform: Prediction of use case results by customised AI models

The models can predict the specific results for each use case. Examples of deployed live models include:

  • Feedback case classifier to auto-predict case categories and sub-categories etc.
  • Agency/ case owner classifier to auto-predict the responding agency or department that will be handling the feedback.
  • Information extractor to auto-extract feedback details.
  • Summarizer to shorten long feedback email chains to 1-2 sentences for quick reading and inclusion in management reports.

Last updated 25 November 2022

Was this article useful?
Send this page via email
Share on Facebook
Share on Linkedin
Tweet this page