UTAP: A platform for analyzing unstructured data

UTAP Logo

 

UTAP is an innovative text analytics platform that enables businesses achieve efficiency in cost and turnaround time in extracting valuable insights from unstructured text data by providing an easy to use interface and customized screens for each phase of the text analytics process enabling faster turnaround time, and data handling ability in classification of textual data.

 

 

What UTAP does


  • Automated text classification: Query classification for deeper and better understanding of customer queries
  • Makes text classification 60% faster
  • A platform that enables scalability and makes text classification easier and faster
  • Preview screens for data extraction, and data cleansing
  • Sliders, and input boxers for easy definition of parameters and data split
  • Pre-defined classification and sampling performed with the click of a button
  • Predefined classifiers, and sampling techniques to choose from

 

Benefits of the tool


Economic

Low cost per comment, making it economic for large data volumes

Scalability

Ability to handles millions of rows of textual data per day

Automation

Faster turnaround time, as each step of the unstructured text classification process is prebuild

Data Sources

Comprehensive coverage of data sources (structured and unstructured) Ability to handle 1 million rows of data per day

Actionable

Newsletters and brand story delivered at the end of the exercise with actionable insights Roadmap for further text analytics will be provided

Significant actionable insights to help you develop a deeper understanding of issues, and identify problem areas

Data Extraction

  • Multiple formats supported (SQL file, .csv, .txt, .xls)
  • Ability to select specific tables and columns of the data file.
  • Concatenate multiple columns as a single entity
  • Preview screen of the selected data set

Variable Selection

  • Simplify the problem to predictor and dependant variables
  • Create a new table using only the relevant fields, and variables
  • The data is reduced to 70% its original size, as irrelevant metadata is removed.

Data Sampling

  • Sliders, and input boxers for easy definition of parameters and data split
  • Predefined classifiers, and sampling techniques to choose from
  • Parameters can be modified based on each dataset
  • Sampling performed with the click of a button
  • Preview screen shows how data has been distributed

Data Cleansing

  • Pre-processing or text cleansing done with the click of a button as list of junk words, stop words are pre defined
  • Preview screen of the cleaned data set
  • List of stop words/junk words/regular expressions can be updated on an excel sheet and the same can be uploaded

Model Training

  • Custom parameters can be inserted on which TFIDF has to be created
  • Set of classifiers to be chosen from (pre-coded in the background)
  • Parameters can be modified based on each dataset
  • Accuracy score is given with the click of a button

Validation and Testing

  • Get Accuracy score, Confusion Matrix Score, and classification report to be downloaded in .pdf/.doc format
  • Predefined classifiers, and sampling techniques to choose from
TOP