North Link

Easing the load

Labeling data by hand can be a very boring and repetitive task. The end-goal of this project was to assist humans assigning categories to companies by automatically generating recommendations and also in some cases automatically categorizing them. This takes away some of the burden and lets people focus on more productive tasks.

Data is king

In machine learning a good model is worth nothing without good training data so the first step of this project was to create a pipeline to collect as much relevant data about companies as possible. This includes augmenting the data and extending it by pulling in external data sources.

Understanding natural language

Much of the data used as input to the model was text written by humans. In order for the model to understand human language, it needs to be processed and converted into a form the computer can understand. To do this we used state-of-the-art NLP techniques to normalize and tokenize the text, taking into account spelling errors and other mistakes.

Categorization of companies using ML and NLP

Automatic company categorization

Easing the load

Data is king

Understanding natural language

Trusted By Innovative Companies