Industry Classification Algorithm
Project scope
Categories
Data analysisSkills
algorithms workflow automation statistical methods automationDescription
We need an algorithm to classify businesses into industries. Based on previous endeavors, we anticipate this involves scoring the businesses with a probability of belonging to a given category, using correlations between the various inputs we have in our database.
We will provide all of the data necessary to complete the project.
The classification involves 3 inputs:
- A list of terms that appear on the business' website
- A 3rd party description of the business
- A 3rd party list of industries with which the business is associated
The output is a probability that this business belongs in any of our list of ~100 industry classifications.
We have a training set of several hundred businesses that have been manually assigned.
We have a universe of roughly 50,000 businesses that need assignment, so we will program your algorithm into some sort of automated workflow (you get bonus points for providing a methodology for automation).
Qualifications
Data science background including comfort with statistical methods. Background with programmatic execution of algorithms also helpful, but not required.
Anyone working on this project will be given access to our company Slack channel so they can quickly get answers to their questions. Also, we highly recommend a weekly video call for status updates and more complex questions.
About the company
Basis State finds acquirers for software companies. We do this through a workflow designed to identify and quantify the value of the technology to potential acquirers, and through an outreach process that follows marketing best practices such as segmentation and account-based marketing. We have developed proprietary tools to assess fit with potential acquirers, derive valuations, and provide process transparency to our clients.