If you’re thinking about starting a company around proprietary algorithms… don’t!
The types of companies I’m most concerned about are:
- recommendation engine companies
- predictive analytics companies
- natural language processing companies
Any strictly technical advantage is a very tough thing to maintain, but I believe that algorithms are the most tenuous advantage of them all. Why? Algorithms are often easy to swap, their advantages can be esoteric, right now you can get open source, high-quality algorithms for the all applications above for free, and open innovation is continuing at a dizzying pace.
This last point, “open” is the most important. If you sell algorithms, then you probably sell them closed-source, which means your customers can’t, you know, customize them. So your value prop is a “good enough” implementation that’ll get your customers to a solution fast. The problem is this: that’s perfect for early stage customers and low-value use-cases. Yes, today you could arbitrage inefficiencies in the data science labor market, but more people with data science training are entering the field, a poorly-tailored algorithm is quickly becoming a competitive disadvantage.
Here is a case in point: the company I work at, Mortar, is a platform company (not an algorithm company), yet we’ve built recommendation engines for Viacom, Associated Press, StubHub, and others. Why? Because we’re not making money on recommendations and we offer flexible, open recommenders that their in-house data science team can own, extend, and adjust as needed. We even packaged up our code and tutorials so that anyone can build their own recommendation engine for free… that’s how little we think the algorithms are worth on their own.
Here’s another bad sign: established algorithm companies are sweating and pivoting because their moat is being drained by an open-source army. (I don’t want to call out individual companies that are having a hard time, but Google will give you examples.) Ted Dunning and Ellen Friedman just wrote a short ebook about building recommendation engines on open technologies (Mahout + Solr). 0xdata has an open source prediction engine for big data, and prediction.io has an open source prediction engine for regular data. Cloudera open sourced Oryx to generate and serve recommendations. Weka is a high quality collection of open machine learning algorithms. Python has a great collection of [natural language processing] (http://www.nltk.org/) and machine learning tools. The list goes on and on. The main value that black-box algorithm vendors provide is making it easier to operate their algorithms end to end, and the main reason they stay afloat isn’t because of customer satisfaction, it is because of inertia… and that’s not the type of company that anyone wants to start.