AI != Big Data

Consider the two following cases:

(1) Using 1TB of data to train a model that will be processing 10MB of data real-time

(2) Processing batches after batches of 1TB of data in real-time

Both are arguably Big Data but these are very different problems.

A classical example of (1) is image recognition. One might needs terabytes of data to train the model but the completed model is processing only 10MB or so at any given time. A GPU with limited memory is perfect for this. CPU can handle it slower but effectively.

For (2), think about the compliance requirement of banks, which demands daily or even hourly results on tens of millions of bank users and the terabytes of data that come with it. That's not something to be handled by GPU with its pathetic access to memory. Even a generic CPU with tons of DRAM might not work properly.

In summary:

(1) is an algorithm problem

(2) is a computation problem

 

Optionality and entrepreneurship

QZ.com's quote of me on the subject of Xiaomi valuation