Solved – Linear Regression Model with Many Features – Real Life Example

machine learningregression

I am learning Machine Learning (Linear Regression) from Prof. Andrew Ng's lecture. While listening when to use normal equation vs gradient descent, he says when our features number is very high (like $10^6$) then to use gradient descent.

Everything is clear to me, but I wonder that can someone give me real life examples where we use such such huge number of features?

a slide from ML coursera lecture

Best Answer

Natural language processing come to mind. For instance, you might predict the amount of money someone spends on your website by their review. The review is text, encoded by an n-gram model. The ith, jth element of your training matrix is the ith customer and counts of jth n-gram. An n-gram is a string of contiguous words like, "the product was excellent."

Typically n-gram data is encoded in a sparse matrix. There are several data-structures suitable. One of the easiest to explain is a coordinate list. A coordinate list is a list of tuples [(i, j, value)]. Since the matrix is mostly zero, this is much more efficient than allocating a dense array.