Solved – Is skip-gram model in word2vec an expanded version of N-Gram model? skip-gram vs. skip-grams

The skip-gram model of word2vec uses a shallow neural network to learn the word embedding with (input-word, context-word) data. When I read the tutorials for the skip-gram model there was not any mentioning regarding the N-gram. However I came across several online discussions in which people claim — skip-gram model in word2vec is an expanded version of N-Gram model. Also I don't really understand this "k-skip-n-gram" in the following Wikipedia page.

Wikipedia cited a paper from 1992 for "skip-grams", so I guess this is not the word2vec's skip-gram model, right? Another paper regarding this "skip-grams" is https://homepages.inf.ed.ac.uk/ballison/pdf/lrec_skipgrams.pdf. This is very confusing. Why there's no one clear this up.

The wikipedia source and the online discussion are as follows:

Best Answer

Some basic concepts are valid through years :) and are used in many solutions and naturally contributing to naming of these solutions...

N-gram is a basic concept of a (sub)sequnece of consecutive words taken out of a given sequence (e.g. sentence).

k-skip-n-gram is a generalization where 'consecutive' is dropped. It is 'just' subsequence of the original sequence, e.g. every other word of the sentence is 2-skip-n-gram.

word2vec is more complicated beast, the buzzword :) here is 'embeddings', here is the original paper https://arxiv.org/pdf/1301.3781.pdf. It uses the concept of consequtive words and 'skip' and 'gram' made its way to the name of the algorithm. BTW there are two alternative ones used by word2vec solution: skip-gram and CBOW.

Best Answer

Related Solutions

Word2Vec Skip-Gram Model – Generating Output Vectors

Solved – Why the skip-gram model is called as predicting source context words from the target word

Related Question