Solved – Skip-gram algorithm confusion

deep learninglanguage-modelsmachine learningnatural language

As a newbie to NLP, I am (deeply) confused by the middle step in the following diagram explaining the skip-gram algorithm. The video where this diagram was presented can be found at:
https://www.youtube.com/watch?v=ERibwqs9p38 (Highly appreciate Stanford university and prof Manning sharing the video)

Two questions I struggled:

  1. At the place where the center word vector multiplies the context word matrix to generate three UoT*Vc vectors. Given the same center word vector and the same context matrix, why it came out three different vectors?

  2. The output of the softmax function is a list of probabilities. A list of probabilities that showing me the likelihood of each word in the vocabulary being the context word of the given center word. Wouldn't just having one such vector be good enough? Why the graphs shows me three (different) such vectors for each of the context word?

enter image description here

I found a similar (may not be exact) question from the following link, but still didn't feel fully answered.

Best Answer

The skipgram model produces one softmax output for all the context words. The drawing from the lecture is incorrect.