An Application of word2vec model

Implementing OddOneOut Algorithm

Shivam Verma
3 min readJul 21, 2022
Photo by Manish Chablani from his article published in towardsdatascience

Word2Vec is one of the most popular techniques to learn word embeddings using a shallow neural network. Tomas Mikolov developed it in 2013 at Google. For the algorithm Odd One Out that we will implement soon, we will use the Google pre-trained model: ‘Googlenews-vectors-negative300.bin’, which can be downloaded from here. This model can be loaded using the gensim module by the following code:

The model contains 300-dimensional vectors for 3 million words and phrases.

((300,), (300,)) #printed result, both vectors are of 300 dimension.

To get a good idea about what is word2vec, you can refer to this article.

In this implementation, we will be using KeyedVectors(from gensim module) and cosine similarity function(provided by sklearn), import these two by the following code,

from gensim.models import KeyedVectors
from sklearn.metrics.pairwise import cosine_similarity

Now, let’s talk about the cool application of word2vec I’m talking about, its an algorithm named OddOneOut. What do I mean by OddOneOut? Let’s take an example so you can understand better. Assume we have a list of 5 words as [“apple,” mango,” banana,” red,” papaya”]. If we have to tell which one of these five words is an…

--

--

Shivam Verma

SWE @Microsoft || Building @shop2app || prev intern @daveai, @plunes & @rivi || Interview Preparation with me at https://topmate.io/shivam_verma10