Overview of Word Embedding using Embeddings from Language Models (ELMo)

Last Updated : 16 Mar, 2021

What is word embeddings?

It is the representation of words into vectors. These vectors capture important information about the words such that the words sharing the same neighborhood in the vector space represent similar meaning. There are various methods for creating word embeddings, for example, Word2Vec, Continuous Bag of Words(CBOW), Skip Gram, Glove, Elmo, etc.

Embeddings from Language Models(ELMo) :

ELMo is an NLP framework developed by AllenNLP. ELMo word vectors are calculated using a two-layer bidirectional language model (biLM). Each layer comprises forward and backward pass.
Unlike Glove and Word2Vec, ELMo represents embeddings for a word using the complete sentence containing that word. Therefore, ELMo embeddings are able to capture the context of the word used in the sentence and can generate different embeddings for the same word used in a different context in different sentences.

Embeddings from Language Models(ELMo)

For Example: –

I love to watch Television.
I am wearing a wristwatch.

Here in the 1st sentence watch is used as a verb while in the 2nd sentence watch is a noun. These words having different context in different sentences are called polysemous words.ELMo can successfully handle this nature of words that GLOVE or FastText fail to capture.

Implementation of word embeddings using ELMo:

The below code is tested on google colab. Run these command before running the code in your terminal to install the necessary libraries.

pip install "tensorflow>=2.0.0"
pip install --upgrade tensorflow-hub

Code:

Python3

# import necessary libraries 
import tensorflow_hub as hub 
import tensorflow.compat.v1 as tf 
tf.disable_eager_execution() 
  
# Load pre trained ELMo model 
elmo = hub.Module("https://tfhub.dev/google/elmo/3", trainable=True) 
  
# create an instance of ELMo 
embeddings = elmo( 
    [ 
        "I love to watch TV", 
        "I am wearing a wrist watch"
    ], 
    signature="default", 
    as_dict=True)["elmo"] 
init = tf.initialize_all_variables() 
sess = tf.Session() 
sess.run(init) 
  
# Print word embeddings for word WATCH in given two sentences 
print('Word embeddings for word WATCH in first sentence') 
print(sess.run(embeddings[0][3])) 
print('Word embeddings for word WATCH in second sentence') 
print(sess.run(embeddings[1][5])) 

Output:

Word embeddings for word WATCH in first sentence
[ 0.14079645 -0.15788531 -0.00950466 ...  0.4300597  -0.52887094
  0.06327899]
Word embeddings for word WATCH in second sentence
[-0.08213335  0.01050366 -0.01454147 ...  0.48705393 -0.54457957
  0.5262399 ]

Explanation: The output shows different word embeddings for the same word WATCH used in a different context in different sentences.