Cosine Similarity using BERT

Question

Nicholas Ang il 30 Giu 2021

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/868608-cosine-similarity-using-bert

Commentato: Nicholas Ang il 30 Giu 2021

Risposta accettata: Divyam Gupta

I am using BERT to calculate similarities in Question Answering. I have encoded my Question data using

data.Tokens = encode(mdl.Tokenizer,data.Questions) which returns me a cell array.

Next, I proceeded to encode new text to test the similiarity with the already encoded Questions in the database: testTokens = encode(mdl.Tokenizer,text)

However, I am imable to use the cosineSimilarity(data.Tokens,testTokens) and I receive an error that says:

Input must be a matrix, a tokenizedDocument array, a bagOfWords model, a bagOfNgrams model, a string array of words, or a cell array of character vectors.

Do I need padding here or reshape of my cell vectors?

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Divyam Gupta il 30 Giu 2021

1
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/868608-cosine-similarity-using-bert#answer_736543

Hi Nicholas, I notice that you're facing an issue while computing the cosine similarity using a text encoder. As per the documentation mentioned at https://www.mathworks.com/help/textanalytics/ref/cosinesimilarity.html#d123e8335 the cosineSimilarity function takes a matrix to compute the similarity between two documents.

Since the encoded vector sizes for each of the questions is different, constructing a matrix might be difficult. You can do a pairwise comparision between the data.Tokens and the testTokens to compute the similarities. This can be achieved by running a nested loop while simultaneously storing the similarity scores.

Hope this helps.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Nicholas Ang il 30 Giu 2021

Thank you! This worked!

Accedi per commentare.

Cosine Similarity using BERT

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Cosine Similarity using BERT

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Risposta accettata

1 Commento Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Più risposte (0)

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti