How to draw co-occurence network by using "Nouns" only in MATLAB Text Analytics Toolbox?

1 visualizzazione (ultimi 30 giorni)
Hello,
I have some trouble when conducting text analytics by using MATLAB.
I want to perform 1) Draw Co-occurence Network Diagram by using most occured 100 Nouns Only and 2) Draw Frequency Table/bar plot of most occured Nouns.
My code is as follows. I conducted the POS(Part of Speech) , but i can't proceed the from now.
Thanks in Advance!!!
T = readtable('D:/OneDrive/evpostridereview.csv');
t.desc = T.review;
cleanedDocuments = tokenizedDocument(t.desc); % 한번 뻗었는데 두번째 시도에서 됨
cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);
% Remove a list of stop words then lemmatize the words. To improve
% lemmatization, first use addPartOfSpeechDetails.
cleanedDocuments = removeStopWords(cleanedDocuments); % 실행 성공
stopwords =["전기차","하이브리드","현대","기아","아이오닉","쏘나타","카렌스","sm5","소나타","아이오","테슬라","를","의","이","중고차","휴게소","자동차"];
cleanedDocuments = removeWords(cleanedDocuments,stopwords);
cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma'); % 뻗었다가 다시 됨
% Erase punctuation.
cleanedDocuments = erasePunctuation(cleanedDocuments); % 한번에 성공
% Remove words with 2 or fewer characters, and words with 15 or more
% characters.
cleanedDocuments = removeShortWords(cleanedDocuments,2); % 한번에 성공
cleanedDocuments = removeLongWords(cleanedDocuments,15);
tdetails = tokenDetails(cleanedDocuments);
head(tdetails)
% Extract Noun
nouns = tdetails.Token(tdetails.PartOfSpeech=='noun');
% Wordcloud for nouns
figure
wordcloud(nouns)
title("EVPost 전기차 주행기 워드 클라우드")
% Co-Occurence Network
bag = bagOfWords(cleanedDocuments);
counts = bag.Counts;
cooccurence=counts.'*counts;
figure
G = graph(cooccurence,bag.Vocabulary,'omitselfloops');
LWidths = 5*G.Edges.Weight/max(G.Edges.Weight);
plot(G,'LineWidth',LWidths)
title("Co-occurence Network")
% Center Keyword Setting
word = "디자인"
idx = find(bag.Vocabulary == word);
nbrs = neighbors(G,idx);
bag.Vocabulary(nbrs)'
H = subgraph(G,[idx; nbrs]);
LWidths = 5*H.Edges.Weight/max(H.Edges.Weight);
plot(H,'LineWidth',LWidths)
title("Co-occurence Network - Word: """ + word + """");
  2 Commenti
Piyush Dubey
Piyush Dubey il 26 Giu 2023
The code seems to be algorithmically perfect can you elaborate on what issue are you facing while creating the co-occurence network.
상원 음
상원 음 il 27 Giu 2023
Dear Piyush, As you mentioned, this code can make co-occurence network. But what i want is that the i want to draw co-occurence network by considering only "nouns" and i want to consider top occuring 100 nouns.

Accedi per commentare.

Risposte (1)

Saksham
Saksham il 18 Ago 2023
Hi 상원 음,
I understand that you already have code for co-occurrence network and want to create co-occurrence network only for top occurring 100 nouns.
I also observed that the code is extracting nouns in “nouns” variable. After the comment “% Co-Occurrence Network, please pass variable “nouns” in the bagOfWords function.
To find top 100 occurring nouns, you may try finding frequency of each word and then filter the words accordingly. To know more about counting word frequency, please follow the below link:
I hope the above shared suggestion and resource will be useful to you.
Sincerely,
Saksham

Categorie

Scopri di più su MATLAB in Help Center e File Exchange

Prodotti


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by