How to make a list of user's reputation ? :)

?:

3 Commenti

Well, how did you do it?
No10 :), I found this is on ImageShack ;)
At a glance, it's missing a few well-known significant contributors. And since the MATLAB newsgroup (unlike the FEX) doesn't allow comment rating, it's hard to say how one would go about this process.

Accedi per commentare.

 Risposta accettata

S = urlread('http://www.mathworks.com/matlabcentral/answers/contributors/2710900'); % 2710900 is your user number.
I = findstr(S,'<div class="value">');
regexp(S(I+19:I+22),'\d+','match')
For the name:
I = findstr(S,'<h1 class="fn">');
regexp(S(I:I+40),'(?<=\">)\w+','match')
So you need to get the user numbers and loop through. The only issue is the user numbers don't appear to be consecutive. This would mean try catch on numbers 1 through ??.
EDIT
I used this code, letting it run for about 8 minutes and got only 3 users,
JoshIsCool - number 1, Patrick - number 2, Rhonda - number 385
all have reputation 0. The code was on number 589 when I ctrl+c it. So this method would be an overnight run the first night. Then using the stored numbers would make it easier!
cnt = 1;
for ii = 1:3000000
try
S = urlread(['http://www.mathworks.com/matlabcentral/answers',...
'/contributors/',sprintf('%i',ii)]);
catch
continue
end
I = findstr(S,'<h1 class="fn">');
N = regexp(S(I:I+40),'(?<=\">)\w+','match') % dump to command...
if ~isempty(N)
NM{cnt} = N; % Growing is nothing compared to urlread!
I = findstr(S,'<div class="value">');
R = regexp(S(I+19:I+22),'\d+','match');
N{cnt} = ii; % For making this easier next time!
REP{cnt} = R;
cnt = cnt + 1;
end
end
EDIT2
O.k., so here is what I ended up doing. I went to google and did this search:
"inurl:matlabcentral answers contributors" site:mathworks.com
with 100 results per page, there were only four pages. So I then saved them to disk. From there I ran the below code:
Uold = 'gggggg';
cnt = 1;
for ii = 1:4
% This is the saved file.
s = urlread(['file:///', 'C:\Users\matt fig\Documents\search',sprintf('%i',ii),'.htm']);
% Read the links to the pages.
I = findstr(s,'www.mathworks.com/matlabcentral/answers/contributors/');
for jj = 1:length(I)
% Find the specific jjth page.
U = regexp(s(I(jj):I(jj)+90),'www.+?(?=["|+])','match');
if strcmp(U,Uold)
continue
end
s2 = urlread(['http://',U{1}]);
I2 = findstr(s2,'<h1 class="fn">'); % Looking for the name.
N = regexp(s2(I2:I2+40),'(?<=\">)\w+\s*\w*','match'); % The name
if ~isempty(N)
disp(N) % Display name.
NM{cnt} = N{1}; % Store the name
I2 = findstr(s2,'<div class="value">'); % looking for reput.
R = regexp(s2(I2+19:I2+22),'\d+','match');
REP{cnt} = R{1}; % Store the reputation
cnt = cnt + 1;
end
Uold = U;
end
end
[NM,JJ] = unique(NM);
REP = REP(JJ);
REP = cellfun(@str2double,REP);
[REP,G] = sort(REP,'descend');
NM = NM(G);
fid = fopen('answersnames.txt','w+');
for ii = 1:length(NM)
fprintf(fid,'%s %i\n',NM{ii},REP(ii));
end
fclose(fid)
This printed all to a nice text file in about 5 minutes. Now if anyone knows how to manipulate google programatically, this would work on autopilot.

11 Commenti

how to get users ids?
Again, I don't know. That is why I said you would have to loop through the numbers. There may be another way, but I don't know it.
If you browse through the contributors to questions with a *meta* tag, you'll probably get most of the leaders.
sprintf('http://www.mathworks.com/matlabcentral/answers/?dir=asc&sort=asked&page=%d', ii)
This will iterate through giving you a page (50) at a time. Parse the page to get the question URLs. Iterate over the question URLs pulling the pages and parsing them for answers/contributors/ references, each of which will have an associated reputation text popup. You might get different values for the same contributor, reflecting the fact that their reputation might have changed during the run.
The one potential flaw is in hidden old comments. I haven't looked at whether that is handled at the server side or not.
I never said it would be fast, but it should be much much faster than iterating over the 700000 or so user numbers.
Oh yes, the order the questions will be fetched in for the above is oldest question first.
Good idea Walter. I will change my code.
I suggest to create a thread where everyone will just post once, so that you could simply query that thread and retrieve the reputation score, w/o looping through all the pages.
With time it will be sufficiently populated to get a 'full' list.
I suggest using
N = regexp(S(I:I+40),'(?<=\">)\w+\s\w+','match')
to get the full name.
Adding #comments-toggle to the URL to retrieve an individual answer appears to expand comments. Strangely, though, the same URL collapses them again, so there must be some javascript going on.
sprintf('http://www.google.com/search?q=site:http://www.mathworks.com/matlabcentral/answers/contributors/&num=100&start=%d',ii)
where ii is 100, 200, 300, etc.
I don't know if this would show the 100th on both the first page (bottom) and the second page (top), but there are only about 6 pages worth to pull out.
linux console command:
"wget -q -O - http://www.mathworks.com/matlabcentral/answers/?page=1 | grep Reputation" would give interesting lines:
by <a href="/matlabcentral/answers/contributors/2721858-komala" title="Reputation: 1">komala </a> which consist of user name and his/her reputation score.
So, to get list (redundant) of users you could browse all pages (from 1 to ...) from /matlabcentral/answers

Accedi per commentare.

Più risposte (3)

Greg Bacon
Greg Bacon il 24 Feb 2011

9 voti

A new Contributor page was added to MATLAB Answers today. The Contributor page lists user reputation and is sortable along a few different axis.
You can get to the new page by clicking on the new Contributor link in the left nav. The url is http://www.mathworks.com/matlabcentral/answers/contributors
Here's yet another variation that uses Walter's idea of going through the pages of questions, fetching the question links, then fetching user Reputation data from each question page. It's fully automated and takes about 3 minutes to run on my machine.
NOTE: This code will only find users who have posted at least one question, answer, or comment. Users who have an Answers account but haven't posted anything (like this guy or this guy) will not show up in the ranking list, but they should have 0 Rep anyway so it doesn't really matter. ;)
function [userData,nQuestions] = answers_rankings
% Initializations:
userData = cell(0,2);
nQuestions = 0;
pagesLeft = true;
iPage = 1;
% Loop over question pages:
while pagesLeft
nextPage = ['http://www.mathworks.com/matlabcentral/answers/' ...
'?dir=asc&sort=asked&page=' int2str(iPage)];
[pageText,pageFound] = urlread(nextPage);
questionLinks = regexp(pageText,['href="(/matlabcentral/answers/' ...
'\d+[^"]+)"'],'tokens');
if pageFound && ~isempty(questionLinks)
questionLinks = strcat('http://www.mathworks.com',...
vertcat(questionLinks{:}));
for iQuestion = 1:numel(questionLinks)
[pageText,pageFound] = urlread(questionLinks{iQuestion});
if pageFound
nQuestions = nQuestions+1;
data = regexp(pageText,'title="Reputation: (\d+)">([^<]+)<',...
'tokens');
userData = [userData; vertcat(data{:})]; %#ok<AGROW>
end
end
iPage = iPage+1;
else
pagesLeft = false;
end
end
updateTime = now;
% Format the user data:
userReps = cellfun(@str2double,userData(:,1)); % Convert Rep to integer
[userNames,~,index] = unique(userData(:,2)); % Find unique user names
userReps = accumarray(index,userReps,[],@max); % Take the max Rep found
[userReps,sortIndex] = sort(userReps,'descend'); % Sort by Rep
userNames = userNames(sortIndex);
userData = [userNames num2cell(userReps)].';
% Display the results:
maxLength = max([9; cellfun('prodofsize',userNames)]);
fprintf('\nMATLAB Answers user rankings as of %s:\n\n',...
datestr(updateTime));
fprintf('%*s: %s\n',maxLength,'User name','Reputation');
fprintf('%s\n',repmat('-',1,maxLength+13));
fprintf(['%' int2str(maxLength) 's: %4d\n'],userData{:});
end
for linux users (because command system() is involved):
str1='Reputation: ';
str2='">';
ANSWERS=cell(10000,2);
ANSWERS{1,1}=''; ANSERS{1,2}=0;
lastuser=1;
for f=1:20
fs=num2str(f);
urlstr=['http://www.mathworks.com/matlabcentral/answers/?page=' fs];
commandstr=['wget -q -O - ' urlstr ' | grep Reputation:'];
[sta res]=system(commandstr);
linie=textscan(res, '%s','delimiter','\n');
for ff=1:numel(linie{1})
linia=linie{1}{ff};
i1=findstr(str1, linia);
i2=findstr(str2, linia);
REPUT=str2double(linia( (i1+12):(i2-1)));
NICK=linia((i2+2):(end-4));
disp([NICK]);
ni=0;
for g=1:lastuser
if strcmp(ANSWERS{g,1}, NICK)
ANSWERS{g,2}=REPUT;
ni=1;
end
end
if ~ni
lastuser=lastuser+1;
ANSWERS{lastuser,1}=NICK;
ANSWERS{lastuser,2}=REPUT;
end
end
end
ANSWERS=ANSWERS(2:lastuser,:);
rep=cell2mat(ANSWERS(:,2));
[v i]=sort(rep,'descend');
t=uitable('Data',ANSWERS(i,:),'Units','normalized', ...
'Position',[0.1 0.1 0.85 0.85]);
set(t,'Columnwidth',{150,80},'ColumnName',{'User','Reputation'})

Categorie

Scopri di più su Historical Contests in Centro assistenza e File Exchange

Prodotti

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by