Well, how did you do it?
MATLAB: How to make a list of user’s reputation ? :)
MATLABmatlab answersmeta
Related Solutions
EDIT @ 4:30pm EST: strfind -> regexp with neg. look behind for avoind matching nbsp;.
Here is a simple crawler. It is not my original idea, which was a mechanism at Mathworks level and not at a user (one of us) level. I implemented a few criteria which are not those listed above, as the crawler has to work with content that was already parsed and "preformatted" by the forum.
The criteria implemented should be improved. Typically, the function call(s)/def(s) detection is too "simple" and generates false positive when users write function names followed by parentheses in normal text.
Anyhow, this is just a simple demo.
The whole code below (both functions) should be saved in forumCrawler.m, and you can set pageDepth to control how many forum pages you want to process.
----------------------------------------------------------------------------------------------------------------
function forumCrawler pageDepth = 1 ; baseURL = 'http://www.mathworks.com' ; for pageId = 1 : pageDepth fprintf('\n=== Processing page %d..\n', pageId) ; url = sprintf('%s/matlabcentral/answers/?page=%d', baseURL, pageId) ; thread = regexp(urlread(url), '(?<=<h3><).*?(?=")', 'match') ; nThread = length(thread) ; for tId = 1 : nThread fprintf(' - Analyzing thread %d/%d..\n', tId, nThread) ; url = sprintf('%s%s', baseURL, thread{tId}) ; htmlBuffer = urlread(url) ; % - Scan question.
question = regexp(htmlBuffer, ... '(?<=class="question-body ).*?(?=</div>)', 'match') ; [tf, msg] = isLikelyUnformatted(question{1}) ; if tf fprintf(' [<a href="%s">question>] %s.\n', url, msg) ; end % - Scan answers.
answer = regexp(htmlBuffer, ... '<div id="([^"]+)" class="answer-body">(.*?)</div>', 'tokens') ; for cId = 1 : length(answer) [tf, msg] = isLikelyUnformatted(answer{cId}{2}) ; if tf answerUrl = sprintf('%s#%s', url, answer{cId}{1}) ; fprintf(' [<%s answer> ] %s.\n', ... answerUrl, msg) ; end end % - Scan comments.
comment = regexp(htmlBuffer, ... '<div id="([^"]+)" class="comment-body">(.*?)</div>', 'tokens') ; for cId = 1 : length(comment) [tf, msg] = isLikelyUnformatted(comment{cId}{2}) ; if tf commentUrl = sprintf('%s#%s', url, comment{cId}{1}) ; fprintf(' [<%s comment> ] %s.\n', ... commentUrl, msg) ; end end end end end function [tf, msg] = isLikelyUnformatted(content) tf = true ; % Eliminate content within <pre>.. and |..| tags,
% so we work on what is meant to be text.
buffer = regexp(content, '', 'split') ; content = [buffer{:}] ; buffer = regexp(content, '<tt.*?</tt>', 'split') ; content = [buffer{:}] ; % Check for a few indicators.
if ~isempty(regexp(content, '\w:\w', 'ONCE')) msg = 'range def. found' ; return ; end if ~isempty(regexp(content, '\w(', 'ONCE')) msg = 'function call(s)/def(s) found' ; return ; end if ~isempty(regexp(content, '(?<!nbsp);</p>', 'ONCE')) msg = '";</p>" found' ; return ; end tf = false ; msg = '' ; end
I found the instruction in a comment by Walter to this answer in the "wish list" asking for the ability to add a link. I got there by re-reading this answer in "how to write a good answer". Now the question is what do I do with the information now that I found it again. Maybe it belongs in "how to write a good answer" or how to format your question.
For now I will repeat Walter's instructions here:
I have figured out how to use the existing interface to get the tags to a particular answer.
Hover over the "add a comment" link for the answer you wish to construct a direct URL to. You will see a link such as,
Pull out the number before "comments" and construct the anchor as #answer followed by the number, such as #answer_1459. That is the relative anchor, to which you would prefix the page URL, such as this page's URL
The net result would be
This URL will not change when the display algorithm reorders the answers due to voting or however it determines what should go first on the page.
@Walter if you want the rep, feel free to add the answer and I will accept yours.
Best Answer