Search this keyword

Showing posts with label history flow. Show all posts
Showing posts with label history flow. Show all posts

Wikipedia History Flow tool now in GitHub

Inspired by a comment on my post Visualising edit history of a Wikipedia page, the code I use to make history flow diagrams like the one below is now in GitHub at https://github.com/rdmpage/wikihistoryflow.

Historyflow

There is also a live version at http://iphylo.org/~rpage/wikihistoryflow. If you enter the name of a Wikipedia page the tool will display the edit history with columns representing page versions and individual contributors (people and bots) distinguished by different colours.

This tool will fall over for pages with a lengthy history of edits, and requires a web browser that can support SVG, but it's a fun visualisation, and may inspire someone to do this properly.

When taxonomists wage war in Wikipedia

Stumbled across Alex Wild's post Pyramica vs Strumigenys: why does it matter?, which takes as it's starting point a minor edit war on the Wikipedia page for Pyramica .

Alex gives the background to the argument about whether Pyramica is a synonym of Strumigenys, and investigates the issue using the surprisingly small about of data available in GenBank. The tree he found (shown below) suggests this issue will require some work to resolve:

phylogeny1.jpg


For fun I constructed a history flow diagram for the edits to the Pyramica page in Wikipedia:

5.png


The diagram shows the two occasions when the page has been striped of content (and subsequently restored) as contributors dispute whether Pyramica is a synonym of Strumigenys. It would be useful to have one or more metrics of how controversial a page (and/or a contributor) was, to both identify controversial pages, and to see how controversial taxonomic pages were compared to other Wikipedia topics. The paper On Ranking Controversies in Wikipedia: Models and Evaluation by Ba-Quy Vuong et al. (doi:10.1145/1341531.1341556) would be a good place to start (a video of the presentation of this paper is available here).

Visualising edit history of a Wikipedia page

Quick post (really should be doing something else). Reading Jeff Atwood's post Mixing Oil and Water: Authorship in a Wiki World lead me to IBM's wonderful history flow tool to visualise the edit history of a Wikipedia page.

Imagine a scenario where three people will make contributions to a Wiki page at different points in time. Each person edits the page and then saves their changes to what becomes the latest version of that page.

history-flow-animation.gif

History Flow connects text that has been kept the same between consecutive versions. Pieces of text that do not have correspondence in the next (or previous) version are not connected and the user sees a resulting "gap" in the visualization; this happens for deletions and insertions. (animated GIF from Jeff Atwood's post).


There's a nice paper describing history flow (doi:10.1145/985692.985765, free PDF here). Inspired by this I decided to try and implement history flow in PHP and SVG. Here's a preliminary result:

afrotheria.png

This is the edit history for the Afrotheria page. Click on the image above (or here to see the SVG image -- you need a decent web browser for this, IE uses will need a SVG plugin).

The SVG image is clickable. The columns represent revisions, click on those to go to that revision. The columns are evenly spaced (i.e., the gaps don't correspond to time). The bands between revisions trace individual blocks of text (in this case lines in the Wikipedia page source). If you click on a band you get taken to that Wikipedia user's page.

This is all done in a rush, but it gives an idea of what can be done. The history flow carries all sorts of information about how an article has developed over time, major changes (such as the introduction of Taxoboxes), and makes the content of a page traceable, in the sense that you can see who contributed what to a page.