Accounting Careers

Showing posts with label interface. Show all posts

More BHL app ideas

Following on from my previous post on BHL apps and a Twitter discussion in which I appealed for a "sexier" interface for BHL (to which @elyw replied that is what BHL Australia were trying to do), here are some further thoughts on improving BHL's web interface.
Build a new interface
A fun project would be to create a BHL website clone using just the BHL API. This would give you the freedom to explore interface ideas without having to persuade BHL to change its site. In a sense, the app would be provide the persuasion.

Third party annotations
It would be nice if the BHL web site made use of third party annotations. For example, BHL itself is extracting some of the best images and putting them on Flickr. How about if you go to the page for an item in BHL and you see a summary of the images from that item in Flickr? At a glance you can see whether the item has some interesting content. For example, if you go to http://biodiversitylibrary.org/item/109846 you see this:

N2 w1150

which gives you no idea that it contains images like this:

Tables of contents
Another source of annotations is my own BioStor project, which finds articles in scanned volumes in BHL. If you are looking at an item in BHL it would be nice to see a list of articles that have been found in that item, perhaps displayed in a drop down menu as a table of contents. This would help provide a way to navigate through the volume.

Who links to BHL?
When I suggested third party annotations on Twitter @stho002 chimed in asking about Wikispecies, Species-ID, ZooBank, etc. These resources are different, in that they aren't repurposing BHL content but are linking to it. It woud be great if a BHL page for an item could display reverse links (i.e., the pages in those external databases that link to that BHL item).

Implementing reverse links (essential citation linking) can be tricky, but two ways to do it might be:

Use BHL web server logs to find and extract referrals from those projects
Perhaps more elegantly, encourage external databases to link to BHL content using an OpenURL which includes the URL of the originating page. OpenURL can be messy, but especially in Mediawiki-based projects such as Wikispecies and Species-ID it would be straightforward to make a template that generated the correct syntax. In this way BHL could harvest the inbound links and display them on the item page.

Why 3D phylogeny viewers don't work

Matt Yoder (@mjyoder had a Twitter conversation yesterday about phylogeny viewers, prompted by my tweeting about my latest displacement activity, a 2D tree browser using the tiling approach made popular by Google Maps.

As part of that conversation, Matt tweeted:

RT @rdmpage: @mjyoder - I think 3D is the worse thing we could do, there's no natural mapping to 3D. <- meh, where's the imagination?

Well, Matt's imagination has gone into overdrive, and he's blogged about his ideas.

This issue deserves more exploration, but here are some quick thoughts. 3D has been used in a number of phylogeny browsers, such as Mike Sanderson's Paloverde, Walrus, and the Wellcome Trust's Tree of Life. I don't find any terribly successful, pretty as they may be. I think there are several problems with trees in general, and 3D versions in particular.

Trees aren't real
Trees aren't real in the same way that the physical world is (or even imagined physical worlds). Trees are conceptual structures. The history of web interfaces is littered with attempts to visualise conceptual space, for example to summarise search results. These have been failures, a simple top ten list as used by Google wins. I don't think this is because Google's designers lack imagination, it's because it works. Furthermore, this is actually a very successful visualisation:

I think elaborate attempts to depict conceptual spaces on screens are mostly going to fail.

Trees are empty
Compared to, say, a geographic map, trees are largely empty space. In a map every pixel counts, in that it potentially represents something. Think of the satellite view in Google Maps. Each pixel on the screen has information. Trees are largely empty, hence much of the display space is wasted. Moving trees to 3D just gives us more space to waste.

Trees don't have a natural ordering
Even if we accept that trees are useful visualisations, they have problems. Given the tree ((1,2),(3,4)); we have a lot of (perhaps too much) freedom in how we can depict that tree. For example, both diagrams below depict this tree. In the x-axis there is a partial order of internal nodes (the ancestor of {1,2} must be to the right of the ancestor {1,2,3,4}), but the tree ((1,2),(3,4)); says nothing about the relative ordering of {1,2} versus {3,4}. We are free to choose. A natural linear ordering would be divergence time, but estimates of those times can be contested, or unavailable.

Phylogenies are unordered trees in the sense that I can rotate any node about it's ancestor and still have the same tree (compare the two trees above). Phylogenies are like mobiles:

The practical consequence of this is that different tree viewers can render the same tree in very different ways, making navigation across viewers unpredictable. Compare this to maps. Even if I use different projections, the maps remain recognisably similar, and most maps retain similar relationships between areas. If I look at a map of Glasgow and move left I will end up in the Atlantic Ocean, no matter if I use Google Maps or Microsoft Maps. Furthermore, trees grow in a way that maps don't (at least, not much). If I add nodes to a tree it may radically change shape, destroying navigation cues that I may have relied on before. Typically maps change by the addition of layers, not by moving bits around (paleogeographic maps excepted).

Trees aren't 3D
There's nothing intrinsically 3D about trees, which means any mapping to 3D space is going to be arbitrary. Indeed, most 3D viewers simply avoid any mapping and show a 2D tree in 3D space, which seems rather pointless.

Perhaps it's because I don't play computer games much (went through an Angry Birds phase, and occasionally pick up an X-Box controller, only to be mercilessly slaughtered by my son), but I'm not inspired by the analogy with computer games. I'm not denying that there are useful things to learn from games (I'm sure the controls in Google Earth owe something to games). But games also rely on a visceral connection with the play, and an understanding of the visual vocabulary (how to unlock treasure, etc.). Matt's 3D model requires users to learn a whole visual vocabulary, much of which (e.g., "Fruit on your tree? Someone has left comment(s) or feedback. ") seems forced.

My sense is that the most successful interfaces make the minimal demands on users, don't fight their intuition, and don't force them to accept a particular visualisation of their own cognitive space.

I'll write more about this once I get my 2D tree viewer into shape where it can be shown. It will be a lot less imaginative than Matt's vision, all I'm shooting for is that it is usable.

Quantum treemaps meet BHL and the Australian Faunal Directory

One of the things I'm enjoying about the Australian Faunal Directory on CouchDB is the chance to play with some ideas without worrying about breaking lots of code or, indeed, upsetting any users ('cos, let's face it, there aren't any). As a result, I can start to play with ideas that may one day find their way into other projects.

One of these ideas is to use quantum treemaps to display an author's publications. For example, below is a treemap showing publications by G A Boulenger in my Australian Faunal Directory on CouchDB project. The publications are clustered by journal. If a publication has been found in BioStor the treemap displays a thumbnail of that publication, otherwise it shows a white rectangle. At a glance we can see where the gaps are. You can view a publication's details simply by clicking on it.

The entomologist W L Distant has a more impressive treemap, and clearly I need to find quite a few of his publications.

I quite like the look of these, so may think about adding this display to BioStor. I may also think about using treemaps in my ongoing iPad projects. If you want to see where I'm going with this then take a look at Good et al. A fluid treemap interface for personal digital libraries.

Notes
The quantum treemap is computed using some rather ugly PHP I wrote, based on this Java code. I've not implemented all the refinements of the original Java code, so the quantum treemaps I create are sometimes suboptimal. To avoid too much visual cluster I haven't drawn a border around each cell, instead I use CSS gradients to indicate the area of the cell (if you're using Internet Explorer the gradient will be vertical rather than going from top left to bottom right). The journal name is overlain on the cell contents, but if you are using a decent browser (i.e., not Internet Explorer) you can still click through this text to the underlying thumbnail because the text uses the CSS property
.overlay { pointer-events: none; }
I learnt this trick from the Stack Overflow question Click through div with an alpha channel.

Navigating the Encyclopedia of Life tree on the desktop and the iPhone

This week seems to be API week. The Encyclopedia of Life API Beta Test has been out since August 12th. By comparison with the Mendeley API that I've spent rather too much time trying to get to grips with, the EOL API release seems rather understated.

However, I've spent the last couple of days playing with it in order to build a simple tree navigating widget, which you can view at http://iphylo.org/~rpage/eoltree/.

The widget resembles Aaron Thompson's Taxonomy (formerly called KPCOFGS) iPhone app in that it uses the iPhone table view to list all the taxa at a given level in a taxonomic tree. Clicking on a row in this table takes you to the descendants of the corresponding taxon, clicking "Back" takes you back up the tree. if you've reached a leave node (typically a species) the widget displays a snippet of information about that taxon. It also resembles Javier de la Torre's taxonomic browser written in Flex.

Here's a screen shot of the widget running in a desktop web browser:

Here's the same widget in the iPhone web browser:

Using the API
The EOL API is pretty straightforward. I call the http://www.eol.org/api/docs/hierarchy_entries API to get the tree rooted at a given node, then populate each child of that node using http://www.eol.org/api/docs/pages. The result is a simple JSON file that I cache locally to speed up performance and avoid hitting the EOL servers for the same information. because I'm locally caching the API calls I need a couple of PHP scripts to do this, but everything else is HTML and Javascript.

iPhone and iPad
I've not really developed this for the iPhone. I've cobbled together some crude Javascript to simulate some iPhone-like effects, but if I was serious about the phone I'd look into one of the Javascript kits available for iPhone development. However, I did want something that was similar in size to the iPhone screen. The reason is I'm looking at adding taxonomic browsing to the geographic browser I described in the post Browsing a digital library using a map, so I wanted something easy to use but which didn't take up too much space. In the same way that the Pygmybrowse tree viewer I played with in 2006 was a solution to viewing a tree on a small screen, I think developing for the iPhone forces you to strip things down to the bare essentials.

I'm also keeping the iPad in mind. In portrait mode some apps display lists in a popover like this:

This popover takes up a similar amount of screen space to the entire iPhone screen, so if I was to have a web app (or native app) that had taxonomic navigation, I'd want it to be about the size of the iPhone.

Let me know what you think. Meantime I need to think about bolting this onto the map browser, and providing a combined taxonomic and geographic perspective on a set of documents,

Browsing a digital library using a map

Every so often I revisit the idea of browsing a collection of documents (or specimens, or phylogenies) geographically. It's one thing to display a map of localities for single document (as I did most recently for Zootaxa), it's quite another to browse a large collection.

Today I finally bit the bullet and put something together, which you can see at http://biostor.org/maps/. The website comprises a Google Map showing localities extracted from papers in BioStor, and a list of the papers that have one or more points visible on the map.

In building this I hit a few obstacles. The first is the number of localities involved. I've extracted several thousand point localities from articles in BioStor. Displaying all these on a Google Map is going to be tedious. Fortunately, there's a wonderful library called MarkerCluster, part of the google-maps-utility-library-v3 that handles this problem. MarkerCluster cluster together markers based on zoom level. If you zoom out the markers cluster together, as you zoom in these clusters will start to resolve into their component points. Very, very cool.

The second challenge was to have the list of references update automatically as we move around or zoom in and out on the map. To do this I need to know the bounding box currently being displayed in the map, I can then query the MySQL database underlying BioStor for the localities within the bounding box, using MySQL's spatial extensions. The query is easy enough to implement using ajax, but the trick was knowing when to call it. Initially, listening for the bounds_changed event seemed a good idea. However, this event is fired as the map is being moved (i.e., if the user is panning or dragging the map a whole series of bounds_changed events are fired), whereas what I want is something that signals that the user has stopped moving the map, at which point I can query the database for articles that correspond to the region that map is currently displaying. Turns out that the event I need to listen for is idle (see Issue 1371: map.bounds_changed event fires repeatedly when the map is moving), so I have a function that captures that event and loads the corresponding set of articles.

Another "gotcha" occurs when the region being viewed crosses longitude 180° (or -180°) (see diagram below from http://georss.org/Encodings).

In this case the polygon used to query MySQL would be incorrectly interpreted, so I create two polygons, each with 180° or -180° as one of the boundaries, and merge the articles with points in either of those two polygons.

I've made a short video showing the map in action. Although I've implemented this for BioStor, the code is actually pretty generic, and could easily be adapted to other cases where we want to navigate through a set of objects geographically.

Browsing a digital library using a map from Roderic Page on Vimeo.

TreeBASE II makes me pull my hair out

I've been playing a little with TreeBASE II, and the more I do the more I want to pull my hair out.

Broken URLs
The old TreeBASE had a URL API, which databases such as NCBI made use of. For example, the NCBI page for Amphibolurus nobbi has a link to this taxon in TreeBASE. The link is http://www.treebase.org/cgi-bin/treebase.pl?TaxonID=T31183&Submit=Taxon+ID. Now, this is a fragile looking link to a Perl CGI script, and sure enough, it's broken. Click on it and you get a 404. In moving to the new TreeBASE II, all these inward links have been severed. At a stroke TreeBASE has cut itself off from an obvious source of traffic from probably the most important database in biology. Please, please, throw in some mod_rewrite and redirect these CGI calls to TreeBASE II.

New identifiers
All the TreeBASE studies and taxa have new identifiers. Why? Imagine if GenBank decided to trash all the accession numbers and start again from scratch. TreeBASE II does support "legacy" StudyIDs, so you can find a study using the old identifier (you know, the one people have cited in their papers). But there's no support for legacy TaxonIDs (such as T31183 for Amphibolurus nobbi). I have to search by taxon name. Why no support for legacy taxon IDs?

Dumb search
Which brings me to search. The search interface for taxa in TreeBASE is gloriously awful:

So, I have to tell the computer what I'm looking for. I have to tell it whether I'm looking for an identifier or doing a text search, then within those categories I need to be more specific: do I want a TreeBASE taxon ID (new ones of course, because the old ones have gone), NCBI id, or uBio? And this is just the "simple" search, because there's an option for "Advanced search" below.

Maybe it's just me, I get really annoyed when I'm asked to do something that a computer can figure out. I shouldn't have to tell a computer that I'm searching for a number or some text, nor should I tell it what that number of text means. Computers are pretty good at figuring that stuff out. I want one search box, into which I can type "Amphibolurus nobbi", or "Tx1294" or "T31183" or "206552" or "6457215" or "urn:lsid:ubio.org:namebank:6457215" (or a DOI, or a text string, or pretty much anything) and the computer does the rest. I don't ever want to see this:

Computers are dumb, but they're not so dumb that they can't figure out if something is a number or not. What I want is something close to this:

Is this really too much to ask? Can we have a search interface that figures out what the user is searching for?

Note to self: Given that TreeBASE has an API, I wonder how hard it would be to knock up a tool that took a search query, ran some regular expressions to figure out what the user might be interested in, then hit the API with that search, and returned the results?

My concern here is that TreeBASE II is important, very important. Which means it's important to make it usable, which means don't break existing URLs, don't make old identifiers disappear, and don't have a search interface that makes me want to pull my hair out.

Zotero: creating bibliographies in the cloud

Lately I've become more and more interested in moving data off my machine(s) and into the cloud. I'm keen to do this partly to avoid having data in one place (e.g., a machine at work) when I need it someplace else (e.g., at home), and there are great tools for doing this (such as the wonderful Dropbox).

As a developer, the cloud appeals, not so much because of the compute power that some are salivating over, but because it may free me from having to create my own software. For example, some time ago I have created an OpenURL resolver to help me find articles online. I harvest a bunch of sources, such as CrossRef, PubMed, some OPAI respositories, etc., but there's always times where I find a reference online that I'd like to add, and that reference doesn't have an identifier such as a DOI.

Typically I add these manually, or by importing a file. I could write some interface code to add (and edit) a bibliographic reference (and, indeed I did some time back), but wouldn't it be great if somebody else had done this for me?

Well, there are some tools out there for handling bibliographies online, such as Connotea, Mendeley, and Zotero (a Firefox add-on). Initially I was skeptical of Zotero (and I'm not a big Firefox user), but now that I'm looking for a place to store obscure papers it's rapidly growing on me. I like the fact that I can add references in situ, and that I can upload PDFs (which can be stored remotely on a WebDAV disk such as an iDisk). But what makes Zotero even more attractive is that it generates an RSS feed of my bibliography, which I can then harvest just as I harvest other resources.

Using a resource like Zotero saves me the hassle of having to write my own bibliographic editor, plus I benefit from using a tool that's a lot more polished than one I could make. Because of this, and my experience with the Google Spreadsheets API, I'm ultimately aiming to never have to write a user interface again. If I write services, and rely on third parties to make tools that can either generate services I can use, or consume my services, then my life becomes a lot simpler.

OK, perhaps I exaggerate. I like making interfaces, such as my eBio09 entry, or the experiment with SpaceTree. However, I can imagine a situation where I don't have to write a data entry interface ever again.

Subscribe to: Posts ( Atom )

Accounting Careers

Search this keyword