Tuesday, November 24, 2009

Libraries and crowdsourcing - 6 examples

 Crowd sourcing is defined as "for the act of taking tasks traditionally performed by an employee or contractor, and outsourcing them to a group crowd of people or community in the form of an open call." The key here is that the work has to be done by non-employees so a group of librarians from your institution working on a subject guide wiki would not be crowd sourcing, unless you opened the wiki for your users to add resources.

Traditionally librarians have being suspicious of crowd sourcing, but this is slowly changing. Still there is an issue of "trust" involved.

Libraries have done crowdsourcing for

  1. Digitisation of newspapers and correction of OCR errors (National Library of Australia)
  2. Tagging of photos on Flickr (Library of Congress)
  3. Rating,tagging and contribution of user reviews of books (via Endeca, Encore, Aquabrowser, Koha and other next-generation OPACS etc)
  4. Frequently asked questions (Social reference tools like LibAnswers)
  5. Comments and suggestions (GetSatisfaction , UserVoice etc)
  6. Collaborative Cataloguing (Library Thing, Biblios.net and Open Library)

1. Digitisation of newspapers and correction of OCR errors (National Library of Australia) - PDF

The Australian Newspapers Digitisation Program (ANDP) is responsible for digitising historic Australian newspapers from 1803 onwards to 1954 - the Historic Australian Newspapers archive.

This in itself isn't unusual, what makes this project unique is that it has a mechanism in place to enlist the help of the public to correct errors in OCR (Optical Character Recognition). Users were also allowed to tag articles, which as the paper (pdf) states is a bit unusual for text where full-text search is available.

The full report(pdf) is worth a read, but during the 6 months of the trial Aug 2008-Jan 2009

  • 2 million lines of text were corrected in 100,000 articles
  • The top corrector corrected 101,481 lines in 2,594 articles
  • 1,300 registered users
  • No vandalism of text was detected in 6 months
  • 19,354 articles were tagged
  • 46,230 tags were created.
In general the project seems to be a success, while many users initially did not expect to be able to do text correction, but once they figured it out many users found that text-correction was addictive, while others drew parallels to Wikipedia.

2. Tagging of photos on Flickr (Library of Congress) - PDF

In Jan 2008, the Library of Congress began sharing two collections of historical photos on Flickr.

Again the final report of the project dated October 2008 is worth a read but between Jan 2008 to Oct 2009

  • 67,176 tags were added by 2,518 unique Flickr accounts
  • 4,548 of the 4,615 photos have at least one community-provided tag
  • 7,166 comments were left on 2,873 photos by 2,562 unique Flickr accounts.
  • 79% of the 4,615 photos have been made a “favorite” (i.e., are incorporated into personal Flickr collections).
Looks like another success case of crowd sourcing. 

3. Tagging, user reviews, rating (Endeca, Encore, Aquabrowser, Koha and other next-generation OPACS etc)

Today many libraries have deployed or are planning to deploy next-generation OPACS. 
Users are able to tag or rate items. Typically they are also allowed to contribute user reviews. There are too many libraries to list that have OPACS or discovery layers that provide such features, but generally I think they haven't taken off.  Most probably because most institutions do not have the critical mass necessary for tagging to become useful. Below is an example of a Harry Potter book from  being tagged, rated and reviewed by users from the Ann Arbor District Library SOPAC.

As noted by John Blyberg author of SOPAC noted

"Free tagging in the Ann Arbor catalog as it pertains to the discovery of material is indeed a failure, simply because the data set hasn’t grown large enough to become terribly meaningful."
He goes on to note that

"If you want to make the argument that local tags are representative of the local community, then I can understand–that was my original argument. But if a community like Ann Arbor cannot support a meaningful tagging system, I’m skeptical that many others would–maybe large metropolitan systems, but anything smaller? Probably not."

As I write this, Singapore's National Libraries has just launched NLBSearchPlus (Primo I think), it will be interesting to see if they have enough critical mass for user contributed tags to take off.

In comparison as John Blyberg notes LibraryThing for libraries are much more successful because you can seed your library OPAC with hundreds of thousands of reviews and tags. Currently over 160 libraries use this system. Items in the catalogue are matched using ISBN and can be inserted into any OPAC that accepts html.

Before is an example from the University of HongKong

The information of course is drawn (filtered and curated) from LibraryThing which can be seen as a way to crowdsource cataloging and tagging.

LibraryThing for libraries also provides recommendations (as do other recommender systems similar to Amazon's "users who bought this also bought ..." ), I wonder if that counts as crowd-sourcing recommendations. 

There are many papers written analysing the results of implementing LibraryThing for libraries in various institutions , but here's one

4. Frequently asked questions (Social reference tools like LibAnswers)

Frequently asked questions are often constructed based on librarians judgment on what are frequently asked. LibAnswers and similar Q&A/knowledgebase systems take the guess work out of this, by (1) listing results by most viewed, and (2) allowing users to post a question if it's not found in the FAQ. Librarians can then answer this question and post this automatically to the FAQ if desired.

When the librarian answers the question, LibAnswers also lists related answers from other libraries on LibAnswers so in a sense you can to crowdsource answers as well.

Currently this service which just started early 2009 is too new to be assessed.

There are other collaborative networks for libraries out there including QuestionPoint, Library Society of the World room in Friendfeed, NLB's Collaborative Reference & Network Service (CRNS) etc

5. Comments and suggestions (Get-satisfaction, UserVoice etc)

In feedback for libraries - Getsatisfaction, UserVoice, Yelp , I wrote about the use of crowd-sourcing tools such as GetSatisfaction and UserVoice to crowdsource feedback, ideas comments.

I reviewed some of the accounts again, most library accounts have not yet managed to achieve critical mass except for  Cook Library's UserVoice page, which they used to get feedback for their website design.

6. Collaborative Cataloguing (Library Thing, Biblios.net and Open Library)

Collaborative Cataloguing isn't a particularly new idea, libraries have always shared cataloguing records to reduce the number of original cataloguing necessary.

Typically such cataloguing efforts are open to only librarians, but new approaches allows the public to get involved. Librarything (Cataloging Flash mob), Biblios.net and Open Library   are some example. Open Library   in particular mimicks Wikipedia style pages that allows anyone to add or edit pages for each book.

7. What else can be crowd sourced?

What else can be crowd-sourced? In theory subject guides done on wikis can be crowdsourced. But as seen by the listing of libraries wikis , the vast majority of wikis are edited only by librarians in the same institution.

As far as I know, none allow users to edit. While the Librarywikis page listed Chad Boeninger of Ohio University Libraries' Biz wiki as one that allowed users to do so, a quick meebo chat  with Chad cleared that up. He used to allow users to edit 2 years ago, but nobody edited except for spammers.

I guess while users are fine with editing wikipedia, they still aren't comfortable with editing library subject pages for various reasons.


Why are some crowd sourcing projects successful others aren't? Some of  it is obviously due to the size of the user base, Library of Congress and the National Library of Australia obviously have larger user bases that other libraries, so their crowd sourcing projects have a better chance of taking off.

There are other reasons, this article, which pretty much hit the head on the nail, advises that "Your workers are unpaid, so make it fun." This could be as simple as listing top performers, awarding titles or ranks for targets met, to making it into a game of some kind.

Are there other crowd sourcing projects that libraries have engaged in that I have not mentioned?


Saturday, November 14, 2009

Bayesian filtering of RSS feeds - can you automatically find interesting journal articles?


In this long rambling post (too bad the name Rambling librarian is taken) , I  write about filtering RSS feeds (in particular table of contents from online journals), using 3 services , SuxOr, FeedZero and FeedScrub. I ramble on about social filtering versus Bayesian filters, Spam filtering versus filtering of RSS feeds , some very brief initial thoughts etc.

Readers who are aware of the background should just skip to the sections on SuxOr, FeedZero and FeedScrub

In Aggregating sources for academic research in a web 2.0 world, I wrote about keeping up with your research using RSS feeds from

"traditional databases (citation alerts, table of contents of favourite journals), library opac feeds of searches and new additions, book vendor sites (e.g Amazon) book sharing sites (e.g LibraryThing), social bookmarking sites both generic (e.g. Delicious) and research 2.0 sites (e.g. citeulike), Google alerts and more"

The main problem with this of course is that you quickly get overwhelmed with results. In many cases you can't create a custom RSS feed (e.g. Many libraries provide RSS feeds of "new additions" in broad subject areas like Economics) and even in instances where you can , say a EBSCOHOST database search in RSS, even the most finely tuned search query can often bring up quite a lot of irrelevant results.

Social filtering

The answer here of course is to do some sort of filtering. Currently, with all the focus on social media, the idea of social search or social filtering is all the rage with Google, Bing competing to add features , many of which attempt to leverage on the social web, least of which is Google's Social search

The basic idea is simple, look at what other people who are in your field are sharing (through social bookmark or blogs or tweeting or likes etc..) .

Or as Chris Anderson (of The long tail fame) puts it

" "Social filtering" is a great way to describe this process. Instead of going directly to the source, we are only going to content that our network suggests is going to be interesting or relevant to us."

Twitter is probably the most famous example, where people you follow act as your social filter. Many claim that they don't even use their rss reader anymore they just look at what people they follow on Twitter , retweet or favorite (e.g favestar.fm lists tweets that people favourite)! Why manually read all your RSS feeds, if your friends who have similar taste as you dutifully share and highlight everything that is likely to be interesting to you as well!

This idea of social filtering and social search is embedded in the "wisdom of the crowds" approach, from the old school Digg/Slashdot approach which aggregated the votes from all users , to more individualized/ collaborative filtering approaches like Friendfeed's "best of the day" that takes into account only actions of your friends (or friend's friend and so on), or actions of users similar to yourself (e.g. Netflix). There are hundreds of examples of web 2.0 services using such approach (basically every social network out there and  recommendation systems) so I won't bother to list them .

The problem with just social filtering or social search is this, it presumes that your tastes is similar to that of the average masses (in cases like Digg, where every vote is aggregated) or that there are people out there in the network or among your contacts that have similar tastes/interests. But that isn't always the case, particularly when interests are defined very narrowly.

Taking myself as an example, I'm broadly interested in "Library stuff", essentially library 2.0, social media, Bibliometrics some aspects of linked data etc, I look at Retweets, faves, delicious etc and at this broad level I get interesting and useful recommendations particularly from the sources everyone monitors (Mashable etc, popular library blogs like Shifted Librarian etc )

But I'm also a Phd student (in theory anyway!) working on an obscure area involving valuation of library services using a specific technique. There are far fewer people working in this area, and hence social filtering is not a reliable method. I can't really expect someone to read a interesting journal article on that specific topic and share it, possibly no such person exists or even if he did he might not be that generous to share.

Note to self : to check if the few people who have written in the area I'm looking at, have a strong online social presence.

Bayesian filtering

So what's the answer? If you can't expect a human to do the hard work for you, you could train a machine to do it for you! There are many "machine learning techniques", but in this post, I focus on Bayesian filtering, a technique well known for being effective against spam.

The idea for Bayesian filtering for spam is usually credited to Paul Graham's "A plan for spam" , for those not into Mathematics, essentially Bayes Theorem allows you to calculate the possibility of a given event given that some other event has occurred.

Here's a simple layman explanation.

First you need to train your Spam filter on say 200 spam mail and 200 ordinary mail. You tell the filter, "this mail is spam", "this one isn't", "this one is" etc..

The filter will eventually "learn" that when certain words appear in the message (e.g. "viagra") the mail is more likely to be spam, while other words like "library" tend to be mail you want to read. (Some words would be neutral because they either appear in both types of mail or they don't appear in either).

Bayes Theorem lets you calculates the exact probability that a mail is "spam" or "ham"(good mail) and beyond a certain threshold of "spaminess", you can be pretty sure it's spam and classify it as such.

The history of the use of Bayesian filters for spam filtering is a fascinating one but I won't recount it here, except to say that they have being very effective despite attempts by spammers to beat it using various tricks.

I used filters in the past such as  POPFile, Spambayes before switching mostly to gmail, and after a couple of weeks of training, I achieved success rates of  catching spam of 99% and up with almost negligible false positive rates.

Bayesian filtering of RSS feeds

There's nothing essentially special about the categories "Spam" and "ham", you can teach a standard Bayesian filter to classify text into any 2 or more categories. Though many Bayesian spam filters, usually allow only 2 categories , some like POPFile allow you to classify into as many "buckets" as you want.

By now you now where I'm going with this, why not use Bayesian filtering to recognise "interesting" vs "non-interesting" articles in RSS feeds?

In the examples that follow, I experiment with bayesian filtering of RSS feeds from table of contents of electronic journals. While I could do this on any RSS feed, I feel it makes little sense to do Bayesian filtering on popular blog feeds, since such feeds would definitely be read and filtered (Retweet or otherwise shared) by humans, while sharing of journal articles particularly for less popular journals is far less likely.

A wild idea here is to feed in a RSS feed that is already pre-filtered adding another layer of filtering to purely social filters. E.g. Try bayesian filtering on a Citeulike rss feed!

To make things simple I used the free ticTOCs service   (other similar services)to get 20 RSS feeds from various Library and information science journals then feed then into the bayesian filters

How to do bayesian filtering of RSS feeds

I'm aware of 3 services that do bayesian filtering of RSS feeds. 2 are web commercial services (FeedZero and  Feedscrub) and one is a open source project (SuxOr).

Google Reader provides a new sort option "magic", which  " is personalized for you, and gets better with time as we learn what you like best — the more you "like" and "share" stuff, the better your magic sort will be". It's unclear if this uses Bayesian filtering or some similar technique.

I have also being experimenting with converting RSS feeds into NNTP, IMAP/POP3 and then using POPFile (an open source bayesian filter designed mainly for classifying email into any number of arbitrary categories) to classify the items, but I doubt it's going to be a real solution.


I first became aware of SuxOr thanks to a tweet by the amazing Ostephens, he tweeted about the Bayesian Feed Filtering (BayesFF)

"The Bayesian Feed Filtering (BayesFF) project will be trying to identify those articles that are of interest to specific researchers from a set of RSS feeds of Journal Tables of Content by applying the same approach that is used to filter out junk emails.
We will develop and investigate the performance of a tool that will aggregate and filter a range of RSS and ATOM feeds selected by a user. The algorithm used for the filtering is similar to that used to identify spam in many email filters only in this case it will be “trained” to identify items that are interesting and should be highlighted, not those that should be junked." (Full proposal)

The research project carried out at the Heriot-Watt University, uses the opensource SuxOr project. They get actual researchers to go through the process of using bayesian filtering on Journal Table of contents and they will try to evaluate the effectiveness of the method.

The research project is currently in progress, but their blog already has a lot of interesting information.

It's a open source project, so you can download and setup your own server with this. Alternatively if you don't want to go through the hassle of doing this, you can sign up at http://icbl.macs.hw.ac.uk/sux0r206/home to play with this (do note that there are no guarantees for long term use).

SuxOr is a very sophisticated system, but I was disappointed that there was no way to import a package of RSS feeds using OPML (a package of RSS feeds you can get from RSS feed readers like Google Reader), so you have to manually import each RSS feed.

Another problem is that adding RSS feeds that are not already listed isn't automatic, you need the administrator to approve it first (though I read they are working on changing this). 

Fortunately for me, the administator Lisa Roger was kind enough to quickly approve 20 or so library and information science journal - Table of contents (e.g Library Quarterly, Journal of Documentation, Journal of American Society for Information Science and Technology) so I was able to set it up for my use to look for articles of interest to me (basically valuation of library services).

Once you have your RSS feeds setup the next thing you need to do is to set up your categories for filtering.

Click on your name and then "edit bayesian".

First you need to create one more "vectors".  This confused me for a while. I understood of course that you could create categories for the filters to classify into. So you could create categories "interesting" and "not interesting" (which is basically what the research project is doing) or even specific categories like "Library 2.0", "Cataloging", "Information Literacy" etc.

But what is a "vector"?

As far as I could make it out, a "vector" is a dimension along which you can classify items into.
So in theory you could have a vector "relevance" with two categories "relevant" and "not relevant" as well as along another vector "subject" with categories "Library 2.0", "Cataloguing" and "Others".

Personally I just stuck to having one vector and 2 categories to give the filter the best chance of working.

Normally, you train on existing items from feeds by classifying items in the feed you import, but you can also copy and paste text from any other place to train the filter. So for instance, you can copy and paste abstracts from relevant articles in your bibliographic database,  to quickly train the filter to watch out for those words.

There are other advanced features like the ability to share/training etc, possibly even share your filters so someone could run text  through your filter to quickly see if you would be interested in that article but I didn't explore that.

The image below shows items in a feed being classified by the vector "interestingness" into 2 categories "interesting" and "not interesting". I set the filter threshold to an arbitrary 77%  and it shows articles in the feed where the filter calculates that there is a 77% or more chance of being "interesting".

 Of course, the filter is not going to perfect  , particularly at the beginning where it has not being sufficiently trained, so you can change the verdict for each item by clicking on the pull down menu and changing the category and the filter will then "learn" by adjusting the training weights given by the text in the document.


 Feedzero is a web-based service that allows you to enter feeds and then train them with a "thumbs up" and "thumbs down". This is a bayesian filter of course.

It has less options than Suxor, in that you can only filter into 2 categories, but it offers more import options, in particular OPML import which I think is a must-have ,so you can import a bundle of RSS feeds from say Google Reader. Great time saver.



As I write this, Feedscrub is in closed beta testing, and requires a invitation code to test. I managed to get one from here, I'm not sure if the code still works.

The free version is very limited you can only add 5 RSS feeds. The paid version allows unlimited feeds and allows you to import feeds using OPML.

Spam filtering vs RSS filtering- some thoughts

I'm aware of  some discussions on how to train bayesian filters but it applies mostly to handling spam.
It's unclear whether this carries over to bayesian filtering of text (I'm sure there is work done on purely text classification using bayesian filters, but I haven't looked at it).

But the two basic methods are "train on error" and "Train on exhaustion"  (see here also)

The former involves

"scanning a corpus of known spam and non-spam messages; only those that are misclassified, or classed as unsure, get registered in the training database. It's been found that sampling just messages prone to misclassification is an effective way to train"

This is the method I suspect most users will use since it's most intuitive, the other method  is just too much work. 

Many spam filters are purely "naive bayesian" filters and I believe results with these filters should be exactly the same as with RSS bayesian filters.

But spam has many special characteristics, and many spam filters have added features to handle it.

Firstly, spammers are trying to beat the bayesian filters using various techniques from using images, or adding random words to mail trying to poison or fool the Bayesian filter - the so called "word salad" method  (see The Spammers' Compendium  for more tricks) . For instance, POPfile  use pseudo words to handle special characteristics of spam.

Secondly, because in spam filtering,  a false positive (classifying a good mail has spam) is more serious than a false negative (classifying spam has okay), filters such as spambayes has a "unsure" feature.

It's unclear if filtering of RSS feeds (in particular those from Table of contents of online journals) is easier or harder than spam filtering. For sure you don't have the problem of an adversary trying to beating the filter (unless you include authors using interesting but misleading terms in their abstract trying to interest you!), so it seems it might be easier.

On the other hand, trying to judge if a full article is interesting based on just the abstract is challenging, it's unclear if there is enough information.

In fact it is worse than that, if you look at the image above which showed filtering of table of contents, you will notice that some RSS feeds of online journal tables of contents just list the title and author and have no abstracts, in such cases, it's unclear if Bayesian filters will be effective. While many RSS feeds will have "full-text" (e.g. Blogs with full-feeds), many will not (e.g. books from OPAC feeds)...

 The other issue is the issue of "false positives". Assuming one filters RSS feeds into 2 categories, where one category  plays the role of "good mail" (interesting item) and the other "spam mail" (not interesting items), should one try to minimize the false positive rate (or the chance of missing a interesting item because the filter wrongly classified it as not interesting)?

SuxOr allows you to select probability thresholds but the question is what threshold of "interestingness" or rather "uninterestingness" should you use?

Also wondering, in terms of evaluation , how does one evaluate the effectiveness of the filter? This moves us into the realm of information retrieval.

As this report on spam filtering  points out , either you calculate

a)The recall and precision rates of not interesting items which is the standard in the information retrievial literature or what all librarians were taught in library school


b) Miss rates and false positive rates  which is commonly used in reporting spam filter results.

I think the recall rate of not interesting items plus the miss rate will always equal to 1 and the precision rate of not interesting items plus the false positive rate will also equal to 1.

Still mulling over this and trying to remember what I studied about information retrieval in library school but I think it is critical to aim for a low false positive rate or conversely a high precision of uninteresting item to avoid missing interesting items. The converse problem of seeing an item marked interesting but really isn't is a lesser problem, since I rather err on the side of caution.

It's interesting to speculate how well bayesian filtering of RSS feeds will do.

My limited experience so far is that it's a bit tricky classifying the articles on "interesting" vs "non-interesting" or "relevant" versus "non-relevant" etc..  For spam vs non-spam it is usually  a no-brainer (less the occasional semi-spammy mailing list) , but given that  relevance or interestingness is a multi-dimension concept, I find myself debating over whether a certain item is really relevant or interesting or not. An article might be interesting if you haven't read it before, but i still classify it as interesting anyway cause of the keywords in there.

Another issue, is that with table of contents you don't get the full text, so if the abstract looks interesting but the full-text isn't, is it to be classified as interesting or not? Maybe classifying by topic e,g "valuation of libraries", "Library 2.0" plus "others" might be better?


It still early days, it will be interesting to see how successful bayesian filtering will turn out, but you can follow the discussion here on Friendfeed. According to MrGunn, Friendfeed (based on social filters) still beats bayesian filtering for him, but he hasn't trained the bayesian filters for long yet and it's possible that  social filters work well for him because he has strong social networks, due to the strong community of Life scientists on Friendfeed.

For me, I'm almost certain that with enough training Bayesian filtering  will be useful above and beyond the usual social filters, but that's because I have weaker social ties. Though of  course if science 2.0 sites that are social networks for researchers like Mendeley take off, it might become easier to find other like minded researchers. Still currently it seems this isn't happening.

But it's not really a question of social filters versus Bayesian filters as MrGunn notes above "Bayesian filter helps you find best stuff you already know you like, whereas Social search helps you find stuff you didn't know you were looking for."

I would add that, one could combine the two systems, since in a 2 category system, your "likes" or "faves" or "thumbs up" gives data that can be used for either bayesian filtering of articles or for fueling your social graphs (to find people who have "liked" similar articles), both systems could work in parallel or chained together.

In many ways this is similar to email filtering systems like SpamAssassin which use a host of filtering techniques from  bayesian filtering to DNS-black-listing, greylisting, checksum-based filtering and more.

 I particularly like the following diagram from the CiteUlike blog

In this post, I'm basically doing "more like this" feature but there are other recommendations that can be done.

Monday, November 2, 2009

Zooming into presentations - Zoomit, Prezi & pptPlex

In this blog post, I describe 3 different ways to zoom in and out of presentations to increase visibility, and to create a bit of action. The three tools are Zoomit, Prezi and pptPlex. I describe my experiences with them in detail and their pros and cons as presentation tools.

UPDATE : John in his comments below suggest that I should have included ahead, a even newer tool that in some respects is similar to Prezi , but which he claims is easier to use than Prezi . I took a quick look it does look interesting, but still in beta. Will review that in future blog post in comparison with Prezi. 


Recently I gave my first ever talk at a local library conference in Singapore - Libraries of tomorrow . The content of my talk wasn't anything particularly interesting (about "Subject Guides 2.0" which many readers of this blog will recognize covers much of the same ground I blogged about in past posts and which I later realized wasn't particularly ground-breaking anyway), so I won't talk about it here, but instead I will talk about the presentation tool I used.

While the conference was ran efficiently and I was extremely impressed by the work of my fellow presenters and posters, I felt that the venue wasn't the best place for giving talks, as there was sunlight coming from behind the screen (we were at the top level of a building with clear transparent windows), and the glare made it difficult (at least for me, though bear in mind I have very poor eyesight!) to see the powerpoint presentations.

Hazman from NTU Library probably fared the best, as a lot of his slides were often simple one message points with huge fonts, but few presenters have mastered this technique; personally I'm guilty as anyone of trying to squeeze small unreadable fonts on slides.

In situations like this, it would be good to have some way to zoom while doing presentations or create "zoomable presentations" to overcome problems of poor visibility.

Right now, I'm aware of 3 solutions, in order of simplicity (in terms of use and preparation time), they are ZoomIt -a portable zooming app from Windows Sysinternals (now owned by Microsoft) , pptPlex   (Microsoft Sharepoint addon that creates zoomable slides) and Prezi a presentation tool that creates zoomable flash presentations.


I'm sure while giving presentations as librarians you have come across situations where you need to zoom in to whatever you are showing so that everyone, even those right at the back with poor eyesight can see what you are doing.

If it's content on the browser page itself you could zoom using built in browser functions (hold cltr and use the mouse wheel), but there are drawbacks to zooming this way, and anyway you can't do this when doing a demonstration with  say Endnote, neither does it work , if you want to type in a url in the address bar , and you can't zoom in for that using normal browser functions.

This is where ZoomIt comes in handy. It's a small portable application, that you can carry on your thumb drive. Hold on cltr and 1 and it automatically zooms in to the part where your mouse cursor is. Right click and it zooms out. Other functions allow you to annotate and type text or draw free hand with different colors and sizes.

If you are using Windows Vista it supports "Live zoom" (hold cltr and 4), what this means is that unlike the earlier zoom , this is actually live, you can continue to work in zoomed mode (type url, click on buttons etc). Below is a short video showing how it works.

The advantage of ZoomIt is that it can work for any type of presentation, and almost no pre-preparation is needed. You can of course use ZoomIt for zooming into normal powerpoint presentations, but it's somewhat clunky to use this way, instead there are two ways to create "zoomable presentations" , that allow you to zoom in or out several levels, or allow you  to do flashy animation by spinning and twisting text, images etc.


Prezi  is a very new presentation tool that began making waves early this year, I got my beta pass in April and have being trying it out since then. It's really hard to describe Prezi , so rather than try, let me embed a few Presentations created by librarians on Prezi (more here).

The first is by Lynnwood Library to publicise their summer reading program.

The second is a library orientation for graduates students at Western Libraries in Canada.

The third is an example teaching research from Carol Skalko of University of New Haven Libraries.

I could go on, there are hundreds of excellent examples from libraries and librarians, and besides the standard conference presentations on web 2.0 (example , example 2), there are some unique ideas (floorplans?, putting covers of new additions?, tongue-in-a-cheek quiz?) , presentations to management (example),though announcement of changes (example), orientations and tutorials (example, example 2, example 3)seem to predominate.

You can embed pictures, movies even  PDF and powerpoint slides!.The free version 

  • gives you 100 mb of space, which is plenty until you start embedding videos. 
  • allows use of the presentation in an online version (the embeds above), 
  • or you can download an offline player version that requires only Flash to run , and works even without internet connection. 
  • does not allow private presentations (the pro allows private presentation). 
  • You can also only work on your presentation with the online editor, though the pro version gives you a desktop version so you don't need internet to build your presentations.

My very  preliminary  use was to present quick results of LIBQUAL survey internally, but  that was just a test.

Prezi isn't easy to use

One issue with Prezi  is that it's isn't very intuitive to use, particularly since the UI seems to take delight at being stylic rather than functional. You don't access functions using a normal toolbar (or ribbon these days), but you instead get a "bubble menu".

Resizing items via the "transformation zebra" is particularly painful.

Don't get me started on pathing (the focus movement when you click on the forward arrow).

All in all it took me a minimum of 3-4 weekends before working Prezi became half-way natural. That doesn't sound so bad but there still another major problem.

Just as people know how to make PowerPoint "Dance to their tune" and can add text of different fonts, add animation of all sorts, they still produce bad presentations because they don't know what to do with it.

Prezi has this problem ten fold, it requires a very different way of thinking, there are no slides (though there are frames that allow you to group items), you can add items at various levels of scale (for surprise value), you can add different rotation angles and people who learn Prezi tend to go aboard with this, but how much rotation and how best to "Tell a story with zoom" is a open question.

Even after I could get Prezi to do pretty much what i wanted without difficulty, I still spent hours looking at some good Prezi presentations and trying to tease out some general principles on why they worked, and what rotation techniques work and what doesn't. The tutorials are a start but there's still a lot of unanswered questions on what is actually good design technique for Prezi. I wouldn't hold my breath though for answers, since this is very new ground, though I wager the official blog  has more answers than any other place.

I struggled with Prezi for a long while and though I wasn't satisfied I eventually presented using this. I think there was a bit of a stir when I started, I believe they were reacting to Prezi rather than my content though and indeed I got questions about Prezi in private fter I finished. Apparently, all that rotation was exciting enough to "wake up" some of the audience!

Still I think if I had just stuck with powerpoint, I might have more time to work on my preparation rather than the final mess (including lack of preparation that showed) that emerged. Oh well, will do better next time.

Pptplex - Making Powerpoints Prezi-like

After the conference, while doing research for this blog post, I discovered pptplex , it's a experimental addon for Powerpoint 2007 found at Office labs (I didn't realize Microsoft Office had a "lab") ! The idea is that with pptplex you can easily convert Powerpoints into zoomable presentations.

The installation will give you a new pptplex tab.

Essentially, you divide your powerpoints slides into sections (click on "insert new section" - see below) and each section can then be treated as an entity and can be zoomed in or out.

In the video below, I show an example of a old library orientation presentation  that I quickly converted using pptplex . For example the "Introduction" section is made up of 3 slides. Arrows keys allow you to go forward and backwards.

By default, pressing the forward arrow zooms to the section title, a second press zooms to first slide in section , a third press moves to the second slide in the section, and so on until last slide in the section, and then it zooms back to original section title.. before moving on to next section.

This simulates of course the old adage about "telling them what you will tell them, tell them, tell them what you told them"

Still, you can change this behavior in the advanced options.

You can also view in "free form", click on any section to zoom, click on any slide to zoom down further, you can even zoom in further by clicking on any section of the slide. Right click to zoom out.

So far, the above example uses a "blank canvas" , the next two examples uses pre-created canvases

It's trivial to add pre-created canvas, just click on "canvas background" (see below)

Adding a canvas, will simply add a special slide at the head of your presentation and you can edit it (see below).

You can even move sections around and resize each section, or even create your own canvas.

Pptplex -Not for everyone?

The main advantage of pptplex is that it's almost trivial to use. Really it took me 15 minutes to convert each presentation. When people create powerpoint slides, they already naturally sub-divide groups of slides into sections most of the time, so what pptplex is doing here isn't that alien.

It's not as flexible as Prezi, you can't do rotations, add movies etc.

As a matter of fact, it doesn't currently work with normal powerpoint animation. Also has noted it works only with PowerPoint 2007 but not older versions (but see Counterpoint??)

Another negative point is that it takes a while to convert to a zoomable presentation with pptplex (my 90 slide EndNote presentation took over 3 minutes?), and there doesn't seem anyway to prepackage it so it works on all systems, which can be a problem if you have do a presentations on systems you don't control, running older versions of powerpoint, or even systems with powerpoint 2007 but did not install the pptplex plugin (which is basically 99% of systems).

Okay, what about doing a remote access into your system that has the plugin installed? Not a good idea, the moment you try this, you get a warning that pptplex can be very slow when on remote access.I tried and it is indeed slow, even after building the presentation.


I reviewed 3 ways to create zoomable presentations ZoomIt, Prezi and  pptplex . 

Are there other ways to make presentations more interesting and more importantly clearer for your audience? If you have tried any of the 3, I would love to hear your experiences with them.


I was introduced to the idea of using Zoomit by Kenneth Lim, another techy librarian from NUS Libraries and have being using it for my EndNote training sessions. Thanks Kenneth!

Share this!

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Related Posts Plugin for WordPress, Blogger...