Monday, May 20, 2013

My experience visiting China for the Serialssolutions Greater China User Group Meeting

Last week, I had the amazing opportunity to visit Xi'an China from 9 May to 11 May 2013 to attend the Greater China SerialsSolutions' User Group meeting.



Regular readers of my blog will know since 2011, I have been reading (see list of articles I am curating) , thinking and blogging about discovery systems, leading up to the implementation of SerialsSolutions' Summon in 2012 in my institution.

I have tried to keep up-to-date with what pioneer librarians and libraries around the world have done with discovery and have interacted and learnt much from librarians in UK, US, Australia etc via Facebook, Twitter & blogs etc.

Obviously this was a very Anglo-Saxon view of things, but hard to avoid, given the nature of the social networks I was on.

But this User group meeting was in China! I was excited to have a chance to have contact with librarians in China to see what they were doing with Summon and learn about librarianship in China as I had never been to China before in my life.



Preparing for the User Group Meeting

Of course by now, I have attended a few library conferences overseas and am even fairly adept at giving talks at conferences (eg Internet Librarian International last Nov), but this time it was particularly tricky because the whole meeting would be in Chinese and I would have to present in Chinese.



For the benefit of international readers,  let me explain why that would be tricky.

While it is true that Singapore is majority ethnic Chinese (about 75%), and Singaporean Chinese like myself study Mandarin in schools as our mother tongue, English is our first language (though it may not be apparent with my odd lapses in written and spoken English I bet) and medium of instruction in schools. We also use it at the work place to communicate with all Singaporeans including non-chinese Singaporeans.

We are supposed to be bilingual in theory but effectively for many including myself it works out that while I can use Mandarin for everyday conversations eg. to talk about shopping, food, movies, Chinese songs (I listen to Chinese pop songs as well as English ones!), I struggle when it comes to professional terms as I studied librarianship etc in English.

Quick, what's "catalogue" in Chinese? Or even "metadata"?

Initially it was suggested that I do the presentation in English and SerialsSolutions staff from China would translate (there were other presentations by American and Australian SerialsSolutions staff done that way) but I decided to stretch myself and try to give it in Chinese.

I generally don't write out every word I want to say in a presentation, though this time I thought it was prudent to do so. I translated what I wanted to say from English with the help of Google translate and additional help from colleagues from our Chinese Library but still ended up with a pretty simplified presentation because I thought it would be best to keep it simple given my limited command of Chinese.

As a sidenote, I was quite impressed by how well Google translate was working, it was pretty good at translating even very technical terms and while it sometimes got the grammar and syntax order wrong it was usually spot on.

I also read a couple of articles on discovery in Chinese and this helped me pin down terms like "Unified search platform".

My institution has also one Chinese library but I must admit up to recently I didn't really focus on Chinese language searches but before leaving for China, I looked up what queries people were doing in Chinese (about 6-8% of queries were in Chinese).

I was also reminded of a feature of Summon that I read before but I forgot, that changing the interface language doesn't just change the text labels of the UI, but the search algorithm applied will change. In most cases, it seemed to make no difference in the search results ordering but in some cases it might give you better results if you changed the interface to Chinese and searched in Chinese as opposed to searching in Chinese using the English interface.


The user group meeting






The User group meeting was hosted by Xi’an Jiaotong University at the Nanyang Hotel. I was nervous as I was the third presenter, after presentations by Pecking University (the flagship Summon Library in China) and Xi’an Jiaotong University.





I was not sure what I expected but I did discover two things.

Firstly, I generally had no problems understanding the presentations even though they were in Mandarin (save one extremely technical presentation about some complicated custom integration of Summon with a OPAC system which I suspect would be difficult for me to grasp even in English).

When they said the term for say "relevancy ranking of search results" in Chinese, I had no problems knowing what they said, though the reverse doesn't apply and if I wanted to say that in Chinese  I often came unstuck :)

Secondly, it became apparent to me that the Librarians in China were mostly facing many of the same issues as librarians around the world.

I had no problems understanding and even some but not all cases nodding with agreement with some of the points made. Eg. difficulty of selecting appropriate packages in 360 core, relevancy ranking issues.




On the second day during the round table session, requests were made by China reference librarians for features including ability to sort by citation count, ability to filter by databases, social sharing features etc. Again these requests weren't unique to China users, I myself have heard such requests from our own users and librarians.

But by now I am familiar enough with the philosophy of Summon to know such requests were unlikely to be supported without strong evidence these would be used by searchers.

Of course, like every local market, China has unique requirements and features including censorship, discussions about working with China Academic Library and Information System (CALIS) - the China Consortium group to create packages for selection etc, libraries presenting on chinese ebook batch loading etc.

And of course there was concern that while Summon had very good coverage of Chinese material, compared to some local Chinese discovery systems it was still weaker, and a discussion on whether this was truly a problem.

From  the admittedly simplistic point of view of a librarian outside China, it seems to me that if the best University in China - Peking university has chosen Summon, there is some assurance at least that Summon has reached a certain level here, though obviously it can be improved further particularly if Chinese material is your main concern.

It was also impressed on me, how much Summon benefited from collaborating with Peking University, the university helped Summon with relevancy ranking of searching in Chinese and I think helping to provide a Thearusi/list of 2.7 million dictionary of Chinese names etc

There was also discussions of the possibility of use of Summon's API to populate Institutional repositories (probably not), and future developments. Unfortunately I promised not to blog about some of the possible future developments mentioned, though I think I can say that Serialssolutions is working hard on further improving relevancy ranking.

It was also announced that 10 universities in China are currently signed up with Summon as well as other high profile signups around the world including Yale and Cambridge (I think).

Somewhat amusing is that I also sat through my third talk on the upcoming Summon 2.0, by 3 different presenters to boot at 3 different occasions. :)

It was not all about Summon as this was a serialssolutions user group meeting, there were presentations and discussions on 360Marc, 360Counter, Intota.

Interaction with librarians

Sadly even in the best of times, I am quite introverted but this time my doubts about my command of the Chinese language made it even harder for me. Thankfully, some librarians from China, took the initiative to talk to me and I tried to converse about librarianship in general e.g the image and perception of librarians in China in my poor Chinese.

Some librarians I spoke to were also from Universities that traditionally have strategic alliances with Universities in Singapore and a few others also mentioned colleagues currently working in libraries in Singapore.

This led me to think about the possibilities of exchanges and strategic alliances between libraries in Singapore and China as well as in other countries.

Coincidentally upon returning I read about the online collaborative projects between China Librarian Hua Sun & American Librarian Mark Douglas Puterbaugh entitled Using Social Media to Promote International Collaboration. This paper described how interaction via the Facebook group Library related people led to fruitful international collaborations.


As a sidenote, there's a certain librarian in Singapore who seemed pretty famous in China as I was asked by at least 3 librarians whether I knew her and asked to pass on their best wishes. :)



Besides Chinese librarians, I also had the chance to meet and chat with  John Law, Vice President, discovery services, Serials Solutions who was at the user group meeting as well. The librarians in China were calling him "the Father of Summon" and it was interesting to hear his take on why he came up with Summon.





Travel & Sightseeing

















As is traditional for me to combine work with sight seeing, I also extended my trip a couple of days and took the opportunity to tour Xi'An China after the user group meeting. This was my first visit ever to China, and Xi'an is a very old and ancient city that was the seat of power/capital of many past dynasties in China.

I visited the Terracotta warriors (twice!) , Huaqing Hot Spring or Huaqing Palace etc. Since this is a librarian blog not a site-seeing blog, I won't describe further what I saw and experienced but I will say if you are into culture and history, Xi'an is definitely a good place to visit.

Obviously, it was a very interesting and educational trip for me, my very first trip to China!

I would like to thank the staff of SerialsSolutions and Xi’an Jiaotong University for graciously hosting us and showing us around.












Thursday, April 25, 2013

How are discovery systems similar to Google? How are they different?

Like many academic libraries, we recently launched our discovery service Summon. Having worked intensively on this project since 2011 during the evaluation followed in 2012 by the implementation phase, I had an opportunity to delve into the topic perhaps deeper than many of my colleagues not on the team.

I would guess most librarians probably see Summon and its competitors as "Google, but for academic research or they see as "Google Scholar like". For sure, users see it that way and so did I.

In a way this isn't a bad way to understand Summon. Similar to Google, Summon builds a centralised index of results that it queries whenever you search, so you can get almost instant results. Of course this isn't how older library federated search products work which pulls in results in real-time from multiple sources (this is the library equalvant of web metasearch services like ixquick ) rather than storing the data before-hand in a single index.

Similar to Google, Summon generally isn't restricted to searching just the metadata or the bibliographic record and searches through full-text of most journal articles and many books - if these are available or provided.

Other similarities include the holy grail of "the one search box" that searches "everything" (or close enough) and heavy focus on relevancy ranking to surface desired results.

As a sidenote, relevancy ranking isn't really new to library catalogues by now (for example our "next generation catalogue" Encore, has relevancy ranking and so does the older webopac), but one thing that is often missed by librarians is that because Summon searches full-text rather than just metadata/library record, Summon's relevancy ranking owes more to how typical web search engines works and is often unpredictable to a large extent.

Even if you knew the exact formulas and weightings of each factor, you would have to crunch the numbers and probably it doesn't work such that "no matter what, this journal must appear on top because it matched 245$a and...." and for sure you can't "explain" why this results appears on top but not another.

As stated in my post, How is Google different from traditional Library OPACs & databases? , Summon is probably as close to Google/Google scholar as any Library associated search currently out there including features like autostemming, search over full-text and Summon 2.0 will come even closer by adding Auto query expansion that will automatically search synonyms.



Other upcoming features like "topic explorer" which pulls in short entries from reference material from sources such as Britannica online and Wikipedia, reminds me of a very primitive form of Google Knowledge graph at least visually (as far as I know Summon has no Semantic search). For example compare the following result from Google for "heart attack".



With the topic explorer in Summon 2.0

http://www.serialssolutions.com/en/services/summon/summon-2.0

I would add that such "Topic pages" is not unique to Summon, for example Ebsco Discovery service is adding topic pages.

Summon 2.0's content spotlighting that "Groups newspaper content for easy identification" and "Local collection and image spotlighting" reminds me of how Google's "universal search" dynamically shows content from Google News and images when necessary.

Below shows a Google search with news items been distinctly grouped and highlighted


In short, both functionally and visually, Summon is getting very close to Google with the main exception it does not do a soft AND - it doesn't occasionally drops terms from the search.

A sidenote is that there are metadata fields in Summon that are never displayed to the user but are indexed and matched, so occasionally it seems Summon might appear to do a soft AND and pull out results that do not match all terms (taking into account stemming) but it's just a illusion.

As such I think while most librarians know how Summon is similar to Google/Google Scholar, what is often not mentioned is how different Summon is from Google. These differences are often technical but I suspect drive a lot of unhappiness towards discovery services because they can't meet "Google level expectations"

I am not technical expert but I believe, the main difference between Google/Google scholar and Summon stems from the fact that

Google mostly obtains knowledge of webpages/articles by crawling such pages and harvesting them directly using spiders, Summon generally doesn't. 





See also Google's Inside Search





This difference has 2 effects

1) Less stability in links
2) Less capability in relevancy ranking

Have you ever wondered why Google or Google Scholar seem to have a much lower broken links rate despite covering so much ground?

Essentially, how Google works is that, they have bots that go out to different webpages and capture the information on those pages and from those pages the bots crawl to other pages via links on those pages.

Google scholar is similar 

"Google Scholar uses automated software, known as "robots" or "crawlers", to fetch your files for inclusion in the search results. It operates similarly to regular Google search. Your website needs to be structured in a way that makes it possible to "crawl" it in this manner. In particular, automatic crawlers need to be able to discover and fetch the URLs of all your articles, as well as to periodically refresh their content from your website."

I recently lead a workshop on using Google Scholar for bibliometrics, and despite how I tried, based on the questions they asked, I suspect many just couldn't wrap their minds how Google Scholar obtained entries for indexing compared to how Scopus and Web of Science worked.

http://www.google.com/intl/en/scholar/inclusion.html pretty much sets out the inclusion guidelines for what Google Scholar will index.

Essentially a Pdf file, that looks vaguely article like (e.g Title in big font, author in one line before it, a section titled references etc) and on a edu domain will be considered scholarly and included by the spider into Google scholar if it comes across.

I believe Summon generally does not find information to index this way (I could be wrong).

This difference means that in general Summon relies fully if not mostly on the quality of information given by publishers etc (whether via FTP/USB/OAI-PMH) and does not really "know" if the information given is correct as it has not really "seen" the page or article in question on the site.

While Summon and competitors in its class try to obtain full-text as well as meta-data whenever possible, it relies heavily on the cooperation of the content owner. So often, it may just have the metadata but not full text, particularly for smaller less technically capable content owners. Comparatively Google Scholar if given permission can pretty much grab "everything" full text and all, if their spiders are allowed permission. My anecdotal testing shows this sometimes makes a big difference for example compare the following for Summon eds discovery in Google scholar vs in Summon and you will notice more relevant results appearing in Google scholar due to more full-text indexing even though most of the articles shown in Google scholar are indexed (metadata/abstract only) in Summon as well.

This also means unlike Google, linking in Summon is going to be less reliable. Let's leave aside the complication of journal articles residing in different locations and the need to use openurl resolvers and assume all articles reside only at the/one publisher.

Google is generally sure that when they display a link, the webpage exists, at least at the point in time the bot harvesting the page, it was definitely there. And also because they directly check to see if the page exists, they can easily do link checks and fight link-rot. They can even tell which domains tend to have more broken links and can penalize such sites more.

Imagine if Summon had such data and could use it to automatically adjust openurl database ordering when there are multiple copies available.

I don't think Summon has a way of knowing what links are broken though? Even though Summon has "Index-Enhanced Direct Linking" which uses information from the publisher for more reliable linking compared to openurl linking it is still not directly checking to see if the article exists. For instance, I notice many of these partnerships seem to be using doi, and believe it or not, dois occasionally still do not resolve properly.

The other thing that people like to moan about is the relevancy ranking. Why isn't Summon as good? Don't get me wrong Summon's is very good, but I doubt anyone would say it's better than Google's and I would guess many if not most would say it isn't as good. I also have anecdotal information in the sense that so far the dedicated google scholar users I know of have not switched to Summon, though they acknowledged Summon is a very good effort, signalling that at the very least Summon isn't much better to be worth switching.

Google has a very sophisticated ranking system of course, they can rank based on social signals, usage, tracking click data etc, which leads to fears of filter bubbles where you get totally different results depending on who you are, when you search, where you are when searching etc..

In any case, I don't believe Summon currently uses any of this, though I would love to see Summon take into account click data usage etc whether on a institutional level or global level if hasn't already, similar to how Summon generates "related search" suggestions.

Summon related searches


But the better relevancy also stems from the fact that because Google directly crawls each page, they can study the linkage patterns between webpages leading to the famous Page Rank algorithm.  As you know, each inbound link is a "vote of approval" from the source page that the destination page is important. While this factor may not be as dominant a factor as it used to be, with other "signals", it's easy to believe it is still very useful for Google.

There's a beautiful explanation here.

"The Web is a complex network of interlinked documents and files. It's vast. It's open. Although much of its data is not very well-structured, it does at least share a common structure (HTML, XML) and a common infrastructure. You can write a program that crawls from document to document on the Web and automatically gleans lots of contextual information based on what links to what, the text in which the link is embedded, and lots of other contextual clues. The contextual data might not be 100% accurate, but it's incredibly rich."

Then it goes on to explain why library data is different.

"Library data, on the other hand, consists mostly of various separate pools of records/resources that, 1. have little (if any) contextual data, 2. are not linked together in any meaningful way (not universally and not with unambiguous, machine-readable links), 3. do not share a common structure, 4. do not share a common infrastructure, and 5. are generally not freely/openly available. So much of what Google has leveraged to make Web search work well is simply not part of library data. "

This is for Google, but applies for Google Scholar as well I would guess to a lesser degree.

For Summon the closest equivalent to that which we have is using citation data from Web of Science/Scopus. I have no information, how this is used, but regardless given that most articles are not even cited once (at least as seen in Citation indexes from Scopus or Web of Science), this citation web is a very poor substitute to the link analysis Google uses.

I would add it's well known Google Scholar generally shows more cites than Web of Science for the same article, due to the "looseness" of what is considere a cite, so this technique of weighting results based on cites is far more effective for them.

Can Summon further improve the relevancy ranking? Yes. For example , Google is famous for personalizing search results using either the fact you are logged into Google accounts or because there is a long term non-expiring cookie as well as hundreds of other cues including social media related ones.

Google Scholar as far as I tell isn't that personalised based on doing the same search on different systems and ips but that's besides the point.

Could Summon do personalised results? In theory it could take into account logged-in users , what discipline they are in, what level of study etc, similar to what Primo's Scholarrank claims to do.

But this would still lack the link analysis Google can do by studying the web as a graph of inter-related articles.

One wonders if adding data from citation managers like Citeulike and Mendeley could help improve relevancy ranking, though of course if altmetrics takes off (in many ways this would be the "social signals" of scholarly works ), Summon could exploit that as well.

Beyond that, I am not sure what the solution is for better relevancy, perhaps moving towards a "linked resource discovery environment" (a concept I don't fully grasp) would help but that would be a fundamental change compared to the shift towards web scale discovery services, but as more and more content gets sucked into Summon and it's competitors , this problem of relevancy ranking is not going to get better.

Conclusion

This post is just my education guess on how Summon and Google work and I might be totally wrong. If you have more knowledge and are aware of errors, please help share what you know in the comments.

Monday, April 1, 2013

More good library related video that spoofs movies or tv

Some of my most popular blog posts in 2010 include 12 good library videos that spoofs movies or tv and Funniest library related movies made using Xtranormal.

It has been almost 3 years since then, and libraries have been hard at working creating more interesting yet professional videos. These are some of my favourites including some I missed the last time around.

1. The Research Games - by Texas A&M University Libraries 




Everyone loves a good spoof, this one by Texas A&M University Libraries  is a high quality movie spoof of The Hunger Games. The theme fits beautifully, with librarians talking the roles of mentors/ex-victors, giving advice.

It's a very high quality production, if there is any weakness it is that if you haven't read the book or watched the movie (which I hadn't at the time this video was released), you may not catch all the references.

After watching the movie and reading the book, I really appreciated how clever this was.

Don't miss the concluding episode here.



2. Research Rescue  - by The Harold B. Lee Library Multimedia Unit 




In our original 12 good library videos that spoofs movies or tv, we included at #8 a "Cops" like spoof. But this one by The Harold B. Lee Library Multimedia Unit  looks even better.

It actually makes a librarian look really cool, I want a "Research Rescue" badge too! Incidentally it made me realise the phrase "Research rescue" is actually used by a few libraries!

Watch Episode 2 "Book Fort" and concluding Episode 3 "And We're Done"


3. BR | Harold B. Lee Library Book Repair by The Harold B. Lee Library Multimedia Unit




Seriously, I could fill the list here with just productions from The Harold B. Lee Library Multimedia Unit . Among some of the ones I liked includes the short but effective videos using unreliable sources like fortune tellers, used car sales persons to drive in the point of using reliably sources. See Library Databases | The Card Reader , Library Databases | The Used Car Salesman and the Library Databases | YouTube Kid

I also liked the warm, moving, THE Library | What Changes Us video as well as the National Treasure like Special Collections | Theatrical Trailer, not to mention the famous Old Spice spoofs

But in the end the one I am going to showcase is BR, book repair , a spoof of ER the TV show opening credits. If you have ever watched the show you will marvel at how good this is. I would add this concept isn't new , see Arlington Heights Memorial Library's Technical Services for a less polished example.


4. The Science Network - A Social Network Parody



Not sure who did this one, but it's a brilliant spoof of the trailer the Social Networkhttp://youtu.be/2RB3edZyeYw but instead of Facebook as the subject it's Pubmed. Arguably Mendeley is a better fit :)


5. Find the Future at the New York Public Library Game Trailer by NYPL



I have written in the past on how adept the New York Public Library is with using Social media, they of course also produce high quality videos. There's The Haunted Library and also NYPL Milstein Suspense Trailer. Still the trailer to the Find the Future NYPL Game Trailer with its X-files type feel is still my favorite by far though NYPL Milstein Suspense Trailer comes close .

6. The Most Interesting Librarian in the World at Library and Information Science grad students at the iSchool of Syracuse University




By a group of library students, this spoofs the by now famous "Most interesting man in the world" ads, and is of course a famous meme. Here's another similar spoof involving a real-life librarian


7. Detection Trailer- Inception Parody/Spoof for Burlingame Public Library

Haven't seen any inception Parodies involving libraries. This was done as promotion for a detective type game for Burlingame Public Library.

8. Victory Lap by The Harold B. Lee Library Multimedia Unit



Okay I couldn't resist, added one last one by The Harold B. Lee Library Multimedia Unit. Entitled victory lap, it's so fun, I couldn't resist including it in.


Honorary Mentions

I have always been impressed by the level of professionalism by the Arizona State University Libraries "The Library Minute" series. Smart, cool and hip.

I know librarians modifying hit songs and doing musical style videos is so overdone (eg Lady Gaga, I will survive, Thriller etc) but I just have a soft spot for Read it Maybe (NYSRA 2012)  

Are there any other library videos you like? Let us know in the comments.

Monday, March 11, 2013

4 ways to bring users to your library resources from Wikipedia

Surveys of both phd students in the UK as well as researchers in US not to mention ordinary users has shown that increasingly, the academic library site is declining in importance as a starting point for searching.

Besides Google, the main site they go to is Wikipedia, either by going there direct or via google because it ranks highly in Google for most topics. There is even a name for it called GWR or Google > Wikipedia > References , the process where people Google, click on a wikipedia result and look at the references.

I won't go through all the debates about Wikipedia by librarians though Wikipedia is not wicked is probably the most spirited defense of the "pro side" of the matter, but suffice to say librarians should look for ways to enable users to somehow get from wikipedia pages to library resources more easily.

But how? What follows are 4 ways I know that allow users to link back to library resources easily, using


1. BookMarklet

2. Libx browser extension

3. Wikipedia "book sources"

4. Wikipedia "Library resource box"

If you have no time, I highly recommend you look at the 4th method. It is a must read.


1. Bookmarklet

I suppose if you read this blog , you know what a bookmarklet is, but in case you don't it's just a simple bookmark, with some javascript that when pressed will carry out a simple action.

Barbara Arnett and Valerie Forrestal way back in 2010 in Bridging the gap from Wikipedia to scholarly sources: a simple library bookmarklet showed us how to create a bookmarklet that did the following when clicked on a wikipedia page.

1. It would take the wikipedia title

2. Throw it into a search (you can edit it first) and that would bring the user to the library's search - in this case Ebsco Discovery Service.

Here's it in action



Obviously it's trivial to change this to Summon or whatever search you want. But that's not all, cleverly they built-in Google analytics, so you can keep track of usage/clicks of the bookmarklet.

A trick they helped me adapt for our highly popular proxy bookmarklet. So now,` I can tell how popular it is.

This is a nifty trick that was adapted by other libraries including MLibrary, there are some doubts about whether people would bother to setup a bookmarklet or remember to use it. But that's the beauty of this bookmarklet, you don't guess, the analytics are there.

I currently don't use this bookmarklet and in the past I would probably say no-one would use bookmarklets but a niche audience, but looking at the heavy usage of our proxy bookmarklet (possibly subject of another future conference so I won't say much except to say it's insanely high),  I wouldn't rule out the possibility of this bookmarklet been used.


2. LibX browser plugin

So maybe bookmarklets are hard to remember but what about browser plugins? Libx by  Annette Bailey and Godmar Back of Virginia Tech  is probably the most famous one of them all.

It's a free service that any library can setup and gives you a host of functionality that makes it easy to go from any webpage to library resources.

Among my favourites are hot-linking of ISBN/ISSN/DOI/PMID (basically it converts such strings to clickable library searches), appending of ezproxy on pages or links, support of COINS, link resolvers, xisbns etc.

The latest version even integrates with Summon so you can mouse over unique identifers and check availability.




Watch the screencast here

In short, it allows users to interact with library resources using multiple methods even if they are not on the library page.

Most of these features work on all web pages and are independent of Wikipedia, but the support of COINS means there is some Wikipedia support. COINS without getting too technical, is a way to markup citation data in html so tools like Libx and Zotero can understand or parse the citation and use it to connect to full-text via your link resolver.

Or rather there *was* support, as of Nov 2012 COINS support was sadly removed 


3. Wikipedia "book sources"

Either method above relies on users on installing something but most users will not. How about something built-in to wikipedia?

There's apparently some feature called "Book sources" in Wikipedia .

It says

"This page links to catalogs of libraries, booksellers, and other book sources where you will be able to search for the book withISBN. If you arrived at this page by clicking an ISBN number link in a Wikipedia page, then the links below (those labeled "find this book") search for the specific book using that ISBN number."



Confused? Here's how it works. Go to say the Wikipedia article Eulerian path

You will see the following




Now click on the isbn and it brings you to the page that says Book sources , this page in particular

If you jump to the section on Singapore you see




Click on it  and you guessed it, a ISBN search in your catalogue. In this case, it works nicely as we have it.





.

I only realised there was such a feature when I was looking at referrers to our catalogue and noticed a fair amount of them from Wikipedia.

Some were ordinary links in the "external links" page but some were isbn links.

This is a nice fairly obscure feature but really isn't very convenient to use if you ask me.


4. Wikipedia "Library resource box"

All the things I mention above are not new. But this last one is new. Rather then explain, let me just show you the Wikipedia article I inserted the library resource box. In this case this is the Wikipedia article Japanese occupation of Singapore .

At the bottom of the page in external links section you see this including the box I added.




It you click on "resources in your library", and this is the first time doing it you will be brought to a library selection page.






Obviously you pick the library you are with, or better yet "set a preferred library for future searches" and when you click it will use the Wikipedia title to do a search in the library you selected.

In this case is our Summon search.

As you can see it's a very nice search result, showing off the strength of our collection including local theses, books etc.




In fact what it uses to search is sometimes much more complicated than just using the Wikipedia title. For example, you can override the search to use a Library of congress heading search instead of the Wikipedia title by adding

|lcheading=xxxxxxx .

You can see this in effect here

Other times it does the closest mapping to a file of LCSH kept in the system etc. It also can use viaf , I believe and if all else fails it just uses the Wikipedia title for a general keyword search. I am not sure if I got the explanation 100% correct, but I think you get the idea.

It also can do something special if the article on Wikipedia is on a person. Below shows the one I inserted on the Wikipedia article Goh Keng Swee . 





You will see because I changed the options there is now a "About Goh Keng Swee" as well as a "By Goh Keng Swee".

If you click on the links below the two, you of course get different results.



The above is a link "about" him. It's a normal keyword search. Note the search uses a LCSH even though the Wikipedia title isn't that exactly and I didn't override the title search manually with a specific LCSH, this is some automapping mechanism I think.

The one below shows the results after clicking "by" him which obviously does a author search.




Personally I think this is a wonderful idea. The author of this system, John Mark Ockerbloom in my opinion has hit on a great idea. You can see the blog post where he sets out the idea here. Specific instructions on adding the library resource box are here.

But how do you get your library into a list of libraries that appears when you click on the link? You simply request it and John will add it. He has been very quick to add libraries (most libraries use standard systems, eg, we use Summon which is used by over 500 libraries) and has also kindly and patiently answered all my questions about this great idea.

The great thing about this is that once I add the box in an article, all libraries in the system benefit. Right now, we are the only Singapore Library available in the list but it's trivial for John to add other Singapore libraries such as the National Library Board's etc and we all benefit!

I've added a little under 100 articles on mostly Singapore topics. I was cautious as well checking if the search would give reasonable results. Part of my strategy has been to look at the most common searches in Summon, Google the same keywords see which Wikipedia articles appear and add the library resource box on those articles. 

Of course this resource box compliments the strategy of inserting direct links from open access/free resources or libguides into wikipedia and using the google site operator I can see this is a pretty popular strategy by some libraries. But in some cases you might have not really have something unique to link to, or many have lots of interesting items and is too much effort to include them properly in the article.

I intend to add more as time permits and obviously I am studying our summon logs to see how much traffic is driven to Summon this way (hopefully not too much by this blog post). The skeptic in me wonders if people will click on such links, as it is usually placed in the last section? Or even in short articles, would they want to click to search the library? 

Only time will tell.

Note: I just finished blogging this, and noticed that the comments to the blog post pretty much encapsulate this blog post including the first 3 ideas , but I hope this was still useful.










Sunday, February 24, 2013

What does it mean to be a librarian? I am not sure.

Time for some navel-gazing!

Sarah Kennedy asks What does it mean to be a librarian?

You know what? Not sure what it says about me that after 5 years in this profession and writing hundreds of blog posts, I have not once come around close to even this topic.

Many of you who are regular readers of my blog probably know or can tell that it seems I love my job and I have even fallen into the trap of describing myself with the "P word". But now that I think about it, I am not sure if this is even accurate.

I enjoy the fact that I am always learning and trying new things, playing with ideas, and currently I am in a position where I have sufficient autonomy to push for change. Also having spent more hours than I want to think of, studying, researching and experimenting on librarianship, I have gone past "The Plain of Suckitude” for many aspects of what I do daily, so I enjoy a feeling of competence exercising my professional skill.

Still, none of this is specific to being a librarian. One of the most important aspects of been passionate is the belief you are doing something worthwhile, whether be it to change the world, or even just a single's person life.

I read inspiring stores from my colleagues (if I may call them that) in the public libraries all around the world, about how they helped the less fortunate people of our society with job hunting, the less IT literate connect with their children using technology etc and I feel somehow I have fallen short because I have few such stories to tell.

How about the fact that as a librarian, I am in one of the oldest and noblest professions and in the line of the "guardians of human knowledge"? Does that give me purpose?

Actually, such descriptions always make me giggle, making me think one has watched too much "The Librarian" (also another time someone asked about "Indiana Jones" type adventures librarians tend to get!).

Seriously, it is not everyday, someone gets to save the Timbuktu’s priceless manuscripts. Even day to day, due to my job scope and current position which has nothing to do with preservation and digitization,  I never felt as if I am one of the "guardians of human knowledge" whose role is to "Safeguard and preserve human knowledge so it can be safely passed down to the generations to come".

Though at times, when I page though old books, or see recording/writings by past librarians, I do get a sense of history ..... and hope that I am not messing things up too badly the works of my predecessors, particularly since my institution is the oldest academic library in Singapore with a 100 year history.

I guess it would be only a slight stretch it to say I have been trying to build services since I began my career , but sometimes I wonder, given the limited resources each library has, whether I am diverting resources and more importantly attention from collections for a short term gain. Would the future generations care whether our library had a chat service, had built a good community around social media, or even had a discovery service that was used for 5 years before it was replaced by yet another round of "superior" technology etc?

How about the role of librarians has activists to help make society better by making information available for all? Currently, if you ask me, in our profession, there is one big cause that has the potential to make the world a better place - Open Access. 

While there are many many librarians on the front-lines trying to push for open access such as Barbara Fister (whose writings always make me think), I have to admit to my shame, I barely noticed this aspect of librarianship until recently.

Though now that I have spent time learning and thinking about it, I think it is blindingly obvious the right thing to do so, not for ourselves as librarians but for the world. Yes, I just read that last sentence back and it felt so cliche.... I understand of course, the status quo which includes not just publishers but also librarians and academics themselves will make it hard to achieve and such change if achieved will have both winners and losers.

To be fair, librarians alone almost certainly can't push the academic world to open access, because ultimately the academics themselves will have to say they want it and there is legitimate concern among librarians that open access will make some aspects of librarianship obsolete and who knows we might even risk eliminating our own profession. So again I am not sure , and really I don't do  grandiose big causes..., though I have a certain smaller vision and goal I am working towards.

Truth is, I didn't come into this profession trying to change the world, or even in the belief that it would help people lead better lives. It isn't glamorous to say this, I didn't grow up wanting to be a librarian (I know a few librarians like this, they tend to be a different breed), like most librarians, I suspect, we stumbled upon this profession while looking for something to do and in my case at least found that I took to it like a duck to water and I enjoy the work, so why not?

I can't give you a big speech on why being a librarian is a noble profession (is it nobler than being a teacher, nurse etc?) , or why our work will change the world for the better, or even that I am definitely not in it for the money (though there are easier ways to earn money eg finance, if you manage to get to the senior management levels the pay is pretty comfortable with a lot less stress and competition than other professions).

I don't even have a well articulated statement on why I want to be a librarian, something I am told some LIS schools are teaching their students. (I am lucky, I got a job without it)

All I can say is I wake up every day because I have so many exciting things and ideas to do,  and I always try to improve and fix problems to make life a little easier for our community; I feel a absurd sense of pride when I see my colleagues using things I have been involved in introducing , and maybe that should be enough?

I would guess (and here's the cynical part), the vast majority of librarians don't worry about such philosophical questions either and many go on to achieve great success in their careers whether in terms of professional advancement and/or accomplishing great things for the profession.

So what do you think? Do you have a well articulated statement of why you are a librarian? Or are you similar to me, enjoy the work without thinking too hard about the ideals of librarianship?

PS : For those who read this blog post for ideas and latest on tech and not personal rambling, do check out Qwiki , it's a free app that reminds me a lot of animato  that allows you to quickly make interesting slideshows with your pictures. Another nice app to look at is the Sunrise app, which is now my default calender app, while Any.do is my default task list and so far it looks like it is taking as my task app. 

















Sunday, February 3, 2013

8 Library related book & article recommender systems I am aware of

In a recent LibraryThing blog post entitled Pew study: Library patrons want personalized recommendations, they noted that in the Pew study , 64% of patrons interested in a library service which suggested books, audiobooks and DVDs to them based on their own preferences.

Though I believe this report is about public library users, I suspect this applies also to academic library users to some extent.

This also reminded me of a panel I moderated at IATUL 2012 last year, the panel was designed to pick the brains of non-librarians and one of the panelists asked a room full of academic librarians why libraries did not have Amazon style recommendations.

The responses from the librarians in the room were interesting ranging from privacy issues due to the Patriot Act and lack of technology.

I think, by now most libraries or at least academic libraries have finishing implementing web scale discovery systems which is doing what Lorcan Dempsey of OCLC would call "aggregating supply". The next logical step is to "aggregate demand", using recommender systems.

While such recommender systems are not common yet to library or library related systems in both the public and academic libraries, they are not totally unknown, here are some I am aware of.

I start with recommender systems for books (mostly) and then move on to article level recommenders. I focus mostly on recommenders that make use of circulation data, or usage data of other users "People who borrow/read X also...", though matching based on other item characteristics might be included as well. Personalised recommenders that either ask for your explicit action to say rate books and make recommendations based on that or those that pull data from your past actions (e.g past borrowing, past action in reading articles, adding to lists) and match against other users are also included.

Ones that simply recommend based on themes or best selling lists are of less interest to me as they are not as personalised, so I left out apps like Gimme and the YALSA Teen Book Finder App (thanks @CarliSpina), and also non-library related recommenders - BookRx, WhatshouldIreadnext (I am less strict for the article recommenders)


1.  BookPsychic

As stated in the already mentioned blog post

"What is BookPsychic? Launched in August, BookPsychic is an easy and fun personal recommender system for library patrons—like Netflix or Amazon, but all about what’s in and what’s popular at your library. As you rate books and DVDs, BookPsychic learns more and more about your tastes, and comes up with recommendation lists. And everything shown or recommended is available at your library. Simple “bookstore” genres, like “Recent fiction” and “History,” help you zero in on the books you want."




You pick a library that is enrolled in Book Psychic, rate the book in the library system presented to you by Genre and you will see recommendations. Pretty simple and effective.




A nice touch is you can import ratings from other systems such as LibraryThing itself and Goodreads.



One thought does strike me, with so many "social reading" systems like Goodreads, which Lorcan Dempsey of OCLC would no doubt consider at the web scale level, a alternative strategy would involve libraries supporting these systems rather than building their own recommenders, and either providing those systems with holding data or at the very least provide a easy way to check/link if book is available.

This parallels the support of Google Scholar etc with OpenURL resolvers, except in the case of books this is even much similar with ISBN searches in Opacs.

2. Huddersfield Book Recommender system

Strictly speaking there are many types of recommenders and many ways to classify them, these range from those that are basically showing "similar books to X" on the item record of X and those that really track who you are or at least recommend based on your explicit ratings or loan records and adapt to your individual preferences (e.g Book Psychic above).  (There's perhaps a third type, where users explicitly give recommendations by manually adding "similar titles", or allowing others to follow what they loan out or rate.)

I suppose the "similar to x" type recommendations relating to each title are in a way in most opacs and discovery systems, since many allow a one click to items with same subject, author etc.

But "similar to x books" based on borrowing activities of people who borrowed X, is indeed an advancement.


That's what the Dave Pattern has done to create a homebrew recommender system for the University of Huddersfield Catalogue.

Below shows a recommendation for the book in the catalogue.  Example is suggested by Dave himself.






Sometimes the recommendations give odd results below is one for Men are from Mars, women are from Venus , this could be due to the lack of circulation data.
Add caption


David has written a few blog posts on this topic including showing the impact of this service


3. National Library Board (Singapore) Recommender - Sharealike system

The National Library Board (NLB) in Singapore here also has a book recommender system though admittedly I don't know much about how it works (please note I work at the National University of Singapore Libraries which is independent of the National Library Board).

There are some details in this recorded talk , where it is mentioned that NLB is unique in having lots of circulation data to mine through due to high borrowing levels in Singapore. So definitely some of the recommender system is based on circulation records, though I get the sense some is based on similarity in item characteristics?

In any case as a user I do see the following popup when I check my loan account.

I blocked out the titles I borrowed, but you can probably guess what I borrowed in the past.



Is this recommendations based on mining circulation records of other users who have borrowed similar items as the ones you have done, perhaps akin to the Huddersfield one or is it recommending based on other characteristics of items I have borrowed? Not sure here.

There are kiosks around in the public libraries, that allow you to slot books in, and it will recommend similar books - aka "Title Recommendation System".  Same comments as above is this based on circulation data?



That said, the National Library Board also has Read on site that allows you to check directly book recommendations and those are definitely based on circulation data.





Here's the entry for The Hobbit, where you can clearly see "Out patrons also borrowed the following"




I am unsure how "Quick Picks" is determined, but I would guess this is based on similarity in subject, author etc?




I was a bit curious about how they integrated this with normal catalogue and it seems it's handled in their new Primo system dubbed NLB SearchPlus

On each item record, there is a link to a "Recommendation" tab that actually links out.









 In the catalogue accessible on the mobile web site, you can see a "other related items" but I don't think that's based  on "our patrons also borrowed.."




4. Ex Libris Bx Recommender + ScholarRank

Academic library services like their public library counterparts generally don't have recommender systems. Offerings by Ex Libris systems seem to be an exception.

In the video below they talk about ScholarRank and how ranking is done.  The first few points are pretty standard but around 3:05 they talk about how ranking is based on user characteristics.

Example given is how a search for "Mercury" would give different relevancy ranking if done by a chemistry student as opposed to a student majoring in the music. The system also takes into account whether someone searching is a Phd student or a fresh undergraduate.




Very interesting, though I suppose this can only happen if the user has already authenticated?

Still the above merely changes the relevancy ranking, but what about outright recommendations for articles?

This is where the bx recommender comes into play.



Essentially, this leverages usage data from users of SFX, perhaps one of the most popular OpenURL resolvers in the world. By association of articles accessed by researchers via SFX in the same session , they are able to create recommendations essentially "researchers who searched/accessed this article also....."

The interesting question is where to embed these recommendations, the video mentions Primo and some other ILS can put such recommendations next to results. Below shows an example from Central Michigan University Library using Primo Central.

If Recommendations are available there will be a recommendations tab you can click on.



But more commonly I see it appearing in link resolver screens typically for libraries using SFX by the same company. But it can be done on other link resolver systems. Here's an example by QUT using 360link.



5. Google Scholar Citations

One of the things that initially escaped my notice was that after creating a Google Scholar citation profile, Google Scholar will start recommending items.





The main limitation of this is that it is recommending based off articles or works you added to your Google Scholar citation profile.

Below it recommends a article on wikipedia because I have an article published relating to that done when I was in library school.




The issue with basing recommendations just off published works you have done is, often by the time you have published something you pretty much scoped out the area and are actually least in need of the recommendations because you did the literature review already!

It's true this features helps you keep uptodate about developments in the field after that, but it hardly helps at the start and for new researchers just starting off or even established researchers seeking to expand to a new area this feature is a non-starter.

I wondered if one could work around this problem by adding works you find interesting to a profile to see what recommendations popups but keep the profile private, but this doesn't work since recommendations work only for public accounts.


6. Citeulike + Mendeley

Instead of trying to abuse Google scholar citations, one can probably use Citeulike (which was recently acquired by Springer) to generate recommendations.

Unlike Google where you can add only what you published, this is based on items you add to your library of items you are interested in.




How does it work? The settings have some information



The other system that is very similar is Mendeley, which I have written on before.

There are 2 types of recommendations. The first is available via the web version and shows "related research" link for each item shown. Note, when you click on the links, some will have no recommendations.



Mendeley also includes a "Mendeley Suggest" feature, though it is only available to premium account holders in the desktop version 

Before are some details.





7. Read by QxMD

Currently the Krafty Librarian blog is tracking and reviewing 4 Medical related iPad app service namely

These four are pretty much positioned as Flipboard for Medical Journals (or all academic journals in the case of Browzine) and are a important development due to the high tech usage in the medical world.

However, for the purposes of this blog post, I am more interested in apps that are positioned as a Zite-like app for academic articles. Not only must the app, aggregate articles and display them in a easy to read newspaper/newszine like format, they must also learn from what the users select to read, their thumbs up and down and use machine learning techniques to customize the articles to display.

I am not sure, which of these apps are closest to Zite, but as noted in the review on the Krafty Librarian blog, QxMD makers of Read seems very proud of their "machine learning" etc technques '

"“Rather than simply relying on our users to tell us which journals they want to read, we use a combination of machine learning, semantic analysis, crowd-sourcing and proprietary algorithms to figure out which articles our users should likely be reviewing.”

comments at http://www.imedicalapps.com/2013/01/flipboard-medical-journals-read-qxmd/

Read together with the privacy policy, and the ability to "thumb up" or "thumb down" it seems on some level they must be doing recommendations though I am unsure how personalized it is.



8. Others/pilot

There are plenty of other pilots and proof of concepts for recommender systems in the library world of course. This includes


In years gone by, I also blogged about attempts to make your own recommender using a bayesian filter of RSS feeds. , though with more and more apis available, there are more options now including a interesting idea here to use this idea combined with information drawn from your Mendeley library using the Mendeley API  .



Conclusion

I have no idea how good in general these recommendations are, in any case when I asked on Twitter for any library related recommender systems my network might be aware of, one of the wry replies was "yes, the librarian".

Indeed, just as the meme that states Librarians are the original search engine, Librarians are also the original recommender systems.

There was also another piece of irony that passed me by until now, to write this post on recommender systems, I had to ask recommendations from people on the Twitter network. So should I include Twitter in here?




Share this!

Related Posts Plugin for WordPress, Blogger...