Sunday, November 10, 2013

Is Summon alone good enough for systematic reviews? Some thoughts.

Edit : Read these speculations with caution, actual tests needs to be done. After posting this speculations, I did a couple of actual tests duplicating the *exact* limited searches done for Google Scholar but in Summon, and the results in a few examples (not all) exploded even with restrictions to journal articles + limited disciplines (e.g Medicine), so precision with Summon might even be *worse* than Google Scholar with the very same search statement! 

In other cases, Summon yielded less results than Google Scholar with the exact same search statement but at a big decrease in recall. 

Attempts to use the more advanced search features in Summon to include wildcards and longer search statements not possible in Google Scholar, actually exploded the search even further. 

Even though I am not a medical librarian, I have read with interest the recent paper "Google Scholar as replacement for systematic literature searches: good relative recall and precision are not enough" by Martin Boeker, Werner Vach and Edith Motschall.

The paper

  • translated search strategies used to find relevant papers in past systematic reviews into Google Scholar equivalent search statements (as close as possible anyway)
  • Checked how many relevant papers were found (the papers found in the original systematic review is the "gold standard" of what is considered relevant)
  • Calculated the recall and precision of using Google Scholar as compared to traditional systematic review methods of searching multiple databases (typically Medline, Web of Science, Cochraine Library etc)

The results aren't particularly surprising, as argued by many other papers and blog posts , despite Google Scholar's large nearly comprehensive coverage of studies that allows it to pick up the papers using just one source (93% recall in this paper),  Google Scholar has many weaknesses making it unsuitable for use in systematic reviews alone.  In particular the lack of precision due to lack of advanced search features is a big one.

As I read through the paper, which is the most comprehensive one I have seen detailing the various weaknesses of Google Scholar for systematic reviews, I couldn't help but think how many of the critiques in there would parallel that for Summon.

In the past, I have blogged about how How Google is different from traditional databases and later I mused about How library web scale discovery services in particular Summon are closer to Google and Google Scholar , but not quite there yet.

On one-hand Summon has many of the same characteristics as Google Scholar. With breath unmatched by traditional databases, it was designed also to maximise recall at the cost of precision with features like auto-stemming which makes it feel google like.

But on the hand Summon does have more advanced search features (though a bit well hidden) and stability of results and more transparent sources.

So how does Summon stack up? Let me go through the critiques against Google Scholar and see if they apply to Summon.

Here's a short summary of some issues in Google Scholar.

  • Maximum 1,000 results, 20 results per page - Summon same limitations, 50 results per page
  • No bulk export - Summon same limitation, Zotero allows export of results by page for both
  • Lack of search history -  Summon same limitation.
  • Limited advanced search interface - Summon 1.0 same, Summon 2.0 is better
  • Lack of truncation and advanced field searches - Summon is better
  • Inability to nest logical operators more than one level - Summon is better
  • Limited query length to 256 characters - Summon does not have this limitation
  • Autostemming leads to lack of control - Summon has same limitation
  • Reliability and stability of index - Summon is better with more transparent listing of sources.

Overall : Reasons to believe one could better translate traditional complicated search strategies to Summon which might result in better recall and precision (assuming the full index of Summon is comparable to Google Scholar), but need an actual study to confirm, which will take a bit of expertise to translate search strategies and even more time to look through the results.

But similar to Google Scholar, limitations like maximum 1,000 results, lack of bulk export might make this moot anyway.

For more detail, read on!

Let's start with graphical interface features.

Quotes below are from provisional PDF of "Google Scholar as replacement for systematic literature searches: good relative recall and precision are not enough" by Martin Boeker, Werner Vach and Edith Motschall allowed under BioMed Central Open Access license agreement. 

"Not more than 1000 results of the complete result set can be displayed in steps of 
maximum 20 results per page."

Google Scholar can show 20 results per page

At 20 results per page, Google Scholar stops at page 50 = 1,000 results

Many people are surprised to know that regardless of the number of results Google Scholar finds, you can see at best 1,000 results and it won't show more results. (Google is similar though it's variable in terms of where it stops showing results).

In Summon 1.0, one can increase maximum number of results to 50 per page compared to 20 per page for Google Scholar. But you can't get more than 1,000 results.

At 50 results per page, Summon 1.0 stops at page 20, click "next" and you get an error.

Error when you go past 1,000 results in Summon 1.0

In Summon 2.0, there is no concept of pages, with the so called "infinite scroll" feature, I can't tell if this limits to 1,000 but might be moot (see below)

Frankly if there is any one reason not to use Google Scholar, this one alone would be sufficient, since many searches done would have >1,000 results. Still, let's press on.

"No bulk export of results is available. Results can only  be exported into reference management software (e.g. ZOTERO)"

Yet another killer, since you need to mass export results you get, ideally all results with one export.

With Google Scholar, you can go export item by item, for mass export the only option is to pair it with Zotero so you can bulk upload page by page of results. Unfortunately with Google Scholar you can see only a maximum of 20 per page so it may take a while to export everything. (Wild idea use Publish or Perish software to get everything in one shot? Though limited to 1,000 results plus search limitations?)

Using Zotero with Google Scholar to mass export all the results on one page at a time

Summon 1.0 is exactly the same with no bulk export. You can use it with Zotero, exactly as in Google Scholar, so you can bulk export all the results in one page. Here it works slightly better than with Google Scholar, since you can set the page to display up to 50 results per page as mentioned.

Using Zotero with Summon 1.0 to mass export all the results on one page (50) at a time

Summon 2.0 has mentioned has "infinite scroll". I think this feature pretty much kills the possibility of quick bulk export. Or would I research, keep scrolling down...until the end, and then exports by bulk with Zotero?

Kinda moot now cos zotero does not work with 2.0 currently.

Lack of "a history function which temporarily stores retrieval results for incremental refinement of search strategies"

This is important in systematic reviews of course, gives you more control and makes controlled searches and exploring of search strategies easier, but this is lacking in Summon currently as well. As a sidenote some web scale discovery services like EDS do support this.

"It is not achievable to construct all possible expressions in the advanced search interface 
due to the limited number of available entry fields. Only one field for each type of 
expression (conjunction, disjunction and conjunction of phrases) is available"

Google Scholar advanced search screen

Assuming I understand this objection properly, Summon 1.0 is even worse, you can't even do that in advanced search since all the fields are an AND function.

                                                Summon 1.0 advanced search screen

In Summon 2.0, the advanced search is much improved, with a pull-down menu covering
  • abstract
  • title
  • publication title
  • author
  • date
  • full-text
  • subject term etc
It also allows you to add additional boxes if necessary connected by logical operators.

                                              Summon 2.0 advanced search screen

Some fields of the advanced search interface are not available in a search expression as a 
keyword or field indicator. Whereas authors can be specifically searched for with the field 
indicator ‘author’ in an expression like ’author:“author name”, the date is not accessible by 
a field indicator.

This refers I think to the oddity where you can use advanced search in Google scholar to restrict by date or publication title, there is no equalvant way of getting to it by keyword syntax.

Summon doesn't have this oddity, though in Summon 1.0, the advanced search only gives you limited fields to search with
  • title
  • author
  • publication title
  • isbn/issn
  • date
All of them can be done using by search syntax alone in basic search.

In fact, many other fields can be searches via if you know the syntax, so like for Google Scholar, to get the best use of it you had to construct the long complicated search on a text editor than transferred it to the basic search.

In Summon 2.0, the advanced search is much improved, with a pull-down menu covering
  • abstract
  • full-text
  • subject term
  • doi
  • etc
It also allows you to add additional boxes if necessary.

The paper mentions that the lack of a search expression builder and search history is desirable though can be tolerated by advanced power users. 

So let's start looking at other issues.

"Search expressions were limited to a length of 230 characters due to the restriction of a 
total of 256 characters"

This is a barrier to creating complicated search strategies in Google Scholar as it has a character limit for length of search queries. This is a big deal because many search strategies needed for systematic reviews are extremely long and complicated.

The paper notes the median length of the Medline searches are 777.5 characters. But because of the character limit in Google Scholar, they had to simplify searches and in the study the median length of the "translated" Google Scholar ones had a median length of only 187.5!

This is a big limitation of course.

As far as I can tell by some testing, Summon does not have the same limitation as Google Scholar. If the query is long enough, browser limitations on processing long URLs (typically >2000 characters) will start to come into play, but this isn't a limitation of Summon per se.

"Terms in Google Scholar are complete single words (truncation is not possible)"

Based on support files, Summon allows truncation (but not within quotes) and proximity operators (without taking into account order).

"Google Scholar applies automatic stemming to terms where the stem is recognizable for Google Scholar. However, this mechanism might not be reliable for domain specific language (e.g. the medical language)."

Summon does autostemming too. My understanding is adding quotes around terms in Summon gives a higher relevancy boost  to items with the exact terms in quotes but does not remove autostemming per se. So here it is similar to Google Scholar and may not be as precise as you may want.

"Logical operators can be used, though only without nesting of logical subexpressions 
deeper than one level."

The paper also warns that "correct interpretation of logical connectors" still needs to be improved, with Google Scholar often giving illogical number of results.

As far as I can test, for Summon, you can nest boolean operators to more than one level. Still, there are indications in the support files that imply complicated nested boolean searches might sometimes give odd results and to report them if seen.

A well known example was where adding quotes would occasionally give MORE results, as reported in this article. So for example

sheep dip flies

would give you less results than

“sheep dip” flies

My understanding of the issue was that Summon by default would do an implied proximity matching (within 200 words) for three or more searches terms if they were found in full text in an attempt to filter out totally irrelevant results where words were extremely far apart in the full-text, but would switch this off when quotes were used.

In any case in the latest versions, this issue is resolved.

"The currency of Google Scholar may not be very high for some resources. The update 
period for certain resources is up to nine months. Although research results indicate 
very high coverage of Google Scholar, the exact coverage is not known. Google itself 
states that it does not index journals, only articles, and does not claim to be exhaustive."

I am not aware of any study to measure how current Summon is with indexing (though no doubts libraries evaluating Summon and rivals would have done some testing). That said, Serialssolutions claims to index  periodically, appropriate to the type of material (e.g if the journal is monthly, it will be indexed monthly).

Also Summon claims to index at the journal issue level, so this differs with Google Scholar, and one would expect more consistency here. 

Reliability and stability of search results over time and place is not sufficient

A somewhat related critique of Google Scholar for systematic review searches is how unstable the results are. Due to way Google scholar works by crawling the web (including sites allowed by publishers, institutional repositories and even normal author personal homepages), articles may drop out of the index suddenly when the page becomes unavailable from crawl to crawl. 

"GS’ changing content, unknown updating practices and poor reliability make it an inappropriate sole choice for systematic reviewers. As searchers, we were often uncertain that results found one day in GS had not changed a day later"

Summon and most web scale discovery services would presumably be less prone to this since they don't trawl the web for articles (except for Institutional Respositories using OAI-PMH).

Also while Summon isn't 100% transparent on what is covered in their index as some would wish, they do produce a list that covers by journal title, the coverage and level of indexing (metatdata only or full text).

Difficulty of translation of standard search strategies from Medline to Google Scholar syntax

This pretty much drives the conclusion (see next point)

Google Scholar is extremely limited in terms of what can be done for searches, this results in extremely imprecise searches compared to what can be done in Medline (whether via Ovid or pubmed), Web of Science etc. 

In this aspect, Summon seems to lie in between Google Scholar and other databases

Unlike Google Scholar, Summon can handle 
  • Truncation
  • Proximity
  • Multiple level of nesting
  • Search queries >256 characters
  • More fields for searching including subject terms, abstract (Summon 2.0)
  • Filtering by content types (eg journal articles), discipline (Medical, Economics etc)
But it's still limited compared to the search syntax you can do in Medline, if say you want to use the MESH headings, theasuri etc, Summon doesn't have controlled vocabulary at all for focusing, exploding etc.

On the other hand as noted in an earlier paper critiquing Google Scholar, lack of a Google Scholar "search filtering option to limit the scope of search results ‘by discipline’ such as ‘health and medicine’" can leads to explosion of results.

Summon does have a discipline facet  for medicine, biology etc, though it's unclear how accurate it is to use.

As I am not a medical librarian, I can't tell if Summon's search feature is sufficient, though looking at the discussion in the paper of examples where the translated Google Scholar strategy fails, the main issue seems to be
  • Lack of truncation support in Google Scholar
  • Google Scholar's limited query length restriction
So it's likely Summon's search features might in fact be sufficient since both do not apply to Summon.

Google scholar has good recall but much worse precision 

The meat of the whole paper is this. The results show that as expected that recall is good and 93% of relevant results are retrieved, but due to the lack of precision in searches allows by Google Scholar (see above), this can lead to a lot more effort wading through the results to pick them up. (In fact, the same limitations prevent 100% recall).

"Our investigation suggests that due to the low precision of Google Scholar searches a user 
has to check about 20 times more references on relevance compared to the standard approach 
using multiple searches in traditional literature databases. In the majority of cases this implies 
for checking 10,000 or more references."

I got into a twitter discussion on how even this statement is misleading or at least impractical because Google Scholar cannot currently get to more than 1,000 results. That's a very good point though in defense of the paper it does recognize this calculation as "completely hypothetical".

I feel the paper does make a interesting point here, that is easily missed due to the title.

The low precision of Google Scholar is not necessarily the main argument to avoid using it for systematic reviews!

Why? While you might need to check 20x more references when using Google Scholar alone, compared to traditional systematic review techniques, you save on time in other ways.

For example, traditional methods require that you query multiple databases and translate the same search to different databases. You need to spend time to dedupe results from these different sources etc. All this is additional time, that might off-set the 20x lack of precision.

Using this argument, like Google Scholar, Summon might also still be worth using even if is less precise than traditional methods, it all depends on the numbers.

But how would the recall and precision figures for Summon stack up compared to Google Scholar?

There is in fact reason to suspect, the precision might be better due to the ability to craft more controlled searches, but again without doing a formal study this is not definite. It may be possible, that in fact Google Scholar's better relevancy ranking can offset this, so if one is restricted to only the top 1,000 results (which is in fact true), Summon might be worse.

Also, is there reason to suspect if Summon would yield as high a recall as Google Scholar? Sure the ability to create long comprehensive search syntax would allow one to pick up more papers (eg long list of drug names) but is the index coverage of Summon as good as Google Scholar's?

This is unclear to me, presumably in Summon you would use "add results beyond your library collection" to use the full Summon index and perhaps exclude content types that are not relevant. Combine that with a login to your institution to get in as much A&I content (mostly Web of Science plus proquest A&Is, don't think Scopus results are in yet despite this announcement) and maybe use the discipline facet to further refine down.

But even that would be relatively small index compared to Google Scholar.

I am also unsure if Summon index includes the whole Medline or even Pubmed index or cochrane library etc. 

But again, we need a formal study.


All in all, the results are not surprising, despite lacking many of the Google Scholar flaws, Summon isn't good enough to use alone for systematic reviews though there are reasons to suspect it might allow more precise searches due to Summon's superiority over Google Scholar in terms of the ability to do truncation, nested searches and much longer queries.

The degree to which this is true and/or the degree it will help precision is hard to tell without actually trying to redo the study with Summon as Summon still lacks some features like rich metadata and controlled vocabulary and related advanced search features for focusing/exploding subject terms. 

Still, the following lack in features which mirrors Google Scholar, makes it almost totally unusable even if true that Summon has much better precision and similar recall as Google Scholar.
  • Maximum 1,000 results in Summon 1.0 (infinite scroll in 2.0?)
  • No Search History
  • No bulk export
It is unclear if Summon will add such features though they don't seem particularly hard to do, since like Google, Summon isn't seen as a typical poweruser database, and such features might not be appropriate. 

The interesting thing is, while web scale discovery services are typically similar enough to discuss in a broad brush, in this case, what I discuss for Summon does not necessarily apply to other discovery services like EDS or Primo Central.

For instance EDS does include search history, does not seem limited to the 1,000 result limit and of course has a totally different set of facets and search filters and may have important medicine sources like Medline, Cochrane library databases included. 

BTW If you want to keep up with articles, blog posts, videos etc on web scale discovery, do consider subscribing to my custom magazine curated by me on Flipboard.

Share this!

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Related Posts Plugin for WordPress, Blogger...