Following yesterday’s post about Google Books’ failure to present links to the full online text of Le Vingtième Siècle, a futuristic novel by Albert Robida published in 1883, I was thinking about just how difficult it had been to find those links. Lots of clicking and human natural language processing was needed, which seems to suggest that search engines still have a way to go before they can provide good results for such queries.
Eric Rumsey made an interesting comment that got me thinking it might be interesting to do a comparison of different search engines to see which if any are best for finding online books. Since Le Vingtième Siècle was rather difficult to find, it seemed like a good test case.
To my knowledge there are two places where you can find the full text of this book online: the primary source at Gallica, which has image and plain text versions and this page on Gloubik, which also links back to Gallica. The question is will any of the Internet search engines find them?
Here are the results:
The search terms I used for all searches were: Le Vingtième Siècle Albert Robida
Test with Google Book Search:
- No links to full text.
- The Twentieth Century – Google Books Result (link to the preview of the English translation)
Test with Google Web Search:
- No links to full text in first 10 results.
- #6: Albert Robida : Le Vingtième Siècle – La Vie Électrique (link to pdf of a different book with a similar title)
- #10 The Twentieth Century – Google Books Result (link to the preview of the English translation)
Test with Google.fr Web Search (since the work in question is in French):
- #6: Albert Robida : Le Vingtième Siècle – La Vie Électrique (same as above)
- #11 Albert Robida (this page has a link to the Gloubik page that has the pdf links)
Test with Bing:
- #7 Le Vingtième Siècle. La Vie Électrique/I/1 (another pdf of #6 above)
- #9: Albert Robida : Le Vingtième Siècle – La Vie Électrique (same link as #6 above)
- #18 Albert Robida (same page as above that links to Gloubik page with pdf links)
Test with Yahoo.fr:
- #5 Albert Robida (link to Gloubik page with pdf links and link to Gallica)
- #6 [PDF] Albert Robida – Le Vingtième Siècle (
pdf of part 3broken link to third part of pdf)
Test with ask.com:
- #6: Albert Robida : Le Vingtième Siècle – La Vie Électrique (same as above)
Just for fun,
Test with DuckDuckGo:
- #12 Albert Robida : Le Vingtième Siècle – La Vie Électrique (same as above, the result is #13 if you count my blog post, which comes up in the search)
Test with Hakia:
- No links to full text in first 10 results.
Test with Yebol:
- #8 Albert Robida (link to Gloubik page with pdf links and link to Gallica)
- And this tweet from Mike Cane which links to Le Vingtième Siècle-La Vie Électrique (and which will probably disappear from the search results shortly as it is already 4 days old):
@doctorlaura BING!! –> Albert Robida : Le vingtième siècle – La vie électrique http://t.co/uVzmsBw
Finally, here’s a special case, Evri. It’s special because while it did not return any results with a link to the book, it seems to be the only search engine in addition to Google Books that knew Le Vingtième Siècle is a book. In fact it showed me before I even finished typing in the query.
While Evri didn’t show me any full text links, and 9 out of the top 10 links were to Wikipedia pages that mention Le Vingtième Siècle, I think Evri has a lot of potential. I’ve been planning to do a post about it, but perhaps now someone else will do it so I won’t have to.
So what’s the bottom line? Of the ten web searches, two listed the one page that has links to all the available online pdfs: yahoo.fr and Yebol. None of the searches presented any direct links within the first dozen results to what should be the most authoritative and reliable source, the Bibliothèque Nationale de France’s digital library site Gallica.
Of course, it’s not possible to draw any generic conclusions from a single test, but if you’re looking for full text online, it might be interesting to try different search engines and compare the results. If you have done similar experiments, I’d love to hear about them.

What a frikkin mess. What all of the search engines should have done was ask first, Do you mean the book? This cries out for semantic web and solid metadata and all of that.
Also, did you try the Wikipedia entry on Robida? Sometimes those have the best direct links!
Exactly, and that’s what Evri does, except that it doesn’t have all the indexing and solida metadata. Yet.
I did check the Wikipedia entries, but unless I overlooked something, I did not see any links to the online text.
Very interesting, thank you!
A few specific observations first:
*** I searched in Google Books for a specific phrase in the book (p 6: “du bureau et les communications furent”) and GBS does find it in snippet view — Apparently 1981 a republication.
*** In the Yebol search result that you cite — the #8 link does not make it easy to see that it includes a link to “Le Vingtième Siècle” — It seems likely that most people would miss it.
*** For the Yahoo.fr search — I repeatedly get an error message for this:
#6 [PDF] Albert Robida – Le Vingtième Siècle (pdf of part 3)
General comments:
Within the (admittedly provincial) American Google world, this book seems pretty obscure (is it not so in France?) — I hope you’ll repeat your tests with something more in Google’s usual scope.
As I mentioned before I think, Gallica and Gloubik are relatively unknown to the world of Google and Wikipedia. So I hope you’ll write more about them.
Thanks for pointing out the error in the yahoo.fr search result link. I must’ve mixed up the the links when I tested them, and I thought it was referring to a valid Gloubik page. I’ve corrected the post.
Interesting too your search for a specific phrase in Google Books. I had tried the same thing, with something very specific to the book: I searched for “Colobry,” the last name of one of the characters. I’ve just repeated this search, and oddly it does not find the 1981 edition. It does list the English translation, but I can’t actually preview the text because the page isn’t available.
Here’s a Google Translation of the #8 search result from Yebol. Some of it is fairly mangled, but the first sentence starts “The copy of the twentieth century put online by the National Library…” that seems fairly clear, as do the links to “Full Text,” “Part I,” “Part II” and “third party,” each of which lists the file size. Even in French, it’s not easy to miss that.
Finally, I do admit that this book is rather obscure (unfairly so!), but I think that makes it all the better as a test. In any case, it does make things more tractable as there are not too many versions online to be found. At the moment, unless something new comes up, I’m not planning to do any more experiments like this.
I have found Buzzdock realtime search to be really helpful when I’m looking for text from books. Buzzdock has Amazon as one of its sites that it searches when it gathers results so it automatically picks up results from Amazon that might contain the keywords. It hasn’t failed me yet, so I really like it.
As far as I can tell, Buzzdock seems to be yet another browser plug-in that does something similar to Google Custom Search, letting you search across multiple sites, including Twitter, within a simple results window located under the search box on your results page. So, it’s not a search engine, but rather a search aggregator and results presentation engine.
That being said, the subjet of this article is finding online full digital versions of public domain books. For this purpose, the only two Buzzdock search applications that are likely of interest are Amazon and Evri, which is arguably doing something similar to Buzzdock, but adding value through the semantic results processing. You are correct to point out Amazon as a potential source for public domain works, and it does provide search within a number of books, however Amazon’s public domain collection is quite limited; Amazon is more interested in selling books than giving you a free one. In the majority of cases, and especially if the book is popular, Amazon does not provide a free version of public domain books that are also for sale in Kindle editions. So unless you are planning on buying a Kindle version of a public domain book, Amazon doesn’t seem to be a very likely place to look for it.
In the present case, Amazon does not have a free version of La Vingtième Siècle, although you can happily search within the for sale English translation.