Google Alerts are a popular way of keeping up with favoured topics on the Web. However, they are not comprehensive or complete, so I’m also in the habit of doing daily searches for specific topics, and narrowing down by date. Unfortunately, it is very easy for website owners to manipulate things so that their pages reappear in search results day after day, even when the user restricts results to pages changed in the past 24 hours and the pages haven’t actually changed. It is possible to bypass this problem, to some extent, by using browser plugins or Google’s block feature; although the latter doesn’t always work.

Based on my experience, while most of the sites whose pages are “updated” every 24 hours appear to do this as a deliberate strategy to pull in more visitors, some seem to be legitimate sites that are just badly designed.

This past week or so, my daily searches have become unusable. From a fairly consistent set of 30 or so results every day, I’m now getting 150 or so results using the same search parameters. All due to one very legitimate site. I won’t mention the name of the site or the strategy it is using as I don’t want to provide information that may be used to trick users. (Although I’ll tell you if you email).

Anyway, I’ve blocked this site from my searches; shame, as it often contains very relevant information.


In the Younger Techapillan’s cancer journey this past while, Techapilla has been constantly amazed at the depths to which some people will sink to attract traffic to their websites. The latest scam to trigger the Techapillan ire is the existence of “content mills”, which provide free content that can be copied and pasted into websites without acknowledgement of the content mill. Frequently, such content carries the byline of the fraudulent website owner.

As a web-savvy academic librarian, Techapilla is well aware of the existence of content mills and similiar sites, such as those that sell essays for assignments. This time, having noticed that Google Alerts was throwing up the same content for “osteosarcoma”, with exactly the same misspelt word, on an almost daily basis, Techapilla was inspired to do some further investigation.

Techapilla is not going to promote content mills or fraudulent websites, so no links provided. But here are the Techapillan findings –

  • 2970 Google hits for the misspelt phrase (“when doctors access osteoarthritis and osteoporosis”). This gets whittled to 43 when similiar results are omitted. Examination of these 43 results reveal that all articles obviously come from a single source
  • 335 Google hits for the corrected phrase ( “when doctors * osteoarthritis and osteoporosis”). This gets whittled to 63 when similiar results are omitted. Once again, examination reveals that all 63 articles have a common source. Note that this search should have yielded more than 2970 results, since it is a broader search than the first – presumably the difference is due to Google using a different algorithm for wildcard searches. This search also highlighted slight wording differences among the articles – either from editing, or running through a translator. Techapilla strongly suspects the original article was written in a foreign language and run through a translator, given the awkward language of most of the articles
  • 4 obvious content mill sites in the first 40 Google hits for the misspelt phrase
  • The majority of sites which had used the farmed article were ostensibly health sites
  • Quite a lot of “This site may harm your computer” links in the hits
  • Searches on slight rewordings of the misspelt phrase yielded additional hits, including the same article that had been cleaned up a bit more or rejigged to “fit” another disease (e.g. osteomyelitis).
  • Visiting a sampling of the sites revealed that while a few obviously tried to be legitmate health sites (shame about their lack of medical knowledge), most sites were fraudulent, with links on the site all leading to commercial sites (“affordable weddings”, “hot winter vacations”)
  • Techapilla is not a medical professional, but has learnt enough about osteosarcoma in the past year or two to confidently state that the article/s examined as part of this task are complete and utter junk

It would be an interesting exercise to trace back the original article, and to run one of the offspring through Turnitin. An exercise for another day.

In the meantime, some guidelines to help Techapillan readers evaluate the quality of information resources.

Was surprised to discover that only one of my favourite author’s books is available in Google Books in full text. A goodly proportion are now out of copyright, and many are available on Project Gutenberg.

There are a few other free ebook sites similiar to Google Books and Gutenberg out there on the Web, including one at UPenn. See also Bruce’s Oz Free Books page. You need lots of time to explore all the goodies on these sites. One day when I have more time …