Sunday, December 14, 2008

Comment On Letters To Nature: Detecting Influenza epidemics using search engine query data

Google's R&D identified the top search queries that accurately model the CDC's data on influenza-like illness (ILI) physician contacts in various geographic regions. 

See, Nature: Detecting influenza epidemics using search engine query data.

CDC's US Influenza Sentinel Provider Surveillance Network publishes on its Web site physician related ILI activities with a 1-2 week reporting lag. Google's automated procedure reduces the reporting to a day. See, Google Flu Trends.

In my opinion, there is a period of time, probably a few days to a week or more, between the onset of flu-like symptoms and actual physician contact. Folks can feel sick or have a cough or a runny nose which is going to result in flu before actually initiating ILI physician contact.

R&D might be able to identify the top search queries that occur a week or two before either CDC or Google Flu Trends data show ILI physician contacts.

Going from detection to prediction would be a useful step to prevent influenza-like illness.

 btw check out my original SETI research

Tuesday, August 12, 2008

How To Improve Mobile Language Translation

A Google R&D post describes a cool new translation application for IPhones. While you are touring, it enables you to text in a word or a phrase and get it instantly translated.
You can translate key phrases such as these:
  • Can you take me to the airport please?
  • How much does it cost?
  • May I have a large coffee please?
This translation tool has a clear need for predictive input. It would be really cool if you could get common phrases displayed quickly by entering a single word: airport, cost, coffee.

However…asking folks questions is only part of the communication hurdle. While mobile, you also need to understand in real time the answers you get – and they are in the same foreign language. 

Suggestion: In addition to providing a phrase translation, display a set of the most common answers. 

This way you may have a good chance of understanding at least part of what a taxi driver, sales person, or waiter says after you ask: ¿Puedo tener un gran cafĂ© por favor?

Thursday, July 17, 2008

Understanding Queries: Keywords And Natural Language

A Google R&D post Technologies behind Google ranking describes how search keywords are algorithmically interpreted.

The post states: “our algorithms understand that in the query [new york times square church] the user is looking for the well-known church in Times Square and not for articles from the New York Times.”

Suggestion: Assume in this example that many users are actually looking for a well-known church in Times Square when they enter the query: [new york times square church].
Now try the following simple experiment – run both of these searches:
The first search clearly returns higher quality results than the second search. The results list the well-known church in Times Square. 

The problem is that according to the post, the second set of search keywords, which is closer to natural language than the first set of keywords, expresses more precisely what the user is attempting to search for.

Why doesn't the set of keywords that expresses more precisely what the user is attempting to search for return superior results?

Tuesday, June 17, 2008

Collaborative Software As A Green Initiative

Google is supporting electric cars: Plug-ins converge on Washington.

Commuting by car, whether by a gas-guzzler or an electric vehicle, consumes energy and wastes scads of time.

A different way to reduce our dependency on oil is to enable telecommuting for jobs that are primarily computer-based. 

However, remote connections offer a lack of transparency and a less than robust environment for face-to-face meetings and other serendipitous interactions.

Isn't it a touch more Googley to put some more zing into social collaboration apps than to promote the use of electric vehicles?

Sunday, June 15, 2008

Relevance: SERPs and Related Searches

Google R&D discusses an enhancement to its related search links in an article: Fresher related search suggestions.

Let’s try to search for the animal: cat.

If you enter into Google search the keyword cat, the following links to relevant SERPs display down the page:

  • Caterpillar Inc.
  • Cat - Wikipedia, the free encyclopedia
  • Centre for Alternative Technology Home Page
  • Lolcats ‘n’ Funny Pictures of Cats

The following related search links are displayed at the top of the page:

  • cat deeley
  • cat stevens
  • cat unix
  • cat exam

Why is there such a divergence between the concept of “relevance” for SERP links and for related searches links?

If Cat Deeley, a DJ, is relevant to related searches, why doesn’t she appear as a SERP link?
If the Unix command cat is relevant to related searches, why isn’t it listed as a SERP link?
Similarly, if Caterpillar Inc. is a relevant SERP link, why isn't there a related search link to building equipment?

In my opinion, applying two widely divergent standards for “relevance” to SERPs and to related search links introduces the risk of reducing the coherence of the Google page rank algorithm.

Relevance to a keyword/s such as cat means one thing across the top of a page, and something else down the page.

btw check out my original SETI research

Wednesday, June 11, 2008

Does Google Trends Data Equate To Consumer Interest?

Google labs is as passionate about providing cool apps as I am about using them. Thus said, a new post by Google Trends A new flavor of Google Trends draws, in my opinion, a clearly incorrect conclusion. 

The post tries to relate search volume to consumer interest. According to the post, ice cream shops and supermarkets should be sure to stock up on chocolate ice cream:

"Google Trends is not only a fun tool; it also offers some practical uses as well. Suppose you own an ice cream shop and don't know which flavors to serve, or suppose you're responsible for stocking supermarkets across the country; Trends can help you explore the popularity and seasonality of your products."

"As the numbers on the top of the graph indicate, vanilla ice cream has about 30 percent less search traffic than chocolate ice cream."

However, if you perform a Google search for ice cream popular flavors and drill down to a sampling of articles, you will be able to easily see that vanilla ice cream is between 2 and 3 times globally more popular among consumers than chocolate ice cream.

Tuesday, May 20, 2008

Frontiers To Computer Search!

Google provides the public with a glance into how it is improving search: "A peek into our search factory". Breaking through search limits is, of course, a worthy goal.

Increasing the amount/types of accessible data, enhancing user input, and improving search retrieval algorithms should result in a continuous improvement of search result pages.
However… there may be a few limits to creating a perfect search engine:
  • Some media types are probably totally inaccessible. For example: human memory. Billions of folks are walking around with enormous amounts of valuable, but unsearchable, human memory. It will be quite some time into the future before Google can search human memory.
  • Some media types are partially inaccessible. For example: proprietary intellectual property, geophysical and space data, and the contents of an incoming foreign cargo ship.
  • Some search results may be nearly impossible to achieve because of search algorithm limits. For example: searching for a cure to AIDS.

Tuesday, May 13, 2008

Connecting Members And Excluding Nonmembers

Google's Friend Connect is a plug-and-play mashup of Web site hubs with social networking features, gadgets, and apps. 

Visitors who sign in to a Web site get to network along the theme of the Web site - for instance sharing guacamole recipes, or mountain bike maps.

Web masters are supposed to benefit by attracting new visitors as members bring along their dozens or hundreds of friends from their social networks.

Google provides some easy to use social gadgets that are described here:
  • Sign-in with their existing Google, Yahoo, AIM, or OpenID account
  • Invite and show activity to existing friends from social networks such as Facebook, Google Talk, hi5, orkut, Plaxo, and more
  • Browse member profiles across social networks
  • Connect with new friends on your site
According to Google "The key gadget is the members gadget".

Although membership can enhance the connection of friends to a Web site, it can also exclude those visitors on the outside who do not want to become Web site members.

Probably, the number of Web site members, including all of their social networking friends, will usually be much, much smaller than the number of Web site visitors who do not want to sign in to become a member.

The problem: If a Web site offers their rich content only after member sign-in, won't that limit the total number of useful visitors to a Web site?

Will Friend Connect be a layer that actually excludes?

Tuesday, March 11, 2008

Privacy Of Aggregated Non-Personal Information When Performing Google Searches

A Google VP of Engineering discusses this interesting topic: How Google keeps your information secure.

I have a comment about Google’s privacy policy.
When a user performs Google searches, there is an expectation of privacy regarding at least two types of information:
  • Personally identifiable information
  • Searches
Google's privacy policy for using aggregated search data is described here:
"We may share with third parties certain pieces of aggregated, non-personal information, such as the number of users who searched for a particular term, for example, or how many users clicked on a particular advertisement. Such information does not identify you individually."

"Aggregate non-personal information is information that is recorded about users and collected into groups so that it no longer reflects or references an individually identifiable user."

Does the Google policy on the use of aggregate non-personal information prevent Google from using search information in the following example scenarios:
A team (i.e. aggregate) of research scientists is doing research to patent a new router, or a lunar spacecraft engine.

It appears that Google's privacy policy allows the searches of aggregated research scientists to be shared with third parties.

Is this a huge flaw in the privacy policy?

Tuesday, February 19, 2008

Units Of Knowledge: Google "Knols"

Google is creating an articles knowledge-base it calls "Knol". Google describes it here: Encouraging people to contribute knowledge.

I use the online encyclopedia Wikipedia quite frequently, and I find it very useful. Knol is planned to compete with Wikipedia by storing units of knowledge Google calls "knols".

Knol is in pre-release, so I cannot comment on its content. However, I will comment on Google's units of knowledge term "knol":

1. KNOL is the NASDAQ symbol for Knology, Inc an advanced broadband communications services provider. Why reuse a NASDAQ symbol in another context?

2. I suggest improving it by changing it to "knowl".

my original SETI research

Sunday, February 17, 2008

Smirnoff Cranberry Twist

The Executive Chef at Google provides a tasty gourmet Valentine's day recipe for preparing crab cakes.

In my opinion, it would be useful to add to the Google Search box a new recipes capability.

It would work like this: You enter the new recipe operator (recipe:) and name of a dish (e.g. Peach Cobbler), and Google search would instantly display a set of top recipes - without the need to drill-down.

Google search could also enable simple natural language recipe queries, such as these, to return a tasty set of results:

1. recipe for Smirnoff cranberry twist
2. what is the recipe for Chicago pizza
3. recipe Spanish olive crusted salmon

Tuesday, February 12, 2008

Need For A Google Calculator Time Zone Capability

The Google search box can perform calculations and conversions involving math, units of measure, physical constants, and currency conversion.

A function that seems to be missing is the capability to convert times and dates between time zones.

Google does return your local time if you enter the keyword: "time", and it will also display localized time if, for example, you enter the search keywords: "time france".

However currently the following searches fail - they do not display times.

5:00 PM New York time in Rome
7:30 AM Paris is what time in Munich
10:00 PM Chicago in Geneva

In my opinion, Google calculator could be enhanced by including a simple time zone conversion function.

Thursday, February 7, 2008

Google QA Testing: The Stroop Effect

The Google testing blog presents the following demo of the Stroop effect:

How quickly can you...
  1. Read all 25 words out loud: RED, GREEN, BLUE, ... (Try it now!)

  2. Say all 25 colors out loud: GREEN, YELLOW, WHITE... (Try it now!)
However, IMO for a more precise comparison both tasks should require either saying the label or saying the color.

The example provided by the Google testing blog obfuscates the Stroop effect with the quite probably different reaction times needed for the tasks of 1) reading labels and 2) converting color samples to their color names.

Another demo of the Stroop effect from Wikipedia is presented here:

Say the color of these words as fast as you can:

Green Red Blue
Yellow Blue Yellow

Blue Yellow Red
Green Yellow Green

According to the Stroop effect, the first set of colors would have had a faster reaction time.
Here's my original SETI research

Saturday, February 2, 2008

Google Suggest: Predictive Text Entry For Search Keywords

Search engine result pages (SERPs) can be improved by enhancing 1) the search engine, 2) the available data (i.e. what's free out there for spiders crawling the Web), and 3) user input. We know the impact of GIGO.
Google Suggest is a predictive text entry tool that is designed to improve user input by displaying real-time suggestions as a user types keywords.

Google Suggest serves suggestions that include the inputted string as a prefix. This makes all of its suggestions "narrowing", and excludes phrases that don't start with the prefix the user entered.

If you type "bicycle" in Google Suggest you get:

bicycle parts
bicycle victoria
bicycle trainer
bicycle magazine
bicycle tire
bicycle casino
bicycle trainers
bicycle chain
bicycle retailer
bicycle village

If you enter "bicycle" in the standard Google search box, at the bottom of the search results page you get a completely different set of suggestions.

Searches related to bicycle:

bicycle accessories
bicycle lyrics
cruiser bicycle
history of the bicycle
bicycle review
bicycle games
bicycle safety

It is unclear why these two Google features purporting to do the same task - provide related search keywords - generate different results. It seems they are using different algorithms.

To improve Google Suggest, it would have to incorporate expert systems that can analyze the terms a user types and offer knowledge-based suggestions.

An expert system could perform very well, working with detailed knowledge of the field in which the user wants to search.

Before these expert systems are developed, it may be a good idea to use a thesaurus database that can offer satisfying suggestions such as:

bicycle manufacturers
bicycle accessories
mountain bike
racing bike
BMX bike
Tour de France
Maillot Jaune
Floyd Landis

Google has focused on improving the Google search engine SERPs by creating cutting edge information retrieval algorithms.

Perhaps, to provide an effective and satisfying search experience substantial effort should be directed to designing software for enhancing user input.

my original SETI research

Tuesday, January 22, 2008

Google Maps: Get Directions And Find Businesses

Get Directions: Enables you to search for driving directions between two locations. You can use it to find a driving route between, say, Urbana-Champaign and Miami.

Find Businesses: Enables you to search for businesses according to location. It will instantly show you where all the Starbucks in, or near, Miami are located.

A new feature would be to combine Google's Get Directions and Find Businesses.

If you travel from Urbana-Champaign to Miami, you could then see on Google Maps exactly where all the Starbucks are along your trip.