Bill Roth, Ulitzer Editor-at-Large

Bill Roth

Subscribe to Bill Roth: eMailAlertsEmail Alerts
Get Bill Roth via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Las Vegas on Ulitzer, Telecom Innovation, Virtual Instruments Journal

Blog Feed Post

Recommending Search

We have all bought items like movies, books and music on-line and seen the now customary additional recommended titles suggested by the retailer. This is seen as a good way to up sell products, that should be of interest to us, based on what we have previously bought or browsed. When I thought about the processes that enable good recommendations it made me think of search or more precisely what good search should be. Let me explain.

There are a number of ways recommender systems can work and for those of you who are really interested – an overview can be found here. Essentially there are 2 main approaches.  Sometimes recommendations can be based on what other ‘similar’ customers purchased. For example if I bought the DVD’s for  Sopranos, Casino and The Untouchables and other similar customers had bought Casino, The Untouchables and Goodfellows, it makes sense to recommend Goodfellows to me. This is known as a collaborative filtering approach.

A problem is that often these systems tend to focus most on recent purchases you have bought and this colours their perspective of what you are interested in. I remember buying DVD’s for my young daughter on her birthday and for the next few months when I went back to buy Movies for myself I found that I was recommended Barbie & Disney Princess movies! Not that useful – the recommendation engine didn’t understand the context that I was buying the movies (my daughters birthday) and this was a major flaw. So understanding context is critical.

Alternatively, recommendations can also be made based on characteristics of the items purchased. These are called content based approaches which rely heavily on the ability of the system to extract quality features that are predictive of the item in question. So reusing our movie example above we can extract predictive features such as Actors, Director, Genre and so on. Given that a user likes Casino and The Untouchables (Genre Mafia) and they have actor Robert De Niro in common – the system can recommend other Mafia movies containing this actor – eg A Bronx Tale.

Pandora, an Internet music radio service, is an interesting application of recommendation technology. Users type a song or artist that they like, and the Pandora service identifies & plays songs that are musically similar by examining the attributes of the song and comparing it with others in its database. It therefore falls broadly under a content based approach. Users can provide feedback on whether they liked the songs chosen, this modifies the system’s perception of them  and Pandora can then take this into account for future selections.

For example, If you type in ‘whole lotta love’ – the Led Zeppelin Classic – you can see the types of features Pandora identifies for this song eg. a subtle use of vocal harmony, repetitive melodic phrasing, demanding instrumental part writing, mixed acoustic and electric instrumentation, minor key tonality, a dirty electric guitar solo – and so on. Recommendations for this song included songs by artists such as Cream, Jimi Hendrix and The Rolling Stones. So you get the idea.

In my view the cool thing about the Pandora technology  isn’t the actual recommendation technology but the technology that automatically identifies the attributes of each song & extracts them in order to carry out a recommendation in the first place.

Irrespective of how recommendations can be made there is one important principle that is important to build into any recommendation engine – that principle is diversity. It is important to add a little diversity to any recommendation set because without this the customer never sees anything new or innovative. There is no creativity in what is presented to them – all they get is more and more of the same things. From a commercial point of view this is very limiting as it is in the retailer’s interest to encourage customers to try new things, as if they discover a new line of products they are interested in, this opens up new avenues of revenue stream for them and it also increases customer satisfaction.  A Pandora customer may love Led Zeppelin but what if Pandora threw up a Black Eyed Peas track? There is just a chance they may surprise themselves and like this also and open an entirely new line of music for themselves and additional revenue for the retailer.

So what has this to do with search and more specifically Sophia? Well everything really, because if we think about it what is search if it’s not the process of recommending relevant information to users based on their interests? (where Interests are described by queries). The problem is most search tools can’t do the job properly! They only do a partial job ‘recommending’ what they deem as relevant items – often the definition of relevant is extremely limited and is restricted to those items (documents) that simply contain the users search term(s). What about those items that contain related concepts – are these not similar also? If I search for ‘cars’  do I not also want ‘automobiles’ ? Yet too often our search tools cannot even provide this basic level of service without relying on taxonomies or ontologies to power them. Additionally, other important key components are missing including the ability to understanding context and the ability to introduce diversity into the mix so as to drive discovery and innovation in organisations. Considering the amount of time and money spent by some of the world’s top companies trying to improve search, it’s incredible that enterprise users still remain so dissatisfied with their search results.

Alternatively Sophia, being a new generation search & discovery tool that understands context, can actually be used just as effectively as a fully fledged recommender system as it can for search. It’s the Pandora of the textual world. It recommends similar items based on the meaning & context of their textual descriptions. It doesn’t just return items because they simply share the users query terms. It goes much deeper than that.

Given descriptions of items such as movies, TV shows, Books, Apps, magazine articles or even web pages, Sophia can automatically extract predictive features, understand the context of the text, identify related items and introduce diversity into the mix to enable users discover other items they should be interested in, but were unaware of. These features are the core attributes that any standard recommender system should have and in my opinion the core features of a good search product.

So let’s look at an example recommendation created by Sophia based on data from the App store. We’ll take each of the key features of a recommendation engine in turn using the query ‘piano’ as a use case to show what I mean…

1)      Extracting predictive content. What is the core essence of an App description – what is it about? Sophia automatically understands the meaning of texts. It is this meaning that Sophia extracts & uses as descriptive features for each item.

2)      Understand context. Words can have many different meanings depending on their context. Because Sophia understands these different meanings, given a query term it can intelligently organise the apps around each of the meanings for the query (we call these different meanings themes or contexts). This reduces the amount of irrelevant information to sift through.

Example themes discovered by Sophia on the app store data include

a) Theme 1 Playing piano on a virtual keyboard

b) Theme 2 Listening to & viewing sheet music for piano music as they play

c) Theme 3 Lessons / tutoring – learning the piano

d) Theme 4 Learning Piano Chords

3)      Identify similar Items. Because Sophia understands context and meaning it can easily identify related items that have a similar meaning within the same context and present them to the user. These apps will contain related concepts and not the query term. Taking the 4 themes identified above in turn – let’s delve deeper and see what other recommendations Sophia finds for the query piano.

a) Theme 1 - Playing piano on a virtual keyboard

  • Suggested related apps – playing keyboards/ organs/guitars from virtual  keyboard

b) Theme 2 - Listening to & viewing sheet music for pieces of piano music as they play-

  • Suggested related apps – apps that allow you to play along while they play the song & show sheet music as you play.

c) Theme 3 - Lessons / tutoring – learning the piano

  • Suggested related apps –  learning to play other instruments such as keyboards & organ

d) Theme 4 – Learning Piano Chords

  • Suggested related apps – learning to play chords on organs/ keyboards & guitars

4)      Introduce diversity. Sophia can use its knowledge of context & meaning to then suggest other items that have a similar meaning in different contexts. In this way presenting new items that are ‘somewhat similar’ and opening up new ideas to users they would not have thought about. An example of this for our use case would be presenting a theme containing apps that allow you to record various band instruments playing, mix the sound & overlay other tracks & save the finished sound for replaying & distributing. This is related to the original query but at the same time it is a diverse suggestion that may be of value.

The next time you are purchasing on line and see recommendations – think of your company’s search capabilities and ask ‘am I getting as good a recommendation from it when I search”? and if you are not ask ‘why not’?

Read the original blog entry...

More Stories By Bill Roth

Bill Roth is a Silicon Valley veteran with over 20 years in the industry. He has played numerous product marketing, product management and engineering roles at companies like BEA, Sun, Morgan Stanley, and EBay Enterprise. He was recently named one of the World's 30 Most Influential Cloud Bloggers.