Tools to play with
- Aaron’s tool has some interesting ‘Phrase Match’ data, but it is marginally effective for this excercise and would need sorting.
- KW Map is interesting, but also is marginally effective and has no export option to speak of. Close, but no cigar
- Vseo Tool – Also not the greatest, but certainly presents some reasonable semantic concepts and can be exported.
- WordStream – also comes close, (I am helping develop a tool tho) but nothing default to really group deeper semantic relations for our purposes. Emails the list to you for sorting purposes.
- Nichebot – these guys almost have it with the poorly named ‘LSI’ tool. This produces probably some of the best lists for our purposes. Fully exportable for sorting.
Googly Tools
- Keyword Tool – about as use(less?) as the others. It has some insights, but not deep enough for this excercise. Although it is easier to sort and does support downloads
- Search-based Keyword Tool – not as good as the above KW tool in the testing I did recently for this. It does support exporting though.
- Google Sets – this one isn’t obvious right away, but handy. If you look at the ‘description’ element, you can start to see some supporting terms that might come in handy (since Googly is recommending them). Problem is that it doesn’t give results for granular/obscure terms.(also try Google Squared)
Semantic relations
- Onelook reverse dictionary – returns the list of related terms, each word linked to its definition (more tricks from Ann here) – does a reasonable job but doesn’t have export function.
- Reference.com reverse dictionary – clusters related terms into groups by their meaning and gives the actual definition for each cluster: barely usable.
- Rhyme Zone – define your term and find rhymes, synonyms and antonyms. Using the ‘Find related terms’ option you can get some pretty usable lists, unfortunately they are not exportable.
Good Geeky Reading
Posts
- What you need to know about phrase based IR
- Phrase based IR one more time
- Lost Google patent on Phrase Based IR
- Google awarded another Phrase based IR patent
- Phrase based optimization resources
- Probabilistic latent semantic analysis
- Latent Dirichlet allocation
- Hidden Topic Markov Models
- Phrase Based Information Retrieval and Spam Detection
- Google Phrase Based Indexing Patent Granted
Google Patents
- Determining query term synonyms within query context
Domain Dictionary Creation (NLP for non-roman character sets) - Word decompounder
- Integrating external related phrase information into a phrase based indexing IR system
- Semantic unit recognition
- Phrase-based generation of document descriptions
- Segmenting words using scaled probabilities
- Inferring search category synonyms from user logs
Microsoft Patents
- System and method for identifying base noun phrases
- Consistent phrase relevance measures
- Semantic canvas
- Method and system for performing phrase/word clustering and cluster merging
- System for automatically annotating training data for a natural language understanding system
- Method for finding semantically related search engine queries
- Ranking parser for a natural language processing system
- Context-based key phrase discovery and similarity measurement utilizing search engine query logs
- Flexible keyword searching
Videos for Geeks
- Extracting Semantic Relations from Query Logs - Ricardo Baeza-Yates, Yahoo! Research
In this paper we study a large query log of more than twenty million queries with the goal of extracting the semantic relations that are implicitly captured in the actions of users submitting queries and clicking answers. Previous query log analyses were mostly done with just the queries and not the actions that followed after them. - Machine learning and translation – Google tech talks –
his is an interesting presentation on probabilistic learning and dealing with better understandings of user intent. Kind of heavy lifting for the search geeks, but still worth watching for any SEO. - Machine Learning, Probability and Graphical Models - Sam Roweis, Department of Computer Science, University of Toronto
- What’s the future of semantic search? – Matt Cutts video discussing the differences and his take on where it’s going