On Search: XML
Categorised under: Search tools
Tim Bray has written another article on search, this time on searching XML. To quote:
Back when people were doing the initial sales job for XML (and its predecessor SGML) one big part of the pitch was how this was going to make search so much better: “Searching in the context of a <title> or <product-name> or <metaphysical-paradigm> is going to be ever so much more precise and powerful than boring old brute-force full-text search.” And in principle, it should be.
But there are a couple of things wrong with this picture. First, people don’t want to compose queries and do flexible, powerful structure-sensitive searches. As I’ve written here previously, people in general want to type the minimal number of keystrokes into a search window and say Go, and have the system figure it out for them. Secondly, descriptive markup is a form of metadata, and there is no cheap metadata, and XML is no exception. If your text inventory is in Word or HTML, XMLifying it in any useful way is going to be very, very expensive. Which is to say, XML may not be cost-effective strictly in terms of making search run better.

James Robertson is the Managing Director of