Google's Bots Read Like Humans, Now

When the first version of Google’s search engine was released, it relied on one thing and one thing only: links in a page. A database of links was built up by stripping out all other page elements until just the metadata and the links remained.

The bots have undergone many iterations since then, but the one thing that has remained the same is that web pages are reformatted to be more comfortable for robot reading. That means no Javascript and no complex page elements, and that any pages that didn’t focus on computer readability were punished.


But apparently that has changed. while digging through his Apache logs, a developer discovered that Google’s bots now execute any javascript they encounter. If the bots weren’t using that to improve how the page is read, well, that Javascript would just make it annoyingly difficult to get to the content.

The only reason for the code to be executed is for the bots to get an actual idea of what the page looks like.

Further, it looks like the bots aren’t just skimming out the URs anymore, but mimicking how actual users click the links. That is big, because that means the bots are using the web like we do. That has been a major thing holding back the development of true semantic networks on the web, and it might be how Google is planning on taking its new Knowledge Graph to the next level.

We still know very little about what Google is working on in this regard. After all, the evidence exists purely in one developer’s logs, and then it is only a record of a bot doing something it shouldn’t have been doing. But the fact that this is being done at all has a far reaching impact on the future of the web.

Engadget Photo by : Carlos Luna