
This is how confusing the results are
So, following on from the previous post “Can the Googlebot read JavaScript? Ajax? Cookies?“, the wait is over, the results are in…
Overall the results are quite confusing, but lead me to make some strange, possible conclusions..
- Googlebot CAN read JavaScript
- Googlebot CAN execute, and read the results of, AJAX requests
- Googlebot CAN NOT store/read Cookies
But the killer is:
- Googlebot DOES NOT use this information in search relevancy calculations
If we take a look at the cached version of the page it doesn’t actually tell us anything about the Googlebot. The only string that can be seen is the one that uses inline JavaScript which is pretty much expected as the Ajax requests are relative and not absolute meaning they fail on the cached version.
And, as we predicted, the cookie is not set, telling us that the Googlebot didn’t/can’t read cookies.

How do they know the string exists if they didn't parse the JavaScript?
The ability of the Googlebot to read the JS and Ajax is not immediately obvious.
JavaScript
Indeed if we take the first string:
and search for it in Google, we get no results. Try it yourself.
….However
A search for the line immediately before the JavaScript text (“so… Relentless Marauder becomes…”) returns the page.
When you check out the page preview, the search query “so… Relentless Marauder becomes…” is highlighted.
Right below it is our JavaScript text….
This is really interesting as it shows the JavaScript text in the right context on the page. It also shows that:
Googlebot is able to read the JavaScript as a string of text.
But the only place that you can see evidence of that is in the page preview section, which is quite strange.
What makes this more interesting is that the JavaScript string is conspicuously missing in the snippet of this search result.
Given the exact search query we are presented with a snippet that shows the context of the query on the target page, but the JavaScript text is simply missing altogether.

The JavaScript string is missing in the snippet.
This would suggest to me that there are perhaps 2 kinds of Googlebot: The classic text based crawler and a more advanced, browser type bot that can handle JavaScript, CSS, etc.
On the results page here we see Google showing us two different interpretations of the content of our page. Ok in this case it is only 2 words, but theoretically it could be a huge difference.
Ajax Requests
So we know that the GoogleBot can read the JavaScript as a string of text, but chooses not to use it in calculating relevancy to a search query. But what about the Ajax requests we make?

GoogleBot reading the Ajax result, in context on the page
If we take the second test string:
and run another Google search for that phrase we see a strange result Try it yourself.
As you can see, the search returns the file that contains that string. On the main test page we call this file with an Ajax call and show the contents. This file is not linked from any other page which would lead me to think that the GoogleBot has understood the structure of the Ajax request and sent off a spider to grab the contents of the file.
Whilst I can’t completely rule out that the GoogleBot got there via an external link, I think given the time frame and obscurity of the file location, this is very unlikely.
Again if we search for the text immediately prior to the string we are testing, we get a results page with the string showing up in the page preview and not in the snippet. This is exactly the same as with the JavaScript text and again we have Google showing us two different interpretations of the page content.
With the third test string we also run some referrer filtering to make sure that the text is only output when it is called via an Ajax request into the main page.
Again this file was found and indexed See here
But, again, when we run a search for the text prior to the expected string, we are presented with the ajax text shown in the page preview but not in the snippet. Try it Yourself. This is perhaps the strongest evidence I have seen yet that, in some way, the GoogleBot DOES execute Ajax requests and CAN read the resulting output.
Cookies
The fourth test yielded no results whatsoever, backing up the idea that the Googlebot CAN NOT store and read Cookies (which is good as its the main premise behind my other post “Faking Backlinks using the Referrer“).
Conclusion
So, in conclusion, it would seem that the GoogleBot has the ability to parse JavaScript content and to read the results of Ajax requests but for some reason these elements are not being used to calculate search relevancy in the same way that on page text is.
Why is this?

Happy Face Man is Happy.
Finally I would like to say a big thanks to everyone that helped spread the previous post, it was great to get feedback from other great SEO’s out there.
Thanks for taking the time to run these experiments. I love reading stuff like this.
I have a few things to add that might clarify your results.
As you pointed out, Google seems to understand JavaScript when it renders Instant Previews, but not when calculating relevance scores or displaying SERP snippets. However, this shouldn’t be interpreted as one single entity (Google) acting differently under different circumstances–rather, it should be interpreted as two separate/independent processes.
The process that generates Instant Previews is programmed to execute/render a web page (including JavaScript) just like a browser would. [http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1062498]
On the other hand, Google calculates relevance scores from the raw HTML code returned by the server after the initial request. External resources aren’t fetched nor executed prior to indexing. Doing so would be much too expensive in terms of memory and CPU (at least for now). This is why Instant Previews are often generated on the fly: Google doesn’t want to spend their resources on that until a User explicitly requests it.
The text contained in the Instant Preview snippets is also calculated on the fly, since it changes depending on the search query.
With regards to AJAX requests, I think you’re giving Google too much credit. You concluded this:
…and this:
There’s an important detail here that should be emphasized: your blog post does NOT rank for phrases 2 or 3; only the external files do. This means that Google did NOT understand the nature of the AJAX request, because the content of those external files was NOT associated with the page (i.e., the blog post) that embedded them.
In other words, all Google did was find a string that looked like a URL…tried it…saw that it worked…and treated it like a new web page. I’ve written a post about this, if you’re interested:
http://www.seomofo.com/advanced/do-not-let-google-crawl-javascript.html
Cheers,
SEO Mofo
Hey,
Thanks for the comment.
I think the line:
“As you can see, the search returns the file that contains that string. On the main test page we call this file with an Ajax call and show the contents. This file is not linked from any other page which would lead me to think that the GoogleBot has understood the structure of the Ajax request and sent off a spider to grab the contents of the file.”
is ambiguous, I agree it is more likely that the GoogleBot simply found a string that looks like a URL and spidered it but in this case they were relative paths to the files which is slightly confusing. I meant to say that the GoogleBot figured out that the content was there somehow, i guessed by quasi interpreting the AJAX request, then spidered the external files.
As for interpreting the GoogleBot/Crawling as being two separate/independent processes, this is kind of the conclusion I was trying to draw but I think the conclusion became a little lost in the sweeping generalisations I made.
But when I said:
“Googlebot CAN execute, and read the results of, AJAX requests”
I was referring to the process used in the generating of the instant previews.
Theoretically Google could use this data to calculate search relevancy in the future.
The main point was that Google is able to process the AJAX and JS stuff and understand the outputs as a string of text but they don’t do this on a normal crawl of the site, only on the process used to generate the preview, and it seems that isn’t influencing the relevancy at the moment.