Google and Microsoft are Manipulating your Log Files

During the last few months, we came across a weird phenomenon. We would analyze our web site traffic, and fond that we received some great amounts of traffic referred from Google and Live.com (MSN), for generic expressions, which had a good fit to our website, however we had no positions for these terms – Not organic nor paid.

So, after reading and searching for a while – we found out we are not alone, and we also found the explanation. It appears that these so called “visits” came from IP addresses owned by Microsoft and Google and they are some type of crawl/user emulation performed by the search engines.

So apart from the fact that our data was to be re-analyzed, we had to get rid of this data.
What we did in our web analytics software (We use and recommend ClickTracks), was to exclude the IP addresses and blocks – as follows:

How to know if you are hit?

1. Check your referral data and see if you get very genere (short tail) search phrases you don’t rank for from Google / Live.com

2. Check the log and see if the string includes the url parameter of FORM=LVSP or, now, FORM=LIVSOP

3. Check the IP Address – agains the details below.

Microsoft (Live.com):
65.55.165.*
131.107.0.*

Google:
209.85.63.133
72.52.140.4
72.232.16.50

Important note:
Both engines has stated that blocking their IP will not be “welcome” (How arrogant…) So in order not to be hurt I definitely do not recommend to Block their IP’s on your server – but rather to exclude them from your analysis.

Leave a Reply