W3Techs Logo
provided by
Q-Success
Home Technologies Reports Sites Quality Users Blog Forum FAQ Search

Blog Categories

All

News
24/7 Real Media
51.la
AddThis
AddToAny
Adobe Analytics
Adobe TagManager
AdRiver
AdRoll
AdTaily
Adtech
Advertising Networks
Adzerk
AngularJS
Apache
ASP.NET
ASP.NET Ajax
AT Internet
AudienceScience
Backbone
Baidu Analytics
Baidu Promote
Baidu Share
Bitrix
Blogger
BlueKai
BuySellAds
CDNJS
CentOS
Character Encodings
Chartbeat
Chitika
Client-side Languages
CNZZ
ColdFusion
Commission Junction
Comodo
Compression
Concrete5
Content Delivery
Content Languages
Content Management
Cookies
CPM Star
CrazyEgg
CSS
DataLife Engine
Debian
Delicious
Digg
DigiCert
Discuz!
Dojo
DotNetNuke
DoubleClick
Drupal
Effective Measure
Ektron
Ensighten
EPiServer CMS
ETag
ExoClick
Ext JS
EZ Publish
Facebook
Fedora
Flash
Frameset
Full Circle Studies
Gemius
Gentoo
GIF
GitHub Pages
GlobalSign
Gomez
Google +1
Google AdSense
Google Analytics
Google Hosted Libraries
Google Servers
Google Tag Manager
GoSquared
Gunicorn
Histats
HitTail
HTML
HubSpot
IBM Servers
Image File Formats
Infolinks
InterRed
IP.Board
IPv6
Java
JavaScript
JavaScript Libraries
Jetty
Joomla
JQuery
JQuery CDN
JsDelivr
KISSmetrics
Knockout
Liferay
Linezing
LinkedIn
Linux
LiteSpeed
LiveInternet
Lotame
Magento
Markup Languages
MediaWiki
Microsoft Advertising
Microsoft-IIS
Mixpanel
Modernizr
MoinMoin
MooTools
Movable Type
MySpace
New Relic
Nginx
Nielsen NetRatings
Node.js
NQcontent
Openstat
Operating Systems
Oracle Servers
OsCommerce
Parse.ly
Perl
Persistent Cookies
PHP
PHP Link Directory
Pinterest
Piwik
Pligg
Plone
PNG
PrestaShop
Prototype
Python
Quantcast
Red Hat
Revolver Maps
Ruby
Satellite
Scientific Linux
Script.aculo.us
Server-side Languages
SharePoint
ShareThis
ShinyStat
Shopify
Silverlight
Site Elements
Skimlinks
Smart AdServer
Snoobi
Social Widgets
SPDY
SPIP
Squarespace
SSL Certificate Authorities
StatCounter
StumbleUpon
SuSE
SVG
SwissSign
Symantec Group
Tag Managers
Tealium
Telerik Sitefinity
Tengine
Top Level Domains
Trac
Tradedoubler
Traffic Analysis Tools
Twitter
TYPO3
Ubuntu
UCoz
Underscore
Unix
UpToLike
Urchin
UTF-8
VBulletin
Verizon
VigLink
Web Servers
WEBrick
Webs
Webtrends
Weebly
Whos.amung.us
Windows
Wix
WordPress
WordPress Stats
XHTML
XpressEngine
Yahoo Advertising
Yandex.Direct
Yandex.Metrika
YUI Library
Zanox
Zedo
ZMS
Zope

Google can't track every single click of your web surfing. Only most of them.

Posted by Matthias Gelbmann on 27 February 2012 in News, Blogger, DoubleClick, Google +1, Google AdSense, Google Analytics, Google Servers

Summary:

If you don't trust Google, you may want to avoid it while surfing the web. Good luck to you.

If you are anything like me, you love a lot of what Google offers. As soon as I fire up my Google Chrome browser, I head over to Google Search, Google Maps, Gmail, Google Calendar, Google Docs or Picasa. And whenever I stop wasting my time on Google+, I continue doing so on YouTube. These services are mostly free and reliable, why should I think twice about using them?

There is a reason. Google most likely has more data about people in its data bases than any other organization in the world. More than the former Soviet KGB could have hoped to get in its wildest dreams. If you have teenaged kids with an Android phone, then Google almost certainly knows quite a few things about them, that you don't. Google may know where they are at any moment via Google Latitude, who all their friends and acquaintances are via their synchronized contact list, what they did last night via their uploaded pictures, and what they say about you via Google Talk.

Now, one might say if you are worried about this, then simply stop using these Google services and you are off the hook.

Really?

If you don't go near the Internet, then that's probably the case. But if you happen to live in the 21st century, Google will still collect data from your website visits via services they provide for webmasters. We collect statistics about a number of such services for our surveys. These services are

Service Percentage of websites using it
Google Analytics 55.6%
AdSense 18.3%
DoubleClick 1.6%
Teracent < 0.1%
Google Web Servers 1.0%
Blogger 0.9%
Google Sites < 0.1%
Google +1 (incl. the old Google Buzz) 11.3%
Google Library API soon to be published


Taking these figures, we investigated how many sites are not using any of these services. We had to take into account the overlaps, e.g. some sites use Analytics and AdSense, therefore we cannot simply add the usage figures. This is what we found out: the percentage of websites that use any of these Google services is 63.5%. In other words

Only 36.5% of the web is Google-free.

This is a very conservative estimate, because there are several popular Google services that we don't monitor: embedded YouTube videos, embedded Google Maps, Google Site Search, Google Checkout and Feedburner are some examples. However, the services that we left out tend to be used on individual web pages only, whereas the services from our surveys are typically used on all or on most pages of a site. Therefore, the percentage of web pages that are Google-free is almost certainly even lower than 36.5%, but probably not much lower.

What does that mean? Suppose somebody wants to stay away from Google out of concern for privacy or for any other reason. Suppose that person does some research on the web and visits any 5 websites that are not owned by Google. Then the chance that none of these sites uses any Google service, so that no traces are left on any Google server, is 0.65%

The probability of providing data to Google
when visiting 5 random websites,

without actively using any Google service,
is 99.35%.

There are a few things one could discuss concerning that figure, I will try to address some of them:

  • The various Google services run on separate servers, it is not possible to combine all these data.

    While it is technically not possible to have something like a super-cookie covering all Google property and thus readily identifying a visitor along the way, techniques such as Browser Fingerprinting combined with all the other data a website visitor leaves behind, can achieve pretty much the same. I think of this like a jigsaw puzzle, where Google tries to bring all these little data points together. They will never find and properly locate all the pieces, but it's sufficient to have plenty of them in place in order to recognize the picture. I'm quite confident that the smart guys at Google know a thing or two about digging into large amounts of data.
     
  • You can turn off JavaScript and use Ad Blockers, so that you are not affected.

    Disabling some (but not all) of the Google data collection is possible. Google itself provides tools such as the Analytics Opt-out Browser Add-ons, and there are any number of third-party tools. However, selecting, configuring and updating these tools on several browsing platforms such as PCs, smartphones and tablets, is more effort than most people and most company's IT administrators are willing to spend.
     
  • Who cares?

    Some people do, others don't. I personally must say that I trust Google more than I trust any government in the world, including my own, but that's a low bar. Call me naive, but I don't believe terrible misuse of the data is planned at the Googleplex at this moment. I think that Google knows best that one can lose people's confidence only once, and as soon as a Google ad is generally perceived as a severe privacy issue, that would pretty much be the end of the company.

    But that doesn't mean that things can't go wrong at some stage. Certainly, seeing all those mountains of data in one place does leave a nervous feeling. Mistakes are made, even at Google, as has been known to happen again and again. There could be data leaks, or outright criminal conduct, or a change of Google's policies at any time. And while at the subject of governments of the world, all that data being available may well, under certain circumstances, give them too more information about me than I want them to have.

Whatever your personal conclusions are, I hope that this little investigation will contribute to making data collectors, surfers, webmasters and law makers alike aware of the magnitude of the problem. We have reached a critical point where it's next to impossible for an individual to decide where and when he or she wants to give away some data to the biggest data collector. It all happens with or without you.

_________________
Please note, that all trends and figures mentioned in that article are valid at the time of writing. Our surveys are updated frequently, and these trends and figures are likely to change over time.

Share this page




Share |


6 comments

Andrew Schwartzmeyer on 27 February 2012

And then you also have people who use Google Voice (me included): which gives them all of your SMS and phone call communication. They give you the option of recording calls, so they certainly have the ability to as well. 

Stefan on 28 February 2012

> I personally must say that I trust Google more than I trust any government in the world, including my own

Don't forget that Google has to obey US legislation, and legislations of other countries it operates in. So, even if you trust Google, do you trust Governments that can make google obey to user data requests? I don't  :)

http://www.google.com/transparencyreport/governmentrequests/userdata/

Jamie S. on 28 February 2012

Is there a list of the domains in that chart given in the article?

Reply by author Matthias Gelbmann on 28 February 2012

@Jamie: We have that list, it is the basis of our surveys, but you can't download it, if that's what you mean.

You can, however, check the technology usage of any site via our site info page: http://w3techs.com/sites

NorthernCanuck on 1 March 2012

Yes, Google must obey US legislation.  Presumably this includes the Patriot Act, which requires US companies to hand over data -- including personal data -- on anybody such companies may be dealing with, including on people from other countries.  The Act does not require *any* notification that peoples' data has been passed on.

You can bet data held by Google, Facebook, etc. is routlinely requested by US authorities.

This is not necessarily a bad thing.  I'm just saying...

 

NC

Pam on 25 April 2012

This was so dead on, almost scary actually! I often refer to them as "The Google God's" on my website, guess I didn't realize how literal that was! :)

Thanks for the great read Matthias!




This entry is closed for comments.

Featured Products and Services

Secure Premium Wordpress Hosting
WP Engine

TemplateMonster

Premium website templates, including themes for WordPress, Joomla, Drupal, Magento, Prestashop, and more.

Present your product or service here


   
W3Techson


Find us on Facebook

Follow W3Techs on Twitter

Our Book Recommendation
About Us Disclaimer Terms of Use Privacy Policy Advertising Feedback
Copyright © 2009-2014 Q-Success