Netcraft, Stats, and other works of fiction…

In my early days on the web in 1996 and 1997 I wrote my first online service at www.freestats.com. I gave away web stats tracking for sites that didn’t have access to log files or other means to track their visitors in exchange for advertising on those sites (Freestats is now owned by United Online – The Netzero guys and is still in operation today). I had access to an enormous amount of data back then. I even tracked stats for some huge sites at the time such as altavista.com and others. The collective data gave me an insight into the web that others simply didn’t have access to. I could see trends with browsers, javascript, operating systems, etc as they were happening. I loved this information because it was valuable and ACCURATE.

Today there are many ways to track stats for your site, and there are services that sell collective trend information to companies that are willing to pay. One such company is Netcraft. The reason I mention them directly is that they are often considered “the standard” for information about market share for web companies, web server market share, and so on. Here is the problem. A couple of different companies have shown us our “status” with Netcraft. They attempt to break down how many domains we manage, our hosting market share and other critical market information. The information is flat out wrong. In fact, its not even close. We are MUCH larger than they show. This got me looking closely at other services that peddle this type of information. Some were accurate, but most were off in their data by 30-50%. Why is this so hard to pin down?

Sometimes the information you need just doesn’t exist and you have to approximate the data. Other times the information is made to appear a certain way because large customers must show gains in their respective area. This is common in the web server market breakdown where Microsoft will do anything to show gains in IIS by not counting Googles results as using apache, or paying off Go Daddy to use Microsoft servers for their default parked pages before a real site is loaded.

Information and data is only as good as the view to that information. Its always good to know more than less, but looking at information with a little skepticism is sometimes a good thing.

Matt Heaton / President Bluehost.com

4 Responses to “Netcraft, Stats, and other works of fiction…”

  1. You reminded me about a famous quote
    “There are three kinds of lies: lies, damned lies, and statistics.—Benjamin Disraeli “

  2. Andy Lax says:

    Matt,

    I agree that one needs to exercise a certain amount of caution when examining statistics, reading the conclusions of a given study, and reviewing information that is purportedly objective. The “who” — the party that is dispensing the material — may be even more meaningful than the “what” — the information that is provided.

    It is truly disconcerting to discover when so-called “reliable” information and/or data is erroneous — especially if this reflects an author’s deliberate attempt to mislead or lie to readers.

  3. Thomas says:

    At the end of the day these “statistics” as seen from netcraft of webhosting.info are nothing more than algorithms which rarely, in my own experience, match internal figures of a web hosting organization.

    The three things that really matter; income statement, balance sheet and customer retention. For privately held firms these don’t get scrutinized, analyzed or “massaged” by the workings of data miners.

    Hold those three close, and let the “works of fiction” sit by the waist side.

    -Thomas

  4. I won’t comment on the accuracy of the netcraft numbers in relation to your company, as I don’t know the inside of that… however, I’m not sure about your source that says that netcraft doesn’t count google numbers. I will agree that there could be better ways of tracking metrics overall, however.

Leave a Reply