We occasionally hear from users asking why Alexa's traffic data for a site doesn't match the data from their site's logs. After all, a site's logs are the definitive source of visit and pageview data, right? Yes. Of course. Sort of...
Few individuals are sophisticated enough to read their logs and understand what they mean. Even with the help of expensive log analysis programs people can be lulled into believing they are looking at definitive stats for their site. But the sad fact is that logs are deceiving and that no two log programs are the same.
Why it is so hard? First, let's tackle the biggest elephant in the room, what is a visit? In the simplest terms, a visit is when an individual person visits your site. But beyond that simple definition there is no agreement as to what a visit actually means. Consider:
- Cookies. Some sites use cookies. Some sites don't. Cookies help identify when the same user comes back on the same computer with the same browser. That's good. But sometimes people use multiple browsers and multiple computers... I know I do. Cookies don't help there.
- Log-ins. Some sites require their visitors to log-in. This is a great way to track visitors, but few sites actually require users to log in each time they visit.
- Repeat visits. What if the person goes away then comes back? Is that 1 visit or 2? What if the person gets up for a cup of coffee and comes back 15 minutes later? What if that person comes back 5 times in a day? There is little agreement as to what constitutes a visit in these cases. How does your log program count them?
- What if that person comes back with a different IP address or logs in as a different user?
- Crawlers? The Web is littered with crawlers. Sometimes the vast majority of pageviews on Alexa come from crawlers. Does your log program recognize them and remove them?
- Raw Logs vs. Web Bugs. Do you analyze your raw Web Server logs or do you have some embedded javascript bug on your page that logs the visits with a 3rd party service? The differences between those two methods can be vast. If you are curious, try both and you'll see.
- Fraud. It is easy to create fraudulent visits and pageviews in logs. Unfortunately, it can also be very profitable -- planning to sell a domain or trying to increase click-throughs? Just start clicking around. Clear your cookies and do it again. Go to a different computer and do it again. It ALL shows up in the logs.
This is just the tip of the iceberg. The point is simply this. A visit is NOT a visit, even if you are using the same log analysis program. You can't reliably detect fraud or crawlers or many of the other factors mentioned above that have a drastic impact on your reported visitor number.
This is where a panel, like Alexa, can be a useful solution. Here's why:
- We count all sites the same. A visit is a visit is a visit. Did you get up to get coffee or come back to the site 20 times? No big whoop. We count a max of one visit per user per 24 hours.
- Cookies, log-ins, different IP addresses, etc... All are counted in the same way, regardless of how the site is set up. Alexa identifies unique visits made by individual visitors.
- Crawlers? What crawlers? All Alexa visits are made by humans with toolbars installed into their browsers. Crawlers are simply not part of the equation.
- Fraud. Despite some dubious claims to the contrary (Jason) it is next to impossible to generate fraudulent traffic on Alexa. We've spent years working on this problem and can guarantee you this: The visits counted on Alexa are much less likely to be influenced by fraud than a site's own logs.
Don't get me wrong. Web site logs are still important and definitely have their place. But they require vastly more sophistication from users if they are to be interpreted properly and/or compared with one another. The alternative is a panel, like Alexa. It isn't perfect, but provided that your site is adequately represented in our panel, it is often better than your own logs.
If you would like to join Alexa's panel so that your visits are counted too, install the
Alexa Toolbar and
distribute it to your users. The Web's very first toolbar is still the best and getting better all the time.