Tuesday, October 30, 2007

Google vs. YouTube?

About a month ago, Microsoft Live passed MSN in daily rank; you should expect to see this reflected in Alexa's three-month rank any day now. The next sites in line for succession --- MySpace and Facebook --- currently lack the reach to upset the Top5. In the middle of this Microsoft dance are Google and YouTube. The following graph illustrates some interesting trends.
As you can see, for the last five weekends YouTube (in teal) has out-ranked not only Live (in black) but also Google (in red), making it the #2 site on the internet. This raises two questions: 1.) Really? and, 2.) Why? To understand this trend, let's first take a look the corresponding reach and traffic graphs.

These graphs show that Google always has more unique users per day than YouTube but that every weekend, Google loses traffic whereas YouTube gains. This is not so far-fetched: the other search engines in the Top5 --- Yahoo! and Live --- show the same general trend. In fact, youtube.com has (for several months) had more overall traffic than google.com. This is also not far-fetched because of the different natures of these sites. When people go to Google, they are searching for content that is likely elsewhere. On the other hand, people that go to YouTube are going for the content on YouTube. Taking into account that (in general) search engines are a means and video sites are an end, it is not unreasonable that YouTube could get more pageviews per user (and more traffic) than Google.

This isn't the whole story, though. Another big difference between Google and YouTube is the way they use international domains, or ccTLDs. If I go to http://www.google.co.uk, I get the British version of Google and that traffic counts for google.co.uk, not for google.com. If I go to http://www.youtube.co.uk, I get redirected to http://uk.youtube.com and that traffic counts for youtube.com. In the Alexa Top100, Google is represented by their .com plus 19 ccTLDs. Each of these international sites pulls in between 0.1% and 0.3% of the daily traffic Alexa sees. Adding that to the 2.2% pulled in by google.com, Google's search sites get at least 6.0% of daily pageviews, which beats YouTube's 3.0% hands down. (Also of note: this beats Yahoo!'s 4.8% and they pull the same ccTLD trick that YouTube does, when possible. There are some exceptions --- yahoo.co.jp is an example --- due to individual registry rules.)

So the answers to the questions posed above depend on what you mean by "Google," i.e. whether it is a site, a brand, or a company. The first would be only google.com; the second would include google.com, google.fr, google.com.br, google.de, etc. The last would include those properties plus YouTube, Orkut, Blogger, et al. The point is that those are important questions to ask when doing traffic analysis on the web. Be wary of analyses that fail to take these kinds of issues into account, especially if the analyst incorrectly uses statistics. (Luckily, some people get it right.)

Friday, October 19, 2007

Beyond the New Black

As a follow-up to my post about "The New Black," here are more graphs for your consideration. I actually intended to write this up last week but I guess late is the new early. Jumping right in, as a computer scientist, I'm interested in programing language trends. The graph below was seeded from the list of programming languages on Wikipedia.


One of my co-workers was offended that "Python is the new Basic," so I guess I should talk about my interpretation of the data. The statement does not say that Python imitates, replaces, or has the same warts as BASIC. My translation is "Whereas BASIC was popular (or well-known) in the past, Python is now popular (and well-known) in a similar way, today." This is certainly the case for "HTML is the new assembly," given that assembly is a basic building block for compiled programs and HTML is a basic building block for the web. People (and computers) haven't forgotten about assembly; rather, the web has its own version of something similar.

The strongest links in the graph are, in order
  1. C is the new assembly
  2. Java is the new COBOL
  3. SQL is the new HTML
  4. Ruby is the new PHP
These mirror the way people think about most of these languages already. C has been called a portable assembler, Java and COBOL are both often thought of as business languages, SQL is behind a lot of dynamically generated HTML, and Ruby (on Rails) is competing with PHP for building web applications.

The only double-edge in the graph is between Java and C (including C++), which is indicative of the long-standing feud between developers on both sides. (It does raise the question as to why .NET is nowhere to be found.) The other cycle is from XML to XML, a testament to the eXtensibility of that standard.

And for those of you who I lost at the top of the post, I took the liberty of putting together of different sports and leagues, below. Enjoy!

Thursday, October 18, 2007

What can Alexa do for you?

Summer is over and I am putting together my to-do list for Alexa, trying to figure out what we are going to be working on for the next year. But before I break out my pencil, I thought I'd ask you first.

If you have about a minute and a half, take this brief survey and tell me what's on the top of your Alexa wish list:

Survey

Thanks.

Friday, October 05, 2007

Shake Up in the Alexa Top 10

Some of you noticed that this week, Google and MSN traded places in Alexa's Top 10 list. Actually, it's better to say that google.com and msn.com traded places, in order to avoid getting confused. This is mostly due to a significant drop in traffic for msn.com; in fact, because Alexa's Top Sites list is based on three months of data, it's been clear for weeks that MSN and Google were going to cross paths. All of this makes sense when you take into account Microsoft's campaign to promote their Live brand, with live.com at #5 and rising.

Fewer noticed that Hi5 and Baidu switched places. This would have happened sooner, except for August 30th. What happened that day? In some Latin American countries it's the Feast of Saint Rose of Lima. It's no coincidence that August 30th is a national holiday in Peru, from which Hi5 gets 12% of its users. So is it fair to compare the two when Baidu gets 90% of its users from China? It depends on what you're looking for.

For instance, youtube.com (currently at #4) get's 13% of its users from the US whereas Google's .com gets 26% of its users from the US. Compare this with myspace.com, which is globally at #6 but gets 45% of its users from the US where it is currently #3. These demographics are almost certainly important when making business decisions, which is why Alexa produces top sites lists for countries, languages, and categories.

Taking a look at our current Top 10, we have

  1. Yahoo!
  2. Google
  3. MSN
  4. YouTube
  5. Windows Live
  6. MySpace
  7. Orkut
  8. Facebook
  9. Wikipedia
  10. Hi5
So four of the top five are web portals, with Yahoo! still the king of hill. Of the other top spots, five are social networking sites and one is a collaborative encyclopedia. Those sparklines contain 4 months worth of 1 week averages; the traffic drop you can see a quarter from the right is Labor Day Weekend. I found it interesting that Orkut took a dip but, unlike the other sites, did not recover. The drop is due to fewer pageviews without a loss in reach.

If it's hard to imagine that the US could have such a big impact on a site that gets most of its reach from Brazil, consider what would happen if a small number of users who were responsible for a greater-than-average number of pageturns suddenly disappeared. The demographic would not significantly change even though there was a significant impact to traffic. I suspect that many Americans (responsible for an more than their fair share of hits on that site) went to BBQs that weekend, realized the people at the party were on other networks, and made the switch. Is it significant that Facebook, which gets 30% of its users from the US, had a drop in unique users that Friday with a less-than-proportional drop in pageviews the same day?

So what's the point? Fundamentally: Alexa ranks websites, not companies. In the global Top 10, Microsoft takes two spots and Google takes three. This traffic can only be properly understood by taking global demographics into account. Finally, a site's rank is dependent on its traffic in relation to other sites. That is, your site's rank can change even if your traffic does not --- or visa-versa --- due to relative traffic changes on other sites with similar rank.

Tuesday, October 02, 2007

Battle of the Game Consoles

Last week, the Google Blog posted a story about search trends and video game systems, namely a comparison between searches for Xbox 360, Playstation 3, and Wii.

Looking at the Google Trends graph next to an Alexa Traffic Graph (reach) for xbox.com, playstation.com, and wii.com, you can see mostly the same trends. In 2005-Q4, the Xbox is clearly the winner in both searches and traffic, for very good reason: of the three, it was the only one actually available. The traffic for “Playstation 3” correlates to the Tokyo Game Show, at which several games for the (then non-functional) system were demoed. The name “Wii” was only officially announced in 2006-Q2, which shows up as a bump in searches, at flag A. This does not translate into traffic for wii.com because the domain was parked until at least July of that year, which can be verified by taking a look at the Wayback Machine.

All three consoles have search and traffic surges in 2006-Q4; since then, Xbox 360 and Playstation 3 have been neck-and-neck in both traffic and searches, though their reach appears to be increasing. Wii, however, has remained level, after the initial peak. If you compare wii.com and nintendo.com, the name announcement and the console launch are both clearly visible. In actuality, the news volume (which is down) more closely resembles the traffic for Nintendo (and Wii) than the number of searches. So the question becomes: what will happen to Wii this winter? Is it just a late bloomer?