Tuesday, October 30, 2007

Google vs. YouTube?

About a month ago, Microsoft Live passed MSN in daily rank; you should expect to see this reflected in Alexa's three-month rank any day now. The next sites in line for succession --- MySpace and Facebook --- currently lack the reach to upset the Top5. In the middle of this Microsoft dance are Google and YouTube. The following graph illustrates some interesting trends.
As you can see, for the last five weekends YouTube (in teal) has out-ranked not only Live (in black) but also Google (in red), making it the #2 site on the internet. This raises two questions: 1.) Really? and, 2.) Why? To understand this trend, let's first take a look the corresponding reach and traffic graphs.

These graphs show that Google always has more unique users per day than YouTube but that every weekend, Google loses traffic whereas YouTube gains. This is not so far-fetched: the other search engines in the Top5 --- Yahoo! and Live --- show the same general trend. In fact, youtube.com has (for several months) had more overall traffic than google.com. This is also not far-fetched because of the different natures of these sites. When people go to Google, they are searching for content that is likely elsewhere. On the other hand, people that go to YouTube are going for the content on YouTube. Taking into account that (in general) search engines are a means and video sites are an end, it is not unreasonable that YouTube could get more pageviews per user (and more traffic) than Google.

This isn't the whole story, though. Another big difference between Google and YouTube is the way they use international domains, or ccTLDs. If I go to http://www.google.co.uk, I get the British version of Google and that traffic counts for google.co.uk, not for google.com. If I go to http://www.youtube.co.uk, I get redirected to http://uk.youtube.com and that traffic counts for youtube.com. In the Alexa Top100, Google is represented by their .com plus 19 ccTLDs. Each of these international sites pulls in between 0.1% and 0.3% of the daily traffic Alexa sees. Adding that to the 2.2% pulled in by google.com, Google's search sites get at least 6.0% of daily pageviews, which beats YouTube's 3.0% hands down. (Also of note: this beats Yahoo!'s 4.8% and they pull the same ccTLD trick that YouTube does, when possible. There are some exceptions --- yahoo.co.jp is an example --- due to individual registry rules.)

So the answers to the questions posed above depend on what you mean by "Google," i.e. whether it is a site, a brand, or a company. The first would be only google.com; the second would include google.com, google.fr, google.com.br, google.de, etc. The last would include those properties plus YouTube, Orkut, Blogger, et al. The point is that those are important questions to ask when doing traffic analysis on the web. Be wary of analyses that fail to take these kinds of issues into account, especially if the analyst incorrectly uses statistics. (Luckily, some people get it right.)

Friday, October 19, 2007

Beyond the New Black

As a follow-up to my post about "The New Black," here are more graphs for your consideration. I actually intended to write this up last week but I guess late is the new early. Jumping right in, as a computer scientist, I'm interested in programing language trends. The graph below was seeded from the list of programming languages on Wikipedia.


One of my co-workers was offended that "Python is the new Basic," so I guess I should talk about my interpretation of the data. The statement does not say that Python imitates, replaces, or has the same warts as BASIC. My translation is "Whereas BASIC was popular (or well-known) in the past, Python is now popular (and well-known) in a similar way, today." This is certainly the case for "HTML is the new assembly," given that assembly is a basic building block for compiled programs and HTML is a basic building block for the web. People (and computers) haven't forgotten about assembly; rather, the web has its own version of something similar.

The strongest links in the graph are, in order
  1. C is the new assembly
  2. Java is the new COBOL
  3. SQL is the new HTML
  4. Ruby is the new PHP
These mirror the way people think about most of these languages already. C has been called a portable assembler, Java and COBOL are both often thought of as business languages, SQL is behind a lot of dynamically generated HTML, and Ruby (on Rails) is competing with PHP for building web applications.

The only double-edge in the graph is between Java and C (including C++), which is indicative of the long-standing feud between developers on both sides. (It does raise the question as to why .NET is nowhere to be found.) The other cycle is from XML to XML, a testament to the eXtensibility of that standard.

And for those of you who I lost at the top of the post, I took the liberty of putting together of different sports and leagues, below. Enjoy!

Thursday, October 18, 2007

What can Alexa do for you?

Summer is over and I am putting together my to-do list for Alexa, trying to figure out what we are going to be working on for the next year. But before I break out my pencil, I thought I'd ask you first.

If you have about a minute and a half, take this brief survey and tell me what's on the top of your Alexa wish list:

Survey

Thanks.

Monday, October 15, 2007

The Sweet Sound of Music

Many of you probably noticed inrainbows.com on our movers and shakers graph last week. This site is where Radiohead released their new album for free download in mp3 format on Oct 10th. If you want the official hard copy CD you can name your own price. This release and distribution method is a first by a well-known heavyweight artist and you can bet the record industry will be watching closely.

The iPod changed the music industry model from a tightly controlled, some would say corrupt, industry, into a user choice market focusing more on singles rather than a whole album. Combined with the reach of the Internet, reality TV (American Idol) and new listening formats (streaming, satellite, websites selling mp3s) the means with which the public can purchase and hear new music is changing right before our ears.

Last week was a big week for that change. Two days before Radiohead released their new album, Trent Reznor of Nine Inch Nails posted he had dropped the record label in favor of "...a direct relationship with the audience as I see fit and appropriate." Soon afterward both Jamiroquai and Oasis were rumored to be interested in releasing their new album on a pay-as-you-wish system. How did the Internet react?


Radiohead did extremely well as their traffic spiked on the pre-order and release days. NIN also saw a sharp increase in web traffic but Oasis and Jamiroquai remained unchanged. Another big music story of last week was on Wednesday (day of Radiohead's release) when Madonna announced that she had left Warner Bros Records for the concert promotion company Live Nation in a $120 million deal. Live Nation did see a traffic boost during the week but Madonna’s site stayed relatively flat:


The fans have spoken and the message is clear. People want music their way and such is the reasoning and success for the iTunes Store and the recent launch of Amazon Mp3 Downloads store, which is completely DRM-Free. Yahoo Music VP or Product Development Ian Rogers last week also said they would not sell any more music with DRM; "I want to delight consumers, not bum them out." Artists have heard the message and are no longer waiting for the recording industry. They are moving ahead, with or without the recording industry.

Welcome to the revolution.

Tuesday, October 09, 2007

This Post is the New Black

Inspired by a graph from DIAGRAM and an old meme, I decided to take a look and see if the internet could help me figure out what is the new what. Variations on this is the new that are fairly common on the web; using Alexa's Web Search Service to search for the phrase "* is the new *" turned up several million instances of documents that are on hand in our archive. Unlike other search services, Million Search Results returns to up to 10 million hits, which makes it useful for data-mining.

For less than $100, I was able to find, extract, and aggregate pairs of words with an "is the new" relationship. It turns out that, as of this post, the most common pair happened to be "pink is the new blog" but there are some other interesting relationships in the data, which are easily plotted with GraphViz. Since the meme started with colors, so did I.

As you can see, black is, apparently, the new black. The graph should be read such that purple is the new blue, which is also the new black. Darker arrows indicate a stronger relationship, so green is the new red more than red is the new green, both of which are the new black. In fact, according to our data, black has a monopoly on being so last year. So what does the internet have to say about other domains?

Taking the question literally, I took a look at the companies behind the sites in our Top 10 list, plus the word "internet," adding other companies to fill out the graph. As an example, even though it isn't in the Top 10, a lot of things are the new AOL. In order of rank, Yahoo!, Google, YouTube, and Facebook are all the new internet. As above, the thicker arrows indicate stronger relationships in the data; so, Facebook is the new MySpace, which is the new Friendster, which is only sort of the new Facebook.

What features jump out at you?

Friday, October 05, 2007

Shake Up in the Alexa Top 10

Some of you noticed that this week, Google and MSN traded places in Alexa's Top 10 list. Actually, it's better to say that google.com and msn.com traded places, in order to avoid getting confused. This is mostly due to a significant drop in traffic for msn.com; in fact, because Alexa's Top Sites list is based on three months of data, it's been clear for weeks that MSN and Google were going to cross paths. All of this makes sense when you take into account Microsoft's campaign to promote their Live brand, with live.com at #5 and rising.

Fewer noticed that Hi5 and Baidu switched places. This would have happened sooner, except for August 30th. What happened that day? In some Latin American countries it's the Feast of Saint Rose of Lima. It's no coincidence that August 30th is a national holiday in Peru, from which Hi5 gets 12% of its users. So is it fair to compare the two when Baidu gets 90% of its users from China? It depends on what you're looking for.

For instance, youtube.com (currently at #4) get's 13% of its users from the US whereas Google's .com gets 26% of its users from the US. Compare this with myspace.com, which is globally at #6 but gets 45% of its users from the US where it is currently #3. These demographics are almost certainly important when making business decisions, which is why Alexa produces top sites lists for countries, languages, and categories.

Taking a look at our current Top 10, we have

  1. Yahoo!
  2. Google
  3. MSN
  4. YouTube
  5. Windows Live
  6. MySpace
  7. Orkut
  8. Facebook
  9. Wikipedia
  10. Hi5
So four of the top five are web portals, with Yahoo! still the king of hill. Of the other top spots, five are social networking sites and one is a collaborative encyclopedia. Those sparklines contain 4 months worth of 1 week averages; the traffic drop you can see a quarter from the right is Labor Day Weekend. I found it interesting that Orkut took a dip but, unlike the other sites, did not recover. The drop is due to fewer pageviews without a loss in reach.

If it's hard to imagine that the US could have such a big impact on a site that gets most of its reach from Brazil, consider what would happen if a small number of users who were responsible for a greater-than-average number of pageturns suddenly disappeared. The demographic would not significantly change even though there was a significant impact to traffic. I suspect that many Americans (responsible for an more than their fair share of hits on that site) went to BBQs that weekend, realized the people at the party were on other networks, and made the switch. Is it significant that Facebook, which gets 30% of its users from the US, had a drop in unique users that Friday with a less-than-proportional drop in pageviews the same day?

So what's the point? Fundamentally: Alexa ranks websites, not companies. In the global Top 10, Microsoft takes two spots and Google takes three. This traffic can only be properly understood by taking global demographics into account. Finally, a site's rank is dependent on its traffic in relation to other sites. That is, your site's rank can change even if your traffic does not --- or visa-versa --- due to relative traffic changes on other sites with similar rank.

Tuesday, October 02, 2007

Battle of the Game Consoles

Last week, the Google Blog posted a story about search trends and video game systems, namely a comparison between searches for Xbox 360, Playstation 3, and Wii.

Looking at the Google Trends graph next to an Alexa Traffic Graph (reach) for xbox.com, playstation.com, and wii.com, you can see mostly the same trends. In 2005-Q4, the Xbox is clearly the winner in both searches and traffic, for very good reason: of the three, it was the only one actually available. The traffic for “Playstation 3” correlates to the Tokyo Game Show, at which several games for the (then non-functional) system were demoed. The name “Wii” was only officially announced in 2006-Q2, which shows up as a bump in searches, at flag A. This does not translate into traffic for wii.com because the domain was parked until at least July of that year, which can be verified by taking a look at the Wayback Machine.

All three consoles have search and traffic surges in 2006-Q4; since then, Xbox 360 and Playstation 3 have been neck-and-neck in both traffic and searches, though their reach appears to be increasing. Wii, however, has remained level, after the initial peak. If you compare wii.com and nintendo.com, the name announcement and the console launch are both clearly visible. In actuality, the news volume (which is down) more closely resembles the traffic for Nintendo (and Wii) than the number of searches. So the question becomes: what will happen to Wii this winter? Is it just a late bloomer?