Wikipedia: 10 billion page views per month
Wikipedia statistician Erik Zachte has analysed the log files from the Wikimedia Foundation proxy server for a new survey. According to his analysis, Wikipedia readers viewed 10.175 billion pages in September alone. Over the same period its sister projects also racked up a further 456 million views.
Unsurprisingly, the English-language wikis dominate the statistics. The English Wikipedia totted up more than 5.4 billion page views. This is somewhat predictable, as with 2.5 million articles, it is far the largest Wikimedia Foundation project. The Japanese version came in second with 998 million views. The German version chalked up 818 million, even though it contains 30,000 more articles than the Japanese version. Of the other Wikimedia Foundation projects, Wikimedia Commons, a multimedia archive, has its nose in front with 173 million page views, with collaborative dictionary Wiktionary also clocking up 151 million views.
The underlying raw data can be accessed via Wikimedia board member Domas Mituzas' server. Users wanting to process the data will need plenty of storage capacity – even compressed and aggregated, the log-files for just one hour require 50 megabytes of space. Excluded from the statistics are page views by registered users, edits by whom generally circumvent the proxy servers. The numbers are, however, on the whole slightly inflated, as page requests by automatic scripts are also included in the statistics. Thus the African Herero language version of Wikipedia still totted up 5700 requests in September, despite the fact that the project was closed more than a year ago.
Users interested in immersing themselves more deeply in the statistics will find extensive analysis at THEwikiStics. There are detailed statistics for every single article for the last few months, including a list of the most sought-after articles. Even more detail is offered by the Wikipedia article traffic statistics, which give daily usage statistics for every article. On the day he died, the article on Austrian politician Jörg Haider, for example, was accessed over 235,000 times. But it’s not only headlines that whet the thirst for knowledge. On 3rd October, for example, more than 240,000 readers wanted to know what that day’s German Unity Day was all about.
Gregory Kohs – who is not a popular figure in the Wikipedia Community – has found another use for the data. He has published a study of incidents of vandalism of Wikipedia articles about US politicians. Using usage statistics, he has extrapolated how many readers will have viewed incorrect information in these articles before they were corrected. According to his calculations, knowingly-inserted misinformation was visible for 6.8 percent of the time during the period sampled – most of it, however, was obvious vandalism or incidental information.
In order to serve this volume of data, the Wikimedia Foundation currently operates over 400 servers, distributed across three computer centres. To reduce the administrative burden, Wikimedia has, under chief technical officer Brion Vibber, now migrated the entire server park to Ubuntu Linux 8.04. The computers had previously been running on Red Hat and various versions of Fedora Linux.