Student collects 15 million Gmail addresses
In his blog, a student from the University of Amsterdam reports that he gathered around 15 million Gmail addresses from Google user profiles within a month. Matthijs Koot analysed just under 35 million profile links from Google's profile site map, which is easily accessible on the company's servers. Koot says he used the same IP address for all of the 35 million queries, but Google didn't attempt to stop the mass download. A Google spokesperson told British IT news source The Register that the site map does not make any information available that is not already publicly accessible.
The site map contains URLs to more than 7,100 text files with 5,000 profile links each. Site maps help other Web services map a web site's structure – in this case, for the indexing of Google profiles. In a lot of cases, Koot was not only able to get the Google user's user name (from which the person's Gmail address can be derived), but also the person's real name, information about education, employment history, current employer, place of residence, links to Twitter and LinkedIn accounts, and the profile holder's Picasa photo albums. Spammers, for example, could use this data for personalised advertising attacks.
Sample tests conducted by The H's associates at heise Security revealed that users often have publicly accessible photos. To see whether you have already created a Google profile, check your account settings. There, you can use the option "Help others find my profile in search results" to determine whether Google puts your profile's URL in its site maps.
Last summer, a hacker managed to collect more than 170 million Facebook data sets, which he made available via BitTorrent for downloading. At SchülerVZ, a German social network for pupils, a teenager used a crawler to collect more than a million profiles in autumn 2009; this raised the question of whether collecting publicly accessible profile information is a crime and, if so, how serious. In the case of SchülerVZ, the suspect was arrested under suspicion of stealing data; however the charges were dropped because the suspect committed suicide in his cell.