Google will now retain users’ search requests for half the amount of time that it used to, promising to scrub its server logs for personally identifiable data after nine months, instead of 18.
The change was announced on its official blog Monday evening.
Google says it was initially reluctant to reduce its retention policies because doing so would degrade the quality of its research data – based heavily on search engine logs – which it claims is a “critical ingredient” in its ongoing innovation.
“Over the last two years, policymakers and regulators – especially in Europe and the U.S. – have continued to ask us (and others in the industry) to explain and justify this shortened logs retention policy,” reads the post, which CNET attributes to Google’s global privacy counsel Peter Fleischer. “We responded by [explaining] how we were trying to strike the right balance between sometimes conflicting factors like privacy, security, and innovation. Some in the community of EU data protection regulators continued to be skeptical of the legitimacy of logs retention and demanded detailed justifications for this retention. Many of these privacy leaders also highlighted the risks of litigants using court-ordered discovery to gain access to logs, as in the recent Viacom suit.”
As a result, Google will be “significantly shortening our previous 18-month retention policy to address regulatory concerns and to take another step to improve privacy for our users.”
The decision did not come lightly, however. The company initially believed that reducing retention beyond 18 months “would degrade the utility of [our] data too much and outweigh the incremental privacy benefit for users.”
“We didn't stop working on this computer science problem, though,” it continues. “After months of work our engineers developed methods for preserving more of the data's utility while also anonymizing IP addresses sooner. We haven't sorted out all of the implementation details, and we may not be able to use precisely the same methods for anonymizing as we do after 18 months, but we are committed to making it work.”
Google says it retains search data in order to combat spam, internet fraud, malicious web sites, and to comply with “valid legal orders” from government agencies, reports the BBC.
Privacy advocates fear that Google’s ubiquity could be turned against its users, as complete server logs – which contain, at the least, the user’s IP address and search terms – would allow one to build a comprehensive profile of almost any given user.
Such was the case for AOL in 2006, when company researchers released-and-then-retracted the private search histories for over 650,000 of its subscribers: using only that data, which consisted primarily of an anonymized user ID and search request, investigators were able to trace individual searches with an almost frightening precision – including one such user, whose name and search history eventually landed in the New York Times.
Google’s myriad services present an even larger set of possibilities, however: search records can be crosslinked with Gmail access logs, for example, to build a comprehensive e-mail/search profile of a given user. Take those same records and crosslink them with AdWords, and a nearly-complete web-surfing history comes into focus – accurate to a single IP address.
Google also announced that it would anonymize data from its Google Suggest service, which powers the Omnibox in Google Chrome, and Google Search utilities in Firefox and the iPhone, after 24 hours.