BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Researchers Sitting On 'Largest Known Database Of Twitter User Locations'

Following
This article is more than 9 years old.

Researchers have been trying various techniques to determine Twitter user's location even where they don't purposefully give it away. In one of the more intriguing papers that has gone under the radar until this week, researchers claimed they could “geolocate the overwhelming majority of active Twitter users” by looking at their contacts' locations and in their tests were able to “geotag over 80 per cent of public tweets”. As a result, they believe they are now sitting on “the largest known database of Twitter user locations”, though they told FORBES they could not share the data.

As social ties are often formed over short geographic distances, it is possible to get an approximate geotag of a Twitter user by examining known locations of their contacts, according to the study from Ryan Compton, David Jurgens, David Allen from the Information and System Sciences Laboratory at HRL Laboratories. The team created a technique that looks at a certain Twitter user’s friends and how often they interact over @mentions, determining where contacts have either purposefully or inadvertently given away their location. They claimed they could then use that information to get as accurate an estimate as possible for the target individual’s location. Compton and his colleagues believe their “variation minimization” technique is good enough to get a solid idea of where a user is based with an median error of 6.38km.

To test their findings they took a sample of 25,312,399,718 @mentions from public tweets collected between April 2012 and April 2014. This amounted to 76.9TB of data related to 110,893,747 users. That data contained 13,899,315 users who had tweeted with GPS-revealing data at least three times. The researchers claimed the majority of GPS-known users have at least one GPS-known friend within 10km.

“We were able to infer location for 971,731 test users with a median error of 6.38km and a mean error of 289.00km,” they noted in their paper. They tested their algorithm on those who had leaked their GPS location.

There’s no need for users to panic, however. The researchers’ aim was to get “high-volume static location inference with city-level accuracy”, so they couldn’t uncover tweeters’ exact addresses from their techniques. They admitted their technique was “only useful for static location inference” and that “fast-moving users with large activity radii will be tagged incorrectly by our method”. Whilst this might not prove useful for law enforcement, it'll likely be attractive to marketers who want to provide local offers or track trends in particular areas. It could also be useful for more altruistic projects, such as tracking regional flu trends or social unrest.

Previous attempts to determine the location of Twitter users who don’t share GPS information, such as those using phraseology and natural language analysis, have proven inaccurate. Professor Alan Woodward, security expert from the University of Surrey, was intrigued by the latest research, even if he wanted to see more proof of the researchers’ tracking capabilities. “In the absence of directly associated GPS data or the text expressly telling you where someone is, this new technique is one of the most interesting I have seen.

“If you could do this on an industrial scale I suspect the accuracy might be surprisingly accurate… It brings a whole new meaning to the phrase keep your friends close and your enemies closer still.”

George Danezis, at the Information Security Group of the Computer Science department at University College London, told FORBES the premise of the research appeared sound and, “after the fact, quite obvious”. “Humans are creatures of habit - they move only between few dwelling locations, and are usually interacting with people they know close to them. Therefore inferring locations of users or tweets on the basis of the location of other users or related tweets will work in many cases…  Inferring attributes from friends, including sexual orientation and politics, has already been explored.”

He says the ultimate lesson here for Twitter users is to establish norms with friends using the micro-blog. “It might be easy to be careful with your own privacy, but what your friends and contacts say and do can also be revealing about you. It seems that their location is too.”

This week also saw the release of separate research from the Brookings Project on US Relations With The Islamic World, looking at how extremist group ISIS was using Twitter. It estimated that from September through December 2014, there were at least 46,000 Twitter accounts used by ISIS supporters, though not all were active at the same time.  The think tank said hundreds sent tweets with location metadata embedded, whilst noting many of the accounts were linked (see graph below). HRL Laboratories' research may well have another use in studying that particular group.