Friday, July 10, 2009

Fascinating paper

On how to predict SSN from birth date and birth place information. The real irony is that the US government implemented a program requiring newborn babies to apply for a social security number with the birth certificate. This was intended to reduce fraud in assigning social security numbers. Unfortunately, all the babies born, say, in Concord, North Carolina, on a given Tuesday will have very similar social security numbers. Using publicly available data on social security numbers on those who have died, it turns out that with a basic two-variable regression model the 9 digit SSN can be obtained in less than 1000 guesses, often in less than 100 guesses.

Not so good.

Predicting Social Security numbers from public data
Alessandro Acquisti and Ralph Gross


Information about an individual's place and date of birth can be exploited to predict his or her Social Security number (SSN). Using only publicly available information, we observed a correlation between individuals' SSNs and their birth data and found that for younger cohorts the correlation allows statistical inference of private SSNs. The inferences are made possible by the public availability of the Social Security Administration's Death Master File and the widespread accessibility of personal information from multiple sources, such as data brokers or profiles on social networking sites. Our results highlight the unexpected privacy consequences of the complex interactions among multiple data sources in modern information economies and quantify privacy risks associated with information revelation in public forums.

Paper here

