Protecting passwords from statistical-guessing attacks
A new paper by Stuart Schechter from Microsoft Research, Cormac Herley from Microsoft Research, and Michael Mitzenmacher from Harvard University titled “Popularity is everything” discusses a new method for protecting passwords from statistical-guessing attacks.
The main concept behind the paper is that the most common type of attack against a password-protected system is a brute-force attack which relies on a popularity-sorted dictionary. The proposed protection is to prevent any password from becoming popular enough that a statistical attack is likely to succeed.
There are a number of problems with this idea, however. First, a goal would likely be to prevent any password similar to a popular password from being used. However, if passwords are stored as hash-values (or salted hash-values) there would be no way to compare and find similarity. Storing the passwords as plain-text would solve this problem but introduce a vulnerability if the attacker obtained the password-database. Second, if the popularity is stored as a numeric value an attacker who obtains the database can determine that N users have the same password and use that information to pick a likely passwords. Similarly, an attacker can attempt to create a new password and determine if that password is popular by consulting the database, which would allow the attacker to concentrate on only the most popular passwords.
Herley, et al address these issues by proposing that popularity be stored as a count-min sketch and ensuring false-positives in the data. The count-min sketch is a data structure which relies on the results of multiple hash functions. An array is kept for each hash function recording the number of times that particular hash-value is used. When checking if a password is popular, the lowest number of times reported for the hash-value determines the number of times the password has been used. Collisions in the hash functions allow protection against similar passwords (see Figure 2). This number can be compared against a threshold to determine if it is too popular. False positives in the data help insure that an attacker who gains access to the sketch will not be too greatly aided (section 3.3). False positives are contributed by the collision in hash functions as well as restricting the counters in the sketch to have a maximum increment. Passwords which use these counters will all appear equally popular, forcing an attacker to waste time trying all of them.
A problem with this approach is that the number of organizations which can create an effective popularity count-min sketch is limited to organizations with a large number of users. A possible solution would be to have a shared count-min sketch among organizations, but that would make it easier for an attacker to obtain.
A proof-of-concept implementation of a count-min sketch will follow soon.
