Password storage 101

4. I have decided to write a basic tutorial because I have been shocked multiple times in the past few weeks knowing how many large organizations keep their users' passwords. It's easy to assume that website operators would have basic knowledge in computer security, but I have found such assumption false.

This post is mainly for website operators. For average users, I offer two pieces of advice:

1. Ignore advice to use a non-repeated strong password that's at least 8 characters long, contains special characters etc., you won't be able to remember 50 different passwords in your head. Instead, use a moderately strong password, such as your initials+date of birth, or full name of your favorite actress that can be remembered. On when to repeat the same password, see advice 2.

2. Beware that many companies and governments don't store your password safely. A database administrator in the company or a hacker who gains access to the database may be able to see your password in the form of plain text (i.e., what you have entered into the form). A good (enough) test for that is to register your account with a random temporary password, and click the "Forget my password" link. If they send you an email with your password, be very careful! The best practice is probably to write down the password somewhere convenient. Instead, if they email you with a link to reset password or a temporary password to log on, then it probably is safe, and you can use the same password with caution.

Now, for operators of services that require logging in,

1. Never store your users' passwords in plain text!

The reason is simple, unscrupulous database administrators and hackers who gain access to the database will be able to pretend to be your users. Worse yet, your users trusted you enough to use the same set of login details for all her services, all her accounts would be vulnerable to impersonation. Remember that the firewall doesn't even need to be breached as people from within the company will be able to view the data as part of their daily job. Imagine a disgruntled employee post all details of your clients on Wikileaks.

Also, use lots of common sense, and do not blindly rely on security consultants (especially not one who ask you to do stupid things). It will be you who suffer reputation loss when bad things happen.

2. Never encrypt users' passwords

Same reason as reason 1, your employees and hackers know how to decrypt. Security by obscurity has proven again and again to not work.

3. Never apply hashing algorithm directly on the password

Good effort, but not good enough. Since employees and hackers know how the password is processed, they can use a Rainbow table to compare matches using brute force attack. It's likely that a few days or a few months are needed to produce the rainbow table, but all 30 million passwords will be recovered together. If a common hashing algorithm is used, there may well be a rainbow table freely available on a magical place called the Internet.

4. Use a long, randomized salt

If the same encryption salt is used across all accounts, we encounter the same problem described in advice 3 - the hacker can generate his own rainbow table for your site. Whenever a new account is created or password reset, generate a new long, random salt, and store the salt in the database as well (it's not meant to be secret), and now we are onto something.  When storing the password to the database, apply the hashing function to the plain text password appended to the salt, i.e., hash(passwd.Append(salt)) or equally valid, append the hash of the password to the salt, then hash again, i.e., hash ( hash(passwd).Append(salt)). Now, a rainbow table cannot be generated. The problem is that a password can still be recovered in a few days or months if a simple fast hashing function is used, but it takes 30 million times a few days or months to recover all those passwords for your 30 million customers. Not bad, but long way to go.

5. Use a slow hashing algorithm

By using a collision resistant hashing algorithm (read: not MD5, possibly not SHA-1 either. I would personally use SHA-256 or better), the main worry is brute force attack. The aim is to use the slowest possible (and still sensible) way to generate the hash. There are algorithms designed specifically for this purpose, bcrypt is a good example. A good way is to apply hashing algorithm multiple times. For instance, hashedPw = hash(passwd.Append(salt)); for (int i=0; i<1000000; i++) hashedPw = hash(hashedPw);
Be careful of cycles. Remember that if the hashing algorithm takes 1ms, a hacker can try 1000 passwords a second. If it takes 200ms, he can only try 5. And a difference of 199ms makes no difference to the user. Do not try to use sleep() to slow down in an artificial way, the hacker doesn't need to put that in his code. Read on if you worry about the extra server load.

6. Use a challenge-response protocol

Don't validate the hash at the server! First, you'll need super-computers to keep up with the large number of users, and second, even if you are rich to buy the super-computers, there may be a eavesdropper ready to steal your user's password during transmission. Use challenge-response authentication instead. The server generates a random string, and encrypts it with the hash. Since the string is one time, the encryption technique can be one that's fast and simple (e.g., DES). If the random string is "apple", the server send encrypt(hash, "apple") to the client. The client's computer then does the computation of hash from the password, decrypts the message, and sends back "apple" to the server.

7. Other miscellaneous points

Invest in a good firewall to keep the users' information safe. Information other than password can still be valuable to a crime organization.
Allow unlimited characters since after all, storage doesn't matter because you are only saving the hash which has fixed length.
Allow every possible character on the planet, support unicode because there is no reason not to.
Limit the number of failed log-in and exponentially increase the time out between each failed log-in to prevent brute force attack.
Read the whole post again, double check to make sure you have done everything mentioned, and only then, consider forcing your users to use secure password.

8. Centralize risk

If you are not sure if you understand all those mentioned and have implemented them correctly, don't create your own log-in page. There are people who know and understand how to manage their users' information, and are willing to help. Almost all your users would have an account with an OpenID provider. It's easy to add OpenID to your site! 
If you can't secure your users' information, definitely use OpenID. Even if you can, still consider using OpenID because it's so much easier.

Bo Tian

Bo Tian


Archive

2012 (3)
2011 (46)
2010 (62)
Posterous theme by Cory Watilo