Background & Summary

Existing websites and applications implementing an older password hashing algorithm like MD5 or SHA1 must be upgraded to a more secure algorithm. Both of these older algorithms are obsolete & breakable and if an attacker obtains those hashes from a lost backup tape or website vulnerability, the attacker could make quick work of determining your user’s passwords via Rainbow Tables. And since 55% of users use the same password on many different websites, your compromise exposes your users elsewhere too. Ideally, this migration from insecure to secure hashes must be done without downtime or forcing a password reset situation across the entire user base.

It is relatively straight-forward to upgrade to new hashing algorithms, such as SHA-512 or Scrypt; this blog post gives you the steps for this upgrade and to implement salting while you’re retrofitting the password system. We also give you advice on how to integrate over time instead of all at once. If you’re already using Spring Security, this migration can be done nearly transparently and we will show you how to implement it.

This blog post was co-written by Siddharth Coontoor and Jay Ball after teaching a security class where the students in all three sessions asked the same question: “How do we upgrade MD5 password hashes in our database without impacting our end users.”

Existing Implementation

Generally, on most websites, the code to implement username & password validation might look something like this pseudo-code:

From the above, replacing line 4 with passwordhash = hexstr( sha512(password) ) seems tempting, but that would invalidate all existing passwords instantly. Remember that all passwords in the database are currently MD5, so the quick code replacement can’t be done. There must be an interim step – we need to determine the current type of hash used.

There are a few mechanisms for determining the type of hash. One manner is to create a new column in the database with the hash type. When the new column is added, set the value to be 1 for all users since everyone starts out using the legacy MD5. So, the code might be:

That is one mechanism; another is by checking the length the existing hash. This method has the advantage that another database column does not need to be created to determine hash type. For the common hashes, this table shows lengths of each:

Hash Digest Bits Bytes Hex Bytes Base64 Bytes Effective Security Bits MCF Prefix
DES 56 7 14 12 39-43 (none)
MD5 128 16 32 24 <64 $1$
SHA-1 160 20 40 28 <63
SHA-2 256 256 32 64 44 128 $5$
SHA-2 512 512 64 128 88 256 $6$
SHA-2 512/256 256 32 64 44 128
Bcrypt 184 23 46 32 $2$, $2a$, $2x$, $2y$ $2b$
Scrypt (variable)
PBKDF1 160 20 40 28
PBKDF2 (variable)

And this code block would use the hash length to determine the hash type:

As can be seen from the table, there are conflicts, such as SHA-1 and PBKDF1 are both 160 bits. Thus, if using this mechanism for differentiating hash types, the designers must ensure that any future implementations take care to use different sizes; otherwise older passwords will be lost during the migration process. Alternatively, additional logic could be employed to handle newer passwords of length 160 versus older ones of length 160 (like checking last password change date).

A third option is to embed the hash type into the password column itself along with the password hash. Before we describe this, let’s make a small segue into the database schema. Many applications are designed to limit the size of the columns in the database to only what is required. If the system had allocated only enough space for md5, the password column could be a CHAR(32) – enough for a 32-hex digit string. Thus, if switching the password hash algorithm, you must also remember to increase the size of the password column in the database to an appropriately large size.

When embedding the hash type and password hash together, it requires some type of defined format to encompass the hash type and hash value. As such, a delimiter between the hash type and password hash must be used and for that, it is suggested to use a dollar sign ($). The choice of this glyph is meaningful as $ is not a valid base64 or hex character. Another choice could be a colon, but the colon was already used in Unix password/shadow file in the form of username:password_hash. In modern contexts, the colon is also used in Basic-Auth passwords and for JSON strings. Colons would require special escape sequences which would vary depending on the context. The dollar also allows migration from older algorithms to newer.

For example, $algorithm$salt_followed_by_hashed_password$ could be placed in the password column. The first part indicates the hash algorithm whose value might be SHA-512 and the remaining part would be the hash itself. If no dollar sign is in the column, then the password hash is assumed to be whatever legacy algorithm has been implemented by the system. As passwords are changed, the new storage format will be used.

The idea of the dollar-delimited fields can be expanded. On many Unix systems, the passwords are stored using the Modular Crypt Format (MCF), which is basically an expanded version of the above. These versions of Unix modified the standard system crypt() function call to utilize alternate algorithms. The crypt function expects two arguments, one is the password and the other the salt. If the salt were to begin with $, then the argument would be parsed for which algorithm to use and processed appropriately; otherwise the legacy crypt/DES mechanisms would be used. By implementing hashing in this manner, even older software could become updated to utilize modern hashes with (possibly) no code changes. In addition, because it was a silent modification to the system call, higher-level languages like Python automatically inherited the advanced features.

However, MCF is not standardized across the various Unix platforms. Not all versions of crypt support all algorithms. Some implementations place a $ at the end, others have different names for the same algorithm and the same name for different algorithms (as shown the MCF column in the table above). Overall, MCF is a starting point, not something we recommend.

The authors of Passlib, a Python-based password hashing library, attempted to document the various implementations of MCF and eventually concluded it needed to be replaced. Thus, they now recommend using the Password Hash Committee (PHC) specification which is basically:

with each component having an explicit definition and format.

  • algorithm_id is the symbolic name for the function
  • param is a parameter name
  • value is a parameter value
  • salt is an encoding of the salt (modified base64)
  • hash is an encoding of the hash output (modified base64)

Here is an example of a PHC string from the Passlib folks:

Which corresponds to

  • algorithm_id=5 which is SHA2-256
  • param=value of rounds=80000 meaning, run SHA2-256(SHA2-256(… 80000 times
  • salt of 60Y7mpmAhUv6RDvj
  • hash of AdseAOq6bKUZRDRTr/2QK1t38qm3P6sYeXhXKnBAmg0

This entire PHC password code was generated from a plaintext password of “fooey” and the salt above.

While Passlib can manage PHC hash strings and offers bindings for non-Python languages like Java, it might not be the best choice for legacy applications. For your application, it may not even be necessary to implement an official PHC-compliant specification, only one use the concepts shown above. Before “rolling your own” implementation, see if another library implements some portions of the above.

Spring Security

Spring Security uses some of these concepts as part of its PasswordEncoder structure and includes support for SHA2-256, PBKDF2, Bcrypt, & Scrypt with automatic salts and extra rounds of hashing. It also stores all of these parameters in the password field in the database. This password hashing mechanism also integrated with the login process.

For instance, we have an existing Spring application containing all user passwords stored as instances of MD5 hashes in the database.

Using Spring Security as an EE filter chain there are a few ways we can migrate our user’s passwords from MD5 hashes to iterative hashing function like PBKDF2.

  1. Approach 1: Upgrade algorithm upon authentication. This approach has minimal or no impact to the end user. When user attempts to login to the application, we retrieve the user password from the database and identify whether the hashing algorithm is MD5. If it is MD5, then hash the user-provided password from the request using MD5 and match it with the hash retrieved in the database. If they match, the user is authenticated and the software will automatically upgrade the user password into PBKDF2 and replace the MD5 version into database. This approach should have no impact on the end user.
  2. Approach 2: Force-password change, then upgrade algorithm. This approach impacts the user as we force all users who authenticate to the application to be redirected to a password change page where they are forced to change their passwords before they can continue. This password change module would use the PBKDF2 to hash the user provided passwords and store them into the database. This approach has a few advantages because this gives the application developers a chance to enforce new password complexity policies immediately. Also, this approach provides assurance to the application owners that any passwords which were compromised or password hashes which were leaked in the past can no longer be leveraged to hijack the victim’s account. A large disadvantage is that all users will change their passwords in quick succession, which could increase helpdesk calls.
  3. Approach 3: Upgrade now, change password later. In this approach the application team can run a custom stored procedure on their database to force a password change at a later date. The stored procedure would look at the last login time for each user and set a column like enforceChangePwdOnNextLogin in the users table to true depending on whether they changed their password before or after the roll-out of the password migration process. When the users authenticate, the application checks this column and if true, the application redirects the authenticated users to a change password page where the application transform the new password to PBDFK2 and then store it into the database. At this point, ensure that the enforceChangePwdOnNextLogin is also set to false so that the application does not redirect the user to change password page next time they login.

Depending on your business requirement, Spring Security can support you with all of these approaches irrespective of whether your application is legacy J2EE or a Spring MVC application.

For our demonstration, we will go with the first approach. The password migration process requires you to tweak the existing Spring Security authentication workflow.

One of the ways to do the migration is by rolling out a custom authentication module by implementing AuthenticationProvider class.

Once you implement AuthenticationProvider, you will need to implement all of its abstract methods including the authenticate() function. This authenticate() function is responsible for the following:

  1. Identifying whether existing user password hash is MD5 or PBKDF2.
  2. Authenticating users who are still on MD5 and migrating them to PBKDF2.
  3. Authenticating users who have already migrated to PBKDF2.

The implementation of this is:

An authentication request is processed by the AuthenticationProvider and a fully authenticated object with full credentials is returned. The authenticate() function does the following :

  1. Retrieves username and password from the authentication request.
  2. Checks whether the provided username exists in the database or not.
    a. This is done by implementing the UserDetailsService and overriding the loadUserByUsername() which retrieves the username from the database using JDBC and returns a user domain object.
  3. The user object contains the password retrieved from the database. Check whether the user password is MD5 or PBKDF2 based on the password length(MD5_STRING_LENGTH). Note that, this check can vary depending on your existing implementation of hashed passwords.
  4. If the retrieved password is determined to be MD5, then encode the user provided password and check if the hashes match. If the hashes match then convert the user retrieved password to PBDFK2 and store it into the database, and return authentication object with full credentials.
  5. If password is determined to be PBKDF2 then use the PBDFK2 encoder to verify the hashes and return authentication object with full credentials.

The default PBKDF2 password encoder will use 360000 iterations and output a hash size of 160 bytes. These default values are an aim for 0.5 seconds to validate the password, to slow down brute force attacks. Application owners should tune password verification on their own systems to their needs.
The resulting hash when PBKDF2 password encoder is used defaults to hex encoding however the designers can also configure the encoder to output in base64 format as well.

Below is the code which implements UserDetailsService to obtain the user object from the database.

The method for handling authorization would vary in different applications; however it is important that we return a user object through this service.

If the application is legacy, then invoke the custom authentication provider through configuration in web.xml using namespace:

In case you want to invoke the custom authentication provider through Java code, then it is possible by implementing WebSecurityConfigurerAdapter as shown below:

Sample Run

Now, with all this code and configuration, let’s get back to seeing the seamless password migration process. As seen earlier, here is the users tables before the migration:

Let us login and see the custom authentication provider in action. First, the login page.

And after authenticating, we get to the main page:

Once authenticated successfully, let’s check the user table to see if user “sid” has now been migrated to PBKDF2.


Yes, “sid” has been upgraded while “jay”, as usual, is slow on the uptake.

Now You Try It!

We have provided all code on our GitHub Repo for you to fork and use. Try it yourself, make change, ask questions, file change requests, and overall, upgrade you own software using this as a basis. If you do use our code, please give a shout-out.

Conclusion

Defense in Depth is essential to keeping your systems secure. If an attacker were to acquire the password hash from a lost backup tape, network tap, or website compromise, they could attempt to reverse them using rainbow tables. To remove these threats, add a salt to the password hashing process and upgrade to a more secure hashing algorithm.

We have provided a working template for a Spring application leveraging Spring Security with simple integration and limited impact on the end users. The code is short and easily customizable to your existing configurations.

We are interested in hearing how you used this code and what changes and enhancements were done to add it to your website. Please give us feedback at addresses below.

References

Various things we looked at in writing this article.

About The Authors

This blog post was co-written by Siddharth Coontoor and Jay Ball and co-published on Jay’s blogs at veggiespam.com; it will be mirrored to Sid’s blog in the future. Both Sid and Jay are #infosec professionals who do penetration testing and security threat modeling in many of the skyscrapers throughout Manhattan and Jersey City. Both participate in local infosec organizations such as OWASP, 2600, HackNYC, and more. They can be reached via DM @theClumsyCoder and @veggiespam or via email.