Password Entropy: What it is and Why it Matters…
TL;DR:
> It’s randomness and helps protect against brute force attacks from hackers.
> Make sure your password isn’t on some leaked password database
If you could pick your own password, and most do, what password would you choose? What format would you use, would you include special characters? How long would you make this password? Finding the perfect password is hard, one that is easy to remember and one that is safe. Especially in today’s world, it seems like passwords need to get longer and longer. How can we measurably determine the safety of a password? Well, it’s called entropy, specifically Information Entropy. In a nutshell, password entropy is the measure of a passwords “guessability”(don’t think that's a real word).
Why is it important?
Entropy is important for a few reasons, but none more important than stopping a bad actor from guessing your password in a brute-force attack. A brute-force attack is when a hacker will use software to run through many permutations (combinations) of passwords, to try and guess which one unlocks whichever database/account/service they are after. The amount of passwords per second they can guess varies on their budget and how motivated they are, this post breaks down the price for an attacker with a certain budget, time, and costs. Because, these things use electricity and costs money, an attacker would need to be skilled, funded, and motivated to start an attack. These facts, are why it is important to put in even just 15 minutes to work on upping your security, and with software, we can do that relatively easily.
How do I measure entropy?
Firstly, entropy is measured in bits. The more bits, the better your chances of surviving an attack. The reason for this is because, it will force the attacker to use up more computing power and time to guess the correct password. Using some math, we can calculate how many bits a password has. This means we can, in a way, guess if a password is: weak, decent, or strong. The formula for measuring password entropy is as follows:
E = log2(R)*L or E = log2(R^L)
E stands for entropy
R is the number of available characters
L is the length of the password
You can also get the entropy of the password by first calculating the number of available characters (R) to the power of the number of characters in the password (L), and then calculating the binary logarithm (log2) of the result (E = log2(RL)).
Any platform, worth its weight in salt, will typically ask you to create a password that meets these criteria:
Let’s break this down further, just so we can get to calculating, but before we do, keep this chart in mind:
Now that you have seen the chart, let’s focus on R in this formula, because it is important to know what they mean by available characters. Lowercase letters include [a-z], so that would be 26 available characters. Uppercase would include [A-Z], so that leaves us with another 26. So as of now, if our password included ONLY uppercase and lowercase letters, our character set (R) would be 52.
Let’s start including numbers as well, on a keyboard, there are 10 numbers available: [0–9], 52+10 = 62. So let’s whip out the handy dandy… python interpreter, using the `math` python library, or you can use a calculator, but I want to visualize it for y’all.
# Calculating the entropy of a password of 8 characters
# using upper/lowercase and numbers
>>> from math import log2
>>> log2(62)*8
47.633570483095
A password of 8 characters, using 62 available ASCII alphanumeric characters, will provide about 47.63 bits of entropy. This is a relatively weak password in my opinion. A good password should have about 60(ish) bits or more, ideally. To achieve this, we can do one of two things: increase the length of the password or include special characters. According to the ASCII table (character code 32–127, excluding character code 32 and 127 which are space and delete), that leaves us with 93 available characters. The amount of characters included in password generators varies (more on that later in the post), depending on the software provider. But, for the sake of the calculation, let's stick with 93 being the amount of available characters in a password that includes alphanumeric characters and special characters. Let's calculate the entropy of a password using these 93 characters in our available set of characters, with a length of 8 characters.
>>> from math import log2
>>> log2(93)*8
52.31327048886425
Just by including special characters, we can increase the bits of entropy for our password, by around 5 bits. A pretty good jump in entropy considering it’s the same length.
Why are some password generators different?
This can be for a myriad of reason, but let’s use Firefox browsers built in Password Generator. Luckily, their code is Free and Open Source (more on F/OSS in another post) so let us take a look:
// Some characters are removed due to visual similarity:
const LOWER_CASE_ALPHA = "abcdefghijkmnpqrstuvwxyz"; // no 'l' or 'o'
const UPPER_CASE_ALPHA = "ABCDEFGHJKLMNPQRSTUVWXYZ"; // no 'I' or 'O'
const DIGITS = "23456789"; // no '1' or '0'
const SPECIAL_CHARACTERS = "-~!@#$%^&*_+=)}:;\"'>,.?]";
The above block of code shows us, in the comments, why they excluded some of the characters. Firefox's browser generator removed some of the characters due to readability issues. I presume they want their users to have a generator where the end user could write their passwords down for later use? I’ll have to ask them. Using this generator, we’d get completely different outcomes due to the smaller character set. We can’t say, though, that because there is a smaller character set, that these passwords are less secure. Below, I will *attempt to* explain why.
Entropy is a factor, but NOT the only factor…
While password entropy is a big factor in developing a strong password, it’s important to emphasize that it is not the ONLY factor. I’d even argue that it’s not the most important factor. But, me saying that is not the same thing as me saying don’t place any importance on it. Let’s go over a scenario: I spun up a password, not generated by software. I created the password PassW0rD!!AbCd
. This password is lengthy, at 14 characters, and includes alphanumeric characters and special symbols. Let’s calculate the bits of entropy:
>>> from math import log2
>>> log2(93)*14
91.54822335551243
As we can see from the code block, this password would give us about 91.54 bits of entropy. 94ish bits of entropy is, by almost all standards, a strong password. But, what some may not realize is, this password probably lives somewhere on the internet on a database of leaked password or a database of common password permutations. This poses a problem because, while this password has enough entropy to pass most password strength checkers, it's already on the internet as plaintext, somewhere. This means an attacker would have no problem cracking that in a couple of hours. Maybe faster, depending on their skills and resources. We will go over threat models in another post.
How can I make sure my password is safe?
The safest way to ensure your password is safe is to check your password against a leaked password database, and to ensure your password is at least around 75(ish) bits in entropy. Keepass has the ability to check the quality of your passwords for the bits of entropy and Kaspersky. I would be very wary of putting in one of your personal passwords into an online checker, but you need to be able to check them against all available databases (Kaspersky claims they do not store any of the passwords, so a level of trust is needed here).