Back to all blog posts

Every Tuesday we will be going through a
technical term from the bitcoin ecosystem. Today we are going to start with
understanding what a “hash” is.

What is a Hash?

In bitcoin a hash is the result of a set of
data going through a cryptographic hash function, but what does a hash function
really do? A hash function is a function that can turn
data of arbitrary size into a signature of fixed size.
An ideal hash function has 5 main
properties:

·    It is deterministic, this means that if I put
the same value through a hash function will always generate the same hash
value.

·    It is quick to generate a hash
value, for a hash to be ideal it has to be possible to generate the hash fairly
quickly.

·    It’s infeasible (almost
impossible) to find the original set of data from its hash value. It is only
possible to find it by guessing all the possible values of the original data,
and running them through the function.

·    Collision resistant It is
infeasible (almost impossible) to find two sets of data that have the same hash
value.

·    Changing even the smallest
value in the set of data should change the hash value drastically.

A real life example of using a hash
function could be done using a rope of a predetermined length and an analog clock.
If you always started at 12 and wrapped the rope around the clock until you ran
out, you would always end up in the same spot, let’s say for example 3:15. But
just knowing 3:15 you would never be able find out how long the rope actually
is.

But how does a hash algorithm

work in the computer world?

Real Hashing algorithms are fairly complex
and would be too hard to explain simply in just a blog post.
So as an example I am going to try to show
how a hashing algorithm could work by with a simplified version.
Lets say I created a hashing function that
assigned a number for every letter in the alphabet, corresponding to it’s position
in the alphabet (for example: a=1, b=2, c=3 etc…), and then the function summed
these values until it got a one digit number.
If I tried to run this algorithm with the
word “abba” I would get a hash value of 6 (1+2+2+1= 6).

This algorithm satisfies a few properties
of a good hashing function:
·    If I run the word “abba”
through the function over and over again, the result will always be 6, so it is
deterministic.

·    It is a simple computation, so
it is quick to generate.

·    From the hash value of 6 it is
not possible to find the original value of the data it represents, It is only
possible to find it by guessing all the       possible values of the original data,
and running them through the function.

·    Because the result of the hash
function is so small (one digit) many sets of data will have the same value. If
for example I entered the value “abc”     in the function I also would get 6 as a
hash value. To solve this issue you need to use a number so big that it can be
used to represent anything in the universe.

·    Because this function is so
simple the hash value does not change drastically when a small change is
applied to the original data. To change the     hash value drastically a hash function
should perform more complex operations on the set of data.

This example only satisfied 3 out of the 5 requirements to be a good hashing function.

An example of a good hashing function is SHA256, which is the most used hashing function in bitcoin. SHA256 successfully
satisfies all five requirements and makes for a good cryptographic hashing function.
Explaining exactly what calculations SHA256 does is beyond the scope of this article, but if you want to see what SHA256 hashes are
generated from different values you can visit https://passwordsgenerator.net/sha256-hash-generator/ and try them out for yourself.