The purpose of this post is to be an introduction to a fundamental cryptographic mechanism at play in cryptocurrency. If you’ve wondered how crypto works underneath the hood, but haven’t taken the time to research it for whatever reason, then hopefully this post will be informative and enjoyable for you. For the purposes of this post I’m going to refer to use ethereum as an example just for my own familiarity. I should also note, that while I’ve taken the time to research this prior to posting, cryptography is a super interesting field of which I am not an expert. I love using it, and I’m passionate about it, but from a mathematical perspective, I’m still learning. If there are any mistakes please feel free to shout it out in a comment below.
For the unfamiliar, it is easy to assume encryption is the cryptography in cryptocurrencies. Encryption, meaning the act of turning a message into a code that is only meant to be decoded by a specific audience. It may be tempting for new people to see a raw transaction hash and think the weird resulting bundle of letters that start with an 0x is the cryptography in crypto and call it a day. Consider for a moment, however, that anyone can decrypt those messages (If you already know the difference between encryption and encoding, just bare with me). If they’re all readable and unhidden by design, then what really is the point to all of this? Let’s take a look at the following raw transaction:
0xf901a1808459682f07831e8480949901ead10f6a379758c866febf9ad62184bfa643843b9aca00b901374e6f742065766572792077616c6c657420616c6c6f777320796f7520746f20696e636c75646520796f7572206f776e20646174612c2062757420746865206162696c697479206973206275696c7420696e746f20657468657265756d2e20596f75206a7573742074616b6520796f7572206d65737361676520616e6420636f6e766572742069742066726f6d205546543820746f2068657861646563696d616c2e20554654206973206772656174206265636175736520796f752063616e20696e636c7564652063686172616374657273207375636820617320e697a5e69cace8aa9e20666f72206578616d706c652e0a0a416e797761792c204920686f7065207468617420736561726368696e6720666f722074686973206561737465722065676720696e206d7920706f7374207761732066756e2e2ea0b02c1518b611a5a0bec7ab7aec4629730d9c437ed0dd7b752113741ced9a73b2a049b9221c1780e956a7f565889bca08859c794dafeb60846462782385d1d8f72b
That strange list of numbers and letters is called hexadecimal, and it’s a way of writing whatever you want using a base 16 system.
In this system the number 0x1 is 1 in our base 10 system, 0xF = 15 and 0x10 = 16.
Why use this system? Because it is much easier to write than plain binary and much more compact as well. It’s more convenient to represent this data in a format that uses less characters, but is also easily converted to binary.
An UFT-8 message in hexadecimal can be read by anyone. 436F6E67726174756C6174696F6E7320666F722074616B696E67207468652074696D6520746F206C6F6F6B207468697320757021 The ethereum message isn’t UFT-8 though.
What the ethereum message actually consists of is a structured message that is signed and then encoded with RLP. That is then converted to bytes and written in hexadecimal. RPL is designed to be a two way street. The message structure is as follows:
nonce: #the sequential number of this transaction from the pov of the from address
gasPrice:
gasLimit:
to:
value: # the value of the transaction
data: # the data included as part of the transaction
v:
r:
s:
With that information anyone can decode the message. The example I provided can be decoded by https://antoncoding.github.io/eth-tx-decoder/ for example (no affiliation, just thought his static page was cool and convenient to prove a point). This decoder website is just an in browser app, and will work offline. This illustrates the point that this can be decoded by anyone with relative ease, no networking required.
The fact of the matter is that these message are designed to be decoded. If they weren’t decodable by everyone, none of this would work. You see up and until now we’ve been working with encoding and decoding, which is simply transferring from one format to another. So far we’ve not encrypted anything. Nor will we.
The real magic of cryptography happens with values v,r,s. Those three variables together form the signature of the transaction. For ethereum transactions, the message is signed before rlp encoding, but it just adds those three fields. You can decode via RPL and get all fields without any advanced knowledge of the signature. ECDSA is designed so that every transaction and message signed comes with a built in way to get the public key (and for the case of cryptocurrency the wallet address of that key).
Using those 3 values together with the rest of the message you can calculate the public key. The public key, when written in hexadecimal, contains the wallet address. So if you can calculate a public key that contains the wallet address for this transaction you have proof that it was signed by the correct private key.
The mechanism as to how this works is called elliptical curve digital signature algorithm (ECDSA). ECDSA is awesome, and a bit complex. The jist of it is you have a curve on a graph, x and y values, and there are rules for moving points around on this curve. Your private key is the number of steps you move when you interact with this pattern. The message has been hashed into a number that can be represented on this graph, that hash is knowable by everyone because like we’ve shown we already have it and can deconstruct it (it’s how we got the values for v,r,s). The public key is a point on that graph as well. Here’s a visual representation of the curve:
ECDSA Curve
The gaph uses a finite field, so if a number is very large, it just gets replayed across an axis every time it passes the maximum size of the field. When a private key signs a message, it takes the message hash, a starting point, a secret very large number (K) and then uses that to calculate r and s values (v values were added for ethereum to select which of the possible public keys [yes ECDSA creates two public keys, you gotta pick one] to use and protect and against cross chain replay attacks[by using two different numbers per chainId]).
Something interesting, and extremely important, is that that value K must always be unique. That K value, the also secret very large number, if it is ever used twice that gives away enough information to calculate the secret key. Sony used the same value across a bunch of signatures, and was easily hacked as a result I believe in 2011.
The interesting thing here is our secret key is just the number of steps used in signing the message, and as long as K is unique each time, the signature will always contain just enough information to reveal the corresponding public key. It’s funny to think the number of steps is a secret key, but it makes sense. If you have a graph filled with seemingly random plot points and are dealing with a large range of numbers, asking how many steps does this take to go from one point to another, it’s very difficult to calculate, unless you already know:
I enjoyed taking the time to organize my thoughts on this topic and write this up. I hope it was entertaining and informative for at least a few people out there. Let me know if you found my second “hidden” message in this post.
Sources / Additional Reading:
https://medium.com/@codetractio/inside-an-ethereum-transaction-fa94ffca912f
https://medium.com/mycrypto/the-magic-of-digital-signatures-on-ethereum-98fe184dc9c7
https://github.com/ethereumjs/rlp https://github.com/ethereumjs/ethereumjs-util
https://web3js.readthedocs.io/en/v1.2.11/web3-eth-accounts.html
Videos:
https://youtu.be/NmM9HA2MQGI (Secret Key Exchange (Diffie-Hellman) – Computerphile)
https://youtu.be/NF1pwjL9-DE (Elliptic Curves – Computerphile)
https://youtu.be/dCvB-mhkT0w (Elliptic Curve Cryptography Overview)
https://youtu.be/QzUThXGRFBU (Elliptic Curve Digital Signature Algorithm ECDSA | Part 10 Cryptography Crashcourse)
https://youtu.be/6FlK3AZTz5k (Elliptic Curves and ECDSA – Bitcoin, Blockchain and Cryptoassets)