Skip to content

Commit ea43784

Browse files
[mabel] Add more exercises and edits to security hash and encrypt
1 parent b2f8076 commit ea43784

File tree

2 files changed

+221
-48
lines changed

2 files changed

+221
-48
lines changed

docs/backend-web-development/security-encrypt-decrypt.md

Lines changed: 205 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Security Encryption/Decryption
22

3+
Cryptology (coming from the Greek words κρυπτός (kryptos) meaning "hidden" and -λογία (-logia) denoting "study of", and hence is the study of hidden writings) is a very broad subject. It is often split into two sections: Cryptography (where γράφειν (graphein) means "wriring") and Steganography (where στεγανός (steganos) means "covered'' or "protected").
4+
5+
Steganography is the hiding of a message by a physical means.
6+
7+
Cryptography is split into two ways of changing the message systematically to confuse anyone who intercepts it:
8+
9+
The task of the cryptographer is to create a system which is easy to use, both in encryption and decryption, but remains secure against attempts to break it.
10+
311
What we are going to cover:
412

513
- Principles of a good cipher
@@ -18,7 +26,76 @@ A **cipher** is an algorithm for performing encryption or decryption.
1826
Why must we learn this?
1927
We can easily just use whatever industry-standard algorithms without really understanding them. However, we will like to understand the algorithms so that we know what we are using, why we are doing this and how we can better use the algorithms in the correct way.
2028

21-
There are two major category of encryption/decryption algorithms, symmetric and asymmetric.
29+
## Principles of a good cipher
30+
31+
Shannon's confusion and diffusion are two properties of a secure cipher.
32+
33+
### Confusion
34+
35+
Relationship between ciphertext and key is obscured.
36+
37+
One aim of confusion is to make it very hard to find the key even if one has a large number of plaintext-ciphertext pairs produced with the same key.
38+
39+
This property makes it difficult to find the key from the ciphertext and if a single bit in a key is changed, most or all the bits in the ciphertext will be affected.
40+
41+
### Diffusion
42+
43+
Relationship between ciphertext and plaintext is obscured.
44+
The influence of one plaintext bit is spread over many ciphertext bits.
45+
46+
The statistics of the plaintext is "dissipated" in the statistics of the ciphertext.
47+
48+
## Simple ciphers
49+
50+
### Monoalphabetic Substitution Ciphers
51+
52+
Substitution Cipher works by replacing each letter of the plaintext with another letter.
53+
54+
![substitution cipher](https://crypto.interactive-maths.com/uploads/1/1/3/4/11345755/4433929_orig.jpg)
55+
56+
Each letter is encrypted as the next letter in the alphabet: "a simple message" becomes "B TJNQMF NFTTBHF". They were used for a long time but are now very easy to break. They played a big part in developing cryptography.
57+
58+
### Caesar Shift Cipher
59+
60+
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/sMOZf4GN3oc?controls=0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
61+
62+
### Exercises
63+
64+
Let's code our own caesar shift cipher!
65+
66+
Julius Caesar protected his confidential information by encrypting it using a cipher. Caesar's cipher shifts each letter by a number of letters. If the shift takes you past the end of the alphabet, just rotate back to the front of the alphabet. In the case of a rotation by 3, w, x, y and z would map to z, a, b and c.
67+
68+
abcdefghijklmnopqrstuvwxyz => defghijklmnopqrstuvwxyzabc
69+
70+
Create 2 function, caesarCipher and decryptCaesarCipher
71+
72+
```js
73+
expect(caesarCipher(“apple”, 3)).toBe(“dssoh”);
74+
expect(decryptCaesarCipher(“dssoh”, 3)).toBe(“apple”);
75+
expect(caesarCipher(“abcde-fghij”, 3)).toBe(“defgh-ijklm”); // non-alphanumeric characters like `-` should be left unchanged
76+
```
77+
78+
## Simple attacks
79+
80+
### Ciphertext-only attack
81+
82+
We can use ciphertext-only attacks if we can figure out the plaintext or better still, the key from the ciphertexts.
83+
84+
#### Frequency analysis
85+
86+
Collecting many ciphertexts allow us to do frequency analysis to figure out the plaintexts.
87+
88+
### Known-plaintext attack
89+
90+
Otherwise we can break the encryption using **known-plaintext attack**.
91+
92+
Knowing both the plaintext and ciphertext, an analyst will be able to figure out the key or keys used.
93+
94+
It could be a brute force method: Try all keys, decrypt the ciphertext and see if it matches the plaintext. This always works for every cipher and will give you the matching key.
95+
96+
For every modern cipher like AES (with key sizes of 128 bit or more) the key space is so large that you need much more time (until the end of the time) to check a significant portion of all keys.
97+
98+
There are two major category of encryption/decryption algorithms involving keys, symmetric and asymmetric.
2299

23100
## Symmetric Encryption / Decryption
24101

@@ -29,6 +106,78 @@ Symmetric encryption/decryption, where one secret key is used for both encryptio
29106

30107
A secret key is just a number. AES can work with keys of three different sizes, 128 bits, 192 bits, and 256 bits. When we say AES-128, AES-256, we are referring to the size of the key as AES is actually always a 128-bit cipher.
31108

109+
Real-life use of AES includes [Encryption at Rest for databases](https://docs.mongodb.com/manual/core/security-encryption-at-rest/)
110+
111+
### AES
112+
113+
[Stickman AES Explanation](http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html)
114+
115+
### Initialization Vector (IV)
116+
117+
Having a unique IV per encrypted file / data is crucial.
118+
119+
The IV adds randomness to your start of your encryption process. When using a chained block encryption mode (CBC) (where one block of encrypted data incorporates the prior block of encrypted data) we're left with a problem regarding the first block, which is where the IV comes in.
120+
121+
If you had no IV, and used chained block encryption (CBC) with just your key, two files that begin with identical text will produce identical first blocks. If the input files changed midway through, then the two encrypted files would begin to look different beginning at that point and through to the end of the encrypted file.
122+
123+
If someone noticed the similarity at the beginning, and knew what one of the files began with, he could deduce what the other file began with. Knowing what the plaintext file began with and what it's corresponding ciphertext is could allow that person to determine the key and then decrypt the entire file.
124+
125+
Now let's add the IV. If each file used a random IV, their first block would be different. The above scenario has been thwarted.
126+
127+
Now what if the IV were the same for each file? Well, we have the same problem scenario again. The first block of each file will encrypt to the same result. Practically, this is no different from not using the IV at all.
128+
129+
Therefore you require a unique IV.
130+
131+
Answer adapted from https://stackoverflow.com/questions/9049789/aes-encryption-key-versus-iv
132+
133+
### Electronic Code Book (ECB) vs Cipher Block Chaining (CBC)
134+
135+
Disadvantage of ECB is lack of diffusion. ECB encrypts identical plaintext blocks into identical ciphertext blocks, it does not hide data patterns well.
136+
137+
Visual explanation of why ECB mode is not suitable to be used for encryption. [ECB vs CBC](https://pthree.org/2012/02/17/ecb-vs-cbc-encryption/)
138+
139+
Zoom was using a [AES-128 key in ECB mode](https://www.zdnet.com/article/zoom-concedes-custom-encryption-is-sub-standard-as-citizen-lab-pokes-holes-in-it/)
140+
141+
![ECB encryption](http://upload.wikimedia.org/wikipedia/commons/c/c4/Ecb_encryption.png)
142+
Image by wikipedia.
143+
144+
![CBC encryption](https://upload.wikimedia.org/wikipedia/commons/thumb/8/80/CBC_encryption.svg/1202px-CBC_encryption.svg.png)
145+
Image by wikipedia.
146+
147+
## Exercises
148+
149+
Try encrypt and decrypt AES. Code from https://hackthestuff.com/article/how-to-encrypt-and-decrypt-data-in-node-js-using-crypto
150+
151+
```js
152+
const crypto = require("crypto");
153+
154+
//aes-256-cbc algo requires a 256 bit (32 bytes) length key and an 128 bit length iv (aka. initialisation vector).
155+
156+
// Nodejs encryption examples with CBC
157+
const crypto = require("crypto");
158+
const algorithm = "aes-256-cbc";
159+
const key = crypto.randomBytes(32);
160+
const iv = crypto.randomBytes(16);
161+
162+
function encrypt(text) {
163+
let cipher = crypto.createCipheriv(algorithm, Buffer.from(key), iv);
164+
let encrypted = cipher.update(text);
165+
encrypted = Buffer.concat([encrypted, cipher.final()]);
166+
return { iv: iv.toString("hex"), encryptedData: encrypted.toString("hex") };
167+
}
168+
169+
function decrypt(text) {
170+
let iv = Buffer.from(text.iv, "hex");
171+
let encryptedText = Buffer.from(text.encryptedData, "hex");
172+
let decipher = crypto.createDecipheriv("aes-256-cbc", Buffer.from(key), iv);
173+
let decrypted = decipher.update(encryptedText);
174+
decrypted = Buffer.concat([decrypted, decipher.final()]);
175+
return decrypted.toString();
176+
}
177+
```
178+
179+
## Problems with purely using symmetric encryption
180+
32181
The main problem with symmetric encryption is, how can the sender and receiver agree on a key securely on the Internet? If we send the key through email, it might be intercepted by a third-party. We cannot send a key over a public channel.
33182

34183
We shall try to solve this problem with asymmetric encrypion.
@@ -57,24 +206,62 @@ If Amy wants to prove that Bob is the one who sent the message, Bob needs to sen
57206

58207
### Public key cryptography with digital signatures
59208

60-
The digital signature along with the message is then encrypted with the public key of the recipient.
61-
62-
See [an example using OpenSSL](https://pagefault.blog/2019/04/22/how-to-sign-and-verify-using-openssl/).
209+
The digital signature created by Bob along with the message is then encrypted with the public key of the recipient (Amy). Amy then can eventually decrypt the digital signature + message with her private key.
63210

64211
### RSA
65212

66-
RSA multiplies two large prime numbers.
213+
- RSA stands for Rivest, Shamir, Adleman
214+
- widely used, one of the first algorithms in 1977
215+
- can be used for both public key cryptography and digital signature
216+
- security is based on an assumption: factoring a very large integer is hard
217+
218+
RSA multiplies two large prime numbers for its algorithm.
67219

68220
How does one generate large prime numbers? Pick a large random number (a very large random number) and test for primeness. If that number fails the prime test, then add 1 and test again.
69221

70222
We take these two random prime numbers (p and q) and then multiply them together to create a modulus (N). The value of N is then part of the public and the private key. For RSA-2048 we use two 1024-bit prime numbers, and RSA-4096 uses two 2048-bit prime numbers.
71223

224+
![RSA algorithm](https://i.ytimg.com/vi/-jSX9fNJiN8/maxresdefault.jpg)
225+
72226
Multiplication is easy, but the difficult part is factoring the product of the multiplying those two primes, to get back the original two primes.
73227

74228
Factoring out the two prime numbers that makeup the N will take a very long time. However, generating and checking those two primes is relatively easy.
75229

76230
RSA-2048 is used commonly within digital certificates and for TLS.
77231

232+
### Digital certificates
233+
234+
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/heacxYUnFHA?controls=0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
235+
236+
## Exercises
237+
238+
### Sign and verify with RSA using Node RSA
239+
240+
1. Download Node RSA.
241+
https://www.npmjs.com/package/node-rsa
242+
243+
2. Using the instructions on the github, use Node RSA to generate a public / private key pair.
244+
245+
3. encrypt using public key and decrypt using private key for the following paragraph:
246+
247+
```
248+
SINGAPORE: The number of COVID-19 cases in Singapore crossed the 13,000 mark on Sunday (Apr 26), after another 931 cases were confirmed as of noon.
249+
250+
The vast majority of the latest cases are work permit holders residing in foreign worker dormitories, the Ministry of Health (MOH) said in its preliminary release of figures.
251+
252+
Fifteen cases are Singaporeans or permanent residents, added MOH.
253+
254+
The new cases bring the national total to 13,624.
255+
```
256+
257+
4. Export your private and public keys and share your public keys with the class.
258+
5. Try to modify your code to read a private key in a key.pem file and a public key in key.pub file, rather than generating a pair each time.
259+
260+
## Problems with purely using asymmetric cryptography
261+
262+
- is very slow
263+
- data size is very limited
264+
78265
## Asymmetric cryptography with symmetric key
79266

80267
Due to the problems of each type of cryptography, symmetric key (session key) is often used with the asymmetric public / private key cryptography.
@@ -85,37 +272,28 @@ Bob can now decrypt the session key using his private key. Now he has the sessio
85272

86273
Since asymmetric cryptography is used to encrypt the session key and not the entire plain text, it would take less time.
87274

88-
## Principles of a good cipher
275+
Protocols like HTTPS are very fast and can be used to encrypt very large streams of data because they only use RSA to exchange an symmetric AES key (a shared secret) and then continues the session with that shared secret.
89276

90-
Shannon's confusion and diffusion are two properties of a secure cipher.
277+
### HTTPS over SSL/TLS Certificates
91278

92-
### Confusion
279+
Hypertext Transfer Protocol Secure (HTTPS) is an extension of the HTTP for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS), or formerly, its predecessor, Secure Sockets Layer (SSL). The protocol is therefore also often referred to as HTTP over TLS, or HTTP over SSL.
93280

94-
Relationship between ciphertext and key is obscured.
281+
When you visit a website, you may notice a green lock icon on the URL address bar. That indicates the website has a SSL/TLS certificate to prove its identity.
95282

96-
One aim of confusion is to make it very hard to find the key even if one has a large number of plaintext-ciphertext pairs produced with the same key
283+
Each certificate has a private key and public key. The web server hosting the website holds the private key, and the browsers download the certificates which contains the public key.
97284

98-
This property makes it difficult to find the key from the ciphertext and if a single bit in a key is changed, most or all the bits in the ciphertext will be affected.
285+
Then the browser can send a challenge (**nonce**) to the web server. The challenge is basically some random number encrypted with the public key of the certificate. If the web server holds the corresponding private key, it should be able to decrypt the challenge and return the correct random number to the browser.
99286

100-
### Diffusion
287+
Upon receiving the correct random number sent in the challenge, the browser can confirm the web server holds the correct private key, hence its identify (as declared in the certificate) can be trusted.
101288

102-
Relationship between ciphertext and plaintext is obscured.
103-
The influence of one plaintext bit is spread over many ciphertext bits.
104-
105-
## Simple attacks
106-
107-
### Ciphertext-only attack
108-
109-
We can use ciphertext-only attacks if we can figure out the plaintext or better still, the key from the ciphertexts.
289+
This random number is then used as a secret to encrypt/decrypt the HTTP requests/responses between the browser and the server.
110290

111-
#### Frequency analysis
291+
SSL and TLS simply refer to this handshake that takes place between a client and a server. The handshake doesn’t actually do any encryption of the actual data itself, it just agrees on a shared secret and type of encryption that is going to be used.
112292

113-
Collecting many ciphertexts allow us to do frequency analysis to figure out the plaintexts.
293+
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/4nGrOpo0Cuc?controls=0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
114294

115-
### Known-plaintext attack
295+
![TLS 1.2 Handshake](https://commons.wikimedia.org/wiki/File:Full_TLS_1.2_Handshake.svg)
116296

117-
Otherwise we can break the encryption using **known-plaintext attack**.
297+
Image from wikipedia
118298

119-
It could be a brute force method: Try all keys, decrypt the ciphertext and see if it matches the plaintext. This always works for every cipher and will give you the matching key.
120-
121-
For every modern cipher like AES (with key sizes of 128 bit or more) the key space is so large that you need much more time (until the end of the time) to check a significant portion of all keys.
299+
![TLS](https://www.imperva.com/wp-content/uploads/sites/13/2020/03/diagram-52@3x.png)

docs/backend-web-development/security-hash-others.md

Lines changed: 16 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,16 @@ Common cryptographic hash functions including:
3232

3333
(From: https://hackernoon.com/cryptographic-hashing-c25da23609c3)
3434

35+
### MD5 failure
36+
37+
> One basic requirement of any cryptographic hash function is that it should be computationally infeasible to find two distinct messages that hash to the same value. MD5 fails this requirement catastrophically; such collisions can be found in seconds on an ordinary home computer.
38+
39+
In 2004,it was shown that MD5 is not collision-resistant. Thus MD5 is not suitable for applications like SSL certificates or digital signatures that rely on this property for digital security.
40+
41+
### SHA-2 Hashing
42+
43+
<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/DMtFhACPnTY?controls=0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
44+
3545
### Protecting Passwords
3646

3747
Websites store their user passwords in their databases. For security reasons, those passwords are not stored in plaintext. Instead, the values stored in the databases are actually the hashes of the original passwords, generated using cryptographic hash functions like bcrypt.
@@ -65,32 +75,17 @@ We have three aims:
6575

6676
- Non-repudiation: If the recipient passes the message and the proof to a third party, can the third party be confident that the message originated from the sender?
6777

68-
| Security | Hash | MAC | Digital |
69-
| --------------- | ---- | --------- | ---------- |
70-
| Integrity | Yes | Yes | Yes |
71-
| Authentication | No | Yes | Yes |
72-
| Non-repudiation | No | No | Yes |
73-
| Kind of keys | none | symmetric | asymmetric |
78+
| Security | Hash | MAC | Digital signature |
79+
| --------------- | ---- | --------- | ----------------- |
80+
| Integrity | Yes | Yes | Yes |
81+
| Authentication | No | Yes | Yes |
82+
| Non-repudiation | No | No | Yes |
83+
| Kind of keys | none | symmetric | asymmetric |
7484

7585
For MAC, receiver can forge any message the sender sends as the key is shared.
7686

7787
## Other Use Cases of Cryptography
7888

79-
### SSL Certificates
80-
81-
When you visit a website, you may notice a green lock icon on the URL address bar. That indicates the website has a certificate to prove its identity.
82-
83-
Each certificate has a private key and public key. The web server hosting the website holds the private key, and the browsers download the certificates which contains the public key.
84-
85-
Then the browser can send a challenge (**nonce**) to the web server. The challenge is basically some random number encrypted with the public key of the certificate. If the web server holds the corresponding private key, it should be able to decrypt the challenge and return the correct random number to the browser.
86-
87-
Upon receiving the correct random number sent in the challenge, the browser can confirm the web server holds the correct private key, hence its identify (as declared in the certificate) can be trusted.
88-
89-
### HTTPS
90-
91-
Hypertext Transfer Protocol Secure (HTTPS) is an extension of the HTTP for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS), or formerly, its predecessor, Secure Sockets Layer (SSL). The protocol is therefore also often referred to as HTTP over TLS, or HTTP over SSL.
92-
A website that supports HTTPS uses the public/private key associated with its SSL certificate to create a random number and share it with the browser. That random number is used as a secret to encrypt/decrypt the HTTP requests/responses between the browser and the server.
93-
9489
### Blockchain
9590

9691
Blockchain is a [distributed ledger](https://medium.com/@vijay.betigiri/blockchain-explained-like-im-5-yrs-5f04b91b059c) that ensures all the information recorded in the system are never modified. This is achieved by using some of the encryption/decryption algorithms such as RSA.

0 commit comments

Comments
 (0)