Cryptography Basics

Cryptography Basics

Cryptography is a word derived from the Greek word kryptos (“hidden”), and the use of cryptography pre-dates the computer age by thousands of years. In face, the history of cryptography was documented over 4000 years ago, where it was first allegedly used in Egypt. Keeping secrets has long been a concern of human beings, and the purpose of cryptography is to hide information or change it so that it is incomprehensible to people for whom it is not intended. Cryptographic techniques include:

  • Encryption: Involves applying a procedure called an algorithm to plaintext to turn it into something that will appear to be gibberish to anyone who doesn’t have the key to decrypt it.
  • Steganography: A means of hiding the existence of the data, not just its contents. This is usually done by concealing it within other, innocuous data.

The first goal of cryptography is confidentiality. Through the use of cryptography, users are able to ensure that only an intended recipient can “unlock” (decrypt) an encrypted message. Most modern algorithms are secure enough that those without access to the message “key” cannot read the message. Thus, it is extremely important to keep the secret key or private key completely secret. If a secret or private key is compromised, the message essentially loses all confidentiality.


Guaranteeing message integrity is another important aspect of cryptography. With cryptography, most asymmetric algorithms have built-in ways to validate that all the outputs are equivalent to the inputs. Usually, this validation is referred to as a message digest, and, on occasion, can be vulnerable to man-in-the-middle (MTM) attacks.

Digital signatures serve to enforce data integrity and non-repudiation. A digital signature ensures that the message received was the message sent, because a hash was performed on the original message using a hashing algorithm. The hash value created by this process is encrypted by the author’s private key and appended to the message. To verify that the message has not been modified, the recipient uses the author’s public key to decrypt the hash created by the author. The recipient also creates a hash of the message body. If the recipient’s hash matches the hash created by the author of the message, the recipient knows that the message is unaltered.

Some types of asymmetric algorithms are immune to MITM attacks, which are only successful the first time two people try to communicate. When a third party intercepts the communications between the two trying to communicate, the attacker uses his own credentials to impersonate each of the original communicators.

Beware of the key exchange mechanism used by any PKE system. If the key exchange protocol does not authenticate at least one and preferably both sides of the connection, it may be vulnerable to MITM-type attacks. Authentication systems generally use some form of digital certificates, and require a PKI infrastructure.

Also, note that MITM-based attacks can only occur during the initial correspondence between two parties. If their first key exchange goes unimpeded, then each party will authenticate the other’s key against prior communications to verify the sender’s identity.

Because there isn’t any authentication built into the Diffie-Hellman algorithm, implementations that use Diffie-Hellman-type key exchanges without some sort of authentication are vulnerable to MITM attacks. Since the protocol itself does not authenticate the client or the server, it’s possible for someone to eavesdrop on the communications. This deficiency was one of the main reasons that the SSH-2 protocol was completely redeveloped from SSH-1. The SSH-2 protocol authenticates both the client and the server, and warns of or prevents any possible MITM attacks, depending on configuration, so long as the client and server have communicated at least once. However, even SSH-2 is vulnerable to MITM attacks prior to the first key exchange between the client and the server.


External Links:

Cryptography at Wikipedia

Hash Functions

Hash functionIntroduction to Hash Functions

Hashing is a technique in which an algorithm (also called a hash function) is applied to a portion of data to create a unique digital “fingerprint” that is a fixed-size variable. If anyone changes the data by so much as one binary digit, the hash function will produce a different output (called the hash value or a message digest) and the recipient will know that the data has been changed. Hashing can ensure integrity and provide authentication. The hash function cannot be reverse-engineered: in other words, you cannot use the hash value to discover the original data that was hashed. Thus, hashing algorithms are referred to as one-way hashes. A good hash function will not not return the same result from two different inputs (returning the same result from two different inputs is known as a collision). The collision domain of the function should be large enough to make it extremely unlikely to have a collision. All of the encryption algorithms studied so far, both symmetric and asymmetric, are reversible. However, there is no reversible function for hashing algorithms, so original material cannot be recovered. For this reason, hashing algorithms are commonly referred to as one-way hashing functions. Irreversible encryption techniques, however, are useful for determining data integrity and authentication. It is easy to generate hash values from input data and easy to verify that the data matches the hash, but hard to fake a hash value to hide malicious data. This is the principle behind the Pretty Good Privacy algorithm for data validation.


Sometimes it is not necessary or even desirable to encrypt a complete set of data. Suppose someone wants to transmit a large amount of data, such as a CD image. If the data on the CD is not sensitive, they may not care that it is openly transmitted, but when the transfer is complete, they want to make sure the image you have is identical to the original image. The easiest way to make this comparison is to calculate a hash value on both images and compare results. If there is a discrepancy of even a single bit, the hash value of each will be radically different. Provided they are using a suitable hashing function, no two inputs will result in an identical output, or collision. The hashes created, usually referred to as digital fingerprints, are usually of a small, easily readable fixed size. Sometimes these hashes are referred to as secure checksums, because they perform similar functions as normal checksums, but are inherently more resistant to tampering.

Encrypted passwords are often stored as hashes. When a password is set for a system, it is generally passed through a hashing function and only the encrypted hash is stored. When a person later attempts to authenticate, the password is hashed and that hash is compared to the stored hash. If these are the same, they are authenticated, otherwise access is rejected. In theory, if someone were to obtain a password list for a system, it would be useless, since by definition it is impossible to recover the original information from its hashed value. However, attackers can use dictionary and brute-force attacks by methodically comparing the output hash of a known input string to the stolen hash. If they match, the password has been cracked. Thus, proper password length and selection are crucial.

There are several different types of hashing, including division-remainder, digit rearrangement, folding, and radix transformation. These classifications refer to the mathematical process used to obtain the hash value. Here are two popular hashing algorithms:

  • Message Digest 4/Message Digest 5 (MD4/MD5): The message digest (MD) class of algorithms were developed by Ron Rivest for use with digital signatures. They both have a fixed 128-bit hash length, but the MD4 algorithm is flawed and the MD5 hash has been adopted as its replacement. However, the security of the MD5 hash function has been severely compromised, and a collision attack exists that can find collisions within seconds on a computer with a 2.6 GHz Pentium 4 processor. MD6 was introduced in 2008 and is designed to take advantage of the full potential of future hardware processors with tens and thousands of cores instead of the conventional uni-core systems.
  • Secure Hash Algorithm (SHA): This hashing algorithm was created by the U.S. government (NIST and the National Security Agency [NSA]) and operates similarly to the MD algorithms. The most common is SHA-1, which is typically used in IPsec installations, and has a fixed hash length of 160 bits. SHA-2 has a fixed hash length of up to 512 bits and has a word size of 64 bits, as does SHA-3.


External Links:

Cryptographic hash function at Wikipedia

El Gamal and RSA Algorithms

El Gamal

In the previous article, we discussed the Diffie-Hellman algorithm. In this article, we will cover two more asymmetric encryption algorithms, El Gamal and RSA.

El Gamal Algorithm

The El Gamal algorithm is essentially an updated and extended version of the original Diffie-Hellman algorithm based on discrete logarithms. The security of the algorithm is roughly on par with that of the Rivest-Shamir-Adleman (RSA) algorithm. El Gamal has a few drawbacks, mainly its larger output and random input requirement. Encrypted El Gamal ciphertext is much longer than the original plaintext input, so it should not be used in places where bandwidth is a limiting factor, such as over slow wide area network (WAN) links. The El Gamal algorithm also requires a suitable source of randomness to function properly. It is worth noting that the Digital Signature Algorithm (DSA) was based on the El Gamal algorithm. DSA is a complementary protocol to RSA that is widely used in the OpenSSH implementation of the Secure Shell (SSH) protocol.

RSA Algorithm

Shortly after the appearance of the Diffie-Hellman algorithm, Ron Rivest, Adi Shamir and Leonard Adleman proposed another public key encryption system. Their proposal is now known as the RSA algorithm, named for the last initials of the researchers.


The RSA algorithm shares many similarities with the Diffie-Hellman algorithm in that RSA is also based on multiplying and factoring large integers. However, RSA is significantly faster than Diffie-Hellman, leading to a split in the asymmetric cryptography field that refers to Diffie-Hellman and similar algorithms as Public Key Distribution System (PKDS), and RSA and similar algorithm as Public Key Encryption (PKE). PRDS systems are used as session-key exchange mechanisms, while PKE systems are considered fast enough to encrypt small messages. However, PKE systems like RSA are not considered fast enough to encrypt large amounts of data such as entire file systems or high-speed communications lines.

RSA involves a public key and a private key. The public key can be known by everyone and is used for encrypting messages. Messages encrypted with the public key can only be decrypted in a reasonable amount of time using the private key. The keys for the RSA algorithm are generated the following way:

  1. Choose two distinct prime numbers p and q: For security purposes, the integers p and q should be chosen at random and should be of similar bit-length.
  2. Compute n = pq: n is used as the modulus for both the public and private keys; its length is the key length.
  3. Compute φ(n) = φ(p)φ(q) = (p-1)(q-1), where φ is Euler’s totient function.
  4. Choose an integer e such that 1 < e < φ(n) and gcd(e, φ(n)) = 1; i.e. e and φ(n) are coprime.
  5. Determine d and d^-1 ≡ e(mod φ(n)), i.e., d is the multiplicative inverse of e (modulo φ(n)).

The public key consists of the modulus n and the public (or encryption) exponent e. The private key consists of the modulus n and the private (or decryption exponent d, which must be kept secret. p, q and φ(n) must also be kept secret because they can be used to calculate d.


External Links:

ElGamal encryption on Wikipedia

RSA on Wikipedia

© 2013 David Zientara. All rights reserved. Privacy Policy