An Overview of Cryptographic Systems and Encrypting Database Data
By Erich Peterson and Siqing Li
A Multi-Part Article on Cryptographic Systems and Encrypting Database Data
This article is the first of a three-part series exploring cryptographic systems and encrypting data within a Microsoft SQL
Server 2005 database. It provides an overview of crytography and techniques used for encrypting data within databases.
The three parts proceed as follows:
Introduction
As the attacks in which hackers use become more and more sophisticated, and the programs in which they attack become increasingly
complex, encryption is becoming the last line of defense in database management system (DBMS) security. Since Microsoft
announced their Trustworthy Computing security initiatives four years ago, the industry has been waiting to see how these
initiatives would be implemented in upcoming products. With the introduction of Microsoft's newest DBMS, SQL Server 2005,
it does indeed seem as though they have provided what they have promised.
What follows is a two-part article series that provides an in-depth examination of encrypting data in SQL Server 2005. In
this article we will exploring key challenges facing database systems and the motivations for providing robust encryption
mechanisms directly within the database system. We will also look at encryption fundamentals and SQL Server 2005's encryption
capabilities. A future article will step through creating a database table with encrypted data and working with that
data from an ASP.NET 2.0 web application. Read on to learn more!
Encryption - the "Last Line of Defense"
As the attacks in which hackers use become more and more sophisticated, and the programs in which they attack become
increasingly complex - which in turn increases the possibility that holes will be left open - encryption is becoming the
last line of defense in database management system (DBMS) security. You might be thinking, "What is the use of encryption
in modern-day database systems?" That is, if the database administrator (DBA) has done his/her job (i.e. applying all the
latest security patches, securing database information with least privilege access in mind, enforcing strong passwords, etc...),
then why do we need to add encryption on top of all that! The reason is that attackers can be clever and security patches
are usually put out only after an attacker has exploited a new found weakness. Moreover, perhaps the database administrator
overlooked something, or didn't configure the database properly. Therefore, encryption can be your last line of defense
when it comes to exploits that have not yet been patched or for those holes that are left open by accident.
There are other motivating factors an organization might consider when evaluating whether
to encrypt database data. One of those might be the need for the corporation to conform to state, federal, or
international regulations. There are a myriad of regulations when storing personally identifiable information in a databases,
such as the Sarbanes-Oxley Act of 2002, the Health Insurance Portability and Accountability Act (HIPAA) of 1996, and the
European Union Data Protection Directive, among others[1].
Because encryption can incur considerable processing overhead, care must be taken when implementing any cryptographic
processes, especially when processing large amounts of data. Moreover, one must consider the increased storage
requirements, which are directly related to the cryptographic algorithm chosen, the size of the keys used, and the size of
the unencrypted data (called plaintext or sometimes clear text).
Cryptography Fundamentals
First, we should define what exactly is encryption and decryption. Encryption could be defined as:
"The conversion of plaintext or data into unintelligible form by means of a reversible translation, based on a translation
table or algorithm."[2] In other words, encryption is the process of taking some data (usually
called plaintext) and obfuscating it (usually with the use of some key) so that it cannot be read by
others (those who do not have the correct key). The resultant data is usually called ciphertext. Conversely, the
processes of taking the ciphertext and deciphering it back to its original form (usually using a key) is called
decryption. Thus, decryption is defined as:
"The translation of encrypted text or data (called ciphertext) into original text or data (called plaintext)."[2]
Figure 1 gives provides a classical view of encryption. In it we can see the plaintext being fed into an algorithm of some
sort, a key applied as an input as well, and ciphertext being produced as output. Then that same ciphertext can be fed into
the decrypting algorithm, with the same key supplied as an input, to produce the plaintext output.
Figure 1: Classical View of Encryption
Beyond the classical type of encryption we have discussed, there are other types that have their own unique properties, and
understanding them will help in the understanding of SQL Server 2005 encryption later.
Types of Encryption Algorithms
We will be covering two major types of encryption algorithms in this section: symmetric encryption (also called private-key
or shared-key encryption) and asymmetric encryption (also called public-key encryption). The former type we have already
briefly introduced near the end of the last section. This form of encryption gets the name symmetric, because of the fact
that the same key is used for both encrypting and decrypting data. Some well known algorithms which use this type of encryption
are:
One major problem inherent in using this form of encryption is key distribution. The key must be shared with two or more
principles (in most cases), and because of this fact, we need someway of securely distributing the key to all of them.
However, symmetric encryption algorithms are historically computationally fast, which makes them a good choice when
encrypting large amounts of data or when key distribution is not a concern.
The other major type of encryption known as asymmetric encryption has revolutionized security systems, and is
depicted in Figure 2.
Figure 2: Asymmetric Encryption
In asymmetric encryption two keys are generated per entity, one which is kept private and the other which is publicly
known. The interesting and most important attribute of each of the two keys is that: each key undoes the other's operations,
and it is computationally infeasible to generate the opposite key given one of the keys. As Figure 2 illustrates, if Bob
wants to send a message to Alice, he uses Alice's public key to encrypt the message with. Later, when Alice wants to decrypt
the message, she only needs to use her private key to do so. This method ensures that only Alice is able to read the
messages encrypted with her public key. One drawback to asymmetric encryption algorithms is that they are known to be
computationally slower (when compared to symmetric encryption algorithms). However, because asymmetric encryption takes
care of the major problem of key distribution, it is very desirable in some instances. Some well known algorithms which
use this type of encryption are:
It is also possible to provide authentication and
non-repudiation in a cryptosystem through the use of asymmetric
encryption. To do so, the sender of a message can sign the message via a digital signature, proving the message only could
have come from the sender. An example would be if Bob wanted to sign and encrypt a message, and then send it to Alice. To do this,
Bob would first encrypt the message with his private key (sign the message), encrypt it again using Alice's public key, and
then send the message to Alice. The resultant message would look resemble the package shown in Figure 3.
How does this provide authentication and non-repudiation? Because Bob encrypted the message with his private key and only
he owns that key, when Alice uses her public key to decrypt the message, she can be sure the message could only have come
from Bob. Moreover, because Bob again encrypts the resultant message, and encrypts it again using Alice's public key, Bob
can be sure only Alice can decrypt the message and read its contents.
Figure 3: Example of Digital Signature
When Alice receives the message, she first decrypts the message using her private key, then decrypts it again using Bob's
public key. Not only do digital signatures provide a means to show non-repudiation and authentication, but also provide
proof of message integrity. Because Bob is the only one who has ownership of the private key, we can be sure that the
message has not been changed since its encryption.
It is not too hard to imagine that we might want to combine the two major types of encryption (known as hybrid encryption),
to reap the benefits of both. In one hybrid scheme, Bob could choose a random key x, and then use symmetric encryption to
encrypt the message m with x. Next, to distribute the key x to Alice, Bob could encrypt the key x
with Alice's public key. Lastly, Bob would send the two encrypted package he encrypted to Alice. When Alice receives the first
package, she can decrypt it using her private key, thus obtaining the random key x. Then she can use that key to
decrypt the second package, which contains the actual message. The process of preparing the two packages can be seen in
Figure 4.
Figure 4: Hybird Encryption Example
Last in our discussion of cryptography fundamentals, is that of certificates. Certificates provide (among other things) the
ability to bind a public key to an identity. This is an important feature, because while asymmetric encryption may work
in securing data, that security could be lost if you are using a public key to encrypt data for an entity you don't know the
identity of. A CA (certificate authority) is a trusted third party which can issue a certificate, proving the identity of an
entity and its associated public key. Moreover, certificates provide a mechanism by which keys can expire and/or be revoked.
Database Encryption Concepts
For a DBMS to provide the necessary facilities or mechanisms to properly implement a database cryptosystem,
it must offer the following:
Encryption Algorithms - of course any DBMS that wishes to implement encryption must contain the actual encryption
and decryption algorithm(s)
Key Management - the system must provide the facility for the management of the keys used for encryption and
decryption
Authentication and Authorization - before we allow a particular user to encrypt or decrypt data, generate or
use a key, and/or generate or use a certificate, the DBMS should authenticate the user and then make sure the user
has permission to perform the particular action. Furthermore, a check should be performed to ensure that the principle
(the entity that is accessing the database) has access to view the encrypted data (ciphertext).
There are two major categories of data which can be encrypted, and an organization will have to determine if it wants
protect both or just one type. These two major types are :
Data-in-motion
Data-at-rest
Data-in-motion refers to data that is being sent to and from the database (i.e. to/from a client application over the
Internet or local intranet). DBMSs usually make use of SSL (Secure Sockets Layer), TLS (Transport Layer Security), and/or
IPSEC (Secure Internet Protocol) to protect data-in-motion. The protection of data-in-motion is necessary to keep secret
the data being sent to or from a client, and to protect against certain types of attacks (i.e. session hijacking and
replay attacks).
The other category of encrypted data, data-at-rest, has to do with the encryption of data inside of the
database itself. As the "Encryption of Data at Rest" white paper[3] points out, "most attacks do
not occur on data-in-motion. Most attacks occur against the end points of data, where data sits for long periods of time."
Moreover, as we have explained earlier, encryption can be a DBA's last line of defense - if an attacker gets a hold of the
actual data at rest.
If an organization decides to encrypt its data-at-rest, it must decide at what degree of granularity to encrypt the data.
A list of possible encryption levels are listed below, along with their possible advantages and disadvantages:
Database Level - this would entail encrypting at least one (usually large) file, which represents the
database in its entirety. One advantage to this level of encryption is the fact that all data is secure. However,
its biggest advantage turns out to be its biggest disadvantage. Every time data must be written or read from the
entire database must be decrypted. Obviously, this will incur a very high overhead and is a less than desirable
solution.
Row or Column Level - row or column encryption is the encryption of only certain rows or columns inside a
table. It is this level of data encryption that tries to addresses the disadvantages of the previously mentioned type.
However, care should be given to the choice of columns to encrypt. For example, if you chose to encrypt a column that
is being referenced within a SQL statement's WHERE clause or is a primary key, the database will be forced
to decrypt that column's data before it can do any comparisons.
Encryption via SQL Server 2005's Database Engine
This article limits its discussion to just the core SQL Server 2005 database product. Thus, we will not be covering any
related products such as: Reporting Services, Notifications Services, Integration Services, Analysis Services, and so on.
Since Microsoft announced their Trustworthy Computing security initiatives four years ago, the industry has been waiting to
see how these initiatives would be implemented in upcoming products. With the introduction of Microsoft's newest DBMS, SQL
Server 2005, it does indeed seem as though they have provided what they have promised. Let us now take a look at what SQL
Server 2005 has to offer.
The database engine allows for the encryption at the column-level. SQL Server 2005 currently supports three types of
encryption:
Symmetric Encryption - the encryption algorithms supported include: AES, DES, RC2, and RC4. Symmetric keys
can be encrypted by:
Another symmetric key created by a database user, which was derived from a user-supplied password
An asymmetric key
Another symmetric key
A certificate's public key
Asymmetric Encryption - the RSA encryption algorithm is used to support asymmetric encryption and can be
used with a 512-, 1,024-, or 2,048-bit key. Usually asymmetric encryption is used to encrypt symmetric keys, which are
then stored in the database. Asymmetric keys can be encrypted by:
A symmetric key created by a database user, which was derived from a user-supplied password
The Database Master Key
Certificates: SQL Server 2005 supports the IETF's (Internet Engineering Task Force) certificate standard
X.509 version 3 (X.509v3). Moreover, this support can be used to generate certificates from within SQL Server,
or to use any externally issued certificates.
Key Hierarchy
In an upcoming article we'll explore the specifics of encrypting data in SQL Server 2005, examining the key generation process
and how to encrypt and decrypt data through T-SQL commands. We'll also look at working with encrypted data through an ASP.NET
2.0 web application.
Conclusion
Because encryption is playing more and more of an important role in database management systems, database administrators,
database developers, and management are using or are starting to seriously consider using encryption in their databases.
Microsoft has seen and recognized the need for a so called, "last line of defense" for companies and organizations with
their introduction of SQL Server 2005. A future article will explore the specific steps needed to implement SQL Server 2005's
encryption capabilities, including interfacing with encrypted data through an ASP.NET 2.0 web application. Until then...
References [1] Microsoft, "Improving data security by using SQL Server 2005: Using SQL Server 2005 to help protect data," Technical White Paper, October 2005. [2] W. Stallings, "Cryptography and network security principles and practices", Fourth Edition, Prentice Hall, 2005. [3] Application Security, Inc., "Encryption of data at rest," White Paper, Accessed November 2006 at http://www.appsecinc.com/presentations/Encryption_of_Data_at_Rest.pdf.