# Sovereign Keys – A proposal for fixing attacks on CAs and DNSSEC

The EFF presented their proposal how to improve the security of SSL/TLS and the internet PKI infrastructure. To understand their proposal, one needs to understand how PKI in the internet works today:

### Public Key Cryptography

In cryptography, you can usually encrypt and decrypt  data. In the past, encryption and decryption used the same key. Starting from the 70s, a new class of encryption/decryption algorithms was invented, the public key encryption algorithm. Instead of using the same key for en- and decryption, these algorithms use different keys for en- and decryption. During key generation, two keys are generated: A public key, that is used to encrypt data, and can be given out to everybody in the word, and a corresponding secret key, that must be kept hidden by the owner. Everybody who has access to the public key can encrypt data, but only the owner of the secret key is able to decrypt it.

There are many algorithms, for example RSA and ElGamal are the most famous public key encryption algorithms, while other algorithms like McEliece and Rabin are less well known.

Besides encryption, there are also digital signature algorithms. Again, a public and a private key is generated. The private key can be used to generate a digital signature on a document. The public key can then be used to verify the signature on the document. A signature on a document should guarantee that the document was really signed by the holder of the private key, and was not altered afterwards.

### PKI

These ideas sound simple at the first look, but in practice, getting a public key of a person or company is not that easy. Just publishing your public key in some kind of web forum or on your facebook page is not enough. Everybody would be able to create a facebook page for another person, and then posting a fake public key on that page, or under that persons name on a web forum. So we need a way to establish a binding between a public key and a person or identity (a company name, a domain name or an email address). One solution would be to meet everybody in person who you want to communicate with, but it doesn’t scale well, and not everybody wants to fly to San Jose, California, just to get the public key for paypal.com.

For these job, Public Key Infrastructue (PKI) and X.509 Certificates have been invented. A Certification Authority (CA) is an organization, that verifies the identity of a person, and that this person is in possession of a private key. After this has been confirmed, the CA issues a X.509 certificate. That certificate contains the corresponding public key of that person, and it’s identity, and this information is signed using the CAs private key. Everybody who thinks that this CA does a good job in verifying the identity of persons, and is in possession of that CAs public key can verify that signature. As from now on, one only needs to trust a CA. One can simply give away the certificate issued by a CA, and everybody can get the public key from the certificate, and verify that it really belongs to that person, by verifying the signature of the CA. Today, there are hundreds of CAs active on the internet, and every web browser comes with a pre-installed list of trustworthy CAs and their public keys.

### SSL/TLS

To encrypt HTTP traffic and to prove the autenticity of a website, the SSL/TLS protocol was created. When a session to a web server is established, the web server usually provides a digital certificate containing the public key of that web server. The web browser verifies the signature on that certificate, and that the identity in that certificate matches with the servers name it want’s to connect to. If everything is fine, the public key in that certificate is used to establish a secure session with that web server using some kind of key derivation scheme. (I won’t go into detail here)

### The current state of PKI in the internet

At the first look, this sounds like a perfect solution. Whenever I want to talk privately with paypal, I just point my web browser to https://www.paypal.com/, it automatically connects to the server, gets a certificate, verifies that is has been correctly signed by a trustworthy CA, and the identidy in the certificate matches the expected servers hostname.

### Attack vectors

However, there are multiple problems with that system. Just to mention one example: There are hundred of CAs active in the internet, and your web browser trusts every single one of them. Every CA is allowed to issue a certificate for every domain name in the internet. For example the national Chinese stat CA is allowed to issue a certificate for http://www.defense.gov/, which is the website of the ministry of defense of the united states of america. Also, the verification done by most CAs is minimal. For many CAs, it is sufficient if you can receive a mail for hostmaster@domain.tld, to get a certificate for domain.tld. There are multiple ways how you can attack this:

#### CAs

First of all, you may find a bug in the CAs website or email server, that allows you to get access to the certificate issuing software, bypassing these checks.

#### DNS

Also, you might be able to attack a DNS server serving the zone-file for domain.tld, that allows you to reroute mail for hostmaster@domain.tld on the DNS level. This allows you to get a certificate for domain.tld too.

#### Routers

Routers, especially those using BGB or a similar protocol might be tricked into rerouting the traffic for the mail server of domain.tld to your network. This way, you can intercept the mail and get your certificate too.

#### Crypto

Besides that, weak cryptography  algorithms like MD5 have been used by some CAs, and this has been used to generate a rouge certificate too.

### The EFF solution

To improve the security of PKI, the EFF has presented a proposal: Sovereign Keys

Sovereign Keys should make it harder for an attacker to generate a new certificate for an HTTPS website, without the cooperation of the legitimate site operator. The main building block of Sovereign Keys are so called timeline servers. These timeline servers are append-only databases, meaning that one can only add entries to the database, but never modify or delete them. These timeline servers could be operated by different entities like the EFF itself, or Mozilla, Google or Microsoft.

To use Sovereign Keys, the side administrator obtains an X.509 certificate as usual. Then he generates a new key, the so called sovereign key. He uploads the key with the certificate to a timeline server. The server operator checks, if that certificate is really issued by a valid CA and no other sovereign key has been added previously, and adds the sovereign key with the hostname of the certificate to the database.

When a client connects to the website, he also requests all database entries belonging to that hostname from a timeline server. In parallel to that, a SSL/TLS connection is established. The Server delivers the server certificate to the client, with an additional signature created with the sovereign key. The client can then check, if this signature can be verified with the sovereign key retrieved from the timeline server.

### More details

The full protocol is a little bit more complex, because it needs to deal with revocation, privacy, mirroring and load balancing the timeline servers and many more things. It has not yet been finalized, but a draft of the protocol can be downloaded from: https://git.eff.org/?p=sovereign-keys.git;a=blob_plain;f=sovereign-key-design.txt;hb=master

### Summary

For me, this looks like one of two solutions you need to improve the general security of SSL/TLS. Sovereign keys is a great solution for website operators that care about the security of their users. It will not help a user, if the website he connects to does not use it. For these cases, a different solution should be used, like checking if multiple computers in the internet get the same certificate from the server.

# Bitcoin – An Analysis

Kay Hamacher and Stefan Katzenbeisser presented their analysis of Bitcoin at 28C3. Besides the usual cryptographic analysis, they pointed out some important aspects of the system:

## Bitcoin doesn’t scale well, and no update mechanism has been designed!

Bitcoin was created as a total decentralized system. There is also no central authority or update mechanism, that could alter the system. As soon as Bitcoin hits it’s boundaries, all Bitcoin users need to agree on a change of the system and integrate it. Because every user of Bitcoin needs to keep a full log of all transactions, that have ever been executed on the Bitcoin network, it is clear that Bitcoin will hit it’s scaleability boundaries rather soon, if it gets more popular.

## Bitcoins can bet lost and the number of Bitcoins that can be generated is fixed!

Bitcoins are a digital currency, that are stored on the current owners computer. If for example the harddisk crashes, or it is encrypted an the password is lost, these Bitcoins cannot be spend anymore. They still exist in the network, and also cannot be regenerated. One could also say, these Bitcoins are lost. Also, the total amount of Bitcoins, that can ever be generated is fixed. So we need to expect, that the total amount of Bitcoins will start to decline, as soon as all Bitcoins have been generated. Just assume, that 1% of all Bitcoins are lost per year, then after $\log_{0.99} (0.5) \approx 69$ years, only 50% of the Bitcoins will still be there.

## Bitcoin is not untraceable and anonymous!

Because Bitcoin keeps a log of all transactions, and this log is available to the public, one can trace which address has received how much money, and where it went. Bitcoin allows a user to have more than one identity, but as soon as money from more than one address is used in a single transaction, one can assume that these addresses belong to the same user. One can for example see, who spend money to Wikileaks, and where Wikileaks transferred that money. Also if you assume, that a person ears money only from Bitcoin, you know his total income. You also know when he transfers money on Bitcoin and when not, so you might find out when he sleeps and in which time zone he lives in.

## What has not been found…

As mentioned at the beginning of the post, there is no central update mechanism. So if somebody would find a bug in the design of the system, that allows him to steal money from it, it cannot easily be fixed. So far, no attack one the basics of the bitcoin system has been found and bitcoin is running and getting more and more popular.

# Time is on my Side – Exploiting Timing Side Channel Vulnerabilities on the Web

Sebastian Schinzel gave an interesting talk today at 28C3, about timing side channel attacks against web applications. (Timing-) Side channel attacks are known in the cryptography world for a long time, and many algorithms like RSA or AES have been successfully attacked. In a nutshell, an attacker measures the time a device needs to process a request (usually an encryption or decryption), and can draw conclusions from that to the values of secret input parameters (a plaintext or a secret key).

Sebastian showed, that this can be used against none cryptography web applications as well. Instead of just presenting his attacks, he presented general methods how to do timing measurements against web applications first. For example, a web application could perform the following sequence of checks during a user login:

1. Does the account exist?
2. Is the account of the user locked?
3. Has the account expired?
4. Is the password correct?

If one of these checks fails, the procedure is aborted and an error page is send to the user. Of course, each of these steps requires some time, and from the time it takes from the request to the generation of the error message, one might guess, which of these steps went wrong.

The second attack presented in this talk was a timing attack on an implementation of the XML encryption standard using a PKCS#1.5 padding. Here, the server needs a longer time to process a request, depending on the padding inside the encrypted payload.

For me, my personal highlight was the extension of timing based side channel attacks to none cartographic web applications. I assume, if one would check some famous web applications, many of such timing leaks could be found, because web developers usually don’t care about timing side channels. The timing difference could also be used to assist blind SQL injection attacks, where the timing difference could be the only channel back to the attacker.

Unfortunately, the slides are not (yet) available, but a previous paper describing the methods can be found at http://sebastian-schinzel.de/_download/cosade-2011-extended-abstract.pdf.

# Effective DoS attacks against Web Application Plattforms – #hashDoS [UPDATE3]

Julian Wälde (zeri) and Alexander Klink (alech) presented a very nice way how to bring down many popular websites at 28C3. The idea is quiet simple, effective and general. Many programming languages, especially scripting languages, frequently use hash tables to store all kinds of data.

## Hash Tables

Hash tables (which are very well explained on wikipedia) are data structures, that store key-value pairs very efficiently. Adding new entries to the table, looking up entries in the table for a given key, and deleting entries are usually executed in $O(1)$ in best and average case, which means that the time for each operation is usually constant, and doesn’t depend on the number of entries stored in the table. Other data structures like binary tree type structures need $O(\log(n))$ entries in the average case, which means that they need more time for these operations when a lot of entries are stored. (n is the number of entries stored) However, there is a drawback, hash tables need $O(n)$ operations for these operations in the worst case, compared to $O(\log(n))$ operations for binary trees, which means that they are much slower in very rare cases.

## How the attack works

Many hash table implementations are very similar. They use a deterministic hash function to map a key k (usually a string or another complex data structure) to a single 32 or 64 bit value h. Let l be the size of the current table. Then  h%l is used as an index in this table, where the entry is stored. If more than one entry is mapped to the same index, then a linked list of entries is stored at this index position in the table. If many different keys are all mapped to the same hash value h, then all these entries are stored at the same index of the table, and the performance of the hash table goes down to a simple linked list, where all operations need $O(n)$ time.

## Performance

Just to get an impression, how much CPU time it takes to process such a request on a Core i7 CPU with PHP, assuming that the processing time for a request is not limited (usually, PHP limits the processing time for a request to 1 minute):

• 8 MB of POST data –  288 minutes of CPU time
• 500k of POST data – 1 minute of CPU time
• 300k  of POST data – 30 sec of CPU time

So you can keep about 10.000 Core i7 CPU cores busy processing PHP requests using a gigabit internet connection. Alternatively for ASP.NET, 30,000 Core2 CPU cores, or for Java Tomcat 100,000 Core i7 CPU cores, or for CRuby 1.8 1,000,000 Core i7 CPU cores, can be kept busy with a single gigabit connection.

Even though this blog is named cryptanalysis.eu, these hash functions used for these tables are not cryptographic hash functions like SHA256, instead simple hash functions like djb2 are used, and it is very easy to find many inputs mapping to the same output. Julian and Alexander did a great job with checking many programming languages used for web applications for their hash table implementation and hash functions. For all of them they checked, they managed to find a lot of keys mapping to the same output, except for Perl. Perl uses a randomized hash function, i.e. the hash doesn’t only depend on the key, but also on an additional value, that is chosen at startup of the application at random. All other languages also store the query parameters send in an HTTP GET or POST request in an hash table, so that a request with many query parameters all mapping to the same hash value will slowly fill such a hash table, before the first line of code written by the application programmer will be executed. Filling this hash table will usually take several minutes on a decent CPU, so that even a fast web server can be kept busy using a slow connection.

## Countermeasures

One may ask, can this problem be fixed by using a modern hash function like SHA256? Unfortunately, this will not solve the problem, because hash tables are usually very small, like $2^{32}$ entries, and one can still find a lot of values where SHA256(m) maps to the same value modulus $2^{32}$. Also fixing this in the parser for the parameters string seems to be a bad solution, because many programmers use hash tables to store data retrieved from a user or another insecure source. From my point of view (which is also the opinion of the speakers), the best solution is to use randomized hash functions for hash tables. However, there are several kind of fixes deployed by the vendors:

### Limiting HTTP POST and GET request lengths (Microsoft ASP.NET)

Microsoft suggests to limit the HTTP request length. Most applications don’t require very long HTTP requests, except for file uploads.

```<configuration>
<system.web>
<httpRuntime maxRequestLength="200”/>
</system.web>
</configuration>```

As long as you don’t process data in a hash table from any other sources, except from the HTTP request (like external URLs), this should prevent the basic form of the attack.

Microsoft has also released an emergency patch against this attack: http://technet.microsoft.com/en-us/security/bulletin/ms11-100.mspx

### Limiting the number of different HTTP request parameters (PHP, Tomcat)

PHP has added a new configuration variable max_input_vars, that can limit the number of parameters. This is similar to the first solution, except that here, not the total length of the request is limited, instead the number of different parameters that can be submitted in a single request is limited. This can be easier to use than limiting the length of the HTTP request, because application programmers usually know, how many parameters they need, while the maximum length of the request is harder to predict. However, if a single request parameter contains a data structure, that is later parsed and put into a hash table, the application might still be vulnerable.

Apache Tomcat also deployed a similar fix, that limits the number of parameters to 10000 by default.

### Using different data structures

As mentioned before, other data structures (binary tree type structures like an AVL-tree gurantee, that all operations are always executed in $O(\log(n))$, and never need $O(n)$ time. This is what for example Daniel J. Bernstein suggests. This completely fixes the vulnerability, but requires fundamental changes to the runtime/compiler/interpreter, and might be harder to implement.

## More Information

More information can be found on Twitter, using the hashtag #hashDoS or from the user @hashDoS!

oCERT has a good summary about affected and fixed software versions: http://www.ocert.org/advisories/ocert-2011-003.html

## Video and Paper

The video of the talk is available on YouTube. There is an advisory on full disclosure that describes the attack in detail: http://permalink.gmane.org/gmane.comp.security.full-disclosure/83694