Archive

Author Archive

SSL/TLS broken again – A weakness in the RC4 stream cipher

March 15th, 2013 No comments

A few days ago, a new attack against SSL/TLS has been published by Nadhem AlFardan, Dan Bernstein, Kenny Paterson, Bertram Poettering and Jacob Schuldt. Many attacks on SSL/TLS in the past relied on the protocol design itself, broken implementations leaking side channels, or the X.509 certificate system and improper issuing of certificates. In contrast to these, the new attack is focused on the RC4 stream cipher, that can be used against SSL/TLS.

RC4 usage in SSL/TLS

SSL/TLS doesn’t rely on a single cryptographic primitive. Instead when a new SSL/TLS session is established, both parties negotiate a ciphersuite. A ciphersuite consists of a key exchange algorithm, an encryption method and an integrity protection method. To agree on a common ciphersuite, the SSL/TLS client sends a list of all ciphersuites it supports, and the server choses one of the methods from the list, usually the first one which is also supported by the server. The RC4 stream cipher is one of the possible choices for the encryption method, and also wildly supported and used. Compared to many other encryption methods, RC4 is very fast in software, very easy to implement, and also very efficient because it doesn’t require any padding as many encryption methods based on block ciphers do.

What makes RC4 a bad stream cipher

So what would be expect from a (secure) stream cipher? A stream cipher should take a key, and transform this key into a (pseudo-) random sequence of bytes, chosen from a uniform distribution. However, RC4 has also serious weaknesses. For example, RC4 is also used in the famous WEP protocol, to protect WiFi networks. Here, many similar RC4 keys are used that differ only in the first 3 bytes. These 3 bytes of the key are used as an initialization vector for the cipher, and are transmitted in clear with the encrypted packet. The remaining bytes of the key are shared among all packets, but should only be known to the operator of the network. What has been shown for WEP is, that an attacker, who knows the first 3 bytes of the key, and also has access to some bytes of the key stream can predict the next bytes of the key. The probability for predicting a single byte correct is only slightly above a random guess, but repeating the procedure for many packets (like 10.000) will reveal the secret key that protects the network with almost 100% probability. In a nutshell, if you know a part of an RC4 key, and some bytes of the keystream, then you can predict parts of the remaining key better than just guessing.

How can this be applied to SSL/TLS

But for SSL/TLS, the situation is entirely different. Here, a new key is chosen almost randomly for every new connection and no parts of the key are shown to an attacker. As a result, the methods used to attack RC4 in WEP cannot be applied for SSL/TLS connections. But there are more problems in RC4:  Even if a key for RC4 is randomly chosen, the keystream bytes of RC4 have some biases. For example the second by of output will be 0 with a probability of more than 1/256. What recently has been discovered is, that much more of such biases exist. If a plaintext is transmitted over SSL/TLS, one gets a small hint about the plaintext. If the same plaintext is transmitted over and over again using TLS with RC4 encryption, one can recover the first bytes of the plaintext using these biases in the keystream generated by RC4. No assumptions about other plaintext or keystream bytes must be made, and no knowledge of parts of the key is required. Also this works independant of the implementation used, because it doesn’t require any timings or similar side channels from the implementation.

Consequences of the attack

The number of sessions required depends on how much is known about the plaintext, and what should be recovered. A full plaintext can be recovered using 2^{30} sessions. If only a single byte needs to be recovered, about 2^{24} sessions might be sufficient, depending on the position of the byte in the plaintext. If one only wants to distinguish two plaintexts only, then less than 2^{24} sessions might be enough.

Countermeasures

One may ask now how to counter the attack. One of the best solutions would be to just simply disable RC4 support on a client or on a server. As long as a client and a server still share another common ciphersuite, they will still operate properly. Alternatively, both the server and the client can be patched to send empty application layer records, until the first 256 or 512 bytes of output of RC4 have been used. Currently, the attack works best with the first bytes of output of RC4, but less well with the following bytes.

Final remarks

I have also previously been active in research on RC4 and WEP attacks.

Categories: protocol, symmetric Tags:

Secure Function Evaluation – There is an issue with OTR and plausible denability

May 27th, 2012 No comments

OTR is a crypto overlay protocol for instant messaging. Instead of encrypting the connection to an instant messaging service like Gtalk, MSN, Skype or ICQ, OTR encrypts messages send over an arbitrary instant messaging service end-to-end. The message leaves your messaging client encrypted, and is later decrypted by the receivers client. Only the communicating clients are in possession of the keys necessary to decrypt the message, and the instant messaging service cannot read the message in clear.

Plausible deniability

Between other very nice properties, the OTR protocol also offers Plausible Deniability as well as Authenticity. This means, that when Alice and Bob are chatting, Alice and Bob can be sure that the messages they receive have really been send by their chat partners, and have not been altered. On the other hand, both Alice and Bob cannot prove to a third party, that any of their chat partner send a message with a specific content.

There is a trivial attack on these kinds of protocols. Assume that Alice chats with Bob, and Bob always asks Alice, to help him rob a bank. Now Alice would like to prove to a judge, that Bob really asks her to rob a Bank. A trivial way of doing this is handing over all keys of Alice to the Judge, so that the Judge can impersonate Alice and say Hi to Bob. Because Bob thinks, he is talking to Alice, he asks her again to rob a bank.

However, Alice might not be willing to hand over her keys to a Judge. Recently, greg found out, that there is a way how Alice can prove to a Judge, that Bob told her to rob a bank, without handing over her private keys. His approach uses Secure Function Evaluation: The concept of secure function evaluation is known for some time now: Assume that you have a function or an arbitrary computer program, that processes two inputs a and b, and generates an output c. Then two parties can jointly compute that function, each providing one of the inputs. The other party doesn’t learn anything about the inputs, and both parties get the output c.

Breaking plausible deniability

Effectively, Alice and the Judge can now jointly compute Alice sides of the protocol. Alice provides as input her private key, and the Judge provides all other inputs, including the messages send by Bob. The output of the jointly computed function can be either the short term communication keys, which Alice and Bob are using in the conversation, or the decrypted protocol messages send by Bob. In fact, this very generic approach can still be optimized exploiting some properties that are specific to the OTR protocol and the DH key exchange used in OTR.

Countermeasures

I assume that it will be very hard to counter this type of attack, because secure function evaluation is a very generic method, that is not bound to any specific properties of OTR.

However, please keep in mind, that this attack is only possible while Bob is still chatting with Alice. As soon as the communication is over, Alice cannot decide to go evil afterwards. Also, while Alice is able to prove the authenticity of the messages send by Bob to the Judge, the Judge cannot prove the authenticity of these messages to another party like a jury.

Categories: protocol, pubkey Tags:

Ron was wrong, Whit is right – Weak keys in the internet

April 6th, 2012 No comments

Today, Maxime Augier gave a great talk about the state of security of the internet PKI infrastructure. The corresponding paper written by Lenstra, Hughes, Augier, Bos, Kleinjung, and Wachter was uploaded to eprint.iacr.org archive a few weeks ago. In a nutshell, they found out that some RSA keys, that is often used in the SSL/TLS protocol to secure internet traffic, are generated by bad pseudo random number generators and can be easily recovered, thous providing no security at all.

The RSA cryptosystem

RSA is one of the oldest and most famous asymmetric encrytion schmes. The key generation for RSA can be summarized as follows:

For a given bitlength l (for example l = 1024 or l = 2048 bits), choose randomly two prime numbers p and q of of bitlength l/2. Choose a number 1 < e < (p-1)*(q-1), that has no divisor in common with (p-1)*(q-1). Many people choose e = 2^{16+1} here for performance reasons, but other choices are valid as well. Now, the number n = p*q and e form the public key, while d = e^{-1} \bmod (p-1)*(q-1) is the private key. Sometimes, the numbers p and q are stored with the private key, because they can be used to accelerated decryption.

To encrypt a message m, one just computes , and to decrypt a message, one computes m = c^d \bmod n. However, we don’t need that for the rest of this text can safely ignore that.

How random do these numbers need to be?

When generating cartographic keys, we need to distinguish between just random numbers and cryptographically secure random numbers. Many computers cannot generate real random numbers, so they generate random numbers in software. For many applications like computer games and for example simmulations of experiments, we only need number that seem to be random. Functions like “rand()” from the standard c library provide such numbers and the generation of these numbers is often initialized from the current system time only.

For cryptographic applications, we need cryptographically secure random numbers. These are numbers that a generated in a way, that there is no efficient algorithm, that distinguish them from real random numbers. Generating such random numbers on a computer can be very hard. In fact, there have been a lot of breaches of devices and programs, that used a bad random number generator for cryptographic applications.

What has been found out?

From my point of view, the paper contains two noteably results:

Many keys are shared by several certificates

6 185 228 X.509 certificates have been collected by the researchers. About 4.3% of them contained an RSA public key, that was also used in another certificate. There could be several reasons for this:

  • After a certificate has expired, another certificate is issued, that contains the same public key. From my point of view, there is nothing wrong with doing that.
  • A company changes their name or is taken over by another company. To reflect that change, a new certificate is issued, that contains another company name, but still uses the same key. I don’t see any problems here either.
  • A product comes with a pre-installed key, and the consumer has to request an certificate for that key. The same key is shipped to several customers. From my point of view, this is really a bad idea
  • Or there might be really a bad random number generator in some key generation routines, that two entities that are not related come up with the same RSA public (and private) key. This is a security nightmare.

Some keys share a common divisor

This is definitely not supposed to happen. If two RSA keys are generated, that share a common divisor by the same or by different key generation routines, the private key for both public keys can be easily determined, and the key generation routine is deeply flawed.

What are the consequences?

For those, who use an RSA public key, that shares a modulus with another different RSA public key, their key provides no protection at all. All implementations, that generated these keys definitely need to be updated and the certificates using the weak keys need to be revoked.

Which devices and vendors are affected?

Because disclosing the list of affected devices and vendors would compromise the security of these systems immediately, and allow everyone to recover their secret RSA keys, this has not been disclosed.

Categories: misc Tags:

GMR-1 cipher specifications are now public

February 7th, 2012 No comments

A reference implementation of the GMR-1 cipher has now been released. You can download the sourcecode from http://cgit.osmocom.org/cgit/osmo-gmr/tree/src/l1/a5.c.

Here are the most important facts in a nutshell:

  • 4 Linear Feedback Shift registers in Fibonacci configuration.
  • 81 bits of state
  • Registers 1, 2, and 3 of length 19, 22, and 23 bits control the output
  • Register 4 of length 17 bits controls the clocking of registers 1, 2, and 3, and does not contribute to the output
  • Clocking control is a simple majority clocking, so that registers are clocked 0 or 1 times
  • The output combiner is a quadratic function

So in fact, this cipher looks like a A5/2 clone from GSM.

Categories: misc Tags:

Don’t trust satellite phones – The GMR-1 and GMR-2 ciphers have been broken [UPDATE]

February 2nd, 2012 No comments

(cc) Iridium phone

Today, February 2nd 2012, Benedikt Driessen and Ralf Hund gave a very interesting talk at Ruhr Universität Bochum about their work on satellite phone security. In a nutshell, they were able to reverse engineer and to break the secret ciphers used in many satellite phone systems, namely the GMR-1 and the GMR-2 ciphers.

 

What are satellite phones?

We all know cell phones, also known as GSM, UMTS or 3G phones. These phones communicate with their network operators stations. Usual ranges are a few hundred meters in a city, or up to 40 kilometers in very sparsely populated areas. In many countries in Europe, there is like 99% coverage, so that it is hard to find a place outside with no network coverage. Of course, cell phones tend not to work in tunnels or basements, but outside, the network coverage is usually fine.

There are many countries in the world like the United States of America, Russia or China, with many underpopulated areas, without any network coverage. Also there is no network coverage on the oceans and in many third world countries. But there is a solution, phones communicating with a satellite instead of a ground station can be used there.

How do satellite phones differ?

All phones in Germany use a common standard, and phones can usually be used on different networks, like the O2, Vodafone, E-plus or T-Mobile network. In contrast to that, there are also many different providers in the United States, but they use different systems. For example AT&T and T-Mobile USA are running GSM networks, but Sprint uses a different system, so that you can’t use your T-Mobile phone on the Sprint network. Also for satellite phones, there are many different standards.

A very common standard made by ETSI is GMR-1, which is used by Thuraya and many more operators. The GMR-1 specifications, except for the encryption algorithms are public and available on http://www.etsi.org/. Like many systems maintained by ETSI, GMR-1 uses a proprietary stream cipher cipher, that is not available to the public. Another standard for satellite phones is GMR-2, which is used by Immarsat. Again, a proprietary stream cipher is used for encryption.

GMR-1 and GMR-2 ciphers have been reverse engineered

A few days ago, it was announced that the GMR-1 and GMR-2 ciphers have been reverse engineerd. From the announcement:

…The first main cont­ri­bu­ti­on is that we were able to completely re­ver­se en­gi­neer the en­cryp­ti­on al­go­rith­ms em­ploy­ed. Both ciph­ers had not been pu­bli­cly known pre­vious­ly. We de­scri­be the de­tails of the re­co­very of the two al­go­rith­ms from fre­e­ly avail­able DSP-firm­ware up­dates for sat­pho­nes, which in­clu­ded the de­ve­lop­ment of a cust­om di­sas­sem­bler and tools to ana­ly­ze the code, and ex­ten­ding prior work on bi­na­ry ana­ly­sis to ef­fi­ci­ent­ly iden­ti­fy cryp­to­gra­phic code. We note that these steps had to be re­pea­ted for both sys­tems, be­cau­se the avail­able bi­na­ries were from two en­t­i­re­ly dif­fe­rent DSP pro­ces­sors…

Just that a cipher has been reverse engineered and is now public is not an indication for any kind of weakness of the cipher. Many popular encryption systems, like SSL/TLS that is used for online banking use cryptographic algorithms that are known to the public and have never been designed as closed source systems. In fact, one of the golden rules in cryptography, known as “Kerckhoffs’s principle” states that the security of a system should never be based on keeping the system private. Instead it should just be based on keeping the secret keys used in the system private.

The reverse engineering of GMR-1 has been done by disassembling firmware updates for the Thuraya SO-2510 phone. Most modern phones consist of 2-3 different processors. A main CPU for all the applications and the graphics and the user interface. Another CPU (DSP) is mostly responsible for signal processing and speech coding, and some very low level parts are implemented in dedicated hardware. Because the market for satellite phones is quiet small compared the GSM market, it is usually cheaper to implement these stream ciphers in software on the DSP, instead of hardware, so that standard chips with a custom firmware can be used, instead of a more expensive custom chip design. If a firmware can be extracted from the chip, or is available as a firmware update, this also eases the reverse engineering of these encryption algorithms. For the for the Thuraya SO-2510, the stream cipher has been found inside the DSP code, but not on the ARM host code. Here, a Texas Instruments C55x DSP was used.

Because one may assume, that these stream ciphers are similar to the GSM ciphers, they decided to search through the dissembled DSP code, and look for functions using a lot of XOR and shift operations. These two instructions are mainly used for bit manipulations when implementing linear feedback shift registers in software, but are not so common in none crypto code.

To reverse GMR-2, a Immarsat IsaPhonePro was used. Again, the firmware was analyzed, and the code for a Blackfin DSP was extracted. Surprisingly, the reverse engineered GMR-2 cipher is very different from GMR-1, and also very different from any cipher used in GSM or DECT. Instead of bits, the cipher operates on 8 bit registers (byte-registers). Also surprisingly, two S-Boxes from DES are used in this cipher.

Attacks on GMR-1 and GMR-2 are possible

The GMR-1 cipher is very similar to A5/2, 4 LFSRs are used, but the clocking is only controlled by a single register R4. As a result, one can even use ciphertext only attacks against GMR-1, which means only valid cipher texts are required for the attack. This type of attack is much more powerful than just a known plaintext attack, because it works with any kind of ciphertext, even without knowing the plaintext. This is possible, because there are parity checks inside the plaintext, that reveal the structure of the encrypted plaintext. The attack is similar to the attack described in http://cryptome.org/gsm-crack-bbk.pdf. The whole attack can be executed in less than 30 minutes on a standard PC.

For GMR-2, a known-plaintext attack is possible. In a nutshell, with a certain probability, some bits of the keystream will only be generated from Bytes 0 and 4 of the key. One can observe many keystreams and come up with a highly likely hypothesis for these two bytes of the key, and do a brute force search on the remaining key bytes. In total, keystreams from 5-6 frames (50-65 bytes of the keystream) are required to come up with a good hypothesis. After this, they key can be recovered within a second on a standard PC. However, so far it is unknown how difficult it is, to recover plaintexts.

UPDATE: The full paper is now available: http://gmr.crypto.rub.de/paper/paper-1.pdf

Categories: symmetric Tags:

Sovereign Keys – A proposal for fixing attacks on CAs and DNSSEC

January 18th, 2012 No comments

The EFF presented their proposal how to improve the security of SSL/TLS and the internet PKI infrastructure. To understand their proposal, one needs to understand how PKI in the internet works today:

Public Key Cryptography

In cryptography, you can usually encrypt and decrypt  data. In the past, encryption and decryption used the same key. Starting from the 70s, a new class of encryption/decryption algorithms was invented, the public key encryption algorithm. Instead of using the same key for en- and decryption, these algorithms use different keys for en- and decryption. During key generation, two keys are generated: A public key, that is used to encrypt data, and can be given out to everybody in the word, and a corresponding secret key, that must be kept hidden by the owner. Everybody who has access to the public key can encrypt data, but only the owner of the secret key is able to decrypt it.

There are many algorithms, for example RSA and ElGamal are the most famous public key encryption algorithms, while other algorithms like McEliece and Rabin are less well known.

Besides encryption, there are also digital signature algorithms. Again, a public and a private key is generated. The private key can be used to generate a digital signature on a document. The public key can then be used to verify the signature on the document. A signature on a document should guarantee that the document was really signed by the holder of the private key, and was not altered afterwards.

PKI

These ideas sound simple at the first look, but in practice, getting a public key of a person or company is not that easy. Just publishing your public key in some kind of web forum or on your facebook page is not enough. Everybody would be able to create a facebook page for another person, and then posting a fake public key on that page, or under that persons name on a web forum. So we need a way to establish a binding between a public key and a person or identity (a company name, a domain name or an email address). One solution would be to meet everybody in person who you want to communicate with, but it doesn’t scale well, and not everybody wants to fly to San Jose, California, just to get the public key for paypal.com.

For these job, Public Key Infrastructue (PKI) and X.509 Certificates have been invented. A Certification Authority (CA) is an organization, that verifies the identity of a person, and that this person is in possession of a private key. After this has been confirmed, the CA issues a X.509 certificate. That certificate contains the corresponding public key of that person, and it’s identity, and this information is signed using the CAs private key. Everybody who thinks that this CA does a good job in verifying the identity of persons, and is in possession of that CAs public key can verify that signature. As from now on, one only needs to trust a CA. One can simply give away the certificate issued by a CA, and everybody can get the public key from the certificate, and verify that it really belongs to that person, by verifying the signature of the CA. Today, there are hundreds of CAs active on the internet, and every web browser comes with a pre-installed list of trustworthy CAs and their public keys.

SSL/TLS

To encrypt HTTP traffic and to prove the autenticity of a website, the SSL/TLS protocol was created. When a session to a web server is established, the web server usually provides a digital certificate containing the public key of that web server. The web browser verifies the signature on that certificate, and that the identity in that certificate matches with the servers name it want’s to connect to. If everything is fine, the public key in that certificate is used to establish a secure session with that web server using some kind of key derivation scheme. (I won’t go into detail here)

The current state of PKI in the internet

At the first look, this sounds like a perfect solution. Whenever I want to talk privately with paypal, I just point my web browser to https://www.paypal.com/, it automatically connects to the server, gets a certificate, verifies that is has been correctly signed by a trustworthy CA, and the identidy in the certificate matches the expected servers hostname.

Attack vectors

However, there are multiple problems with that system. Just to mention one example: There are hundred of CAs active in the internet, and your web browser trusts every single one of them. Every CA is allowed to issue a certificate for every domain name in the internet. For example the national Chinese stat CA is allowed to issue a certificate for http://www.defense.gov/, which is the website of the ministry of defense of the united states of america. Also, the verification done by most CAs is minimal. For many CAs, it is sufficient if you can receive a mail for hostmaster@domain.tld, to get a certificate for domain.tld. There are multiple ways how you can attack this:

CAs

First of all, you may find a bug in the CAs website or email server, that allows you to get access to the certificate issuing software, bypassing these checks.

DNS

Also, you might be able to attack a DNS server serving the zone-file for domain.tld, that allows you to reroute mail for hostmaster@domain.tld on the DNS level. This allows you to get a certificate for domain.tld too.

Routers

Routers, especially those using BGB or a similar protocol might be tricked into rerouting the traffic for the mail server of domain.tld to your network. This way, you can intercept the mail and get your certificate too.

Crypto

Besides that, weak cryptography  algorithms like MD5 have been used by some CAs, and this has been used to generate a rouge certificate too.

The EFF solution

To improve the security of PKI, the EFF has presented a proposal: Sovereign Keys

Sovereign Keys should make it harder for an attacker to generate a new certificate for an HTTPS website, without the cooperation of the legitimate site operator. The main building block of Sovereign Keys are so called timeline servers. These timeline servers are append-only databases, meaning that one can only add entries to the database, but never modify or delete them. These timeline servers could be operated by different entities like the EFF itself, or Mozilla, Google or Microsoft.

To use Sovereign Keys, the side administrator obtains an X.509 certificate as usual. Then he generates a new key, the so called sovereign key. He uploads the key with the certificate to a timeline server. The server operator checks, if that certificate is really issued by a valid CA and no other sovereign key has been added previously, and adds the sovereign key with the hostname of the certificate to the database.

When a client connects to the website, he also requests all database entries belonging to that hostname from a timeline server. In parallel to that, a SSL/TLS connection is established. The Server delivers the server certificate to the client, with an additional signature created with the sovereign key. The client can then check, if this signature can be verified with the sovereign key retrieved from the timeline server.

More details

The full protocol is a little bit more complex, because it needs to deal with revocation, privacy, mirroring and load balancing the timeline servers and many more things. It has not yet been finalized, but a draft of the protocol can be downloaded from: https://git.eff.org/?p=sovereign-keys.git;a=blob_plain;f=sovereign-key-design.txt;hb=master

Summary

For me, this looks like one of two solutions you need to improve the general security of SSL/TLS. Sovereign keys is a great solution for website operators that care about the security of their users. It will not help a user, if the website he connects to does not use it. For these cases, a different solution should be used, like checking if multiple computers in the internet get the same certificate from the server.

Categories: 28C3, protocol, pubkey Tags:

Bitcoin – An Analysis

December 29th, 2011 No comments

Kay Hamacher and Stefan Katzenbeisser presented their analysis of Bitcoin at 28C3. Besides the usual cryptographic analysis, they pointed out some important aspects of the system:

Bitcoin doesn’t scale well, and no update mechanism has been designed!

Bitcoin was created as a total decentralized system. There is also no central authority or update mechanism, that could alter the system. As soon as Bitcoin hits it’s boundaries, all Bitcoin users need to agree on a change of the system and integrate it. Because every user of Bitcoin needs to keep a full log of all transactions, that have ever been executed on the Bitcoin network, it is clear that Bitcoin will hit it’s scaleability boundaries rather soon, if it gets more popular.

Bitcoins can bet lost and the number of Bitcoins that can be generated is fixed!

Bitcoins are a digital currency, that are stored on the current owners computer. If for example the harddisk crashes, or it is encrypted an the password is lost, these Bitcoins cannot be spend anymore. They still exist in the network, and also cannot be regenerated. One could also say, these Bitcoins are lost. Also, the total amount of Bitcoins, that can ever be generated is fixed. So we need to expect, that the total amount of Bitcoins will start to decline, as soon as all Bitcoins have been generated. Just assume, that 1% of all Bitcoins are lost per year, then after  \log_{0.99} (0.5) \approx 69  years, only 50% of the Bitcoins will still be there.

Bitcoin is not untraceable and anonymous!

Because Bitcoin keeps a log of all transactions, and this log is available to the public, one can trace which address has received how much money, and where it went. Bitcoin allows a user to have more than one identity, but as soon as money from more than one address is used in a single transaction, one can assume that these addresses belong to the same user. One can for example see, who spend money to Wikileaks, and where Wikileaks transferred that money. Also if you assume, that a person ears money only from Bitcoin, you know his total income. You also know when he transfers money on Bitcoin and when not, so you might find out when he sleeps and in which time zone he lives in.

What has not been found…

As mentioned at the beginning of the post, there is no central update mechanism. So if somebody would find a bug in the design of the system, that allows him to steal money from it, it cannot easily be fixed. So far, no attack one the basics of the bitcoin system has been found and bitcoin is running and getting more and more popular.

Categories: 28C3 Tags:

Time is on my Side – Exploiting Timing Side Channel Vulnerabilities on the Web

December 29th, 2011 No comments

Sebastian Schinzel gave an interesting talk today at 28C3, about timing side channel attacks against web applications. (Timing-) Side channel attacks are known in the cryptography world for a long time, and many algorithms like RSA or AES have been successfully attacked. In a nutshell, an attacker measures the time a device needs to process a request (usually an encryption or decryption), and can draw conclusions from that to the values of secret input parameters (a plaintext or a secret key).

(cc)Sebastian showed, that this can be used against none cryptography web applications as well. Instead of just presenting his attacks, he presented general methods how to do timing measurements against web applications first. For example, a web application could perform the following sequence of checks during a user login:

  1. Does the account exist?
  2. Is the account of the user locked?
  3. Has the account expired?
  4. Is the password correct?

If one of these checks fails, the procedure is aborted and an error page is send to the user. Of course, each of these steps requires some time, and from the time it takes from the request to the generation of the error message, one might guess, which of these steps went wrong.

The second attack presented in this talk was a timing attack on an implementation of the XML encryption standard using a PKCS#1.5 padding. Here, the server needs a longer time to process a request, depending on the padding inside the encrypted payload.

For me, my personal highlight was the extension of timing based side channel attacks to none cartographic web applications. I assume, if one would check some famous web applications, many of such timing leaks could be found, because web developers usually don’t care about timing side channels. The timing difference could also be used to assist blind SQL injection attacks, where the timing difference could be the only channel back to the attacker.

Unfortunately, the slides are not (yet) available, but a previous paper describing the methods can be found at http://sebastian-schinzel.de/_download/cosade-2011-extended-abstract.pdf.

Categories: 28C3, protocol, sidechannel Tags:

Effective DoS attacks against Web Application Plattforms – #hashDoS [UPDATE3]

December 28th, 2011 No comments

Julian Wälde (zeri) and Alexander Klink (alech) presented a very nice way how to bring down many popular websites at 28C3. The idea is quiet simple, effective and general. Many programming languages, especially scripting languages, frequently use hash tables to store all kinds of data.

Hash Tables

Hash tables (which are very well explained on wikipedia) are data structures, that store key-value pairs very efficiently. Adding new entries to the table, looking up entries in the table for a given key, and deleting entries are usually executed in O(1) in best and average case, which means that the time for each operation is usually constant, and doesn’t depend on the number of entries stored in the table. Other data structures like binary tree type structures need O(\log(n)) entries in the average case, which means that they need more time for these operations when a lot of entries are stored. (n is the number of entries stored) However, there is a drawback, hash tables need O(n) operations for these operations in the worst case, compared to O(\log(n)) operations for binary trees, which means that they are much slower in very rare cases.

How the attack works

Many hash table implementations are very similar. They use a deterministic hash function to map a key k (usually a string or another complex data structure) to a single 32 or 64 bit value h. Let l be the size of the current table. Then  h%l is used as an index in this table, where the entry is stored. If more than one entry is mapped to the same index, then a linked list of entries is stored at this index position in the table. If many different keys are all mapped to the same hash value h, then all these entries are stored at the same index of the table, and the performance of the hash table goes down to a simple linked list, where all operations need O(n) time.

Performance

Just to get an impression, how much CPU time it takes to process such a request on a Core i7 CPU with PHP, assuming that the processing time for a request is not limited (usually, PHP limits the processing time for a request to 1 minute):

  • 8 MB of POST data –  288 minutes of CPU time
  • 500k of POST data – 1 minute of CPU time
  • 300k  of POST data – 30 sec of CPU time

So you can keep about 10.000 Core i7 CPU cores busy processing PHP requests using a gigabit internet connection. Alternatively for ASP.NET, 30,000 Core2 CPU cores, or for Java Tomcat 100,000 Core i7 CPU cores, or for CRuby 1.8 1,000,000 Core i7 CPU cores, can be kept busy with a single gigabit connection.

Even though this blog is named cryptanalysis.eu, these hash functions used for these tables are not cryptographic hash functions like SHA256, instead simple hash functions like djb2 are used, and it is very easy to find many inputs mapping to the same output. Julian and Alexander did a great job with checking many programming languages used for web applications for their hash table implementation and hash functions. For all of them they checked, they managed to find a lot of keys mapping to the same output, except for Perl. Perl uses a randomized hash function, i.e. the hash doesn’t only depend on the key, but also on an additional value, that is chosen at startup of the application at random. All other languages also store the query parameters send in an HTTP GET or POST request in an hash table, so that a request with many query parameters all mapping to the same hash value will slowly fill such a hash table, before the first line of code written by the application programmer will be executed. Filling this hash table will usually take several minutes on a decent CPU, so that even a fast web server can be kept busy using a slow connection.

Countermeasures

One may ask, can this problem be fixed by using a modern hash function like SHA256? Unfortunately, this will not solve the problem, because hash tables are usually very small, like 2^{32} entries, and one can still find a lot of values where SHA256(m) maps to the same value modulus 2^{32}. Also fixing this in the parser for the parameters string seems to be a bad solution, because many programmers use hash tables to store data retrieved from a user or another insecure source. From my point of view (which is also the opinion of the speakers), the best solution is to use randomized hash functions for hash tables. However, there are several kind of fixes deployed by the vendors:

Limiting HTTP POST and GET request lengths (Microsoft ASP.NET)

Microsoft suggests to limit the HTTP request length. Most applications don’t require very long HTTP requests, except for file uploads.

<configuration>
 <system.web>
 <httpRuntime maxRequestLength="200”/>
 </system.web>
</configuration>

As long as you don’t process data in a hash table from any other sources, except from the HTTP request (like external URLs), this should prevent the basic form of the attack.

Microsoft has also released an emergency patch against this attack: http://technet.microsoft.com/en-us/security/bulletin/ms11-100.mspx

Limiting the number of different HTTP request parameters (PHP, Tomcat)

PHP has added a new configuration variable max_input_vars, that can limit the number of parameters. This is similar to the first solution, except that here, not the total length of the request is limited, instead the number of different parameters that can be submitted in a single request is limited. This can be easier to use than limiting the length of the HTTP request, because application programmers usually know, how many parameters they need, while the maximum length of the request is harder to predict. However, if a single request parameter contains a data structure, that is later parsed and put into a hash table, the application might still be vulnerable.

Apache Tomcat also deployed a similar fix, that limits the number of parameters to 10000 by default.

Using different data structures

As mentioned before, other data structures (binary tree type structures like an AVL-tree gurantee, that all operations are always executed in O(\log(n)), and never need O(n) time. This is what for example Daniel J. Bernstein suggests. This completely fixes the vulnerability, but requires fundamental changes to the runtime/compiler/interpreter, and might be harder to implement.

More Information

More information can be found on Twitter, using the hashtag #hashDoS or from the user @hashDoS!

oCERT has a good summary about affected and fixed software versions: http://www.ocert.org/advisories/ocert-2011-003.html

Video and Paper

The video of the talk is available on YouTube. There is an advisory on full disclosure that describes the attack in detail: http://permalink.gmane.org/gmane.comp.security.full-disclosure/83694

 

Categories: 28C3 Tags:

Encrypted Traffic Mining (TM) – e.g. Leaks in Skype

December 28th, 2011 No comments

Stefan Burschka presented a nice attack against Skype on 28C3. The attack allows you to detect a sentence or a sequence of words in an encrypted Skype call, without having to break the cryptography used in Skype. Skype is a famous internet telephony application, that offers free voice calls. To save bandwidth, Skype uses an advanced audio codec. The used bandwidth and the size of the packets send by Skype depends heavily on the audio, that is encoded. The packet length is no hidden by the encryption of Skype and visible to an eavesdropper.  Just using these information, Burschka could distinguish between difference sentences in an encrypted Skype call.

Two methods were used to improve these results:

  • Dynamic Time Warping (DTW) is an algorithm used by many speech recognition systems, to compare an audio sample with a set of reference samples of slightly different length.
  • A Kalman Filter, which is a method to remove noise from measurements. I think here, noise is a slight variation of the packet lengths.

That the amount of communication between two parties and the actual timing of packets can leak information about the content of an encrypted communication was known for a long time, and has been successfully used for an attack against the TOR onion router network. However, this is the first time, that I have seen this method applied against Skype. The methods here are so generic, that they might be applied as well against other VoIP systems like SIP, if they are encrypted or communicate over a VPN.

The full paper is available at http://www.csee.usf.edu/~labrador/Share/Globecom/DATA/01-038-02.PDF and the slides can be downloaded from http://events.ccc.de/congress/2011/Fahrplan/attachments/1985_CCC.pptx.

Categories: 28C3, protocol Tags: