Stefan Burschka presented a nice attack against Skype on 28C3. The attack allows you to detect a sentence or a sequence of words in an encrypted Skype call, without having to break the cryptography used in Skype. Skype is a famous internet telephony application, that offers free voice calls. To save bandwidth, Skype uses an advanced audio codec. The used bandwidth and the size of the packets send by Skype depends heavily on the audio, that is encoded. The packet length is no hidden by the encryption of Skype and visible to an eavesdropper. Just using these information, Burschka could distinguish between difference sentences in an encrypted Skype call.
Two methods were used to improve these results:
- Dynamic Time Warping (DTW) is an algorithm used by many speech recognition systems, to compare an audio sample with a set of reference samples of slightly different length.
- A Kalman Filter, which is a method to remove noise from measurements. I think here, noise is a slight variation of the packet lengths.
That the amount of communication between two parties and the actual timing of packets can leak information about the content of an encrypted communication was known for a long time, and has been successfully used for an attack against the TOR onion router network. However, this is the first time, that I have seen this method applied against Skype. The methods here are so generic, that they might be applied as well against other VoIP systems like SIP, if they are encrypted or communicate over a VPN.
The full paper is available at http://www.csee.usf.edu/~labrador/Share/Globecom/DATA/01-038-02.PDF and the slides can be downloaded from http://events.ccc.de/congress/2011/Fahrplan/attachments/1985_CCC.pptx.