Sunday, October 27, 2013

Case #2: The alpha-padding permutation cipher

Requirements: PHP 5.4+ and Apache web server - both installed and running
Especially for this article, references will be noted by letters instead of numbers.

For this week, I'll be using a different approach of explaining the algorithm that I refer as "understanding in the storm". This will make use of special references within the codes. References will be made to some comments Also after reading some articles on the permutation cipher, I got inspired and developed an enhanced version of this last. Indeed the traditional permutation cipher was shown to be vulnerable to brute force attacks and frequency analysis.

First of all, let me quickly explain how permutation ciphers work (You can skip it though).

1.The traditional permutation cipher
We choose a keyword, and split the plaintext (the text to encrypt) into blocks that are the same length as the keyword. We write this in columns beneath the keyword( It can be represented as arrays in programming). We then label each keyword letter in alphabetical order (if there are duplicates we take them in order of appearance). Now we reorder the columns, so that the numbers are in order (the letters of the keyword are in alphabetical order). We now read across the rows[a].
For instance, suppose you want to process this plaintext using the key beac: kanel is awesome

The two tables show how it goes using the procedure described above.
   
            b       e       a      c
            2       5      1      3
            k        a        n        e
            l                   i         s
                      a        w       e
            s        o        m       e
  
====>              a       b       c      e
             1       2      3      5
             n         k       e        a
             i           l       s        
             w                 e        a
             m        s       e        o
The original plaintext broken into blocks having the same lenght than the key and sorted in labeled columns The blocks ordered after sorting the key's digits

 Therefore, after encrypting, the result (the ciphertext) will be: nkeails w eamseo. You see now the relation with anagrams. The plaintext and the ciphertext are anagrams in permutation ciphers.
However, this technique has a lot of inconviniences:
(a) The information relating to the length of the key is revealed by the length of the ciphertext
(b) Such permutation is breakable by trying out different permutations until you find a first block that makes sense. Once you have worked out the permutation, you can then use this to decrypt the rest of the intercept easily. That's brute force attack.
(c) Another way to break it is by frequency analysis. Basically, you consider some specific commonly used letters of the permutation's language (english, french,...) that are parts of the ciphertext, then you study their frequency [b] in order to determine which one is more likely in the plaintext and thus, infer the plaintext.

Looking at these, I said to myself: "With such flaws, such cipher is definitely useless". So I started working on a more mature useful permutation-type cipher that I called: the alpha-padding permutation cipher .

2.The alpha-padding permutation cipher

Inspired from the traditional transposition cipher, the essence of this encryption method resides in three(3) main pieces: an alphabet, a padding parameter and the key.

2.1 How it works:
First, you choose a key and a padding parameter in line with the following requirements:
- Both must be numbers
- Both must be contain at least two(2) digits 
Now during the encryption, as its predecessor, the original plaintext is broken into blocks having the same length than the key and sorted in labelled columns. But this time, before the blocks are ordered after sorting the key's digits, each character of the plaintext-side content of the columns is mapped to a number based on an alphabet and its position in the alphabet; and the position is then padded using the padding parameter. A child parameter called subpad, based on the key and the padding parameter, is computed and  debited substracted from the total. The subpad is the cumulative join of the padding parameter's half within the length of the key.

Example:
Consider again the same plaintext above: kanel is awesome but now for the key, we will take 2513 and for the padding parameter, we choose 10 for instance. We define our alphabet as follows:  "00"=>" ","01"=>"a","02"=>"b", 
"03"=>"c","04"=>"d","05"=>"e","06"=>"f", "07"=>"g","08"=>"h","09"=>"i",
 "10"=>"j","11"=>"k","12"=>"l","13"=>"m","14"=>"n","15"=>"o","16"=>"p", 
"17"=>"q","18"=>"r","19"=>"s","20"=>"t","21"=>"u","22"=>"v","23"=>"w",
"24"=>"x","25"=>"y","26"=>"z","27"=>"_");

  2   5   1   3  
  k    a    n    e
  l           i     s
        a    w   e
  s    o    m   e
=>   2   5   1   3    
  11  01  14  05
  12  00  09  19
  00  01  23  05
  19  15  13  05
=>     2   5   1   3    
     21  11  24  15
     22  10  19  29
     10  21  33  15
     29  25  23  15
=>   1   2   3   5
  24  21 15  11
  19  22 29  10
  33  10 15  11
  23  29 15  25
The original plaintext broken into blocks having the same lenght than the key and sorted in labeled columns Mapping each block's characters to the alphabet Padding/Adding a noise to the map by adding 10 to the content of each block The blocks ordered after sorting the key's digits

The resulting ciphertext will hence be: 24211511192229103310151123291525
Finally, the subpad  (5555) is substracted from the resulting ciphertext  and the new ciphertext is: 24211511192229103310151123285970

In general, one can say that the alpha-padding permutation cipher is a vertical padded sort followed by a horizontal substraction.

For the decryption process, we simply first add the subpad and split the ciphertext into block of lenght 2. Next, for each block, we substract the padding paramater from each block's numbers and map the result to their corresponding alphabet's character. Then we form super-block of length = key's length, index each super-block with the sorted key's digit. Finally, we bring the plaintext back by bringing the indexes(key's digits) to the original order of the key. Without knowledge of the key and the padding parameter, it's almost impossible to crack this cipher.

Pretty simple and powerful; but strict though. As a matter of fact, apart from the initial requirements, two more constraints are added: Because the key's digits serve as array indexes, the key shall not contain repetitive digits (e.g: 1000, 1199,....). Furthermore, how good are the ciphered plaintext and deciphered ciphertext depends on how good is your alphabet. So you might want to revise it.

2.2 The code:
Now, let's take a look at the class behind the alpha-padding process:

 class.alphapad.php

<?php

class alphaPad {
  
   private $key = 0;
   private $padding_par = 0;
   private $alphabet = array ("00"=>" ","01"=>"a","02"=>"b","03"=>"c", 
   "04"=>"d","05"=>"e","06"=>"f","07"=>"g","08"=>"h", "09"=>"i", 
   "10"=>"j","11"=>"k","12"=>"l","13"=>"m","14"=>"n","15"=>"o", 
   "16"=>"p","17"=>"q","18"=>"r","19"=>"s","20"=>"t","21"=>"u", 
   "22"=>"v","23"=>"w","24"=>"x","25"=>"y","26"=>"z","27"=>"_");
    private $sub_par = 0;
    private $length = 0;
    private $plaintext = "";
    private $ciphertext = "";
    private $key_a = array();
      
    public function __construct($key, $padding_par){
        $this->key = $key;
        $this->padding_par = $padding_par;
        $this->length = strlen($this->key);
        $this->key_a = str_split($this->key);
        $this->sub_par = $this-> _sp($this->padding_par, $this->length);
    }
  
    public function _encrypt($plaintext){
        try{
            $this->ciphertext = "";
            $this->plaintext = $plaintext;
          print "<div id='out'>Plaintext: ".$this->plaintext." <br>(key: ".$this->key." - padding parameter: ".$this->padding_par.")";
          
            //- split the plaintext into block of lenght = key lenght [1]
            //- index each block using the key [2]
            //- and sort each block by the key's index [3
]
            $plain_a = str_split($this->plaintext, $this->length); /*[1]*/
          
            foreach ($plain_a as $block) {  /*[2]*/
                //- if a block is not a multiple of $lenght. e.g: "plai ntex t" for $lenght=4
                $empty_space=strlen($block)% $this->length;
                if( $empty_space > 0) {for($j=1; $j<=($this->length-$empty_space); $j++) {$block .="-";}}
              
                $block_a = str_split($block);
                $tmp = array(); //- we put all the changes in a temp array first
                for($i = 0; $i < $this->length; $i++){
                    //- $tmp[$this->key_a[$i]] = $block_a[$i]; - For the primitive permutation
               $tmp[$this->key_a[$i]] = (int)array_search($block_a[$i], $this->alphabet)+$this->padding_par; /*[2]*/
                    //- unmatchable index keys will be converted to 0
                    unset($block_a[$i]);  
                }
                $block_a = $tmp; unset($tmp);
                ksort($block_a);  /*[3]*/
                $this->ciphertext .= implode("",$block_a);
            }
            $this->ciphertext = $this->ciphertext - $this->sub_par;
            print "<br>Ciphertext: ".number_format($this->ciphertext, 0, '.', '');
        }
        catch(Exception $e){
            die ("An error occured");
        }
      
    }
  
    private function alpha(&$item, $key, $parameter){
        $item = (int)$item;
        $item = $item - $parameter;
        if(strlen($item) == 1) $item ="0".$item;
        if (isset($this->alphabet[$item])) $item$this->alphabet[$item];
        else $item = "-";            
    }
  
    public function _decrypt($ciphertext){
        try{      
            $this->plaintext = "";
            $this->ciphertext = $ciphertext;
            $this->ciphertext = $this->ciphertext +$this->sub_par; 
            print "<div id='out'>Ciphertext: ".$this->ciphertext." <br>(key: ".$this->key." - padding parameter: ".$this->padding_par.")";

           //- split the ciphertext into block of lenght = string lenght of the items in $alphabet = 2 [1]
            //- for each block, unpad and map to the alphabet character [2]
            //- form super-block of lenght = key lenght [3]
            //- sort the key in ascending order [4]
            //- index each super-block with the sorted key [5]
            //- bring the plaintext back by bringing the indexes(key digits) to the original order of the key. [6]


            $cipher_a = str_split($this->ciphertext, 2); /*[1]*/
          
            if($this->length%count($this->ciphertext)==0) {
                      
                array_walk( $cipher_a , array($this,'alpha'), $this->padding_par); /*[2]*/
                $cipher_a = array_chunk( $cipher_a, $this->length); /*[3]*/
                asort($this->key_a); /*[4]*/
              
                foreach($cipher_a as $block_a){
                    $tmp = array(); $i= 0;
                    foreach($this->key_a as $k=>$v){  /*[5]*/
                        if(isset($block_a[$i])) {
                            $tmp[$v] = $block_a[$i];
                            unset($block_a[$i]);
                        }
                        else $tmp[$v] = "-"; //- e.g: 25 31 21 --
                        $i++;
                    }
                    $block_a = $tmp; unset($tmp);
                    $block_a = $this->referential_ksort($block_a,$this->key); /*[6]*/
                    $this->plaintext .= implode("", $block_a);
                 }
                 print "<br>Plaintext: ".$this->plaintext;
                
             }
             else echo "Invalid ciphertext";
        }
        catch(Exception $e){
            die ("An error occured");
        }
    }
  
    private function referential_ksort($array, $sequence){
        //- sorting by array index and also by following a given sequence
        //- thus all the array keys must be members of the given sequence

        $sequence_a = str_split($sequence);$temp = array();
        foreach ($sequence_a as $s) $temp[$s] = $array[$s];
        unset($array);
        return $temp;
    }
  
    private function _sp($pad,$len){
        $sp = "";
        for($i=0; $i<$len; $i++) $sp.=(int)($pad/2); //- so that the substraction will always be > 0
        return $sp;
    }
}

2.3 Complexities (Time and Space):

Let l be length of the key, n the length of the original input array
We note n/l as the round up of n/l

For the encryption:
#Time complexity: O(l n/l) 
In the method _encrypt(), during the "for" loop [2], n/l blocks (from the initial block division) are traversed. Then in each loop, we have another loop of l rounds giving a total of l n/l time units. The computation of the $sub_pad requires l time units because of the "for" loop involved in the process of the method _sp(). However it's not included in the time complexity since it's computed at the same time the class is instantiated.
#Space complexity: O(l+n)
Again in the method _encrypt(), the computation of $key_a requires l space units, all the $block_a/$plain_a instances after splitting the plaintext takes n space units, the transition from $block_a to $tmp  and the other fixed variables  take only one (1) unit space.

For the decryption:
#Time complexity: O(n/l+(2l)n/l) 
In the quest of applying the method alpha(), the method _decrypt() will take a single time unit to traverse each of the n/l blocks. Similarly to _encrypt, the super 'for' loop will traverse each of the n/l block via another loop of l rounds; but this time, we have a game changer called referential_ksort() in each the l-rounds loop causing a total consumption of 2l n/l time units for this 2-dimensional traversal. 
#Space complexity: O(2l+n)
As in encrypt(), $key_a costs l space units, all the $block_a/$cipher_a take n space units. However the variable $sequence_a in referential_ksort in the 2-dimensional loop will cost an additional l. And we have the transition from $block_a to $tmp and the other fixed variable demanding a single space unit.
 
2.4 The results:
The test of its code was done in a regular Apache-PHP environment, nothing special) and shows encouraging results

 View the results|Download source

Memory overhead: Given a specific cipher, we define the memory overhead as follow: 
overhead = (size_of_plaintext / size_of_ciphertext)-1
In the experiment, the following function was in charge of capturing the memory footprint during the ciphering:

function sizeofvar($var) {
   $start_memory = memory_get_usage();
   $tmp = unserialize(serialize($var));
   return memory_get_usage() - $start_memory;
}

Note that this is in no way a reliable method, you can't be sure that nothing else touches memory while assigning the variable, so this should only be used as an approximation. The figure bellow shows the results:

   Loading ...   

Execution time: To efficiently test the running time of the cipher over plaintexts of random length,  we measured 5 times the running time over 5 different values of the padding parameter (10,47,84,135,206,313) and over 5 different values of the key (12,98,105,376,1010). For each round, we increased the length of the input plaintext. The results are plotted in figure below.

   Loading ... 

False positives: By false positives, we mean the scenario where the encryption always works but decryption not. Indeed, during the experimental phase, we found that false positives occur when the key's length is 1 (one) and the value of the padding parameter is 79 and above. That's why we suggest again to use keys containing at least 2 (two) digits.

2.5 Some practices in real-world:

#Name-based URLs: The detailed technique in this article can be used to match a username to a specific code. For instance, in a social network or any other dynamic computer system where user's data are stored in a XML file named using a user-based code (such code can be created by passing the user name to our alpha-padding permutation cipher, you might have a scenario where the user "johndoe" request to see the data of  "smith" via the URL: http://www.example.com/smith. This request will load the file 1929302324181210-data.xml (1929302324181210 is the ciphertext for "smith" for the key 2513 and the padding parameter 10).

#Digital signatures: Another practical example of our cipher is found in the identity management area. It can be used to protect files using a collaboration with trusted time-stamping. This scenario will be the topic of our next article. The result of such collaboration will be called Private Trusted Identity. In this case,  a document is created by someone called "John Doe Sr.". After signing the document with something he is (his name/social security number/fingerprint/...), the saving procedure will used his name to create a code using our alpha-padding permutation cipher. This code will be then joined to a time-stamp provided by a time-stamp authority to form a private key or digital signature. The document, the private key and the time-stamp will be rowed in a database. During the access to the document by another bud, the digital signature will be verified first before the green light is given.

Extended work:
A deeper research on the alpha-padding permutation cipher includes, but is not limited to:
- The sub-pad parameter (the variable $sub_pad variable in the code): The resulting ciphertext can be really large for some plaintext and therefore might not be supported by some PHP environment. In our presented study, the sub-pad parameter cannot not be longer than the key. Thus, future works will aim to provide a larger sub-pad parameter in order to considerably reduce the length of the ciphertext and hence the inference power of the attacker.
-  Another vista to pay attention is the customization of the alphabet. A more diverse and mature alphabet will give a bigger input field from the user's perspective. However such tailoring will require a conscious  adjustment of the cipher.

References:
(a) http://crypto.interactive-maths.com/permutation-cipher.html
(b) http://homepages.math.uic.edu/~leon/mcs425-s08/handouts/breaking_tranposition_cipher.pdf
(c) http://web.cs.du.edu/~ramki/courses/security/2011Winter/notes/Lecture4.pdf
(d) http://everything2.com/title/Permutation+cipher
(e) http://stackoverflow.com/questions/2192657/how-to-determine-the-memory-footprint-size-of-a-variable 

Thanks for your time and don't forget to comment and share this article, add me on G+ and subscribe to the blog's RSS feeds

No comments:

Post a Comment