Kenneth Falck's Blog

Decoding PKCS#1 padding in Python

Posted on 2011-03-07 by Kenneth Falck

If you need to decrypt data that has been encrypted with the RSA algorithm as specified by PKCS#1 (RFC 3447), you can use the PyCrypto toolkit.

PyCrypto provides a simple API to load an RSA key and decrypt the ciphertext:

from Crypto.PublicKey import RSA
key = RSA.importKey(open('privatekey.der').read())
print key.decrypt("<ciphertext encrypted elsewhere>")

However, PyCrypto does not remove the PKCS1-v1_5 padding, so the decrypted data contains extra bytes.

Luckily, the unpadding algorithm defined in section 7.2.2 of RFC 3447 (RSAES-PKCS1-V1_5-DECRYPT) is very simple. It basically separates the message (M) from the padding bytes (PS) with a NUL character (ASCII 0x00):

EM = 0x00 || 0x02 || PS || 0x00 || M.

We can simply look for the NUL character in the decoded data and extract the final message:

def pkcs1_unpad(text):
    if len(text) > 0 and text[0] == '\x02':
        # Find end of padding marked by nul
        pos = text.find('\x00')
        if pos > 0:
            return text[pos+1:]
    return None

Note that the function returns None if the padding is invalid. The data returned by key.decrypt() does not contain the first 0x00 character, so the code assumes it starts with 0x02.

Disclaimer: I'm not a cryptography expert and you should be very careful in building your own encryption schemes with the technologies described in this article, as many things can go wrong.

Update 2011-12-07: The original re.match() used for unpadding would break when the message contained newline characters (\n). It has been replaced with manual string processing in the example code.