Insecure Deserialization: How to Prevent Deserialization Attacks
Understand how gadget chains enable RCE via insecure deserialization in Java, PHP, and Python, and learn safe deserialization patterns to protect your app.
Insecure Deserialization: How to Prevent Deserialization Attacks
Insecure deserialization is ranked among the most severe vulnerability classes in the OWASP Top 10. It occurs when an application deserializes data from an untrusted source without validation, allowing an attacker to manipulate the serialized object and trigger unexpected behavior — up to and including arbitrary code execution. The attacks are language-specific but the underlying pattern is universal: the deserializer instantiates objects and invokes methods based entirely on attacker-controlled bytes.
What Deserialization Attacks Exploit
Serialization converts an in-memory object graph into a byte stream that can be stored or transmitted. Deserialization reverses the process. The problem is that during deserialization, the runtime calls constructors, field setters, and special lifecycle methods (like Java's readObject() or PHP's __wakeup()) on classes that are already present in the application's classpath or autoload path.
An attacker who can control the serialized payload does not need to inject new code. They chain together methods on existing classes — classes already loaded by the application — to achieve a malicious outcome. These chains of existing objects used in unintended ways are called gadget chains.
Java Gadget Chains
Java's native serialization format (using ObjectInputStream) is the most historically exploited. The landmark 2015 research by Frohoff and Lawrence ("Marshalling Pickles") demonstrated that virtually every Java application using Apache Commons Collections was exploitable via a gadget chain that called Runtime.exec().
A simplified view of how a gadget chain works:
- The deserializer reads the class name from the stream and instantiates it.
readObject()is called on the top-level object.- That object's
readObject()calls a method on another object it holds a reference to. - The chain propagates through several classes until it reaches a class that executes a system command.
Common gadget chain libraries include Apache Commons Collections, Spring Framework, Apache Commons BeanUtils, Groovy, and JDK classes themselves. Tools like ysoserial automate the generation of exploit payloads for known chains.
Detection: Look for usages of ObjectInputStream.readObject() in your codebase. Any deserialization of data that originates outside your own backend (HTTP request bodies, cookies, message queue payloads) is a risk.
Prevention in Java:
// Use a look-ahead ObjectInputStream that validates class names before instantiation
import java.io.*;
import java.util.Set;
public class SafeObjectInputStream extends ObjectInputStream {
private static final Set<String> ALLOWED_CLASSES = Set.of(
"com.example.myapp.SafeDto",
"java.lang.String",
"java.util.ArrayList"
);
public SafeObjectInputStream(InputStream in) throws IOException {
super(in);
}
@Override
protected Class<?> resolveClass(ObjectStreamClass desc)
throws IOException, ClassNotFoundException {
if (!ALLOWED_CLASSES.contains(desc.getName())) {
throw new InvalidClassException(
"Unauthorized deserialization attempt: " + desc.getName()
);
}
return super.resolveClass(desc);
}
}
The Java serialization filter API (ObjectInputFilter, available since Java 9) provides a cleaner built-in mechanism:
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(info -> {
Class<?> clazz = info.serialClass();
if (clazz == null) return ObjectInputFilter.Status.UNDECIDED;
if (clazz == MyDto.class) return ObjectInputFilter.Status.ALLOWED;
return ObjectInputFilter.Status.REJECTED;
});
PHP Gadget Chains
PHP's unserialize() function is similarly dangerous. PHP's magic methods — __wakeup(), __destruct(), __toString(), __call() — are invoked automatically during deserialization and can be chained.
A real-world example: Drupal (CVE-2019-6340) and Laravel have both had critical RCE vulnerabilities through PHP deserialization. The phpggc tool generates PHP gadget chain payloads for dozens of popular frameworks.
Vulnerable pattern:
// Never do this with user-controlled data
$userdata = unserialize($_COOKIE['session']);
Prevention in PHP:
The safest approach is to never use unserialize() on untrusted data. Use JSON instead:
// Safe alternative
$userdata = json_decode($_COOKIE['session'], true);
If you must use unserialize(), pass the allowed_classes option:
// Restrict which classes can be instantiated
$data = unserialize($input, ['allowed_classes' => ['MyApp\SafeModel']]);
// Or disallow all classes entirely (returns stdClass or arrays only)
$data = unserialize($input, ['allowed_classes' => false]);
The allowed_classes option, introduced in PHP 7.0, is the critical safety control. Without it, unserialize() will instantiate any class reachable by the autoloader.
Python Deserialization (Pickle)
Python's pickle module is explicitly documented as unsafe for untrusted data. Pickle can serialize and deserialize arbitrary Python objects, including objects whose __reduce__ method returns a callable and arguments — effectively a built-in code execution primitive.
import pickle
import os
class Exploit:
def __reduce__(self):
return (os.system, ('id > /tmp/pwned',))
payload = pickle.dumps(Exploit())
# Any application that does pickle.loads(payload) will execute the command
Detection: Search your codebase for pickle.loads, pickle.load, yaml.load (without Loader), marshal.loads, and jsonpickle.decode.
Prevention in Python:
Replace pickle with JSON or other safe serialization formats:
import json
# Serialization
data = json.dumps(my_dict)
# Deserialization — safe, no code execution possible
parsed = json.loads(data)
For more complex object graphs, use dataclasses with dacite or Pydantic for validated deserialization:
from pydantic import BaseModel
class UserPayload(BaseModel):
user_id: str
role: str
# Validates structure and types, raises ValidationError on invalid input
payload = UserPayload.model_validate_json(raw_json)
For YAML, always use yaml.safe_load() instead of yaml.load():
import yaml
# Vulnerable
data = yaml.load(user_input) # Can execute arbitrary Python
# Safe
data = yaml.safe_load(user_input) # Only basic types
Detecting Deserialization Vulnerabilities
Static analysis
Search for dangerous deserialization calls by language:
- Java:
ObjectInputStream,XMLDecoder,XStream.fromXML,Kryo.readObject - PHP:
unserialize(withoutallowed_classes - Python:
pickle.loads,yaml.load(,marshal.loads - Ruby:
Marshal.load - Node.js:
node-serialize,serialize-javascript(eval-based)
Semgrep has community rules for detecting these patterns in CI pipelines.
Dependency scanning
Many deserialization vulnerabilities are exploitable only because of gadget chains in third-party libraries. Keep your dependency tree updated and use tools like OWASP Dependency-Check or Snyk to identify libraries with known deserialization gadgets.
Runtime monitoring
The OWASP Java Serialization Cheat Sheet recommends instrumenting ObjectInputStream with the Java agent from Netflix's SerialKiller or using the built-in jdk.serialFilter system property to globally restrict deserialization:
-Djdk.serialFilter=com.example.**;!*
This JVM argument allows only classes in the com.example package and rejects everything else, providing a global safety net even for libraries that use native serialization internally.
The Safest Pattern: Avoid Deserialization of Untrusted Data
The most reliable fix is architectural: never deserialize native binary object formats from sources you do not control. Use data-only formats (JSON, Protobuf, Avro, MessagePack) with explicit schemas. Validate all incoming data against a strict schema before using it. Treat the deserialized output as untrusted until validated, the same way you would treat raw user input.
If you must accept serialized objects — for example, in a session cookie or a message queue — sign the serialized data with an HMAC and verify the signature before deserialization. An attacker who cannot forge the HMAC cannot tamper with the payload.