trusting data you did not create
Lack of Input Validation
The Input Validation Rules of Thumb in one defect: reading expects trustworthy data, writing expects well-formed data, executing expects code for the right interpreter. Skip the check at the boundary and the whole program inherits the lie.
Read values are assumed correct, written values assumed safe, executed values assumed benign — until proven otherwise.
01in the wild
In the wild
Executing Unvalidated Input
An executed value is code, and unvalidated input becomes code for the wrong interpreter.
example.py
# WRONG: user input concatenated straight into SQL
cur.execute("SELECT * FROM users WHERE name = '" + name + "'")
# name = "x'; DROP TABLE users; --" -> catastrophe
# RIGHT: parameterize -- the driver keeps data as data
cur.execute("SELECT * FROM users WHERE name = %s", (name,))String-building SQL lets input cross from data into code. Parameters keep the interpreter from ever seeing input as syntax.
// observed
concatenated: input can rewrite the query parameterized: input is always a bound value
example.sh
# WRONG: eval on anything a user can influence
mode=$(cat /tmp/mode)
eval "run_$mode" # mode file is attacker-controlled
# RIGHT: validate against an allow-list, never eval input
case "$(cat /tmp/mode)" in
full|incremental) run_backup "$mode" ;;
*) echo "unknown mode" >&2; exit 1 ;;
esaceval turns text into commands. An allow-list of known-good values is the only safe way to branch on outside input.
// observed
eval: arbitrary command execution case: only 'full' or 'incremental' accepted
Reading an Untrusted Payload
Read values are assumed correct and up-to-date; assume nothing until you have validated the shape.
example.ts
// WRONG: cast the JSON and pray
const user = JSON.parse(body) as User; // a cast checks nothing
sendEmail(user.email.toLowerCase()); // crashes if email is missing
// RIGHT: parse, don't validate -- a schema returns a typed value
const user = UserSchema.parse(JSON.parse(body)); // throws on bad shape
sendEmail(user.email.toLowerCase());'as User' is a compile-time fiction; at runtime the data can be anything. A schema (zod, io-ts) verifies the shape and hands back a value you can trust.
// observed
cast: TypeError: Cannot read 'toLowerCase' of undefined schema: throws 'email: Required' at the boundary
example.py
# WRONG: trust the inbound dict's keys and types
def handle(payload):
return charge(payload["amount"] * 100) # KeyError / TypeError roulette
# RIGHT: validate into a typed model at the edge (pydantic)
class Order(BaseModel):
amount: condecimal(gt=0)
def handle(payload):
order = Order(**payload) # raises ValidationError on bad input
return charge(order.amount * 100)Validating into a model turns a scattering of runtime errors into one clear failure at the door, with the type guaranteed afterward.
// observed
raw: KeyError('amount') deep in charge()
validated: ValidationError('amount: must be > 0')02cross-pollination
Where this compounds
Runtime Errors
- Integration Crash on Unexpected Input × Impure Functions
- NaN / Infinity Poisons a Consumer × Time, Money & Entropy
Data Corruption
- Stored / Second-Order Injection × File & Network Access
- Torn Write / Lost-Update Corruption × Race Conditions
- Cross-Session Contamination × Cross-Boundary State Exposure
03weakness catalog
Mapped weaknesses (CWE)
On its own, this defect is catalogued by MITRE as one or more of these weaknesses. The exploitable vulnerability usually appears only when it chains or combines with another.