How To Properly Sanitize User Input In PHP?

Published October 3, 2024

Problem: Unsanitized User Input Risks

Unsanitized user input in PHP applications can cause security issues. These include SQL injection, cross-site scripting (XSS), and other attacks that use unfiltered data.

PHP Input Sanitization Techniques

Using Built-in PHP Functions

PHP offers functions to sanitize user input. The filter_var() and filter_input() functions help validate and sanitize data. You can use filter_var() to validate an email address or sanitize a string:

$email = filter_var($user_email, FILTER_SANITIZE_EMAIL);
$string = filter_var($user_input, FILTER_SANITIZE_STRING);

The htmlspecialchars() function is useful when outputting data in HTML context. It converts special characters to their HTML entities, preventing XSS attacks:

$safe_output = htmlspecialchars($user_input, ENT_QUOTES, 'UTF-8');

Tip: Sanitize and Validate

Always sanitize input before validation. This ensures that the data you're validating is in the expected format, reducing the risk of false positives or negatives during validation.

Implementing Prepared Statements

Prepared statements prevent SQL injection attacks. They separate SQL logic from data, making it impossible for malicious input to alter the query structure.

Using PDO (PHP Data Objects):

$stmt = $pdo->prepare("SELECT * FROM users WHERE username = ?");
$stmt->execute([$username]);

Using MySQLi:

$stmt = $mysqli->prepare("SELECT * FROM users WHERE username = ?");
$stmt->bind_param("s", $username);
$stmt->execute();

In both cases, the database treats the input as data rather than part of the SQL command, eliminating the risk of SQL injection.

Advanced Sanitization Strategies

Context-Specific Sanitization

When you sanitize user input, adapt your methods based on where the data will be used. Different contexts need different sanitization approaches:

  • For database queries, use prepared statements or escape functions for your database system.
  • For HTML output, apply htmlspecialchars() to prevent XSS attacks.
  • For URL parameters, use urlencode() to encode special characters.

Handle different data types with specific approaches:

  • For strings, remove or encode harmful characters.
  • For numbers, use filter_var() with flags or cast to integer/float.
  • For emails, use filter_var() with the FILTER_VALIDATE_EMAIL flag.

Example of context-specific sanitization:

function sanitizeInput($input, $context) {
    switch ($context) {
        case 'html':
            return htmlspecialchars($input, ENT_QUOTES, 'UTF-8');
        case 'url':
            return urlencode($input);
        case 'integer':
            return filter_var($input, FILTER_SANITIZE_NUMBER_INT);
        // Add more contexts as needed
    }
}

Tip: Sanitize JSON Data

When handling JSON data, use json_encode() with the JSON_HEX_TAG, JSON_HEX_AMP, JSON_HEX_APOS, and JSON_HEX_QUOT options to escape special characters:

$sanitizedJson = json_encode($data, JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS | JSON_HEX_QUOT);

Custom Sanitization Functions

Create custom sanitization functions to fit specific use cases:

function sanitizeUsername($username) {
    // Remove any characters that aren't alphanumeric or underscore
    $username = preg_replace('/[^a-zA-Z0-9_]/', '', $username);
    // Limit the length
    return substr($username, 0, 30);
}

When you create custom functions, balance security and functionality:

  • Be strict to remove threats.
  • Avoid being too restrictive that it limits usability.
  • Consider your application's specific needs.

Custom functions should add to, not replace, built-in PHP functions and prepared statements.