Context HTML Attribute

HTML attribute context (`escHtmlAttr`)

Use when the value lands inside an HTML attribute: <span title="HERE">, <a href="HERE">, <input value=HERE>.

What it does

escHtmlAttr() is strict by design. It whitelists [A-Za-z0-9,.\-_] — every other character is rewritten as an HTML entity. The output is therefore safe inside quoted, single-quoted and unquoted attribute values.

Three rules drive the output:

C0 controls (U+0000–U+001F) except tab/LF/CR → � (the Unicode replacement character).
C1 controls (U+007F–U+009F) → �. This check is performed on the decoded code point, so it catches the multibyte UTF-8 form (\xC2\x80 … \xC2\x9F) too.
Everything else outside the whitelist → named entity if one exists (", &, <, >), otherwise numeric reference (&#xHH; for ≤ 255, &#xHHHH; for BMP, full hex for supplementary plane).

Signature

public function escHtmlAttr(string $str): string;

Or via the facade:

Esc::esc(string $str, 'attr', ?string $encoding = null): string;

Exceptions

Throws	When
`InvalidUtf8Exception`	`$str` is not valid UTF-8 (after any encoding conversion).
`EncodingConversionException`	iconv / mbstring fail during UTF-8 conversion.

Both extend EscaperException.

Examples

Plain ASCII passes through

Esc::esc('plain', 'attr');     // plain
Esc::esc('abc,XYZ.-_0123', 'attr');  // abc,XYZ.-_0123

Space, `=`, parens, semicolon are rewritten

Esc::esc('with space', 'attr');
// with&#x20;space

Esc::esc('" or 1=1', 'attr');
// &quot;&#x20;or&#x20;1&#x3D;1

Esc::esc('faketitle onmouseover=alert(1);', 'attr');
// faketitle&#x20;onmouseover&#x3D;alert&#x28;1&#x29;&#x3B;

Defeating an unquoted-attribute injection

$untrusted = 'faketitle onmouseover=alert(1);';

echo '<span title=' . Esc::esc($untrusted, 'attr') . '>hello</span>';
// <span title=faketitle&#x20;onmouseover&#x3D;alert&#x28;1&#x29;&#x3B;>hello</span>

The browser sees a single attribute value (the encoded space ends the value parse), not an extra onmouseover handler.

Named entities are preferred

The four characters in the named-entity map use the shorter form:

Esc::esc('"', 'attr');  // &quot;
Esc::esc('&', 'attr');  // &amp;
Esc::esc('<', 'attr');  // &lt;
Esc::esc('>', 'attr');  // &gt;

Multibyte characters

Esc::esc('ş', 'attr');  // &#x015F;
Esc::esc('🚀', 'attr'); // &#x1F680;

Control characters

Esc::esc("\x00", 'attr');  // &#xFFFD;
Esc::esc("\x1B", 'attr');  // &#xFFFD;
Esc::esc("\x7F", 'attr');  // &#xFFFD;
Esc::esc("\xC2\x80", 'attr');  // &#xFFFD;   ← U+0080 in proper UTF-8
Esc::esc("\xC2\x9F", 'attr');  // &#xFFFD;   ← U+009F in proper UTF-8

Tab / LF / CR are explicitly exempted (they are valid in HTML):

Esc::esc("\t", 'attr');  // &#x09;
Esc::esc("\n", 'attr');  // &#x0A;
Esc::esc("\r", 'attr');  // &#x0D;

U+00A0 (NO-BREAK SPACE) sits one code point above the C1 range and is escaped as a normal character, not replaced:

Esc::esc("\xC2\xA0", 'attr');  // &#xA0;

Empty and digit-only input short-circuits

Esc::esc('', 'attr');       // ''
Esc::esc('12345', 'attr');  // 12345

Those inputs are already attribute-safe; the matcher is skipped entirely.

When not to use it

Location	Why	Use instead
`href=""`, `src=""`, `action=""`	`escHtmlAttr` only protects the attribute delimiters. `javascript:` survives.	`escUrl` + scheme whitelist
`style="..."`	The attribute is safe, but its content is CSS.	`escCss` for the inner value
`onclick="..."` and other `on*`	The attribute is safe, but its content is JavaScript.	`escJs` for the inner
Bare `<script>` body	Wrong context entirely — entities aren't decoded in scripts.	`escJs`

Why every value goes through UTF-8 internally

Even when the configured output encoding is, say, windows-1252, the matcher needs to address individual code points to apply the C0/C1 control-replacement and named-entity rules. The pipeline is:

[input in $encoding]
        ↓ iconv/mbstring
[input in UTF-8]
        ↓ preg_replace_callback with the matcher
[escaped in UTF-8]
        ↓ iconv/mbstring
[output in $encoding]

If iconv/mbstring fails at either end, EncodingConversionException is raised. If the converted input is not well-formed UTF-8, InvalidUtf8Exception is raised. See Encodings and Exceptions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context HTML Attribute

HTML attribute context (`escHtmlAttr`)

What it does

Signature

Exceptions

Examples

Plain ASCII passes through

Space, `=`, parens, semicolon are rewritten

Defeating an unquoted-attribute injection

Named entities are preferred

Multibyte characters

Control characters

Empty and digit-only input short-circuits

When not to use it

Why every value goes through UTF-8 internally

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Clone this wiki locally

Context HTML Attribute

HTML attribute context (escHtmlAttr)

What it does

Signature

Exceptions

Examples

Plain ASCII passes through

Space, =, parens, semicolon are rewritten

Defeating an unquoted-attribute injection

Named entities are preferred

Multibyte characters

Control characters

Empty and digit-only input short-circuits

When not to use it

Why every value goes through UTF-8 internally

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Clone this wiki locally

HTML attribute context (`escHtmlAttr`)

Space, `=`, parens, semicolon are rewritten