JavaScript htmlspecialchars_decode
Convert special HTML entities back to characters
1 2 3 4 56 7 8 9 1011 12 13 14 1516 17 18 19 2021 22 23 24 2526 27 28 29 3031 32 33 34 3536 37 38 39 4041 42 43 44 4546 47 48 49 5051 52 53 54 5556 57 58 59 6061 62 63 64 | function htmlspecialchars_decode (string, quote_style) { // Convert special HTML entities back to characters // // version: 912.1315 // discuss at: http://phpjs.org/functions/htmlspecialchars_decode // + original by: Mirek Slugen // + improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Mateusz "loonquawl" Zalega // + input by: ReverseSyntax // + input by: Slawomir Kaniecki // + input by: Scott Cariss // + input by: Francois // + bugfixed by: Onno Marsman // + revised by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Brett Zamir (http://brett-zamir.me) // + input by: Ratheous // + input by: Mailfaker (http://www.weedem.fr/) // + reimplemented by: Brett Zamir (http://brett-zamir.me) // + bugfixed by: Brett Zamir (http://brett-zamir.me) // * example 1: htmlspecialchars_decode("<p>this -> "</p>", 'ENT_NOQUOTES'); // * returns 1: '<p>this -> "</p>' // * example 2: htmlspecialchars_decode("&quot;"); // * returns 2: '"' var optTemp = 0, i = 0, noquotes= false; if (typeof quote_style === 'undefined') { quote_style = 2; } string = string.toString().replace(/</g, '<').replace(/>/g, '>'); var OPTS = { 'ENT_NOQUOTES': 0, 'ENT_HTML_QUOTE_SINGLE' : 1, 'ENT_HTML_QUOTE_DOUBLE' : 2, 'ENT_COMPAT': 2, 'ENT_QUOTES': 3, 'ENT_IGNORE' : 4 }; if (quote_style === 0) { noquotes = true; } if (typeof quote_style !== 'number') { // Allow for a single string or an array of string flags quote_style = [].concat(quote_style); for (i=0; i < quote_style.length; i++) { // Resolve string input to bitwise e.g. 'PATHINFO_EXTENSION' becomes 4 if (OPTS[quote_style[i]] === 0) { noquotes = true; } else if (OPTS[quote_style[i]]) { optTemp = optTemp | OPTS[quote_style[i]]; } } quote_style = optTemp; } if (quote_style & OPTS.ENT_HTML_QUOTE_SINGLE) { string = string.replace(/�*39;/g, "'"); // PHP doesn't currently escape if more than one 0, but it should // string = string.replace(/'|�*27;/g, "'"); // This would also be useful here, but not a part of PHP } if (!noquotes) { string = string.replace(/"/g, '"'); } // Put this in last place to avoid escape being double-decoded string = string.replace(/&/g, '&'); return string; } |
Examples
» Example 1
Running
1 | htmlspecialchars_decode("<p>this -> "</p>", 'ENT_NOQUOTES'); |
Should return
1 | '<p>this -> "</p>' |
» Example 2
Running
1 | htmlspecialchars_decode("&quot;"); |
Should return
1 | '"' |
Dependencies
No dependencies, you can use this function standalone.
Open syntax issues
php.js uses JsLint to help us keep our code consistent and prevent some common bugs.
Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.
Authors
Thanks to the following developers, you get to have htmlspecialchars_decode goodness in JavaScript.
htmlspecialchars_decode function in PHP doesn't work recursive.
but this function is too recursive.
so "☻" will not be converted by this function as "☻"
however, it will be converted as "☻"
on the other hand,
the function in php will convert it as "☻"
@Mailfaker: Thanks. I've completely redone the two htmlspecialchars functions in Git, also to handle flags and arguments: http://github.com/kvz/phpjs/commit/881de8748cf986d025ecfad5f448fbbb8ba7710e . Btw, using replace was much faster for me (and easier) than using split and join.
Hi everyone,
this code wasn't working for me. I have done some changes and now it runs.
The problem is that, for decoding, hash_map table must be read in descending order. Or simply, you can do so:
1
2
3
4
56
7
8
| function htmlspecialchars_decode (string) { tmp_str = string.toString(); tmp_str = tmp_str.split('"').join('"'); tmp_str = tmp_str.split('<').join('<'); tmp_str = tmp_str.split('>').join('>'); tmp_str = tmp_str.split('&').join('&'); return tmp_str; } |
@ Liviu Mirea: I added your example as a testcase, but I was unable to reproduce the problem.
What version & browser are you using?
I'm sorry but the messaging system seems to be messed up and I can't post my message. What I'm trying to say is that the above function is incorrect. If you try to decode "& amp; quot;" (remove spaces) it will output a double quotation mark instead of "& quot;" (remove spaces). Hope this message will be properly posted. :/
Erm, ignore my message below, the caracters are messed up.
Here:
1 | htmlspecialchars_decode(' &quot; '); |
In PHP it returns:
1 | " |
The Javascript function above returns: "
Basically, it first decodes
1 | "&" |
1 | "&" |
1 | """ |
1 | htmlspecialchars_decode(' &quot; '); |
In PHP it returns: "
The Javascript function above returns: "
Basically, it first decodes "&" to "&", thus resulting """. Afterward, it decodes """ but it shouldn't.
There is a serious parse error in this function
1 | string = string.replace(/&gt;/g '>'); |
should be (added a comma):
1 | string = string.replace(/&gt;/g, '>'); |
There is an error in the htmlspecialchars_decode(),
There a single quote around the regex for all params values in replace() except for > the only one that works. this is in the php.min.js
[CODE="php"]
<?php
echo html_entity_decode("8")."\n";
?>
[/CODE]
returns 8.
This behavior is not documented in the PHP manual though, do you know what table is used here?
@ Bob Palin: Thank you for noticing. It is possible to declare global constants in javascript, but that would increase the number of dependencies throughout this project.
We have deliberately chosen to implement this a bit different from the original PHP documentation to allow for more functions to be included separately.
The function description says that 'quote_style' is an int and list constants, in fact the argument is a string as shown in the code and example.
No problem :)
There's another bug in this function. First argument of called function string.replace() is a string object '/&/g'. It won't work, unless it's a regular expression object (should be /&/g - without the apostrophes).
Here's the correct code:
1 2 3 4 56 7 8 9 1011 12 13 14 1516 17 | string = string.toString(); // Always encode string = string.replace(/&amp;/g, '&'); string = string.replace(/&lt;/g, '<'); string = string.replace(/&gt;/g, '>'); // Encode depending on quote_style if (quote_style == 'ENT_QUOTES') { string = string.replace(/&quot;/g, '"'); string = string.replace(/&#039;/g, '\''); } else if (quote_style != 'ENT_NOQUOTES') { // All other cases (ENT_COMPAT, default, but not ENT_NOQUOTES) string = string.replace(/&quot;/g, '"'); } return string; |
This is explained here:
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:String:replace
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Objects:RegExp
Btw. Most people involved in php2js project have their full names in credits. So, my name's Mateusz Zalega. Just saying :)
Shouldn't it be
1 2 3 | string = string.replace(/&/g, '&'); string = string.replace(/</g, '<'); string = string.replace(/>/g, '>'); |
rather than
[CODE = "Javascript"]
string.replace('/&/g', '&');
string.replace('/</g', '<');
string.replace(/>/g, '>')
[/CODE]
?
Function (string object).replace() doesn't modify the string. It returns a new (replaced) string object.


Brett Zamir
Feb 13th