JavaScript htmlspecialchars
Convert special characters to HTML entities
1 2 3 4 56 7 8 9 1011 12 13 14 1516 17 18 19 2021 22 23 24 2526 27 28 29 3031 32 33 34 3536 37 38 39 4041 42 43 44 4546 47 48 49 5051 52 53 54 5556 57 58 59 6061 62 63 64 6566 | function htmlspecialchars (string, quote_style, charset, double_encode) { // Convert special characters to HTML entities // // version: 1008.1718 // discuss at: http://phpjs.org/functions/htmlspecialchars // + original by: Mirek Slugen // + improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Nathan // + bugfixed by: Arno // + revised by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Brett Zamir (http://brett-zamir.me) // + input by: Ratheous // + input by: Mailfaker (http://www.weedem.fr/) // + reimplemented by: Brett Zamir (http://brett-zamir.me) // + input by: felix // + bugfixed by: Brett Zamir (http://brett-zamir.me) // % note 1: charset argument not supported // * example 1: htmlspecialchars("<a href='test'>Test</a>", 'ENT_QUOTES'); // * returns 1: '<a href='test'>Test</a>' // * example 2: htmlspecialchars("ab\"c'd", ['ENT_NOQUOTES', 'ENT_QUOTES']); // * returns 2: 'ab"c'd' // * example 3: htmlspecialchars("my "&entity;" is still here", null, null, false); // * returns 3: 'my "&entity;" is still here' var optTemp = 0, i = 0, noquotes= false; if (typeof quote_style === 'undefined' || quote_style === null) { quote_style = 2; } string = string.toString(); if (double_encode !== false) { // Put this first to avoid double-encoding string = string.replace(/&/g, '&'); } string = string.replace(/</g, '<').replace(/>/g, '>'); var OPTS = { 'ENT_NOQUOTES': 0, 'ENT_HTML_QUOTE_SINGLE' : 1, 'ENT_HTML_QUOTE_DOUBLE' : 2, 'ENT_COMPAT': 2, 'ENT_QUOTES': 3, 'ENT_IGNORE' : 4 }; if (quote_style === 0) { noquotes = true; } if (typeof quote_style !== 'number') { // Allow for a single string or an array of string flags quote_style = [].concat(quote_style); for (i=0; i < quote_style.length; i++) { // Resolve string input to bitwise e.g. 'PATHINFO_EXTENSION' becomes 4 if (OPTS[quote_style[i]] === 0) { noquotes = true; } else if (OPTS[quote_style[i]]) { optTemp = optTemp | OPTS[quote_style[i]]; } } quote_style = optTemp; } if (quote_style & OPTS.ENT_HTML_QUOTE_SINGLE) { string = string.replace(/'/g, '''); } if (!noquotes) { string = string.replace(/"/g, '"'); } return string;} |
Examples
» Example 1
Running
1 | htmlspecialchars("<a href='test'>Test</a>", 'ENT_QUOTES'); |
Should return
1 | '<a href='test'>Test</a>' |
» Example 2
Running
1 | htmlspecialchars("ab\"c'd", ['ENT_NOQUOTES', 'ENT_QUOTES']); |
Should return
1 | 'ab"c'd' |
Dependencies
No dependencies, you can use this function standalone.
Open syntax issues
php.js uses JsLint to help us keep our code consistent and prevent some common bugs.
Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.
Authors
Thanks to the following developers, you get to have htmlspecialchars goodness in JavaScript.
@hacksmw: When I try
alert(htmlspecialchars_decode('& amp;#9787;'))
...I do get & #9787; in our php.js JavaScript.
Make sure you are using the latest code (see http://github.com/kvz/phpjs/raw/master/functions/strings/htmlspecialchars_decode.js ).
htmlspecialchars_decode function in PHP doesn't work recursive.
but this function is too recursive.
so "& amp; #9787;" will not be converted by this function as "& #9787;"
however, it will be converted as "☻"
on the other hand,
the function in php will convert it as "&# 9787;"
(
i can't delete my old comment.
so, i wrote this comment once again :(
)
htmlspecialchars_decode function in PHP doesn't work recursive.
but this function is too recursive.
so "& #9787;" will not be converted by this function as "& #9787;"
however, it will be converted as "☻"
on the other hand,
the function in php will convert it as "&# 9787;"
(
i can't delete my old comment.
so, i wrote this comment once again :(
)
@Felix: Thanks for the feedback. Yes, I pushed earlier to the git repo with the fix. Was my oversight as I was testing in Firefox which doesn't have a problem with trailing commas. htmlspecialchars_decode() also had the issue which I fixed as well. Thanks again!
problem solved.. wrong synthax in in row 38/39.. after "'ENT_IGNORE' : 4" there's a comma but it shouldnt be there ^^
Hi,
seems that the script has problems with ie6 + 7 .. here the browser says "object expected" in line 41/42... ???
also.. does this function work with utf-8 ?
@ T.Wild: Hey man. Thanks a lot for testing this. I've patched it in SVN, and things will be online shortly
A Frank Forte posted over on strtr (http://phpjs.org/functions/strtr:556#comment_75192) that htmlspecialcharacters is encoding ampersands after encoding other characters.
so < test > becomes andamp;lt test andamp;gt
I've confirmed this myself, and his fix of moving the line
entities['38'] = '&';
to the top of the entities list (before the line
if (useTable === 'HTML_ENTITIES')
seems to work without effecting the other dependent functions:
htmlentities
html_entity_decode
htmlspecialchars_decode
If you try htmlspecialchars in PHP with this example, you're going to have a different conversion with javascript:
use the string : FS'IG'IKU"UJHFE
@ Ashley Broadley: Thanks for noticing!
I guess the &amp; character must be the last character when decoding, but the first when encoding!
OK, I seem to have fixed this problem.
It turned out that the & symbol was at the bottom of the ascii decimal array in 'get_html_translation_table'. I simply moved it to the top and now everything is fine.
Can you test and confirm by emailing me?
Thanks
Ashley
I really find the idea of the php.js fantastic! I for one am very impressed with everyones work!
I have noticed a problem with the htmlentities (not sure if it applies to htmlspecialchars):
testing all the available symbols on my keyboard (£, <, >, ', " and such) i alert()'ed the supposedly encoded string and found that all ampersands were encoded, so "&pound;" would be "&amp;pound;" which then on a html page would echo "&pound;" and not "£" as it should.
im not a pro so im not sure whats causing the the bug.
just thought i would let you know!
@ atv: I'm not able to reproduce that behavior here. Also, if I run that test, my single quotes are being replaced by #039; entities.
Are you sure you're running our latest version?
Today, 2008-11-11, this function encodes the string twice, so the output of such code
[CODE="Javascript"]
htmlspecialchars("<a href='test'>Test</a>", 'ENT_QUOTES')
[/CODE]
will be like this:
[CODE="text"]
&lt;a href='test'&gt;Test&lt;/a&gt;
[/CODE]
Fix this!
@ Philip Peterson: It's been decided some time ago that we do not want global dependencies (like constants). The method to implement these is to have the functions accept both the integer representation of the constants (leaving it compatible) and the constant as string (for usability).
I've done some work on merging get_html_translation_table, htmlentities & htmlspecialchars and their counterparts, check it out if your like.
Here's an proposed implementation of get_html_translation_table. I do have a small problem though, which probably has a simple solution, and I used the actual integer values for constants instead of ENT_QUOTES, etc. ... would it not be more practical to do so, really, maybe have an optional "CONSTANTS" section in php.js?
Oh well, here's my code:
[CODE="Javascript"]
HTML_SPECIALCHARS=0;
HTML_ENTITIES=1;
ENT_COMPAT=2;
ENT_QUOTES=3;
function get_html_translation_table(table, quote_style)
{
retarr=[];
if(table==0)
{
if(quote_style == 2 || quote_style == 3)
{
retarr=['"':'&quot', '\'':'&#39;', '<':'&lt;', '>':'&gt;', '&':'&amp;'];
}
if(quote_style == 2)
{
// remove the ' entry
}
}
else if(table==1)
{
// Do the same thing as table == 0, but with the huge list of characters found by calling get_html_translation_table(1)
}
}
[/CODE]
I just see that your example here is wrong too. Here is the corrected version:
This is how you could call htmlspecialchars()
[CODE="Javascript"]
htmlspecialchars("<a href='test'>Test</a>", 'ENT_QUOTES');
[/CODE]
And that would return
[CODE="text"]
&lt;a href=&#039;test&#039;&gt;Test&lt;/a&gt;
[/CODE]


test
May 21st