JavaScript strip_tags
Strips HTML and PHP tags from a string
1 2 3 4 56 7 8 9 1011 12 13 14 1516 17 18 19 2021 22 23 24 2526 27 28 29 3031 32 33 34 3536 37 38 39 4041 42 43 44 4546 47 48 49 5051 52 53 54 5556 57 58 59 6061 62 63 64 6566 67 68 69 7071 72 73 74 7576 77 78 79 8081 82 83 84 8586 87 | function strip_tags (str, allowed_tags) { // Strips HTML and PHP tags from a string // // version: 909.322 // discuss at: http://phpjs.org/functions/strip_tags // + original by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + improved by: Luke Godfrey // + input by: Pul // + bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Onno Marsman // + input by: Alex // + bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + input by: Marc Palau // + improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + input by: Brett Zamir (http://brett-zamir.me) // + bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Eric Nagel // + input by: Bobby Drake // + bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net) // + bugfixed by: Tomasz Wesolowski // * example 1: strip_tags('<p>Kevin</p> <b>van</b> <i>Zonneveld</i>', '<i><b>'); // * returns 1: 'Kevin <b>van</b> <i>Zonneveld</i>' // * example 2: strip_tags('<p>Kevin <img src="someimage.png" onmouseover="someFunction()">van <i>Zonneveld</i></p>', '<p>'); // * returns 2: '<p>Kevin van Zonneveld</p>' // * example 3: strip_tags("<a href='http://kevin.vanzonneveld.net'>Kevin van Zonneveld</a>", "<a>"); // * returns 3: '<a href='http://kevin.vanzonneveld.net'>Kevin van Zonneveld</a>' // * example 4: strip_tags('1 < 5 5 > 1'); // * returns 4: '1 < 5 5 > 1' var key = '', allowed = false; var matches = []; var allowed_array = []; var allowed_tag = ''; var i = 0; var k = ''; var html = ''; var replacer = function (search, replace, str) { return str.split(search).join(replace); }; // Build allowes tags associative array if (allowed_tags) { allowed_array = allowed_tags.match(/([a-zA-Z0-9]+)/gi); } str += ''; // Match tags matches = str.match(/(<\/?[\S][^>]*>)/gi); // Go through all HTML tags for (key in matches) { if (isNaN(key)) { // IE7 Hack continue; } // Save HTML tag html = matches[key].toString(); // Is tag not in allowed list? Remove from str! allowed = false; // Go through all allowed tags for (k in allowed_array) { // Init allowed_tag = allowed_array[k]; i = -1; if (i != 0) { i = html.toLowerCase().indexOf('<'+allowed_tag+'>');} if (i != 0) { i = html.toLowerCase().indexOf('<'+allowed_tag+' ');} if (i != 0) { i = html.toLowerCase().indexOf('</'+allowed_tag) ;} // Determine if (i == 0) { allowed = true; break; } } if (!allowed) { str = replacer(html, "", str); // Custom replace. No regexing } } return str; } |
Examples
» Example 1
Running
1 | strip_tags('<p>Kevin</p> <b>van</b> <i>Zonneveld</i>', '<i><b>'); |
Should return
1 | 'Kevin <b>van</b> <i>Zonneveld</i>' |
» Example 2
Running
1 | strip_tags('<p>Kevin <img src="someimage.png" onmouseover="someFunction()">van <i>Zonneveld</i></p>', '<p>'); |
Should return
1 | '<p>Kevin van Zonneveld</p>' |
Dependencies
No dependencies, you can use this function standalone.
Open syntax issues
php.js uses JsLint to help us keep our code consistent and prevent some common bugs.
Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.
Authors
Thanks to the following developers, you get to have strip_tags goodness in JavaScript.
@Kevin: Thanks for the security fix, and sorry I'm too busy to look into it myself at the moment, but now the code snippets are showing less-than signs, etc. in entity form...
@ Tomasz Wesolowski: Very kind of you to provide the fix! I've added it to SVN along with the credits.
PS: oops indeed! fixed the comment issue
Oops, no HTML escaping in posts? Here's a cleaner repost:
---
That's some useful code. :)
Unfortunately it seems to fail on header tags h1..h7. I have probably fixed that by changing the line 42:
1 2 3 4 | // Build allowes tags associative array if (allowed_tags) { allowed_array = allowed_tags.match(/([a-zA-Z]+)/gi); } |
to
1 | allowed_array = allowed_tags.match(/([a-zA-Z0-9]+)/gi); |
That's some useful code. :)
Unfortunately it seems to fail on header tags
... I have probably fixed that by changing the line 42:
// Build allowes tags associative array
1
2
3
if (allowed_tags) {
allowed_array = allowed_tags.match(/([a-zA-Z]+)/gi);
}
to
1
allowed_array = allowed_tags.match(/([a-zA-Z0-9]+)/gi);
// Build allowes tags associative array
1 2 3 | if (allowed_tags) { allowed_array = allowed_tags.match(/([a-zA-Z]+)/gi); } |
to
1 | allowed_array = allowed_tags.match(/([a-zA-Z0-9]+)/gi); |
@ Bobby Drake: Thanks for pointing that out. I fixed the bug and added your testcase to prevent future bugs. Thanks!
what does !! do here? validate? convert int to bool?
array unique is using this function internally, but array_unique is not working for me (it returns undefined), and I'm trying to figure out why.
Thanks for the function. I added:
1 | var k = '', i = 0; |
in your variable declarations, as I was using k and i outside the function, which put things into a nasty loop. Hope this helps someone.
You have a great collection of PHP equivalent javascript functions. This is really helpful to develpers. Thanks for sharing.
@ Alex: I wasn't aware of this implementation. And, you're right: it is our objective to mimic php as much as reasonably possible. Thanks for sharing, I've updated the function and credited you accordingly.
It looks like there's a small difference in your JS implementation of strip_tags from PHP's implementation:
PHP declares multiple allowable tags like this: strip_tags('<p><b>text</b></p>', '<p><b>')
The JS version is like this:
strip_tags('<p><b>text</b></p>', '<p>,<b>')
Note the comma separation in the JS version between the allowable tags. It's not a big deal, but I thought I'd point it out, as it tripped me up for a while (and I thought you'd want to know since you're attempting to make these functions work syntactically the same as their PHP equivalents). Thanks!
@ Pul: Thank you for pointing that out. I've fixed the code and added your usage example so it will be tested in the future as well.
try
1 | strip_tags("<a href='index.html'>test</a>", "<a>"); |
please fix.. :P
The strip_tags() function appears to be broken in IE7. Upon detecting an opening tag, it completely removes ALL output. The same behavior appears on the test page on this site. It appears that in IE, the match() function returns a copy of the input string and a couple other extraneous values on a successful match, causing the entire string to be replaced by the first matched key (the original input).
To fix, I added this ugly piece of work inside the key loop:
1 2 3 4 | if (key == '0' || Number(key.toString())) { // replacement } |


Kevin van Zonneveld
4 Aug '09