Use PHP functions in JavaScript

JavaScript urlencode

URL-encodes string

1
2
3
4
56
7
8
9
1011
12
13
14
1516
17
18
19
2021
22
23
24
2526
27
28
29
3031
32
33
34
35
function urlencode (str) {
    // URL-encodes string  
    // 
    // version: 1008.1718
    // discuss at: http://phpjs.org/functions/urlencode    // +   original by: Philip Peterson
    // +   improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
    // +      input by: AJ
    // +   improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
    // +   improved by: Brett Zamir (http://brett-zamir.me)    // +   bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
    // +      input by: travc
    // +      input by: Brett Zamir (http://brett-zamir.me)
    // +   bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
    // +   improved by: Lars Fischer    // +      input by: Ratheous
    // +      reimplemented by: Brett Zamir (http://brett-zamir.me)
    // +   bugfixed by: Joris
    // +      reimplemented by: Brett Zamir (http://brett-zamir.me)
    // %          note 1: This reflects PHP 5.3/6.0+ behavior    // %        note 2: Please be aware that this function expects to encode into UTF-8 encoded strings, as found on
    // %        note 2: pages served as UTF-8
    // *     example 1: urlencode('Kevin van Zonneveld!');
    // *     returns 1: 'Kevin+van+Zonneveld%21'
    // *     example 2: urlencode('http://kevin.vanzonneveld.net/');    // *     returns 2: 'http%3A%2F%2Fkevin.vanzonneveld.net%2F'
    // *     example 3: urlencode('http://www.google.nl/search?q=php.js&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:unofficial&client=firefox-a');
    // *     returns 3: 'http%3A%2F%2Fwww.google.nl%2Fsearch%3Fq%3Dphp.js%26ie%3Dutf-8%26oe%3Dutf-8%26aq%3Dt%26rls%3Dcom.ubuntu%3Aen-US%3Aunofficial%26client%3Dfirefox-a'
    str = (str+'').toString();
        // Tilde should be allowed unescaped in future versions of PHP (as reflected below), but if you want to reflect current
    // PHP behavior, you would need to add ".replace(/~/g, '%7E');" to the following.
    return encodeURIComponent(str).replace(/!/g, '%21').replace(/'/g, '%27').replace(/\(/g, '%28').
                                                                    replace(/\)/g, '%29').replace(/\*/g, '%2A').replace(/%20/g, '+');
}
external links: original PHP docs | raw js source

Examples

» Example 1

Running

1
urlencode('Kevin van Zonneveld!');

Should return

1
'Kevin+van+Zonneveld%21'

» Example 2

Running

1
urlencode('http://kevin.vanzonneveld.net/');

Should return

1
'http%3A%2F%2Fkevin.vanzonneveld.net%2F'

Dependencies

No dependencies, you can use this function standalone.

Open syntax issues

php.js uses JsLint to help us keep our code consistent and prevent some common bugs.

Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.


Authors

Thanks to the following developers, you get to have urlencode goodness in JavaScript.

Comments

Add Comment
Use:
[CODE]
your_stuff('here');
[/CODE]
for proper code formatting
By submitting code here you are allowing us to use it in php.js hence dual licensing it under the MIT and GPL licenses

Gravatar
Mohsen Haeri
19 Nov '09 Permalink

q  Thank you very much...

Gravatar
Kevin van Zonneveld
7 Nov '09 Permalink

q  Brett, thanks so much for your research. This stuff is rocking so hard.. : ) you make me proud man!

Gravatar
Brett Zamir
28 Oct '09 Permalink

q  Ok,

I've updated in git (at http://github.com/kvz/phpjs/commit/2691be636ea1d3f8d035bfbe11fb2e05657b48da ) to a simpler (and faster) implementation based on encodeURIComponent (for all the urlencode/decode functions), but fully adjusting to how PHP is SUPPOSED to become as of PHP 5.3/6.0 (though I didn't see news of it yet). If you want PHP how it is now, to the encode functions you should add (since encodeURIComponent() doesn't do it) an additional:

.replace(/~/g, '%7E');



...since PHP at present outdatedly encodes the tilde, while later RFC's have let it be unencoded.

(The decode functions in PHP already can decode the tilde ok, so no need to "correct" here.)

Two other lessons learned (I hope) from RFC3986 (at http://labs.apache.org/webarch/uri/rfc/rfc3986.html ):
1) The reason why "!", "'", "(", ")", and "*" are now reserved (though not by the time encodeURIComponent was added to JavaScript, thus it is outdated and has to be corrected), even though they have no special official URI delimiting function, is because as characters normally usable for other purposes, it helps indicate that the other items in the group to which they belong (e.g., as with "&", "=", etc.) are generally not safe to be used as is without escaping. I guess it also allows them to be used for unofficial purposes.
2) Although there are no PHP analogues to encodeURI() in JavaScript (as urlencode() and rawurlencode() pretty much are for encodeURIComponent), so we don't have to worry about it as far as PHP.JS here, another way in which JavaScript is a little behind the times is in encodeURI() as far as how it should stop escaping square brackets, as they have been made reserved in order to be usable with IPv6 (delimiters for an IP literal in the 'host'). One might thus "fix" encodeURI thus (but NOT encodeURIComponent which is SUPPOSED to escape delimiters like '/' and now '['):

function fixedEncodeURI () {
    return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']');
}



For the record, we can do all of the straight replaces above because UTF-8 only uses bytes 0x00 to 0x7F for single-byte ASCII--these bytes can therefore be safely replaced back-and-forth from their escaped form to their unescaped form without fear that it is being used as part of a multi-byte sequence.

Again, folks, be very careful before submitting patches that you realize that our encoding/decoding is done here assuming UTF-8; you have to serve your PHP pages with a UTF-8 header (as you should) if you want comparable behavior on the PHP side.

Below is the old version, just for easy reference (e.g., if you happen to want to know how to make your own UTF-8 octets):

function urlencode (str) {
// http://kevin.vanzonneveld.net
// + original by: Philip Peterson
// + improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
// + input by: AJ
// + improved by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
// + improved by: Brett Zamir (http://brett-zamir.me)
// + bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
// + input by: travc
// + input by: Brett Zamir (http://brett-zamir.me)
// + bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
// + improved by: Lars Fischer
// + input by: Ratheous
// + reimplemented by: Brett Zamir (http://brett-zamir.me)
// + bugfixed by: Joris
// % note 1: This reflects PHP 5.3/6.0+ behavior
// * example 1: urlencode('Kevin van Zonneveld!');
// * returns 1: 'Kevin+van+Zonneveld%21'
// * example 2: urlencode('http://kevin.vanzonneveld.net/');
// * returns 2: 'http%3A%2F%2Fkevin.vanzonneveld.net%2F'
// * example 3: urlencode('http://www.google.nl/search?q=php.js&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:unofficial&client=firefox-a');
// * returns 3: 'http%3A%2F%2Fwww.google.nl%2Fsearch%3Fq%3Dphp.js%26ie%3Dutf-8%26oe%3Dutf-8%26aq%3Dt%26rls%3Dcom.ubuntu%3Aen-US%3Aunofficial%26client%3Dfirefox-a'

var hexStr = function (dec) {
return '%' + (dec < 16 ? '0' : '') + dec.toString(16).toUpperCase();
};

var ret = '',
unreserved = /[\w.-]/; // A-Za-z0-9_.- // Tilde is not here for historical reasons; to preserve it, use rawurlencode instead
str = (str+'').toString();

for (var i = 0, dl = str.length; i < dl; i++) {
var ch = str.charAt(i);
if (unreserved.test(ch)) {
ret += ch;
}
else {
var code = str.charCodeAt(i);
if (0xD800 <= code && code <= 0xDBFF) { // High surrogate (could change last hex to 0xDB7F to treat high private surrogates as single characters); https://developer.mozilla.org/index.php?title=en/Core_JavaScript_1.5_Reference/Global_Objects/String/charCodeAt
ret += ((code - 0xD800) * 0x400) + (str.charCodeAt(i+1) - 0xDC00) + 0x10000;
i++; // skip the next one as we just retrieved it as a low surrogate
}
// We never come across a low surrogate because we skip them, unless invalid
// Reserved assumed to be in UTF-8, as in PHP
else if (code === 32) {
ret += '+'; // %20 in rawurlencode
}
else if (code < 128) { // 1 byte
ret += hexStr(code);
}
else if (code >= 128 && code < 2048) { // 2 bytes
ret += hexStr((code >> 6) | 0xC0);
ret += hexStr((code & 0x3F) | 0x80);
}
else if (code >= 2048) { // 3 bytes (code < 65536)
ret += hexStr((code >> 12) | 0xE0);
ret += hexStr(((code >> 6) & 0x3F) | 0x80);
ret += hexStr((code & 0x3F) | 0x80);
}
}
}
return ret;
}

Gravatar
Brett Zamir
14 Oct '09 Permalink

q  @Donovan Walker: Yeah, except that that's not perfectly equivalent to what urlencode() does.

Gravatar
Donovan Walker
13 Oct '09 Permalink

q  Have to love this:
http://phpjs.org/functions/urlencode:573#comment_1090

Gravatar
Brett Zamir
18 Jun '09 Permalink

q  @Martin, to add to my comment just now, I see escape() would do the trick, but that is deprecated, again because it assumes Latin-1.

Gravatar
Brett Zamir
18 Jun '09 Permalink

q  Thanks, Martin. Can you explain why we don't want UTF-8 though? I see when I test this with PHP, if the file is encoded in UTF-8, I get the same results. Given the tide turning toward UTF-8, not to mention its compatibility with all languages, I think it's best to try for that, no?

I guess we could add a custom "phpjs." configuration option (triggered through our ini_set() which allowed for other character sets), but we'd probably want to use some generic algorithm to translate assuming Latin-1 input (or whatever) rather than adding character conversions case by case. What do you think?

Gravatar
Martin Allchin
18 Jun '09 Permalink

q  The UK pound sign (£) encodes with multiple escape sequences giving:

%C2%A3



rather than

%A3



This is due to conversion into UTF-8. I suggest adding the following into the histogram array as a simple fix:

histogram['%C2%A3'] = '%A3';


Gravatar
Kevin van Zonneveld
10 Oct '08 Permalink

q  @ bukura: Unfortunately we have some bad experiences with escape, as it does not provide PHP compatible output. Please also see the link we refer to in the script.

Gravatar
bukura
9 Oct '08 Permalink

q  [CODE=&quot;Javascript&quot;]
function urlencode (str) {
var res=&quot;&quot;;
for (i=0;i&lt;str.length;i++) {
if(str[i]==' ') {
res+='+';
}else {
res+=escape(str[i]);
}
}
return res;
}
[/CODE]

Gravatar
Kevin van Zonneveld
29 Aug '08 Permalink

q  @ AJ: I've rewritten the urlencoding functions, should be a great improvement! Thanks for your input.

Gravatar
AJ
28 Aug '08 Permalink

q  It's a good function, but the it needs to encode the forward slash character also. I'd recommend adding the following line before the return statement:

[CODE=&quot;Javascript&quot;]
ret = ret.replace(/\//g,'%2F');
[/CODE]

Short of going into the PHP source, this seems to work reasonably similarly.

Gravatar
Kevin van Zonneveld
27 Aug '08 Permalink

q  @ johnrembo: Hi John, thanks for your input again. We had some discussion about it earlier. It doesn't mimic PHP behaviour enough. Differences between JavaScript's encoding functionalities can be found here: http://xkr.us/articles/javascript/encode-compare/

Gravatar
johnrembo
27 Aug '08 Permalink

q  [CODE=&quot;Javascript&quot;]
function urlencode (str) {
return encodeURIComponent(str);
}
[/CODE]

Gravatar
Kevin van Zonneveld
18 Apr '08 Permalink

q  Yeah I did it because Michael reached the conclusion that encodeURIComponent had better PHP compatibility.

I guess the tester doesn't work because in it current form it fails to handle \n characters, and maybe the exclamation mark gets translated twice, I have to double check that.

Discussion on encodeURIComponent vs escape can be found here:
http://kevin.vanzonneveld.net/techblog/article/javascript_equivalent_for_phps_http_build_query/#comment_1071

If you reach a different conclusion, please let me know ok?

Gravatar
Philip Peterson
18 Apr '08 Permalink

q  Woah... strike that, apparently it's because you replaced escape with encodeURIComponent? They function a little bit differently, and escape() is the most similar to PHP's functionality.

Gravatar
Philip Peterson
18 Apr '08 Permalink

q  Just so you know, in phpjs_tester, the examples for urlencode and nl2br are both wrong (they don't just not work). ;-)


Contribute a New function

More functions

In this category

base64_decode
base64_encode
get_headers
get_meta_tags
http_build_query
parse_url
rawurldecode
rawurlencode
urldecode
» urlencode

Support us

spread the word:


Use any PHP function in JavaScript


These kind folks have already donated: @HalfWinter, Paulo Freitas, Andros Peña Romo, Nitin Gupta, @nikosdion, Anonymous, Anonymous and Shawn Houser.
<your name here>

Click here to lend your support to: phpjs and make a donation at www.pledgie.com !