Use PHP functions in JavaScript

JavaScript rawurlencode

URL-encodes string

1
2
3
4
56
7
8
9
1011
12
13
14
1516
17
18
19
2021
22
23
24
2526
27
28
29
3031
function rawurlencode (str) {
    // URL-encodes string  
    // 
    // version: 911.718
    // discuss at: http://phpjs.org/functions/rawurlencode    // +   original by: Brett Zamir (http://brett-zamir.me)
    // +      input by: travc
    // +      input by: Brett Zamir (http://brett-zamir.me)
    // +   bugfixed by: Kevin van Zonneveld (http://kevin.vanzonneveld.net)
    // +      input by: Michael Grier    // +   bugfixed by: Brett Zamir (http://brett-zamir.me)
    // +      input by: Ratheous
    // +      reimplemented by: Brett Zamir (http://brett-zamir.me)
    // +   bugfixed by: Joris
    // +      reimplemented by: Brett Zamir (http://brett-zamir.me)    // %          note 1: This reflects PHP 5.3/6.0+ behavior
    // %        note 2: Please be aware that this function expects to encode into UTF-8 encoded strings, as found on
    // %        note 2: pages served as UTF-8
    // *     example 1: rawurlencode('Kevin van Zonneveld!');
    // *     returns 1: 'Kevin%20van%20Zonneveld%21'    // *     example 2: rawurlencode('http://kevin.vanzonneveld.net/');
    // *     returns 2: 'http%3A%2F%2Fkevin.vanzonneveld.net%2F'
    // *     example 3: rawurlencode('http://www.google.nl/search?q=php.js&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:unofficial&client=firefox-a');
    // *     returns 3: 'http%3A%2F%2Fwww.google.nl%2Fsearch%3Fq%3Dphp.js%26ie%3Dutf-8%26oe%3Dutf-8%26aq%3Dt%26rls%3Dcom.ubuntu%3Aen-US%3Aunofficial%26client%3Dfirefox-a'
    str = (str+'').toString(); 
    // Tilde should be allowed unescaped in future versions of PHP (as reflected below), but if you want to reflect current
    // PHP behavior, you would need to add ".replace(/~/g, '%7E');" to the following.
    return encodeURIComponent(str).replace(/!/g, '%21').replace(/'/g, '%27').replace(/\(/g, '%28').
                                                                    replace(/\)/g, '%29').replace(/\*/g, '%2A');}
external links: original PHP docs | raw js source

Examples

» Example 1

Running

1
rawurlencode('Kevin van Zonneveld!');

Should return

1
'Kevin%20van%20Zonneveld%21'

» Example 2

Running

1
rawurlencode('http://kevin.vanzonneveld.net/');

Should return

1
'http%3A%2F%2Fkevin.vanzonneveld.net%2F'

Dependencies

No dependencies, you can use this function standalone.

Open syntax issues

php.js uses JsLint to help us keep our code consistent and prevent some common bugs.

Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.


Authors

Thanks to the following developers, you get to have rawurlencode goodness in JavaScript.

Comments

Add Comment
Use:
[CODE]
your_stuff('here');
[/CODE]
for proper code formatting
By submitting code here you are allowing us to use it in php.js hence dual licensing it under the MIT and GPL licenses

Gravatar
Joris van der Wel
29 Sep '09 Permalink

q   heh :)

Well, if a high surrogate is found, the i++; is just there so we do not loop over the low surrogate the next time.
It then goes all the way to

1
if (code >= 65536) { // 4 byte
to turn it into utf-8

That just me accounting for the remote possibility the specification changes (aka charCodeAt returning something bigger then 65535)

Funny thing is, I actually wrote my own rawurlencode function before finding this one and it was nearly identical.

Gravatar
Brett Zamir
10 Sep '09 Permalink

q  @Joris: Good catch about the non-BMP code points; ironic you caught me making the mistake, since I was the one who edited the article you cited for the correction to point this problem out! :) That's what I get for adapting someone else's pattern without thinking... Anyways, your addition is good, except that it should not assign to "code" but instead to "ret" and then do a "continue" after the "i++" or ensure we are in a continuous else/else-if block (I chose the latter). Also, thanks for the catch on the hex needing two chars min... Fixed in git...

Gravatar
Joris
9 Sep '09 Permalink

q   This function does not work properly for 4 byte unicode characters. Browsers use UTF-16 for strings. That means any unicode character above 65536 is split up into two surrogates values.

So "code >= 65536" is NEVER true.
Oh and PHP always makes sure a percentage value is composed of two hex numbers.
Here is a version that does urlencode as if the string were really UTF-8:

1
2
3
4
56
7
8
9
1011
12
13
14
1516
17
18
19
2021
22
23
24
2526
27
28
29
3031
32
33
34
3536
37
38
39
4041
42
43
44
var hexStr = function (dec) {
    return '%' + (dec < 16 ? '0' : '') + dec.toString(16).toUpperCase();
};
 
var ret = '',        unreserved = /[\w.~-]/; // A-Za-z0-9_.~-
str = (str+'').toString();
 
for (var i = 0, dl = str.length; i < dl; i++) {
    var ch = str.charAt(i);    if (unreserved.test(ch)) {
        ret += ch;
    }
    else {
        var code = str.charCodeAt(i);        if (0xD800 <= code && code <= 0xDBFF) // High surrogate (could change last hex to 0xDB7F to treat high private surrogates as single characters); https://developer.mozilla.org/index.php?title=en/Core_JavaScript_1.5_Reference/Global_Objects/String/charCodeAt&revision=39
        {
            code = ((code - 0xD800) * 0x400) + (str.charCodeAt(i+1) - 0xDC00) + 0x10000;
            i++; // skip the next one
        }        // We never come across a low surrogate because we skip them
        
        // Reserved assumed to be in UTF-8, as in PHP
        if (code < 128) { // 1 byte
            ret += hexStr(code);        }
        else if (code >= 128 && code < 2048) { // 2 bytes
            ret += hexStr((code >> 6) | 0xC0);
            ret += hexStr((code & 0x3F) | 0x80);
        }        else if (code >= 2048 && code < 65536) { // 3 bytes
            ret += hexStr((code >> 12) | 0xE0);
            ret += hexStr(((code >> 6) & 0x3F) | 0x80);
            ret += hexStr((code & 0x3F) | 0x80);
        }        else if (code >= 65536) { // 4 bytes
            ret += hexStr((code >> 18) | 0xF0);
            ret += hexStr(((code >> 12) & 0x3F) | 0x80);
            ret += hexStr(((code >> 6) & 0x3F) | 0x80);
            ret += hexStr((code & 0x3F) | 0x80);        }
    }
}
return ret;


Gr. Joris

Gravatar
Brett Zamir
2 Jun '09 Permalink

q  Even encodeURIComponent differs. See http://www.devpro.it/examples/php_js_escaping.php

Gravatar
Kankrelune
2 Jun '09 Permalink

q  it's not exactly the same chars list in escape and rawurlencode... .. .

The escape and unescape functions do not work properly for non-ASCII characters and have been deprecated. In JavaScript 1.5 and later, use encodeURI or encodeURIComponent... .. . ;o)

@ tchaOo°

Gravatar
Me
1 Jun '09 Permalink

q   Isn't this simpler and achieving the same result:

1
escape(str);

Gravatar
Brett Zamir
21 Apr '09 Permalink

q  Good catch! I'm not sure how that happened, but it is now fixed in SVN. I've actually been meaning to review these functions, as I'm not 100% sure now that the recent changes to the histogram have all been correct, at least for all functions...

Gravatar
Michael Grier
21 Apr '09 Permalink

q  Not encoding spaces is not the behavior of rawurlencode or urlencode, for that matter.

urlencode and rawurlencode both encode anything that is not "A to Z", "a to z", "0 to 9", "-", "_" or "." ... the only difference between them is how spaces are encoded... urlencode encodes spaces as "+" and rawurlencode encodes spaces as "%20".


Contribute a New function

More functions

In this category

base64_decode
base64_encode
get_headers
get_meta_tags
http_build_query
parse_url
rawurldecode
» rawurlencode
urldecode
urlencode

Support us

spread the word:


Use any PHP function in JavaScript


These kind folks have already donated: Anonymous and Shawn Houser.
<your name here>

Click here to lend your support to: phpjs and make a donation at www.pledgie.com !