JavaScript parse_url
Parse a URL and return its components
1 2 3 4 56 7 8 9 1011 12 13 14 1516 17 18 19 2021 22 23 24 2526 27 28 29 3031 32 33 34 3536 37 38 39 4041 42 43 44 4546 47 48 49 5051 52 53 54 | function parse_url (str, component) { // Parse a URL and return its components // // version: 1109.2015 // discuss at: http://phpjs.org/functions/parse_url // + original by: Steven Levithan (http://blog.stevenlevithan.com) // + reimplemented by: Brett Zamir (http://brett-zamir.me) // + input by: Lorenzo Pisani // + input by: Tony // + improved by: Brett Zamir (http://brett-zamir.me) // % note: Based on http://stevenlevithan.com/demo/parseuri/js/assets/parseuri.js // % note: blog post at http://blog.stevenlevithan.com/archives/parseuri // % note: demo at http://stevenlevithan.com/demo/parseuri/js/assets/parseuri.js // % note: Does not replace invalid characters with '_' as in PHP, nor does it return false with // % note: a seriously malformed URL. // % note: Besides function name, is essentially the same as parseUri as well as our allowing // % note: an extra slash after the scheme/protocol (to allow file:/// as in PHP) // * example 1: parse_url('http://username:password@hostname/path?arg=value#anchor'); // * returns 1: {scheme: 'http', host: 'hostname', user: 'username', pass: 'password', path: '/path', query: 'arg=value', fragment: 'anchor'} var key = ['source', 'scheme', 'authority', 'userInfo', 'user', 'pass', 'host', 'port', 'relative', 'path', 'directory', 'file', 'query', 'fragment'], ini = (this.php_js && this.php_js.ini) || {}, mode = (ini['phpjs.parse_url.mode'] && ini['phpjs.parse_url.mode'].local_value) || 'php', parser = { php: /^(?:([^:\/?#]+):)?(?:\/\/()(?:(?:()(?:([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?()(?:(()(?:(?:[^?#\/]*\/)*)()(?:[^?#]*))(?:\?([^#]*))?(?:#(.*))?)/, strict: /^(?:([^:\/?#]+):)?(?:\/\/((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?((((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/, loose: /^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/ // Added one optional slash to post-scheme to catch file:/// (should restrict this) }; var m = parser[mode].exec(str), uri = {}, i = 14; while (i--) { if (m[i]) { uri[key[i]] = m[i]; } } if (component) { return uri[component.replace('PHP_URL_', '').toLowerCase()]; } if (mode !== 'php') { var name = (ini['phpjs.parse_url.queryKey'] && ini['phpjs.parse_url.queryKey'].local_value) || 'queryKey'; parser = /(?:^|&)([^&=]*)=?([^&]*)/g; uri[name] = {}; uri[key[12]].replace(parser, function ($0, $1, $2) { if ($1) {uri[name][$1] = $2;} }); } delete uri.source; return uri; } |
Examples
Running
1 | parse_url('http://username:password@hostname/path?arg=value#anchor'); |
Should return
1 | {scheme: 'http', host: 'hostname', user: 'username', pass: 'password', path: '/path', query: 'arg=value', fragment: 'anchor'} |
Dependencies
No dependencies, you can use this function standalone.
Open syntax issues
php.js uses JsLint to help us keep our code consistent and prevent some common bugs.
Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.
Authors
Thanks to the following developers, you get to have parse_url goodness in JavaScript.
@Aaron. Can you provide more details? I tried on Apache, and it works fine. Or did you mean you tried in SSJS? What regex and sample are you using? One thing you can try is replacing the various "()" marks used in the php mode regex with the equivalent but longer, "(.{0})". Maybe Firefox4's new regex parser doesn't properly handle "()" under some conditions. If that doesn't work, let me know if changing by default to "strict" or "loose" mode instead of "php" (see the code on line 25 with "|| 'php'").
I've been using this script for two years. I just switched to FF4 and it now fails. Error: "regular expression too complex" fails here in firebug "var m = parser[mode].exec(str)," only on a web server. Works fine locally.
@radekk: Can you clarify?
@Tony & @Lorenzo Pisani: Finally getting to this. The issue was simply that the loose mode had been chosen by default. This should probably work more like PHP now, and I also cleaned up the function a bit and allowed custom ini settings to change the parsing mode (e.g., "loose" mode (set by "phpjs.parse_url.mode") is more useful when trying to guess at a user's imperfect input, but faulty as you found out; "strict" follows the same as the default "php" mode, but offers more properties, including parsing the query string further).


Patrick
May 11th