Use PHP functions in JavaScript

JavaScript get_meta_tags

Extracts all meta tag content attributes from a file and returns an array

1
2
3
4
56
7
8
9
1011
12
13
14
1516
17
18
19
2021
22
23
24
2526
27
28
29
3031
32
33
34
3536
37
38
39
4041
function get_meta_tags (file) {
    // Extracts all meta tag content attributes from a file and returns an array
    //
    // version: 905.3122
    // discuss at: http://phpjs.org/functions/get_meta_tags    // +   original by: Brett Zamir (http://brett-zamir.me)
    // %        note 1: This function uses XmlHttpRequest and cannot retrieve resource from different domain.
    // %        note 1: Synchronous so may lock up browser, mainly here for study purposes.
    // -    depends on: file_get_contents
    // *     example 1: get_meta_tags('http://kevin.vanzonneveld.net/pj_test_supportfile_2.htm');    // *     returns 1: {description: 'a php manual', author: 'name', keywords: 'php documentation', 'geo_position': '49.33;-86.59'}
    var fulltxt = '';
 
    if (false) {
        // Use this for testing instead of the line above:        fulltxt = '<meta name="author" content="name">'+
        '<meta name="keywords" content="php documentation">'+
        '<meta name="DESCRIPTION" content="a php manual">'+
        '<meta name="geo.position" content="49.33;-86.59">'+
        '</head>';    } else {
        fulltxt = this.file_get_contents(file).match(/^[\s\S]*<\/head>/i); // We have to disallow some character, so we choose a Unicode non-character
    }
 
    var patt = /<meta[^>]*?>/gim;    var patt1 = /<meta\s+.*?name\s*=\s*(['"]?)(.*?)\1\s+.*?content\s*=\s*(['"]?)(.*?)\3/gim;
    var patt2 = /<meta\s+.*?content\s*=\s*(['"?])(.*?)\1\s+.*?name\s*=\s*(['"]?)(.*?)\3/gim;
    var txt, match, name, arr={};
 
    while ((txt = patt.exec(fulltxt)) !== null) {        while ((match = patt1.exec(txt)) !== null) {
            name = match[2].replace(/\W/g, '_').toLowerCase();
            arr[name] = match[4];
        }
        while ((match = patt2.exec(txt)) !== null) {            name = match[4].replace(/\W/g, '_').toLowerCase();
            arr[name] = match[2];
        }
    }
    return arr;}
external links: original PHP docs | raw js source

Examples

Running

1
get_meta_tags('http://kevin.vanzonneveld.net/pj_test_supportfile_2.htm');

Should return

1
{description: 'a php manual', author: 'name', keywords: 'php documentation', 'geo_position': '49.33;-86.59'}

Dependencies

In order to use this function, you also need:

Open syntax issues

php.js uses JsLint to help us keep our code consistent and prevent some common bugs.

Eventually we want all code to pass or at least take into consideration most fixes suggested by JsLint, following this JsLint configuration we’ve decided on.


Authors

Thanks to the following developers, you get to have get_meta_tags goodness in JavaScript.

Comments

Add Comment
Use:
[CODE]
your_stuff('here');
[/CODE]
for proper code formatting
By submitting code here you are allowing us to use it in php.js hence dual licensing it under the MIT and GPL licenses

Gravatar
Raphael (Ao) RUDLER
17 Jun '09 Permalink

q  works effectively better :)

Gravatar
Brett Zamir
8 Jun '09 Permalink

q  Ok, came across a better trick than the already-pretty-safe negation I was using...Just use [\s\S] which allows a single character to be either a whitespace (including newilne) or non-whitespace--in other words anything...

Gravatar
Brett Zamir
8 Jun '09 Permalink

q  Sorry, there had been several bugs in the latest versions of get_meta_tags() and file_get_contents() on which it depended. Those should all be fixed those now, so please use the latest copies for these. However, your own suggested fix will not work because that is no longer a regular expression and became instead a string (that text will no doubt never be found).

Explorer apparently has a problem if a negated character class in a regular expression is empty (e.g., [^]). We use a negated character because 1) We want to use something equivalent to the "." (any character) until we reach the text after it that we do want, but... 2) we want to reach across multiple lines (and the 'm' flag, does not, as is frequently supposed, do this). Although it doesn't look like any character is explicitly forbidden in HTML (only XHTML), since we have to add some character, I added the null control character \u0000. If someone knows another better unlikely character or approach, let us know, but I think that should be a safe bet for now.

Thanks for reporting the issue.

Gravatar
Raphael (Ao) RUDLER
7 Jun '09 Permalink

q  Hi,

I noticed warnings under ie7 under vista for this function.
I had to replace :

fulltxt= this.file_get_contents(file).match(/^[^]*<\/head>/i);


by

fulltxt= this.file_get_contents(file).match('/^[^]*<\/head>/i');



bytheway, thanks for all that work you made.


Contribute a New function

More functions

In this category

base64_decode
base64_encode
get_headers
» get_meta_tags
http_build_query
parse_url
rawurldecode
rawurlencode
urldecode
urlencode

Support us

spread the word:


Use any PHP function in JavaScript


These kind folks have already donated: @HalfWinter, Paulo Freitas, Andros Peña Romo, Nitin Gupta, @nikosdion, Anonymous, Anonymous and Shawn Houser.
<your name here>

Click here to lend your support to: phpjs and make a donation at www.pledgie.com !