Is it possible to use jQuery selectors/DOM manipulation on the server-side using Node.js?
Update: (late 2013) The official jQuery team finally took over the management of the jquery package on npm, but omitted any node usage instructions in the README.
An example of usage can be found at http://github.com/coolaj86/node-jquery.
Since November 4th, 2010, you can simply do this:
npm install jquery
This package internally uses the modules jsdom and xmlhttprequest. The package manager will handle the installing of dependencies. Read the thread here.
Update: (March 1, 2015) Note that the most recent version of jsdom (>= 4.0.0) does not work with with node.js (it only works with io.js). The example at http://github.com/coolaj86/node-jquery does work in node.js, however, if you install jsdom 3.1.2 instead of the most recent version:
npm install jquery
npm install jsdom@3.1.2
Yes you can, using a library I created called nodeQuery https://github.com/tblobaum/nodeQuery
var Express = require('express')
, dnode = require('dnode')
, nQuery = require('nodeQuery')
, express = Express.createServer();
var app = function ($) {
$.on('ready', function () {
// do some stuff to the dom in real-time
$('body').append('Hello World');
$('body').append('<input type="text" />');
$('input').live('click', function () {
console.log('input clicked');
// ...
});
});
};
nQuery
.use(app);
express
.use(nQuery.middleware)
.use(Express.static(__dirname + '/public'))
.listen(3000);
dnode(nQuery.middleware).listen(express);
Using jsdom you now can. Just look at their jquery example in the examples directory.
At the time of writing there also is the maintained Cheerio.
Fast, flexible, and lean implementation of core jQuery designed specifically for the server.
I believe the answer to this is now yes.
http://github.com/tmpvar/jsdom/blob/master/example/jquery/run.js
var navigator = { userAgent: "node-js" };
var jQuery = require("./node-jquery").jQueryInit(window, navigator);
This is my formula to make a simple crawler in Node.js. It is the main reason for wanting to do DOM manipulation on the server side and probably it's the reason why you got here.
First, use request to download the page to be parsed. When the download is complete, handle it to cheerio and begin DOM manipulation just like using jQuery.
Working example:
var
request = require('request'),
cheerio = require('cheerio');
function parse(url) {
request(url, function (error, response, body) {
var
$ = cheerio.load(body);
$('.question-summary .question-hyperlink').each(function () {
console.info($(this).text());
});
})
}
parse('http://stackoverflow.com/');
This example will print to the console all top questions showing on SO home page. This is why I love Node.js and its community. It couldn't get easier than that :-)
Install dependencies:
npm install request cheerio
And run (assuming the script above is in file crawler.js):
node crawler.js
Some pages will have non-english content in a certain encoding and you will need to decode it to UTF-8. For instance, a page in brazilian portuguese (or any other language of latin origin) will likely be encoded in ISO-8859-1 (a.k.a. "latin1"). When decoding is needed, I tell request not to interpret the content in any way and instead use iconv-lite to do the job.
Working example:
var
request = require('request'),
iconv = require('iconv-lite'),
cheerio = require('cheerio');
var
PAGE_ENCODING = 'utf-8'; // change to match page encoding
function parse(url) {
request({
url: url,
encoding: null // do not interpret content yet
}, function (error, response, body) {
var
$ = cheerio.load(iconv.decode(body, PAGE_ENCODING));
$('.question-summary .question-hyperlink').each(function () {
console.info($(this).text());
});
})
}
parse('http://stackoverflow.com/');
Before running, install dependencies:
npm install request iconv-lite cheerio
And then finally:
node crawler.js
The next step would be to follow links. Say you want to list all posters from each top question on SO. You have to first list all top questions (example above) and then enter each link, parsing each question's page to get the list of involved users.
When you start following links, a callback hell can begin. To avoid that, you should use some kind of promises, futures or whatever. I always keep async in my toolbelt. So, here is a full example of a crawler using async:
var
url = require('url'),
request = require('request'),
async = require('async'),
cheerio = require('cheerio');
var
baseUrl = 'http://stackoverflow.com/';
// Gets a page and returns a callback with a $ object
function getPage(url, parseFn) {
request({
url: url
}, function (error, response, body) {
parseFn(cheerio.load(body))
});
}
getPage(baseUrl, function ($) {
var
questions;
// Get list of questions
questions = $('.question-summary .question-hyperlink').map(function () {
return {
title: $(this).text(),
url: url.resolve(baseUrl, $(this).attr('href'))
};
}).get().slice(0, 5); // limit to the top 5 questions
// For each question
async.map(questions, function (question, questionDone) {
getPage(question.url, function ($$) {
// Get list of users
question.users = $$('.post-signature .user-details a').map(function () {
return $$(this).text();
}).get();
questionDone(null, question);
});
}, function (err, questionsWithPosters) {
// This function is called by async when all questions have been parsed
questionsWithPosters.forEach(function (question) {
// Prints each question along with its user list
console.info(question.title);
question.users.forEach(function (user) {
console.info('\t%s', user);
});
});
});
});
Before running:
npm install request async cheerio
Run a test:
node crawler.js
Sample output:
Is it possible to pause a Docker image build?
conradk
Thomasleveil
PHP Image Crop Issue
Elyor
Houston Molinar
Add two object in rails
user1670773
Makoto
max
Asymmetric encryption discrepancy - Android vs Java
Cookie Monster
Wand Maker
Objective-C: Adding 10 seconds to timer in SpriteKit
Christian K Rider
And that's the basic you should know to start making your own crawlers :-)
The module jsdom is a great tool. But if you want to evaluate entire pages and do some funky stuff on them server side I suggest running them in their own context:
vm.runInContext
So things like require / CommonJS on site will not blow your Node process itself.
You can find documentation here. Cheers!
No. It's going to be quite a big effort to port a browser environment to node.
Another approach, that I'm currently investigating for unit testing, is to create "Mock" version of jQuery that provides callbacks whenever a selector is called.
This way you could unit test your jQuery plugins without actually having a DOM. You'll still have to test in real browsers to see if your code works in the wild, but if you discover browser specific issues, you can easily "mock" those in your unit tests as well.
I'll push something to github.com/felixge once it's ready to show.
WARNING
This solution, as mentioned by Golo Roden is not correct. It is just a quick fix to help people to have their actual jQuery code running using a Node app structure, but it's not Node philosophy because the jQuery is still running on the client side instead of on the server side. I'm sorry for giving a wrong answer.
You can also render Jade with node and put your jQuery code inside. Here is the code of the jade file:
!!! 5
html(lang="en")
head
title Holamundo!
script(type='text/javascript', src='http://code.jquery.com/jquery-1.9.1.js')
body
h1#headTitle Hello, World
p#content This is an example of Jade.
script
$('#headTitle').click(function() {
$(this).hide();
});
$('#content').click(function() {
$(this).hide();
});
Not that I know of. The DOM is a client side thing (jQuery doesn't parse the HTML, but the DOM).
Here are some current Node.js projects:
http://wiki.github.com/ry/node
And SimonW's djangode is pretty damn cool...
An alternative is to use Underscore.js. It should provide what you might have wanted server-side from JQuery.