I'm using nodejs with cheerio to scrape data from a website, and it creates an object from it. Then, it needs to take that object and use it in a function.
The issue is, my object is being created but before cheerio can properly parse the data and put it into the object, the next function is already running. Here's my code:
function getInfo(obj, link){
request(link, function(err, resp, body) {
if (err) {
console.log("Uh-oh: " + err);
throw err;
}
$ = cheerio.load(body);
function createProduct(obj, callback){
var product = {
name : $('#name').text(),
gender : obj.gender,
infoLink : link,
designer : $('.label').first().text(),
price : $('#price').first().text(),
description : $('.description').text(),
date : new Date()
}
product.systemName = (function(){
return product.name.replace(/\s+/g, ' ');
}());
callback(product);
}
createProduct(obj, function(product){
lookUp(product);
});
I'm getting mixed results here. Some product objects are being sent to the function just fine with all the details properly input. Some are missing descriptions, others are missing every cheerio-populated content. Others have some cheerio scraped content, but are missing certain bits. The gender and date attributes are always there, and the properties exist, but they're just blank (e.g. product.name returns "" rather than undefined).
I've checked each offending link and all pages contain the correct selectors to be scraped.
How can I set up the callback to ONLY function once the product object has been populated?
There are two possible asynchronous executions which can get you these results :
cheerio.load has not finished before createProduct is called.createProduct product is not getting populated or partially like description before callback is called (not sure).You can use async library to make functions execute synchronously (by using async.series). If createProduct is asynchronous as well , you will have to make it synchronous in similar way.
async.series([
function(callback){
$ = cheerio.load(body);
callback();
},
function(callback){
createProduct(obj, function(product){
lookUp(product);
});
callback();
}
]);