This is my first question to Stack overflow, but I have been using SO for the last 2 years and it has provided a wealth of information.
I have recently picked up NodeJS and I have run into a dilemma. I am trying to work out the best way to load/insert about 2000 rows of data using NodeJS, and if possible using the preferred Async methodology. I have to pick up data from an API and then take the JSON data and load the data into 3 tables so that I can use the data later. The file has 17 Country objects, then 77 State object and the about 2000 Counties Objects.
The JSON file format that I am parsing is:
[{Country:{
Name: ...
CountryId: ...
States: {
Name: ...
StateId: ...
Counties: {
Name: ...
CountyId:...
}
}
},{Country+n:{
Name: ...
CountryId: ...
States: {
Name: ...
StateId: ...
Counties: {
Name: ...
CountyId:...
}
}
}];
So with my PHP backround, I would instantly create three functions as such in JavaScript:
function Country(data){
for(var z in data){
var country = data[z];
InsertInCountryDB(country.CountryId, country.Name);
State(Country.State);
}
}
function State(data){
for(var z in data){
var state = data[z];
InsertInStateDB(state.StateId, state.Name);
State(Country.State);
}
}
function County(data){
for(var z in data){
var county = data[z];
InsertInCountyDB(county.CountyId, county.Name);
}
}
My first attempts have been to create an event for each object and then use the events to drill down into objects within itself and then insert the data. I found it worked well for inserting the Country and State details into the DB, but with the County data, it would error out.
I am not looking for a answer to my issue, but rather a coding tips to help me move away from my three synchronous functions and rather use asynchronous functionality.
Thanks
2000 shouldn't be too bad. That sounds like something more like 10000 inserts or so. Your big rule is run them as a single commit (if you can do prepared statements, even better) and the commit at the end.
Now since your question is more on the architecture side, and assuming you are using PostgreSQL 9.1 or later, (and can install additional compiled extension from source) my recommendation would be quite different.
I would create a temporary table (needs an extension from source if 9.1):
CREATE TEMPORARY TABLE upload_holding (payload json);
Then I would just insert the "rows" as JSON documents in there. I would then write a stored procedure in plv8js (which you must install separately, possibly from source, but you could use pl/perl instead if you don't want to work in Javascript) which would process the JSON and extract each of the pieces. You can then chain the functions together to process the rows as sets and do your inserts together.
A full working example isn't really possible here but consider that with Perl or Javascript you could take each entity in JSON form and extract it into a tuple, and then take that tuple and further process it, you could fairly easily chain your inserts to the JSON processing inside the database.
This would be similar except that most of your functionality would be in the database and therefore would require much less planning overhead to make use of effectively.