I have an app that stores user uploaded spreadsheets as tables in PostgreSQL. Everytime an user uploads a spreadsheet I create a record in a Dataset table containing the physical table name, its alias and the owner. I can retrieve a certain Dataset information with
GET domain.com/v1/Datasets/{id}
AFAIK, the relation between rows in Dataset and physical tables can't be enforced by a FK, or at least I haven't seen anyones creating FKs on the information_schema of PostgreSQL, and FKs can't drop tables, or can they? So it's common to have orphan tables, or records in Dataset that point to tables that no longer exist. I have managed this with business logic and cleaning tasks.
Now, to access one of those physical tables, for example one called nba_teams I would need to declare an NbaTeams model in loopback and restart the app, then query its records with
GET domain.com/v1/NbaTeams/{id}
But that can't scale, specially if I'm already having like 100 uploads a day. So from where I'm standing, there are two ways to go:
1.- Create one model, then add 4 custom methods that accepts a table name as a string, and perform the next CRUD operation on that table name via raw queries. For example, to list the records:
GET domain.com/v1/Datasets/getTable/NbaTeams
or, to update one team
PUT domain.com/v1/Datasets/getTable/NbaTeams/{teamId}
This sounds unelegant but should work.
2.- Create a custom method that accepts a table name as a string, which in turn creates an ephemeral model and forward the HTTP verb and the rest of the arguments to it
dataSource.discoverAndBuildModels('nba_teams', {
owner: 'uploader'
}, function (err, models) {
console.log(models);
models.NbaTeams.find(function (err, act) {
if (err) {
console.error(err);
} else {
console.log(act);
}
dataSource.disconnect();
});
});
this second one I haven't got to work yet, and I don't know how much overhead it might have, but I'm sure it's doable.
So before I dig in deeper I came to ask: has anybody dealt with this row-to-table relation? What are the good practices in this?
In the end, I did my own hacky workaround and I thought it may help someone, some day.
What I did was put a middleware (with regular express syntax) to listen for /v1/dataset{id_dataset} , create the model on the fly and pass the execution to the next middleware
app.use('/v1/dataset:id_dataset', function(req, res, next) {
var idDataset=req.params.id_dataset;
app.getTheTable(idDataset,function(err,result) {
if(err) {
console.error(err);
res.json({"error":"couldn't retrieve related table"});
} else {
next();
}
});
});
inside the app.getTheTable function, I'm creating a model dynamically and setting it up before callback
app.getTheTable = function (idDataset, callback) {
var Table = app.models.Dataset,
modelName='dataset'+idDataset,
dataSource;
Table.findById(idDataset, function (err, resultados) {
if (err) {
callback(new Error('Unauthorized'));
} else {
if(app.models[modelName]) {
callback(null,modelName); // model already exists
} else {
var theDataset = dataSource.createModel(modelName, properties, options);
theDataset.settings.plural = modelName;
theDataset.setup();
app.model(theDataset);
var restApiRoot = app.get('restApiRoot');
app.use(restApiRoot, app.loopback.rest());
callback(null, modelName);
}
}
});
};
It's hacky, I know, and I believe there must be some kind of performance penalty for overloading restApiRoot middleware, but it's still better tan creating 500 models on startup to cover all possible dataset requests.