How to speed up serving static content with node.js?

I know that it is better to process static content with nginx in node.js web applications, but the question has more general context. For example, the same methods could be applicable if we want to use node.js application as a smart proxy server.

Well, I have a simple node.js web application serving static files, with caching:

var express = require('express');
var path = require('path');
var fs = require('graceful-fs');
var http = require('http');

var app = express();

var cache = Object.create(null);

app.get('/', function(req, res){
  res.send('It works!');
});

app.get('/images/:file', function(req, res) {
    var file = req.params.file;
    var filename = path.join(__dirname, 'public', 'images', file);

    var ext = filename.split('.');
    ext = ext[ext.length - 1];      

    if(ext=="gif" || ext == "jpg" || ext == "jpeg")
        res.set('Content-Type', 'image/'+ext);

    if(!cache[filename])
        {
            fs.readFile(filename, function(error, data) {
                    cache[filename] = new Buffer(data);
                    res.send(cache[filename], null, function(error)
                        {
                            res.end();
                        });                     
                });
        }
    else
        {
        res.send(cache[filename], null, function(error)
            {
                res.end();
            }); 
        }
});


var server = http.createServer(app).listen(82, function () {
    console.log('Express server listening on port ' + server.address().port);
});

Now, let's try to test this application with medium loads:

 ab -n 10000 -c 1 -r http://127.0.0.1:82/images/test.gif

I see the following result:

 Document Path:          /images/test.gif
Document Length:        264138 bytes

 Concurrency Level:      1
Time taken for tests:   15.267 seconds
Complete requests:      10000
Total transferred:      2643210000 bytes
HTML transferred:       2641380000 bytes
Requests per second:    655.00 [#/sec] (mean)
Time per request:       1.527 [ms] (mean)
Time per request:       1.527 [ms] (mean, across all concurrent requests)
Transfer rate:          169071.47 [Kbytes/sec] received

Percentage of the requests served within a certain time (ms)
  50%      1
  66%      1
  75%      1
  80%      1
  90%      2
  95%      2
  98%      3
  99%      6
 100%     29 (longest request)

If we allow 1000 concurrency connections:

  ab -n 10000 -c 1000 -r http://127.0.0.1:82/images/test.gif

Results:

Concurrency Level:      1000
Time taken for tests:   15.710 seconds
Complete requests:      10000
Total transferred:      2643210000 bytes
HTML transferred:       2641380000 bytes
Requests per second:    636.53 [#/sec] (mean)
Time per request:       1571.024 [ms] (mean)
Time per request:       1.571 [ms] (mean, across all concurrent requests)
Transfer rate:          164304.33 [Kbytes/sec] received

Percentage of the requests served within a certain time (ms)
  50%    455
  66%    547
  75%   1405
  80%   1430
  90%   1647
  95%   3453
  98%   7468
  99%   7603
 100%   8238 (longest request)

And now I am comparing the results with the same file hosted on the same 1-core server serving with Apache httpd:

 ab -n 10000 -c 1 -r http://127.0.0.1:80/images/test.gif

Results:

Concurrency Level:      1
Time taken for tests:   6.938 seconds
Complete requests:      10000
Total transferred:      2644210000 bytes
HTML transferred:       2641380000 bytes
Requests per second:    1441.40 [#/sec] (mean)
Time per request:       0.694 [ms] (mean)
Time per request:       0.694 [ms] (mean, across all concurrent requests)
Transfer rate:          372202.50 [Kbytes/sec] received

Percentage of the requests served within a certain time (ms)
  50%      0
  66%      0
  75%      0
  80%      0
  90%      0
  95%      1
  98%      1
  99%      1
 100%    642 (longest request)

Concurrent queries:

 ab -n 10000 -c 1000 -r http://127.0.0.1:80/images/test.gif

Result:

Concurrency Level:      1000
Time taken for tests:   24.067 seconds
Complete requests:      10000
Failed requests:        42
Total transferred:      2639186001 bytes
HTML transferred:       2636361378 bytes
Requests per second:    415.51 [#/sec] (mean)
Time per request:       2406.682 [ms] (mean)
Time per request:       2.407 [ms] (mean, across all concurrent requests)
Transfer rate:          107090.60 [Kbytes/sec] received

Percentage of the requests served within a certain time (ms)
  50%     54
  66%     61
  75%     78
  80%    148
  90%   1059
  95%   3057
  98%  11426
  99%  13288
 100%  21031 (longest request)

We see that Apache works twice faster with non-concurrent queries and much faster with concurrency (80% of concurrent queries to Apache were served within 148ms vs only 66% of queries to node.js application served within 547ms).

Are there ways how to speed up this node.js application? I use only 1-core VPS, so I think use of cluster module doesn't make sense.