Use Buffers when decoding

Decoding a string is probably the most common mistake when working with legacy encoded resources. Why? Lets see.


This is wrong:

var http = require('http'),
    iconv = require('iconv-lite');

http.get("", function(res) {
  var body = '';
  res.on('data', function(chunk) {
    body  = chunk;
  res.on('end', function() {
    var decodedBody = iconv.decode(body, 'win1252');

Before being decoded with iconv.decode function, the original resource was (unintentionally) already decoded in body = chunk via javascript type conversion. What really happens here is:

  res.on('data', function(chunkBuffer) {
    body  = chunkBuffer.toString('utf8');

The same conversion is done behind the scenes if you call res.setEncoding('utf8');.

Not only double-decoding leads to wrong results, it is also nearly impossible to restore original bytes because utf8 conversion is lossy, so even iconv.decode(new Buffer(body, 'utf8'), 'win1252') will not help.

Note: theoretically, if you use 'binary' encoding to first decode to strings, then feed them to decode, you get the correct results. This is a bad practice because it's slower, it's mixing concepts and 'binary' encoding is deprecated.


Keep original Buffer-s and provide them to iconv.decode. Use Buffer.concat() if needed.

In general, keep in mind that all javascript strings are already decoded and should not be decoded again.

http.get("", function(res) {
  var chunks = [];
  res.on('data', function(chunk) {
  res.on('end', function() {
    var decodedBody = iconv.decode(Buffer.concat(chunks), 'win1252');

// Or, with [email protected] and Node v0.10 , you can use streaming support with `collect` helper
http.get("", function(res) {
  res.pipe(iconv.decodeStream('win1252')).collect(function(err, decodedBody) {

What if you know what you're doing and just want to mute the warning?

iconv.skipDecodeWarning = true;