Thursday, September 17, 2015

double encoding of UTF8 strings

Turns out I was focusing on the wrong leads. These strings became values in a hash which was run through JSON->encode before saving to db. What prompted the re-encoding was not the values of the hash, but the keys. These were string literals in my program which were not marked as UTF8. Perl looked at my keys, looked at my values, saw a mixed bag, and decided to run the whole thing through the shredder again, just to be on the safe side.

The solution was simple:
use utf8

Because my keys were source-code string literals, I want my source code to be UTF8. use utf8 assures that.

thanks to: http://perlgerl.com/2011/04/08/utf8-follies/