computer science, math, programming and other stuff
a blog by Christopher Swenson

Please, please use UTF-8

I am currently migrating a site from an older WordPress setup to a fresh new server. I obtained MySQL backups and all of the files from the site.

My problem is that the primary way of posting was by writing a document in Microsoft Word, copy-pasting it to Firefox or IE into a version of WordPress set up for UTF-8, running on a PHP installation using ISO-8859-1, and ending up in a MySQL database setup as Latin1. It is then exported out into the dump, and now I have to figure out how to clean up the dump or MySQL instance so that I get "don't" and not "don’t".

So please, I beg of you, read this article and always use UTF-8 in everything you do.

Edit: Also, if you run into such problems, two of the best helps I have found were here (though you may want to modify it a bit, since it changes smart quotes to dumb quotes) and here.