Fixing Wordpress and MySQL Charset Problem (Wordpress Posts Using Several Languages)

*!!!!! I think that this post needs some modifications, as not all is working as expected. Any tips and comments are very much welcomed. (Think I have resolved the issue …. check UPDATE section at the bottom)

I am in the process of moving my blogs from wordpress.com to my own wordpress.org hosting. While doing so, I got stuck with the problem. My blog posts use several different languages, especially Russian and Hindi, apart from English. Once exported from wordpress.com using the built-in feature and imported to newly created blog, posts with other than English were not shown correctly, i.e. I was getting ????? instead of letters. So, how to fix this.

There are two ways: One is to edit wp_config.php file only, as mentioned in II, or the way, especially if you are starting a new blog, as I had a small problem at the end (see remark after I), though am not very much sure if this gives me extra benefit for future imports/exports, but does sound better.

Notice: I installed Wordpress using the Fantastico script provided by my hosting company, and my hosting provider has phpMyAdmin 2.11.4, running MySQL 4.1.22.

(choose only one of the following methods, don;t do both)

I. Check main configuration of Wordpress in wp-config.php. You should have these records:

define('DB_CHARSET', 'utf8');
define('DB_COLLATE', '');

Open your PhpMyAdmin and check settings for default charset and collation.

Set collation as utf8_general_ci

  • MySQL charset: UTF-8 Unicode (utf8)
  • MySQL connection collation: utf8_general_ci

Let’s check settings for the tables. You can see that all tables have not the utf8 charset, but latin1 or any other charset, but default collation for the database is utf8 already.

Next step - Export Wordpress database and fix collation and charset for all records.

Press Export button, select “Save as file” in checkbox and press “Go“.

Once downloaded, open it in any text editor, like kwrite/notepad. Search and replace all latin1 (or your current charset) to utf8. For faster method, use find and replace: latin1 - to - utf8. Save the file.

Upload file with Import button.

Remark: This allowed me to write new posts in other languages, but the older content imported still remained with ???? instead of letters (!)

II. Editing wp_config.php file:

Get a copy of your wp_config.php file and change the following:

define('DB_CHARSET', 'utf8');
define('DB_COLLATE', '');

to:

//define('DB_CHARSET', 'utf8');
//define('DB_COLLATE', '');

With only this hack, you should be able to make posts in your native language, as well as see properly all your imported content from your older blog.

Troubleshooting: Error for import tables or database - tables or database already exist

Press Drop button and selected database will be deleted (careful here :-)) Now try to import database once again.

If your hosting decide to delete Drop button from your phpmyadmin, then you can delete tables in this way:

Open database (Browse) and select all tables. Select Drop from menu.

Also maybe you should delete this line in your edited copy of the database:

CREATE DATABASE `your_database_name` DEFAULT

CHARACTER SET utf8 COLLATE utf8_general_ci;

It’s highly recommended to leave this line in a file

USE `your_database_name`;

I hope this tutrial was of help, as I myself spent few hours resolving this issue.

Also check out:

UPDATE (few hours later):

After trying out few combinations, I came to the following for the imported blog/new blog, either from wordpress.com or the wordpress.org version 2.2.x or older (I think), here is what will work for sure.

1. Set collation to utf8_general_ci, as mentioned in Step I.

2. Open your database , and DROP all Tables, as explained in Troubleshooting section (leaving db empty)

3. Head to your domain/url/web address where you installed your wordpress blog

4. Go through couple of install steps (change the password after setup for admin, if you want)

5. Import your XML file that you have downloaded from your older wordpress blog (using wordpress import tool, not mysql)

You will notice now that all your tables are set to utf8, older imported posts are showing all characters properly. Seems like Fantastico script for installing wordpress.org is giving head aches to many users across the web.

THAT IS IT … it takes time to ….

7 Responses to “Fixing Wordpress and MySQL Charset Problem (Wordpress Posts Using Several Languages)”

  1. Fixing Wordpress and MySQL Charset Problem (Wordpress Posts Using Several Languages) « Linux and Open Source Blog Says:

    [...] Read More [...]

  2. Lloyd Budd Says:

    Great blog post!

    define(’DB_CHARSET’, ‘utf8′);
    to
    define(’DB_CHARSET’, ”);
    may be easier for some.

    http://www.mydigitallife.info/2007/06/18/wordpress-charset-encoding-problem-after-upgrading-to-version-22/

  3. E@zyVG Says:

    Yah, it is easier, but I am afraid that it can break future import/export procedures. I am was trying to find a future proof method.

  4. Deusdies Says:

    Thank you very much! It worked! Thanks!!!

  5. Venu Says:

    I too recently moved to a new hosting and faced a similar problem. I had set to utf8_unicode_ci (even though the blog was all in english :) ).

    Even then some of the special character and html entities got corrupted again only in a few posts. Now my question would be : Should I try utf8_general_ci ?

    I hope there is some kind of wordpress plugin which can “tidy” up all these issues :)

  6. admin Says:

    ok, if you hotlinked my images and grab ny post, maybe you can give a credits?
    thanks

  7. E@zyVG Says:

    The hotlink was by mistake, I didn’t want it to be here any way, so I removed it completely. Credits, sure, your post was of help, though I ended up altering and finding the solution other way. Anyways, thanks.

Leave a Reply