TransWikia.com

Excel: Change default encoding (file origin) of Text Import Wizard to UTF-8 (65001 : Unicode)

Super User Asked by Dizzley on December 22, 2020

I am using a variety of tools to regularly prepare data for the web. One stage requires me to open a CSV in Excel, make changes and save the file.

Is there a way to force Excel to accept UTF-8 encoding, and to save its files with that encoding?

8 Answers

I answered a similar question at Default character encoding for Excel Text Wizard?.

I found my answer at Changing default text import origin type in Excel.

  1. Close Excel, if it is open.
  2. Open the Registry Editor.
  3. Navigate to HKEY_CURRENT_USER → Software → Microsoft → Office → ▒▒ → Excel → Options, where ▒▒ is your version of Office, mostly likely the largest number you see there.
  4. Right-click an empty space on the right side and select New → DWORD.
  5. Name the item DefaultCPG, and press Enter to save.
  6. Right-click on DefaultCPG and select Modify.
  7. Set the Base to Decimal.
  8. For Value data, enter 65001 to set your default to UTF-8. For some other encoding, use the code page identifier, which you can find in the Text Import Wizard in Excel or in this list.
  9. Click OK.

Like Vasille says in the comment to this question, if your file is not actually in UTF-8 format, you may technically want to convert the characters within the file to the encoding you want before opening in Excel. For my purposes, though, UTF-8 does a good enough job of displaying non-corrupted characters.

Not working? Make sure you set Base to Decimal (Step 7).

Screenshot of Edit DWORD 32-bit value showing setting base to decimal

Correct answer by Robert H. on December 22, 2020

It seems that Byte Order Mark is required by Microsoft Office software.

  1. Using Notepad++, convert the CSV using menu: Encoding -> Convert to UTF8-BOM.

  2. Using the sed Unix utility, available in cmder or Git for Windows. See Adding BOM to UTF-8 files.

    sed -i '1s/^/xefxbbxbf/' file.csv
    

  • Tested with Excel 2016
  • This is what avwtp meant in their answer.

Answered by Cristian on December 22, 2020

Assuming you have a unix shell available (Cygwin), I append to the front of the csv file a little 3 char unicode ID (UTF-8) and then it reads fine into Excel and more importantly it is properly processed as input using VB Excel automation.

UTF-8.txt

EFBBBF << hex edit into a 3 char file.

cat UTF-8.txt file-to-use.csv > new_file.csv

Answered by avwtp on December 22, 2020

You need to use the File > Import option and start with blank document and specify UTF-8

enter image description here

but this is far from optimal to make this a default setting for all files, anyway it is unnecessary to rotate the files around Google Drive or LibreOffice. The defaults are just badly chosen and the disability to change that is irritating.

Related

  1. How to set character encoding when opening Excel but does not find the option to change the defaults such that all files are automatically opened with UTF-8 instead of Macintosh format in OSX

  2. How to change default file option to UTF-8 away from Macintosh file option in Apple.SE.

Answered by hhh on December 22, 2020

It's not the best solution, but it's an option: upload your Excel file to Google drive, open it with Google Tabs and download as a csv file. It worked for me.

Answered by Viktor on December 22, 2020

There's an Excel addin available here to work with Unicode CSV files that should help you.

Here's the developer Jaimon Mathew's note:

Excel treats .csv files as text files and will replace all Unicode characters with “?” when saved normally. If you want to preserve the Unicode characters, you would need to save the file as “Unicode text (*.txt)”, which is a Tab delimited file. Since I couldn’t find any existing solution to preserve Unicode characters in CSV format, I thought I’ll give it a go in creating an Excel Addin to do just that.

1

Answered by Karan on December 22, 2020

I solved similar problem before. Unsuccessfully, but you can use LibreOffice, which is UTF-8 in default.

Answered by Lluser on December 22, 2020

One easy way to change excel ANSI encoding to UTF-8 is the open the .csv file in notepad then select File > Save As. Now at the bottom you will see encoding it set to ANSI change it to UTF-8 and save the file as new file and then your done.

Answered by Moudiz on December 22, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP