I want to have my sections in independent .txt
files in the folder of my assignment. So I use \input{text.txt}
. The problem is, I write in Danish and we have the 'æ' 'ø' 'å' characters that aren't normally recognized by LaTeX. I fixed it within the master .tex
file, so when I write the letters, it shows up nicely in my document. It does not however show up nicely, if I put one of the characters in the file name or within the .txt
files. Now, I don't mind that I have to avoid them in file names, but I really need them to be recognized within the file.
\documentclass[10pt,a4paper]{article}
\usepackage[danish]{babel}
\renewcommand{\danishhyphenmins}{22}
\usepackage{lmodern}
\usepackage[T1]{fontenc}
\usepackage[utf8x]{inputenc}
\usepackage[danish=quotes]{csquotes}
\begin{document}
æ ø å
\input{text.txt}
\end{document}
I can't upload the .txt
file, but if you copy the three letters, put it in a document called text.txt
file and put it in the folder, you'll see what I mean.
If I delete \input{text.txt}
from the document, or avoid using those three letters in the input file, everything works perfectly and it writes 'æ ø å' in the output file.
I asked this question on another forum, but they weren't much help. They talked about it being the wrong encoding and told me, that if I saved it as .tex
files there would be a dialogue that would enable me to change the encoding of the input document so it would be compatible, however there is no such thing in the 'Save as' dialogues I get from TeXworks. I also have it set to utf8
in my preferences in TeXworks, however that doesn't seem to influence much of anything.
Best Answer
I agree with the guys in the other forums - the issue is likely that the text file is in the wrong encoding - but I disagree with their solution. Depending on your operative system, I'll suggest two different solutions:
Under Linux
First a disclaimer: I use Ubuntu, and the exact commands might be slightly different under other distributions. The general idea is the same, however, so you should be able to iron out any cranks with the help of Google...
Confirming the diagnosis
To confirm that encoding is in fact the issue,
cd
to the folder where your files reside, and doThat should give you an output like the following (etc for more files):
If the text file is listed as
ISO-8859
(or something similar), or in fact anything other than "UTF-8 text", then encoding is your problem.Fixing the problem
To convert ISO-8859 (a.k.a. "Latin 1") to UTF-8, you can use the following command
iconv
is an encoding conversion utility.-f latin1
and-t utf8
are arguments toiconv
that tell the program which encoding the file is currently in, and which encoding you want it in. For a complete list of possible encoding names, doiconv --list
. The last argument is the file name of the input file (i.e. the one in the "wrong" encoding).iconv
writes the file, in the new encoding, tostdout
, so we redirect the output into a new file (don't use the same file name - you'll overwrite your file with an empty one).Under Windows
Confirming the diagnosis
My standard way of confirming encoding problems under windows is to open the file in Notepad and select
Save as...
- then there's a little dropdownlist that lets you choose the encoding of the file - if you don't change it, it states the current encoding of the file. Usually, files that I find problematic when using UTF-8 turn out to be saved in ANSI, which is Microsoft's own encoding (and quite similar to ASCII).If encoding is your problem, this dropdownlist shows something other than "UTF-8".
Fixing the problem
To fix it, simply select UTF-8 in the dropdownlist, (optionally) select a new file name for your input file, and hit Save.
Notepad converts the file intelligently, but if you experience problems you can (usually) simply reverse the process to get back the file you started with, and try something else.