Unicode -(U+301) error in biblatex, but not in main text: {'{i}}
When compiling my document embedding references using biblatex, I get the error message:
Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX
With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}}
in one of the references as the culprit. Interestingly, setting {'{i}}
in the main text does not throw an error message:
begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}
@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}
documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}
usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding
usepackage{filecontents}
usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing
begin{document}
section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}
S'{i}
printbibliography
end{document}
Trying to solve the problem, I I found different attempts on this side, such as
using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.
I therefore tried to configure biblatex using the
--output-safechars
option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.However, I prefer to use
latexmk
(especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to passbiber
options tolatexmk
. I created the filelatexmk
in the local directory, containing the line$biber='biber --output-safechars';
. This finally works.
I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.
So I guess I'm having two options here:
1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.
2) if there isn't, is there any way to force latexmk/biber
to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.
Edit:
I just tested the workflow using the .latexmkrc
on my whole document, which now throws an error
Undefined control sequence.
in the line just after theprintbibliography
command. Apparently some entry in my 200+ bibliography clashes with the--output-safechars
option.
I'll research on it, but it seems this workflow might also not work for me in the end.
biblatex unicode latexmk
add a comment |
When compiling my document embedding references using biblatex, I get the error message:
Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX
With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}}
in one of the references as the culprit. Interestingly, setting {'{i}}
in the main text does not throw an error message:
begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}
@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}
documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}
usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding
usepackage{filecontents}
usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing
begin{document}
section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}
S'{i}
printbibliography
end{document}
Trying to solve the problem, I I found different attempts on this side, such as
using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.
I therefore tried to configure biblatex using the
--output-safechars
option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.However, I prefer to use
latexmk
(especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to passbiber
options tolatexmk
. I created the filelatexmk
in the local directory, containing the line$biber='biber --output-safechars';
. This finally works.
I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.
So I guess I'm having two options here:
1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.
2) if there isn't, is there any way to force latexmk/biber
to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.
Edit:
I just tested the workflow using the .latexmkrc
on my whole document, which now throws an error
Undefined control sequence.
in the line just after theprintbibliography
command. Apparently some entry in my 200+ bibliography clashes with the--output-safechars
option.
I'll research on it, but it seems this workflow might also not work for me in the end.
biblatex unicode latexmk
It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of{'{i}}
for{'i}
with a sourcemap.
– gusbrs
Jan 10 at 13:12
@gusbrs: what's a sourcemap? :)
– Wiebke
Jan 10 at 13:15
Wiebke, see the answer moewe just provided. ;-)
– gusbrs
Jan 10 at 13:17
add a comment |
When compiling my document embedding references using biblatex, I get the error message:
Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX
With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}}
in one of the references as the culprit. Interestingly, setting {'{i}}
in the main text does not throw an error message:
begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}
@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}
documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}
usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding
usepackage{filecontents}
usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing
begin{document}
section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}
S'{i}
printbibliography
end{document}
Trying to solve the problem, I I found different attempts on this side, such as
using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.
I therefore tried to configure biblatex using the
--output-safechars
option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.However, I prefer to use
latexmk
(especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to passbiber
options tolatexmk
. I created the filelatexmk
in the local directory, containing the line$biber='biber --output-safechars';
. This finally works.
I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.
So I guess I'm having two options here:
1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.
2) if there isn't, is there any way to force latexmk/biber
to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.
Edit:
I just tested the workflow using the .latexmkrc
on my whole document, which now throws an error
Undefined control sequence.
in the line just after theprintbibliography
command. Apparently some entry in my 200+ bibliography clashes with the--output-safechars
option.
I'll research on it, but it seems this workflow might also not work for me in the end.
biblatex unicode latexmk
When compiling my document embedding references using biblatex, I get the error message:
Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX
With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}}
in one of the references as the culprit. Interestingly, setting {'{i}}
in the main text does not throw an error message:
begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}
@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}
documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}
usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding
usepackage{filecontents}
usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing
begin{document}
section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}
S'{i}
printbibliography
end{document}
Trying to solve the problem, I I found different attempts on this side, such as
using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.
I therefore tried to configure biblatex using the
--output-safechars
option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.However, I prefer to use
latexmk
(especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to passbiber
options tolatexmk
. I created the filelatexmk
in the local directory, containing the line$biber='biber --output-safechars';
. This finally works.
I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.
So I guess I'm having two options here:
1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.
2) if there isn't, is there any way to force latexmk/biber
to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.
Edit:
I just tested the workflow using the .latexmkrc
on my whole document, which now throws an error
Undefined control sequence.
in the line just after theprintbibliography
command. Apparently some entry in my 200+ bibliography clashes with the--output-safechars
option.
I'll research on it, but it seems this workflow might also not work for me in the end.
biblatex unicode latexmk
biblatex unicode latexmk
edited Jan 10 at 13:59
Wiebke
asked Jan 10 at 13:07
WiebkeWiebke
658413
658413
It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of{'{i}}
for{'i}
with a sourcemap.
– gusbrs
Jan 10 at 13:12
@gusbrs: what's a sourcemap? :)
– Wiebke
Jan 10 at 13:15
Wiebke, see the answer moewe just provided. ;-)
– gusbrs
Jan 10 at 13:17
add a comment |
It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of{'{i}}
for{'i}
with a sourcemap.
– gusbrs
Jan 10 at 13:12
@gusbrs: what's a sourcemap? :)
– Wiebke
Jan 10 at 13:15
Wiebke, see the answer moewe just provided. ;-)
– gusbrs
Jan 10 at 13:17
It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of
{'{i}}
for {'i}
with a sourcemap.– gusbrs
Jan 10 at 13:12
It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of
{'{i}}
for {'i}
with a sourcemap.– gusbrs
Jan 10 at 13:12
@gusbrs: what's a sourcemap? :)
– Wiebke
Jan 10 at 13:15
@gusbrs: what's a sourcemap? :)
– Wiebke
Jan 10 at 13:15
Wiebke, see the answer moewe just provided. ;-)
– gusbrs
Jan 10 at 13:17
Wiebke, see the answer moewe just provided. ;-)
– gusbrs
Jan 10 at 13:17
add a comment |
1 Answer
1
active
oldest
votes
The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.
author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},
The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).
If that is not possible and you can't replace {'{i}}
with {'i}
in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.
The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.
To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}
, '{i}
, ^{i}
and "{i}
(all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.
documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}
DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}
DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}
usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}
begin{document}
parencite{Zheng2016}
cite{itest}
printbibliography
end{document}
Why is this Unicode business such an issue?
Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).
Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.
Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i
- ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i
- i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i
gets converted to í (í
, U+00ED), but 'i
to ı́ (ı́
, U+0131 + U+0301, a combination of the dotless i and the accent).
LaTeX's inputenc
can only deal with a sensible subset of Unicode and fails to account for ı́
(U+0131 + U+0301) while it handles í
(U+00ED) just fine.
See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.
Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
1
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
2
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loadsnatbib
and is incompatible withbiblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."
– moewe
Jan 10 at 16:06
|
show 5 more comments
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "85"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f469555%2funicode-u301-error-in-biblatex-but-not-in-main-text-i%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.
author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},
The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).
If that is not possible and you can't replace {'{i}}
with {'i}
in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.
The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.
To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}
, '{i}
, ^{i}
and "{i}
(all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.
documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}
DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}
DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}
usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}
begin{document}
parencite{Zheng2016}
cite{itest}
printbibliography
end{document}
Why is this Unicode business such an issue?
Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).
Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.
Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i
- ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i
- i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i
gets converted to í (í
, U+00ED), but 'i
to ı́ (ı́
, U+0131 + U+0301, a combination of the dotless i and the accent).
LaTeX's inputenc
can only deal with a sensible subset of Unicode and fails to account for ı́
(U+0131 + U+0301) while it handles í
(U+00ED) just fine.
See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.
Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
1
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
2
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loadsnatbib
and is incompatible withbiblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."
– moewe
Jan 10 at 16:06
|
show 5 more comments
The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.
author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},
The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).
If that is not possible and you can't replace {'{i}}
with {'i}
in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.
The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.
To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}
, '{i}
, ^{i}
and "{i}
(all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.
documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}
DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}
DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}
usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}
begin{document}
parencite{Zheng2016}
cite{itest}
printbibliography
end{document}
Why is this Unicode business such an issue?
Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).
Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.
Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i
- ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i
- i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i
gets converted to í (í
, U+00ED), but 'i
to ı́ (ı́
, U+0131 + U+0301, a combination of the dotless i and the accent).
LaTeX's inputenc
can only deal with a sensible subset of Unicode and fails to account for ı́
(U+0131 + U+0301) while it handles í
(U+00ED) just fine.
See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.
Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
1
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
2
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loadsnatbib
and is incompatible withbiblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."
– moewe
Jan 10 at 16:06
|
show 5 more comments
The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.
author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},
The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).
If that is not possible and you can't replace {'{i}}
with {'i}
in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.
The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.
To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}
, '{i}
, ^{i}
and "{i}
(all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.
documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}
DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}
DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}
usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}
begin{document}
parencite{Zheng2016}
cite{itest}
printbibliography
end{document}
Why is this Unicode business such an issue?
Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).
Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.
Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i
- ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i
- i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i
gets converted to í (í
, U+00ED), but 'i
to ı́ (ı́
, U+0131 + U+0301, a combination of the dotless i and the accent).
LaTeX's inputenc
can only deal with a sensible subset of Unicode and fails to account for ı́
(U+0131 + U+0301) while it handles í
(U+00ED) just fine.
See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.
Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).
The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.
author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},
The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).
If that is not possible and you can't replace {'{i}}
with {'i}
in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.
The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.
To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}
, '{i}
, ^{i}
and "{i}
(all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.
documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}
DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}
DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}
usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}
begin{document}
parencite{Zheng2016}
cite{itest}
printbibliography
end{document}
Why is this Unicode business such an issue?
Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).
Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.
Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i
- ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i
- i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i
gets converted to í (í
, U+00ED), but 'i
to ı́ (ı́
, U+0131 + U+0301, a combination of the dotless i and the accent).
LaTeX's inputenc
can only deal with a sensible subset of Unicode and fails to account for ı́
(U+0131 + U+0301) while it handles í
(U+00ED) just fine.
See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.
Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).
edited Jan 10 at 14:19
answered Jan 10 at 13:16
moewemoewe
96.5k10118362
96.5k10118362
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
1
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
2
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loadsnatbib
and is incompatible withbiblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."
– moewe
Jan 10 at 16:06
|
show 5 more comments
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
1
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
2
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loadsnatbib
and is incompatible withbiblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."
– moewe
Jan 10 at 16:06
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)
– gusbrs
Jan 10 at 13:18
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?
– Dr. Manuel Kuehner
Jan 10 at 13:22
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?
– Wiebke
Jan 10 at 14:40
1
1
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).
– moewe
Jan 10 at 14:58
2
2
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads
natbib
and is incompatible with biblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."– moewe
Jan 10 at 16:06
@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads
natbib
and is incompatible with biblatex
) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."– moewe
Jan 10 at 16:06
|
show 5 more comments
Thanks for contributing an answer to TeX - LaTeX Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f469555%2funicode-u301-error-in-biblatex-but-not-in-main-text-i%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of
{'{i}}
for{'i}
with a sourcemap.– gusbrs
Jan 10 at 13:12
@gusbrs: what's a sourcemap? :)
– Wiebke
Jan 10 at 13:15
Wiebke, see the answer moewe just provided. ;-)
– gusbrs
Jan 10 at 13:17