Unicode -(U+301) error in biblatex, but not in main text: {'{i}}












6















When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.










share|improve this question

























  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.

    – gusbrs
    Jan 10 at 13:12











  • @gusbrs: what's a sourcemap? :)

    – Wiebke
    Jan 10 at 13:15











  • Wiebke, see the answer moewe just provided. ;-)

    – gusbrs
    Jan 10 at 13:17
















6















When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.










share|improve this question

























  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.

    – gusbrs
    Jan 10 at 13:12











  • @gusbrs: what's a sourcemap? :)

    – Wiebke
    Jan 10 at 13:15











  • Wiebke, see the answer moewe just provided. ;-)

    – gusbrs
    Jan 10 at 13:17














6












6








6








When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.










share|improve this question
















When compiling my document embedding references using biblatex, I get the error message:




Package inputenc Error: Unicode char ́ (U+301)(inputenc) not set up
for use with LaTeX




With the help of the various unicode/biblatex questions on this side, I identified the character {'{i}} in one of the references as the culprit. Interestingly, setting {'{i}} in the main text does not throw an error message:



begin{filecontents}{biblio.bib}
@Article{Zheng2016,
%author = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodr{'{i}}guez-Calero and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
author = {Qinsi Zheng and Gabriel G. Rodr{'i}guez-Calero and Steffen Jockusch and Zhou Zhou and Hong Zhao and Roger B. Altman and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
publisher = {Royal Society of Chemistry ({RSC})},
}

@Article{Pennacchietti2018,
author = {Francesca Pennacchietti and Ekaterina O. Serebrovskaya and Aline R. Faro and Irina I. Shemyakina and Nina G. Bozhanova and Alexey A. Kotlobay and Nadya G. Gurskaya and Andreas Bod{'{e}}n and Jes Dreier and Dmitry M. Chudakov and Konstantin A. Lukyanov and Vladislav V. Verkhusha and Alexander S. Mishin and Ilaria Testa},
title = {Fast reversibly photoswitching red fluorescent proteins for live-cell {RESOLFT} nanoscopy},
journal = {Nature Methods},
year = {2018},
volume = {15},
number = {8},
month = {jul},
pages = {601--604},
doi = {10.1038/s41592-018-0052-9},
publisher = {Springer Nature America, Inc},
}
end{filecontents}



documentclass[pdfa,a4paper,11pt,
bibliography=totoc,
numbers=noenddot,
abstracton,
twoside,openright,
parskip=half]{scrartcl}

usepackage[english]{babel} % provides the dictionary for proper hyphenation
frenchspacing % single space after full stop
raggedbottom
usepackage[utf8]{inputenc} % for font encoding

usepackage{filecontents}

usepackage{csquotes} % needed for babel / polyglossia
usepackage[
natbib = true, % allows usage of citet, citep etc. commands
citestyle = authoryear, bibstyle = authoryear, %
backend = biber, %
sortcites = true, % sorts multiple refs in one cite command
hyperref = true, %backref = true, %
giveninits = true, %
terseinits = false, % if true: D. E. => DE
%uniquelist = true,
maxbibnames = 30, maxcitenames = 2, %
uniquename = init, uniquelist = minyear, % uniquelist = minyear only cites 2nd author if first author and year are identical
date = year,
url = false, isbn = false]{biblatex} % package for the bibliography
addbibresource{biblio.bib}
usepackage{hyperref} % crossreferencing

begin{document}

section{Introduction}
citep{Zheng2016}
citep{Pennacchietti2018}

S'{i}

printbibliography

end{document}


Trying to solve the problem, I I found different attempts on this side, such as




  • using {'i} as suggested in this answer works. However, for automatically imported bibliography entries, it's tedious to find all of offending characters, especially when the error might occur with different combinations of precomposed characters as suggested here.


  • I therefore tried to configure biblatex using the --output-safechars option as suggested in in this answer. Compiling manually from the terminal, it seems to work ok.


  • However, I prefer to use latexmk (especially when compilation workflows require multiple runs of various compilers) for compilation. I then found this answer, explaining how to pass biber options to latexmk. I created the file latexmk in the local directory, containing the line $biber='biber --output-safechars';. This finally works.



I am however afraid, that this whole workflow is beyond my bosses willingness to put up with the perks of LaTeX.



So I guess I'm having two options here:



1) is there any way to remove the offending characters automatically? I found this answer, but am afraid that it's way beyond my understanding.



2) if there isn't, is there any way to force latexmk/biber to compile such characters properly that does not require any additional files or setup? Ideally, I'm looking for some magic commands that I could "sneak in unnoticed" at the beginning of the .tex file.



Edit:
I just tested the workflow using the .latexmkrc on my whole document, which now throws an error




Undefined control sequence.
in the line just after the printbibliography command. Apparently some entry in my 200+ bibliography clashes with the --output-safechars option.




I'll research on it, but it seems this workflow might also not work for me in the end.







biblatex unicode latexmk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 10 at 13:59







Wiebke

















asked Jan 10 at 13:07









WiebkeWiebke

658413




658413













  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.

    – gusbrs
    Jan 10 at 13:12











  • @gusbrs: what's a sourcemap? :)

    – Wiebke
    Jan 10 at 13:15











  • Wiebke, see the answer moewe just provided. ;-)

    – gusbrs
    Jan 10 at 13:17



















  • It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.

    – gusbrs
    Jan 10 at 13:12











  • @gusbrs: what's a sourcemap? :)

    – Wiebke
    Jan 10 at 13:15











  • Wiebke, see the answer moewe just provided. ;-)

    – gusbrs
    Jan 10 at 13:17

















It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.

– gusbrs
Jan 10 at 13:12





It's a guess here, for I don't really know if the sourcemap will kick in before the error occurs, or not. But, after the attempts you described, I would try a regex replace of {'{i}} for {'i} with a sourcemap.

– gusbrs
Jan 10 at 13:12













@gusbrs: what's a sourcemap? :)

– Wiebke
Jan 10 at 13:15





@gusbrs: what's a sourcemap? :)

– Wiebke
Jan 10 at 13:15













Wiebke, see the answer moewe just provided. ;-)

– gusbrs
Jan 10 at 13:17





Wiebke, see the answer moewe just provided. ;-)

– gusbrs
Jan 10 at 13:17










1 Answer
1






active

oldest

votes


















7














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer


























  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

    – gusbrs
    Jan 10 at 13:18











  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

    – Dr. Manuel Kuehner
    Jan 10 at 13:22













  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

    – Wiebke
    Jan 10 at 14:40






  • 1





    @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

    – moewe
    Jan 10 at 14:58






  • 2





    @Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

    – moewe
    Jan 10 at 16:06












Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "85"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f469555%2funicode-u301-error-in-biblatex-but-not-in-main-text-i%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









7














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer


























  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

    – gusbrs
    Jan 10 at 13:18











  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

    – Dr. Manuel Kuehner
    Jan 10 at 13:22













  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

    – Wiebke
    Jan 10 at 14:40






  • 1





    @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

    – moewe
    Jan 10 at 14:58






  • 2





    @Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

    – moewe
    Jan 10 at 16:06
















7














The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer


























  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

    – gusbrs
    Jan 10 at 13:18











  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

    – Dr. Manuel Kuehner
    Jan 10 at 13:22













  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

    – Wiebke
    Jan 10 at 14:40






  • 1





    @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

    – moewe
    Jan 10 at 14:58






  • 2





    @Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

    – moewe
    Jan 10 at 16:06














7












7








7







The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).






share|improve this answer















The best solution™ is of course to use the correct Unicode characters (and ideally the precomposed characters: Åström, not a combination of the combining characters: Åström) in the source.



author    = {Qinsi Zheng and Steffen Jockusch and Gabriel G. Rodríguez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman and Héctor D. Abruña
and Scott C. Blanchard},


The benefit of this solution is that it is easier to read, just works and avoids the additional braces that BibTeX needs (and that are retained in Biber for simplicity and backwards compatibility, those braces could destroy kerning and are otherwise unnecessary for Biber, see How to write “ä” and other umlauts and accented letters in bibliography? for why they are needed for BibTeX).





If that is not possible and you can't replace {'{i}} with {'i} in the source, you can try a sourcemap as shown in PLK's answer to Input encoding error after upgrading from Biber 1.9 to Biber 2.1.



The logistic drawback of that approach is that you need to add a substitution rule for every possible problematic combination.



To offer some additional benefit over PLK's answer, the code below uses the new loop functionality to replace `{i}, '{i}, ^{i} and "{i} (all Latin-1 dotless-i combinations) for (hopefully) all fields where it makes sense.



documentclass{article}
usepackage[english]{babel}
usepackage[utf8]{inputenc}
usepackage{csquotes}
usepackage[style = authoryear, backend = biber, maxbibnames=999]{biblatex}
addbibresource{jobname.bib}

DeclareDatafieldSet{setall}{
member[datatype=literal]
member[datatype=name]
member[field=journal]% journal is special since it is
% actually journaltitle
}

DeclareSourcemap{
maps[datatype=bibtex]{
map[overwrite, foreach={setall}]{
% `{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0300}},
replace=regexp{x{00EC}}]
% '{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0301}},
replace=regexp{x{00ED}}]
% ^{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0302}},
replace=regexp{x{00EE}}]
% "{i}
step[fieldsource=regexp{$MAPLOOP},
match=regexp{x{0131}x{0308}},
replace=regexp{x{00EF}}]
}
}
}

usepackage{filecontents}
begin{filecontents}{jobname.bib}
@article{itest,
author = {Lo{"{i}}c Rodr{'{i}}guez-Calero},
title = {Lor{"{i}}m {'{i}}psum and {`{i}}v{^{i}}n},
journal = {Dol{"{i}}r s{'{i}}t},
note = {Am{"{i}}t cons{'{i}}ctur},
date = {2018},
}
@article{Zheng2016,
author = {Qinsi Zheng and Steffen Jockusch
and Gabriel G. Rodr{'{i}}guez-Calero
and Zhou Zhou and Hong Zhao and Roger B. Altman
and H{'e}ctor D. Abru{~n}a and Scott C. Blanchard},
title = {Intra-molecular triplet energy transfer is a general
approach to improve organic fluorophore photostability},
journal = {Photochemical {&} Photobiological Sciences},
year = {2016},
volume = {15},
number = {2},
pages = {196--203},
doi = {10.1039/c5pp00400d},
}
end{filecontents}

begin{document}
parencite{Zheng2016}
cite{itest}

printbibliography
end{document}


Rodríguez-Calero, Loïc (2018). “Lorïm ípsum and ìvîn”. In: Dolïr sít. Amït consíctur.





Why is this Unicode business such an issue?



Unicode combines characters by adding the combining marks after the base glyph. LaTeX works exactly the other way round: The combining accents are added before
the glyph (as a macro that gets the base glyph as argument).



Biber 'parses' the LaTeX character macros and converts them to Unicode characters for sorting and the like. That is done according to simple translations for macros into Unicode points and the complex Unicode rules.



Combining characters involving i are particularly complicated since LaTeX usually bases its characters upon the 'dotless i' (i - ı, U+0131) to avoid clashes of accent and tittle, whereas Unicode seems to prefer its combining characters based on the 'small i' (i - i, U+0069) http://unicode.org/faq/char_combmark.html#22. That means that 'i gets converted to í (í, U+00ED), but 'i to ı́ (ı́, U+0131 + U+0301, a combination of the dotless i and the accent).



LaTeX's inputenc can only deal with a sensible subset of Unicode and fails to account for ı́ (U+0131 + U+0301) while it handles í (U+00ED) just fine.



See also PLK's explanation in the linked answer as well as comments in https://github.com/plk/biber/issues/65 and https://github.com/plk/biblatex/issues/819.





Another solution that needs no such tricks, but might not be compatible with your workflow, is to use a proper Unicode engine like LuaLaTeX or XeLaTeX and font that has properly kerned accents (Linux Libertine, for example).







share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 10 at 14:19

























answered Jan 10 at 13:16









moewemoewe

96.5k10118362




96.5k10118362













  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

    – gusbrs
    Jan 10 at 13:18











  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

    – Dr. Manuel Kuehner
    Jan 10 at 13:22













  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

    – Wiebke
    Jan 10 at 14:40






  • 1





    @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

    – moewe
    Jan 10 at 14:58






  • 2





    @Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

    – moewe
    Jan 10 at 16:06



















  • I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

    – gusbrs
    Jan 10 at 13:18











  • +1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

    – Dr. Manuel Kuehner
    Jan 10 at 13:22













  • Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

    – Wiebke
    Jan 10 at 14:40






  • 1





    @Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

    – moewe
    Jan 10 at 14:58






  • 2





    @Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

    – moewe
    Jan 10 at 16:06

















I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

– gusbrs
Jan 10 at 13:18





I didn't know "best solution" was trade marked! I hope you don't have to pay any royalties for the ones you use to provide. ;-)

– gusbrs
Jan 10 at 13:18













+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

– Dr. Manuel Kuehner
Jan 10 at 13:22







+1: Not just because of the humor. Are there specific interpreters necessary? Python or Perl for example?

– Dr. Manuel Kuehner
Jan 10 at 13:22















Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

– Wiebke
Jan 10 at 14:40





Everytime I look, your answer is getting more complicated :) Unicode TeX engines will probably not work (document is prepared for journal submission). That's why I also don't want to replace the offending characters with proper Unicode characters: I try to keep a single bibliography, and some journals still require bibtex, so I need the backwards compatibility. The sourcemaps work for me. But, following up on the question of above: Do I have to expect this to fail on the journal's side, because the journal might not have any specific interpreters that might be needed?

– Wiebke
Jan 10 at 14:40




1




1





@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

– moewe
Jan 10 at 14:58





@Dr.ManuelKuehner Sorry, I only just saw the question in your comment. No, you only need Biber (which is written in Perl hence this is all being very Perl-y; Biber is usually installed as a stand-alone executable, which brings its own Perl modules, so no Perl or Python installation is required).

– moewe
Jan 10 at 14:58




2




2





@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

– moewe
Jan 10 at 16:06





@Wiebke Well, I would not take that gamble. I took the liberty of googling the phrase you quoted and the hits lead me to Elsevier journals. Elsevier have their own document class (which usually loads natbib and is incompatible with biblatex) as well as BibTeX bibliography styles and say: "You are recommended to use the Elsevier article class elsarticle.cls to prepare your manuscript and BibTeX to generate your bibliography. Our LaTeX site has detailed submission instructions, templates and other information."

– moewe
Jan 10 at 16:06


















draft saved

draft discarded




















































Thanks for contributing an answer to TeX - LaTeX Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f469555%2funicode-u301-error-in-biblatex-but-not-in-main-text-i%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bressuire

Cabo Verde

Gyllenstierna