Commit 0b0faf92 authored by User expired's avatar User expired
Browse files

Restrict allowed field names to avoid wrong matches

Only the English alphabet, - and _ are allowed in field names
in order to avoid erroneous detection in multi line contents.
parent d6ff9a20
Pipeline #26392 passed with stage
in 2 minutes and 32 seconds
......@@ -3,7 +3,7 @@ Command line tools for bibtex citations
=======================================
:Author: Bjoern Bastian <bjoern.bastian@uibk.ac.at>
:Date: 2019-09-09
:Date: 2019-11-26
This projects contains several shell scripts for the following tasks.
......@@ -15,18 +15,19 @@ This projects contains several shell scripts for the following tasks.
- Extract single records
- Output text citations for copy and paste
.. contents::
The html version of this file is created with ``python-docutils``::
make README.html
Usage
=====
See online documentation in `bibtools.pdf`_ or build ``doc/bibtools.pdf``
yourself with ``make doc``.
Setup
=====
Setup on Linux
==============
- GNU AWK is required. For Ubuntu you may install the ``gawk`` package::
sudo apt-get install gawk
......@@ -47,4 +48,13 @@ Setup
make uninstall
Notes on the BibTeX format
==========================
- BibTeX entries are formatted in a unique way with sorted tags (field names)
and contents enclosed in curly braces.
- Tags must start with a letter from the English alphabet and otherwise may
only contain the additional characters ``-`` and ``_``.
- New BibTeX files may enclose contents in quotation marks, but string
concatenation with ``#`` is not supported.
.. _bibtools.pdf: https://bbastian.pavo.uberspace.de/files/bibtools/bibtools.pdf
......@@ -103,9 +103,9 @@ do
# Use curly brackets {} instead of double quotes "" or no quotes.
# FIXME: Multi line entries with double quotes are not treated correctly.
sed -i -r 's/^ ([A-Za-z][^ ={("]*[^ ={"]) = "/ \1 = {/;s/" *,$/},/;T;s/"$/}/' "$tmpfile"
sed -i -r 's/^ ([A-Za-z][A-Za-z_-]*) = "/ \1 = {/;s/" *,$/},/;T;s/"$/}/' "$tmpfile"
sed -i '$s/"}$/}\n}/' "$tmpfile"
sed -i -r 's/^ ([A-Za-z][^ ={("]*[^ ={"]) = ([^{].*[^},])(,*)$/ \1 = {\2}\3/' "$tmpfile"
sed -i -r 's/^ ([A-Za-z][A-Za-z_-]*) = ([^{].*[^},])(,*)$/ \1 = {\2}\3/' "$tmpfile"
# Remove preceding and trailing whitespace in field contents.
sed -i 's/ *\(},\?\)$/\1/' "$tmpfile"
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment