Citations and Bibliographies
Suppose you want to be able to define some source, and easily reference it in your document. You also want to produce a list of all such sources — a bibliography.
We’ll show how to implement a simple system for this.
Challenge: tags that need to talk to each other
This is an “interesting” problem to work on because 1) the solution depends so much on the nature and structure of your project, and 2) the tag functions need to be aware of each other somehow.
Consider the following example Pollen markup:
#lang pollen Paragraphs are important. ◊cite[1] ◊define-citation[1]{Chicago Manual of Style, 15th edition} ◊insert-bibliography[]
In order for the ◊cite
tag to produce the
right output, it needs to be able to access information in the ◊define-citation
tag that has the same number. And the
◊insert-bibliography
tag needs to do the same for
all the ◊define-citation
tags.
Also notice that our ◊define-citation
comes
after the ◊cite
tag that references it.
Remember: the book is a program. This Pollen markup isn’t just a static document; it’s a series
of expressions that are evaluated in order. How is the cite
tag function supposed
to access the output of a define-citation
function call that hasn’t even been
reached yet?
The Tree of Knowledge
At the point a tag function is called, it “knows” about two things:
- Anything
provide
d bypollen.rkt
- Its own attributes and elements
So whenever you find yourself trying to create a tag function that doesn’t simply transform its own attributes and elements — i.e., that needs to draw on information outside itself — you really have two options:
- Construct (and maintain and provide) the information you need in
pollen.rkt
- Save state somewhere and defer processing to a later tag function that can see more of
the doc (usually
root
)
Both of these approaches are valid and idiomatic. The first is simpler, and suited to a project where the same information is used across multiple Pollen sources. But sometimes it isn’t an option; sometimes the information you need can only be found elsewhere in the same document. That’s when the second approach is needed.
Our implementation
I’m going to explain how this book puts all of the above into practice. Remember, to see the code itself, check out the contents of citations.rkt.
Defining a citation: The ◊define-citation
tag emits a cite-def
X-expression
with a ref
attribute.
Referencing a citation: The ◊cite
tag likewise emits a cite
X-expression with a
ref
attribute.
Inserting a bibliography: The ◊insert-bibliography
tag inserts '(bib)
. That’s it,
that’s the tweet.
These tag functions don’t actually do much; they just leave behind little “marker” X-expressions in the doc that don’t actually give us the functionality we want — not by themselves.
Recall we’re taking the second approach above: saving the information we’ll need, and
deferring further processing until the root
function. Where do we save the
information? We’re saving it right in the doc.
Here’s what the doc from our example markup looks like after these tag functions
have been called, but before the root
function is called:
'(root "Paragraphs are important. " (cite [[ref "1"]]) (cite-def [[ref "1"]] "Chicago Manual of Style, 15th edition") (bib))
When the root
function is called, it has visibility into the entire doc. If you
look at this function in pollen.rkt you’ll see it calls citations-root-handler
from
citations.rkt. This function does three things:
- Splits out the
cite-def
X-expressions from the doc usingsplitf-txexpr
and converts them into a hash table - Using
decode
, replaces all thecite
X-expressions with tooltips containing the matching entry from the hash table; and replaces any occurrence of'(bib)
with a list of all the entries from that hash table.
What else could we do?
- Omit the
◊insert-bibliography
tag and just insert a bibliography at the end automatically if there are any uses of◊define-citation
. - There is a good reason for the
◊cite
tag to leave behind a marker in the document: the location in the document where the citation occurred is part of the info we need to keep track of. But the same is not true of◊define-citation
: the location where the citation is defined is not useful or needed. So rather than leaving a marker,◊define-citation
could simply return an empty string and add its contents to a hash table. Then you wouldn’t need step 1 above; by the timeroot
is called, the hash table would be all ready for you and there’d be nocite-def
X-expressions lying around to clean out of the doc.