Citations and Bibliographies
Suppose you want to be able to define some source, and easily reference it in your document. You also want to produce a list of all such sources — a bibliography.
We’ll show how to implement a simple system for this.
Challenge: tags that need to talk to each other
This is an “interesting” problem to work on because 1) the solution depends so much on the nature and structure of your project, and 2) the tag functions need to be aware of each other somehow.
Consider the following example Pollen markup:
#lang pollen
Paragraphs are important. ◊cite[1]
◊define-citation[1]{Chicago Manual of Style, 15th edition}
◊insert-bibliography[]
In order for the ◊cite tag to produce the
right output, it needs to be able to access information in the ◊define-citation tag that has the same number. And the
◊insert-bibliography tag needs to do the same for
all the ◊define-citation tags.
Also notice that our ◊define-citation comes
after the ◊cite tag that references it.
Remember: the book is a program. This Pollen markup isn’t just a static document; it’s a series
of expressions that are evaluated in order. How is the cite tag function supposed
to access the output of a define-citation function call that hasn’t even been
reached yet?
The Tree of Knowledge
At the point a tag function is called, it “knows” about two things:
- Anything
provided bypollen.rkt - Its own attributes and elements
So whenever you find yourself trying to create a tag function that doesn’t simply transform its own attributes and elements — i.e., that needs to draw on information outside itself — you really have two options:
- Construct (and maintain and provide) the information you need in
pollen.rkt - Save state somewhere and defer processing to a later tag function that can see more of
the doc (usually
root)
Both of these approaches are valid and idiomatic. The first is simpler, and suited to a project where the same information is used across multiple Pollen sources. But sometimes it isn’t an option; sometimes the information you need can only be found elsewhere in the same document. That’s when the second approach is needed.
Our implementation
I’m going to explain how this book puts all of the above into practice. Remember, to see the code itself, check out the contents of citations.rkt.
Defining a citation: The ◊define-citation tag emits a cite-def X-expression
with a ref attribute.
Referencing a citation: The ◊cite tag likewise emits a cite X-expression with a
ref attribute.
Inserting a bibliography: The ◊insert-bibliography tag inserts '(bib). That’s it,
that’s the tweet.
These tag functions don’t actually do much; they just leave behind little “marker” X-expressions in the doc that don’t actually give us the functionality we want — not by themselves.
Recall we’re taking the second approach above: saving the information we’ll need, and
deferring further processing until the root function. Where do we save the
information? We’re saving it right in the doc.
Here’s what the doc from our example markup looks like after these tag functions
have been called, but before the root function is called:
'(root "Paragraphs are important. " (cite [[ref "1"]]) (cite-def [[ref "1"]] "Chicago Manual of Style, 15th edition") (bib))
When the root function is called, it has visibility into the entire doc. If you
look at this function in pollen.rkt you’ll see it calls citations-root-handler from
citations.rkt. This function does three things:
- Splits out the
cite-defX-expressions from the doc usingsplitf-txexprand converts them into a hash table - Using
decode, replaces all theciteX-expressions with tooltips containing the matching entry from the hash table; and replaces any occurrence of'(bib)with a list of all the entries from that hash table.
What else could we do?
- Omit the
◊insert-bibliographytag and just insert a bibliography at the end automatically if there are any uses of◊define-citation. - There is a good reason for the
◊citetag to leave behind a marker in the document: the location in the document where the citation occurred is part of the info we need to keep track of. But the same is not true of◊define-citation: the location where the citation is defined is not useful or needed. So rather than leaving a marker,◊define-citationcould simply return an empty string and add its contents to a hash table. Then you wouldn’t need step 1 above; by the timerootis called, the hash table would be all ready for you and there’d be nocite-defX-expressions lying around to clean out of the doc.