1. BIBFRAME
and
Moving Away From MARC
Linked Data: what cataloguers need to know #cigld
CILIP Cataloguing and Indexing Group (CIG)
20 February 2015
Thomas Meehan
tom@aurochs.org @orangeaurochs
4. Why? (1)
• On the Record / Library of Congress Working
Group on the Future of Bibliographic Control
(January 2008)
• Report and Recommendations of the U.S. RDA
Test Coordinating Committee (June 2011)
"Demonstrate credible progress towards a
replacement for MARC".
5. Why? (2)
Storage
Manipulation
Display
Input
Exchange and distribution
Publication
"Lingua franca of library cataloguing"
"foundation for the future of bibliographic description
that happens on the web and in the networked world"
6. Why? (2)
Storage
Manipulation
Display
Input
Exchange and distribution
Publication
"Lingua franca of library cataloguing"
"foundation for the future of bibliographic
description that happens on the web and in the
networked world"
7. Why? (3)
Conversion of MARC or replacement of MARC?
For all libraries everywhere?
For all bibliographic data?
8. Who
Library of Congress
Partners (Early Experimenters),
among others:
• British Library,
• Deutsche Nationalbibliothek,
• George Washington University,
• National Library of Medicine,
• OCLC,
• and Princeton University
Consultants:
• Zepheira
Implementers and Testers,
among others:
• National Library of Medicine
(decided to fork)
• George Washington University
• Biblioteca Nacional de Cuba
“José Martí” (BNJM)
• Stanford University
10. FRBR and BIBFRAME Models
FRBR
Work
Expression
Manifestation
Item
Work
Instance
Annotation
BIBFRAME
11. BIBFRAME Model: Resource
A BIBFRAME Resource can be anything: a Work, Instance, Authority, or
Annotation
bf:authorizedAccessPoint
bf:identifier
bf:label
bf:relatedTo
http://bibframe.org/vocab/Resource.html
12. BIBFRAME Model: Work 1/2
Work: A resource reflecting a conceptual essence of the cataloging resource. (A FRBR Work/Expression)
bf:classificationLcc
bf:contains
bf:creator
bf:hasDerivative
bf:note
bf:language
bf:originalVersion
bf:relatedWork
bf:series
bf:title
bf:workTitle
bf:subject
14. BIBFRAME Model: Instance 1/2
Instance: A resource reflecting an individual, material embodiment of
the Work. (A FRBR Manifestation)
bf:contributor
bf:dimensions
bf:extent
bf:isbn10
bf:isbn13
bf:isbn
bf:publication
bf:instanceOf
bf:titleStatement
http://bibframe.org/vocab/Instance.html
15. BIBFRAME Model: Instance 2/2 (pub)
ex:wk17082740 a bf:Work ;
hasInstance _:inst001 .
_:inst001 a bf:Instance ;
bf:publication _:pub002 .
_:pub002 a bf:Provider ;
bf:copyrightDate "c1980.";
bf:providerName _:provName003 ;
bf:providerPlace _:provPlace004 .
_:provName003 a bf:Organization ;
bf:label "D.W. Teske" .
_:provPlace004 a bf:Place ;
bf:label "Manchester, Iowa" .
(Example adapted from http://bibframe.org/vocab/publication.html)
16. BIBFRAME Model: Authority (Person)
Authority: Representation of a key concept or thing. Works and Instances, for
example, have defined relationships to these concepts and things.
bf:hasAuthority
bf:authorizedAccessPoint
http://bibframe.org/vocab/Authority.html
http://bibframe.org/documentation/bibframe-authority/
17. More on BIBFRAME Authorities
Direct Approach
ex:wk666 a bf:Work ;
bf:creator <http://id.loc.gov/authorities/names/n79049248> .
Indirect Approach, or, The lightweight abstraction layer
ex:wk666 a bf:Work ;
bf:creator ex:person99 .
ex:person99 a bf:Person ;
authorizedAccessPoint "Waugh, Evelyn,1903-1966." ;
hasAuthority <http://id.loc.gov/authorities/names/n79049248> .
18. Even More on BIBFRAME Authorities
Work creator
LC
Authority
Work creator
Bibframe
Authority
hasAuthority
LC
Authority
19. What is a BIBFRAME Person?
A. A person
B. An identity
C. A controlled personal name
D. An RDF document
20. BIBFRAME Model: Annotation
Annotation: Resource that asserts additional information about other BIBFRAME
resource.
bf:annotates
bf:annotationAssertedBy
bf:annotationBody
bf:annotationSource
bf:annotationDate
http://bibframe.org/vocab/Annotation.html
http://bibframe.org/documentation/annotations/
21. BIBFRAME Annotations Example
ex:wk005 a bf:Work ;
bf:hasAnnotation ex:ann010 .
ex:ann010 a bf:Summary ;
bf:annotates ex:wk005 ;
bf:annotationAssertedBy <http://id.loc.gov/vocabulary/organizations/ukluc> ;
bf:annotationSource <http://dbpedia.org/resource/Dbpedia> ;
bf:summary < http://dbpedia.org/resource/ Decline_and_Fall > ;
bf:annotationDate "20131125" ;
bf:startOfSummary "Decline and Fall is a novel by the English author Evelyn Waugh,
first published in 1928. It was Waugh's first published novel; an earlier attempt, entitled The
Temple at Thatch, was destroyed by Waugh while still in manuscript form. Decline and Fall is
based in part on Waugh's undergraduate years at Hertford College, Oxford, and his experience as
a teacher in Wales. It is a social satire that employs the author's characteristic black humour in
lampooning various features of British society in the 1920s. The novel's title is a contraction of
Edward Gibbon's The History of the Decline and Fall of the Roman Empire. " .
22. BIBFRAME Example
MARC
01260cam a2200265 a 4500
001 14920419
005 20090827103824.0
008 070709s2008 ilua b 001 0 eng
010 $a 2007027845
020 $a0838909507 (pbk. : alk. paper)
020 $a9780838909508 (pbk. : alk. paper)
040 $aDLC$cDLC$dDLC
050 00 $aZ666.6$b.M39 2008
082 00 $a025.3$222
100 1 $aMaxwell, Robert L.,$d1957-
245 10 $aFRBR :$ba guide for the perplexed /$cRobert L. Maxwell.
260 $aChicago :$bAmerican Library Association,$c2008.
300 $avii, 151 p. :$bill. ;$c23 cm.
500 $aIncludes bibliographic references and index.
505 0 $aThe entity-relationship model -- The FRBR entities -- Relationships -- User tasks -- The
FRBR model and the existing MARC and AACR2-based cataloging model.
650 0 $aFRBR (Conceptual model)
856 41 $3Table of contents only$uhttp://www.loc.gov/catdir/toc/ecip0722/2007027845.html
906 $a7$bcbc$corignew$d1$eecip$f20$gy-gencatlg
925 0 $aacquire$b2 shelf copies$xpolicy default
955 $alh39 2007-07-06$ilh39 2007-07-06$elh39 2007-07-09 to CIP (Dewey completed)$aps10
2008-02-14 1 copy rec'd., to CIP ver.$fpv06 2008-02-29 (telework) CIP ver to BCCD$ald11 2008-09-04
copy 2 added
23. BIBFRAME Example
Turtle
@prefix bf: <http://bibframe.org/vocab/> .
@prefix madsrdf: <http://www.loc.gov/mads/rdf/v1#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix relators: <http://id.loc.gov/vocabulary/relators/> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://id.loc.gov//resources/bibs/14920419> a bf:Text,
bf:Work ;
bf:authorizedAccessPoint "Maxwell, Robert L., 1957- FRBR :a guide for the perplexed",
"maxwellrobertl1957frbraguidefortheperplexedengworktext"@x-bf-hash ;
bf:classification [ a bf:Classification ;
bf:classificationEdition "22",
"full" ;
bf:classificationNumber "025.3" ;
bf:classificationScheme <http://id.loc.gov/authorities/classSchemes/ddc> ;
bf:label "025.3" ] ;
bf:classificationLcc <http://id.loc.gov/authorities/classification/Z666.6> ;
bf:creator [ a bf:Person ;
bf:authorizedAccessPoint "Maxwell, Robert L., 1957-" ;
bf:hasAuthority [ a madsrdf:Authority ;
madsrdf:authoritativeLabel "Maxwell, Robert L., 1957-" ] ;
bf:label "Maxwell, Robert L., 1957-" ] ;
bf:derivedFrom <http://id.loc.gov//resources/bibs/14920419.marcxml.xml> ;
bf:hasAnnotation [ a bf:TableOfContents ;
bf:annotates <http://id.loc.gov//resources/bibs/14920419> ;
bf:label "Table of contents only" ;
bf:tableOfContents <http://www.loc.gov/catdir/toc/ecip0722/2007027845.html> ],
[ a bf:Annotation ;
bf:annotates <http://id.loc.gov//resources/bibs/14920419> ;
bf:changeDate "2009-08-27T10:38" ;
bf:derivedFrom <http://id.loc.gov//resources/bibs/14920419.marcxml.xml> ;
bf:descriptionConventions <http://id.loc.gov/vocabulary/descriptionConventions/aacr2> ;
bf:descriptionModifier <http://id.loc.gov/vocabulary/organizations/dlc> ;
bf:descriptionSource <http://id.loc.gov/vocabulary/organizations/dlc> ;
bf:generationProcess "DLC transform-tool:2015-01-16-T11:00:00" ] ;
bf:hasInstance [ a bf:Instance,
bf:Monograph ;
bf:contentsNote "The entity-relationship model -- The FRBR entities -- Relationships -- User tasks -- The FRBR model and the existing MARC and AACR2-based cataloging model." ;
bf:derivedFrom <http://id.loc.gov//resources/bibs/14920419.marcxml.xml> ;
bf:dimensions "23 cm." ;
bf:format "paperback" ;
bf:heldItem [ a bf:HeldItem ;
bf:label "Z666.6 .M39 2008" ;
bf:shelfMarkLcc "Z666.6 .M39 2008" ] ;
bf:illustrationNote "ill. ;" ;
bf:instanceOf <http://id.loc.gov//resources/bibs/14920419> ;
bf:instanceTitle [ a bf:Title ;
bf:subtitle "a guide for the perplexed " ;
bf:titleValue "FRBR :" ] ;
bf:isbn10 <http://isbn.example.org/0838909507> ;
bf:isbn13 <http://isbn.example.org/9780838909508> ;
37. Make Your Own BIBFRAME Examples
1. BIBFRAME Compare:
http://bibframe.org/tools/compare/
2. Enter an LC system number (e.g. 10342843).
3. Click Run Comparison
4. Figure out what's going on.
5. Print out and colour in.
38. Things you could do next…
Learn more about linked data
Read the BIBFRAME list
Consider the implications
Add $0 to headings
Concentrate on headings
SPARQL (e.g BL and Oslo Public Library)
Look at BIBFRAME transformed examples
Look at Worldcat examples
Find other library linked data
Find non-library linked data
Play!
https://www.flickr.com/photos/34905030@N00/14532041398/
39. Find Out More
BIBFRAME.org
http://bibframe.org/
BIBFRAME Model and vocabulary.
http://bibframe.org/vocab/
BIBFRAME Model Primer (pdf).
http://www.loc.gov/bibframe/pdf/marcld-report-11-21-
2012.pdf
BIBFRAME mailing list.
http://listserv.loc.gov/listarch/bibframe.html
40. References
BIBFRAME
• BIBFRAME AV Modeling Study: Defining a Flexible Model for Description of Audiovisual Resources / Kara Van
Malssen, AVPreserve (May 2014). http://www.loc.gov/bibframe/pdf/bibframe-avmodelingstudy-may15-2014.pdf
• BIBFRAME diagram. http://bibframe.org/vocab/
• A Bibliographic Framework for the Digital Age (October 31, 2011). http://www.loc.gov/bibframe/news/framework-
103111.html
• NLM announcement from BIBFRAME mailing list (November 2014). http://listserv.loc.gov/cgi-
bin/wa?A2=ind1411&L=bibframe&T=0&P=12997
• On the Record / Library of Congress Working Group on the Future of Bibliographic Control (January 2008).
http://www.loc.gov/bibliographic-future/news/lcwg-ontherecord-jan08-final.pdf
• Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary (June 2011).
http://www.loc.gov/bibliographic-future/rda/source/rda-execsummary-public-13june11.pdf
• FRBR as Cake / Karen Coyle (Coyle's InFormation, 2011) http://kcoyle.blogspot.co.uk/2011/04/frbr-as-cake.html
• Open Annotations Model / W3C. http://www.w3.org/ns/oa
TEL, Oslo, Bibliograph
• The European Library. Research Libraries UK Linked Open Data.
http://www.theeuropeanlibrary.org/tel4/access/data/lod
• OCLC. BiblioGraph.net. http://www.Bbbliograph.net
• Oslo Public Library SPARQL endpoint. http://data.deichman.no/sparql
• RDF Linked Data Cataloguing at Oslo Public Library / Asgeir Rekkavik (July 2014).
http://digital.deichman.no/blog/2014/07/06/rdf-linked-data-cataloguing-at-oslo-public-library/
Editor's Notes
I'm going to look at the:
Background
The model
Some of the issues as we go along.
Some other ways of doing it!
Short for BIBliographic FRAMEwork initiative
Not an acronym!
Also,
Bibflow
LibHub
MarcNext
I alluded earlier to the problems with MARC and RDA in particular that LC and US testers identified…
"The library community's data carrier, MARC, is based on forty-year old techniques for data management and is out of step with programming styles of today. No community other than the library community uses this record format, severely compromising its utility to other communities as a data transmission tool. Bibliographic applications being developed outside of the library environment are not making use of, and may not be compatible with, records encoded in MARC. New and anticipated uses of bibliographic data require a format that will accommodate and distinguish expert-, automated-, and user-generated metadata, including annotations (reviews, comments) and usage data. Flexible design should allow for the selective (modular) use of metadata in different environments (e.g., use of controlled vocabularies appropriate to specific domains). The existing Z39.2/MARC “stack” is not an appropriate starting place for a new bibliographic data carrier because of the limitations placed upon it by the formats of the past."—On the Record. Jan. 2008
"Many survey respondents expressed doubt that RDA changes would yield significant benefits without a change to the underlying MARC carrier.
"Most felt any benefits of RDA would be largely unrealized in a MARC environment. MARC may hinder the separation of elements and ability to use URIs in a linked data environment.
"Demonstrate credible progress towards a replacement for MARC"
-- Report and Recommendations of the U.S. RDA Test Coordinating Committee. Executive Summary –June 2011
"Bibliographic framework is intended to indicate an environment rather than a "format"" according to the A Bibliographic Framework for the Digital Age (October 2011). However, most people when they are talking about Bibframe are in fact talking about this idea, which is what I'm going to focus on.
Storage - Possibly
Manipulation - Yes
Display – No (beyond actual data). Not like AACR2/RDA and MARC do.
Input – MARC is basically an input screen in its own right. Bibframe editor could have been done years ago.
Exchange and distribution – Yes, on the web
Publication – Yes, on the web
"Lingua franca of library cataloguing" – Hopefully not!
MARC has a lot more purposes that it needs. This makes it hard for Bibframe if it tries to Replace MARC.
"foundation for the future of bibliographic description that happens on the web and in the networked world"
AMBITIOUS! Is it too ambitious?
Conversion of MARC or replacement of MARC? – See the implications of the previous slide. Many purposes.
For all libraries everywhere? For LC or for all academic libraries? For national libraries. Your local public library, school libraries?
For all bibliographic data? – For repositories, digital image libraries, commercial data, Library Thing, article databases, discovery system knowledge bases, etc, etc. For books, AV, serials, anything held by a library?? On what system: ILSs, as an additional nicety, something else entirely? Is this all wise?
BIBFRAME is fundamentally "an initiative of the Library of Congress".
Zepheira are consultants with expertise and experience with the Semantic Web. Worked with OCLC on schema.org as well as a new project with University of California, Davis to "investigate the future of research library operations, particularly the production of metadata — or data on data — and deployment on the Web." Its president, Eric Miller, has been prominent in the development of the Semantic Web and RDF.
Interestingly not a formal collaborative effort.
Zepheira are no longer formally involved although they have a number of related projects, including training. The emphasis in 2015 has been on keeping the format stable and encouraging libraries to test.
I understand that something akin to the MARC management structure is likely at some point.
This is the basic Bibframe model. Bits of it look like FRBR, but the Expression in particular is missing.
The Sir Humphrey version:
"BIBFRAME has worked on modelling works as Works within the BIBFRAME model, similar to the RDA modelling work, itself modelled on the work on the FRBR model of Works and Expressions. A BIBFRAME Work is a creative work, perhaps a FRBR Work, or an RDA FRBR Work but it also expresses a FRBR Expression, and of course an RDA FRBR Expression. A Work may express another Work based on others’ work, not just a FRBR Work or an RDA Work. That also works. FRBR Works or RDA Works expressed as BIBFRAME Works can relate to FRBR Expressions (BIBFRAME Works or RDA Expressions). So, Works are works that can be Works but also Expressions linked to Works that really are Works."
Also similar to schema.org Work/? Split
Schema.org has vaguely similar Work/Instance split.
Oslo PL has similar Work/Manifestation split.
The LC A/V Bibframe Modelling report suggested something different, creating an (unintentional/natural) Event category on the same level as Work and placing them both within something called Content. This report also has references to a lot of models that are not strictly FRBR or Bibframe or …
A Bibframe resource can be anything the same as as RDF resource is anything that can be given a URI.
authorizedAccessPoint: "Controlled string form of a resource label intended to help uniquely identify it, such as a unique title or a unique name plus title." For the resource, not one of its access points. [Literal]
identifier: Number or code that uniquely identifies an entity. [identifier]
label: "Text string expressing the property value." [Literal]
relatedTo: "Any relationship between resources." [URL]
classificationLcc: Other classifications are available.
contains: "Work that is a discrete component of a larger work."
creator: Creator role / Generalized creative responsibility role.
hasDerivative: "Work has a modification for which it is the source. Work that has been translated, i.e., the text expressed in a language different from that of the original workp)."
NB. expressionOf and hasExpression!
hasInstance: "Work has a related Instance/manifestation."!
bf:title: a string
WorkTitle: a Title object
contributor: Contributor role / Generalized expressive responsibility role.
dimensions: "Measurements of the carrier or carriers and/or the container of a resource."
Isbn, isbn10, isbn13: More specific than bf:identifier. Identifier is a specific type of resource. identifierAssigner Identifier assigner. Literal
identifierQualifier Identifier qualifier. Literal
identifierScheme Identifier scheme. URL
identifierStatus Identifier status. Literal
identifierValue Identifier value. Literal.
bf:publication: Seems to have replaced the previous attributes bf:placePub, bf:provider, bf:pubDate to provide a more event-based model (like the BL's). See example on next slide.
titleStatement: "Title transcribed from an instance."
For FRBR, note instanceOf. Also, at least in examples, the split between creator for work, and contributor for instances.
This is a brief example to give you an idea of the model as a whole, including Publication modelled as an event.
Example of Bibframe Authority.
Previous properties included resourceRole, isnni, orcid, and viaf. Resourcerole is tricky.
EXAMPLE ISSUE 1: AUTHORITIES
There was some debate on this issue, of how they should be handled.
The first is what most people would assume would happen and is the classic linked data approach. Indeed, it's basically a recreation of the first example I showed you this morning.
The second is Bibframe's attempt to deal with some essentially practical problems:
Local indexing
Local variation
Allows augmentation from other sources
Deals with people not in authority files without a pre-existing URI
Although many of these things are possible by simply making use of existing techniques. There is also the danger that the notion of an authority is confused with that of a person. They are not the same thing
Which of these is it? There is evidence to suggest that it could be several.
Person page says: "Individual or identity established by an individual (either alone or in collaboration with one or more other individuals)" but…
Authority says: "Representation of a key concept or thing. Works and Instances, for example, have defined relationships to these concepts and things."
There are also properties hinting at ambiguity or the traditional approach.
authorityAssigner Authority assigner / Entity that assigned the information. Agent
authoritySource Authority source / Authority list from which a value is taken. Literal
hasAuthority Authority information / Link to controlled form of name or subject and other information about.
When knowing WHO is asserting the additional information is important!!!
There is other work which approaches the same problems: named graphs, and an already established Open Annotations Model from the W3C whose outline Bibframe follows but not exactly. This could prove troublesome.
Mooted as a possible way of modelling Holdings and items, but not without criticism.
bf:describes is a subproperty of bf:annotates
bf:annotationAssertedBy: a URI of whoever asserted this was the case. The important bit for Annotations! In this case, UCL
This is an LC MARC record for FRBR: a guide for the perplexed by Robert L. Maxwell.
This is the same book but in Turtle. Difficult to fit on the screen!! I'll show you the whole thing, but split up and highlight a few things as we go along.
Note the LC URI for the Work at the top.
No link to an external authority here, presumably because it's generated on the fly as a conversion.
There are two Annotations here (look for the square brackets).
The first is a link to a table of contents.
The second appears to be information about the creation of the record.
Instance, so has all the physical description, instance title, etc.
Note the ambiguity of the bf:derivedFrom statement which implies that the the Instance is derived from the a MARC XML record.
Publication information is here.
One thing you may have noticed is that all the properties are BF specific…
EXAMPLE ISSUE 3: VOCABULARIES
I've coloured in each line of a Bibframe "record" and you can see that most of it is red, which is the BF vocab.
The orange is RDF
The blue is MADS
We heard about the BNB earlier and this is the same book…
The red here are the terms coined by the BL. All the others are external vocabularies.
I have highlighted the equivalents of the 100 and 245
This is The European Library. They recently converted a huge amount of RLUK data to linked data.
The red here is in fact the RDA element set!
The white is the EDM, and you can see it never made it beyond the prefixes.
The red here are the PL's own terms.
OPL are working on producing and using LD like this for their internal purposes as MARC not fit for their purposes and they felt they couldn't wait for Bibframe to be ready. Do read the article!
This and OCLC in particular point an intriguing way towards not waiting for a vendor to along with something ready for us to use.
Note here that
The Title Proper is from the DCT vocabulary
The Subtitle is from the Fabio vocabulary
The SOR is from the RDA vocabulary!
No information about the Creator here.