Data Basics and Open Refine for Conservation Professionals

  • Registration Closed

September 16 - October 14, 2021
Live sessions on Thursdays at 12:00 - 2:00 p.m. EST
Led by Kelly Davis

This five-week workshop is an introduction to handling data and Open Refine for conservation professionals. The first lesson will focus on data etiquette and best practices, formats and types to inform your data collection and research, and will build up a shared vocabulary to better converse with technical colleagues. The remainder of the workshop will focus on using the free data management platform, Open Refine, to explore and analyze your data. This will include faceting and editing via the program, as well as utilizing the built-in reconciliation function to reconcile terms and creators to external vocabularies. Considerable time will be spent actively working in small groups with discussion.

The workshop concludes with a mini-project and ‘office hours’ to discuss participant's specific concerns one-on-one with the instructor/assistant. The final live session will feature case studies and may allow time for participants to present their projects. By the end of the workshop, participants will be able to manage their research data according to best practices, and to analyze that data with Open Refine. The workshop works best if students have a dataset of their own to use, but one can be provided if necessary.  

The live sessions for the workshop will take place in Zoom and automated captions will be available. All live sessions will be recorded and accessible to participants shortly after the session is complete. See Instructor and Workshop Outline tabs for more information.


This program is supported by a collaborative agreement between the National Center for Preservation Technology and Training (NCPTT) and the Foundation for Advancement in Conservation (FAIC). FAIC was created by a grant from The Andrew W. Mellon Foundation and is supported by donations from members of the American Institute for Conservation and its friends. Workshops are made possible with the assistance of many AIC members, but no AIC membership dues were used to create or present this course. Without support, the registration fee for this workshop would be $395. FAIC relies on your contributions to support these and its many other programs. Learn more about donating to the foundation here.

Thank you to the dedicated members of the FAIC Collaborative Workshops in Photograph Conservation Advisory Committee for their work in organizing this workshop.

Kelly Davis

Kelly Davis is a data manager of the Getty Provenance Index, a database for the field of collecting and provenance. She completed her Master’s from Pratt Institute in 2014, in the fields of Library Science and Art History. Her work is focused on updating and maintaining an excellent research tool by standardizing metadata, conforming to internal schema and reconciling entities. She’s known as an “Open Refine guru” at the Getty, and works frequently with that program and other data science methods.

Working with Data (“1s and 0s”): September 16

  • Data types (strings, booleans, floating point, integers, etc.)
  • Spreadsheet terminology (cells, rows, tables, headers, etc.)
  • Character encoding - what is it and why does it matter?
  • Data formats/serialization - why it’s good to use .csv, etc.
  • Nulls vs blank/empty and why does it matter?
  • Regular expressions (regex) - what they do, where do they appear?
  • Activity: Identifying data types with provided worksheet

Exploring and Standardizing Data with Open Refine: September 23

  • Live demonstration, creating a project in Open Refine
  • Faceting - built in functions and some basic GRELs (Refine regex)
  • Editing and clustering - built in functions and some basic GRELs
  • Activity: Following short list of recipes for Open Refine with participants' own dataset or a provided dataset

Vocabularies and Reconciliation: September 30

  • What are vocabularies and why you should be aware of them - benefits of aligning your data with others
  • Standard vocabularies in the field, how to access and use them
  • Live demonstration, Open Refine reconciliation service
  • Activity: Explore pros/cons of reconciliation options (VIAF, Getty Vocabs, Wikidata, etc.)

 Mini Project (no live session)

  • Participants work with their own dataset or one provided, draft project document that describes dataset with cleaning and reconciliation goals. Create Refine project and complete as many cleaning steps as possible with focus on columns for reconciliation (could be artist names, materials, or something else). Define standards to reconcile to and why they chose. Opportunity to schedule a meeting with instructor/assistant to discuss progress and challenges

Case Studies and Discussion: October 14

  • How have these ideas been applied in practice?
  • What were the experiences from the mini-project?
  • What implications does this content have on your ways of working? 
  • Optional: Submit results from the mini project to get feedback from instructor/assistant

Key:

Complete
Failed
Available
Locked
Participant List
Open to download resource.
Open to download resource.
Lesson 1
Lesson 1 Overview
Open to download resource.
Open to download resource.
Lesson 1 Data Types Worksheet
Open to download resource.
Open to download resource.
Lesson 1 Reading: Tidy Data
Open to download resource.
Open to download resource.
Session 1
09/16/2021 at 12:00 PM (EDT)  |  Recorded On: 09/17/2021
09/16/2021 at 12:00 PM (EDT)  |  Recorded On: 09/17/2021
Lesson 1 Slides
Open to download resource.
Open to download resource.
Lesson 2
Lesson 2 Overview
Open to download resource.
Open to download resource.
Class 2 Group Activity
Open to download resource.
Open to download resource.
FAIC dataset
Recorded 09/14/2021
Recorded 09/14/2021
Cheat Sheet #1
Open to download resource.
Open to download resource.
Cheat Sheet #2
Open to download resource.
Open to download resource.
Cheat Sheet #3
Open to download resource.
Open to download resource.
Session 2
09/23/2021 at 12:00 PM (EDT)  |  Recorded On: 09/23/2021
09/23/2021 at 12:00 PM (EDT)  |  Recorded On: 09/23/2021
Lesson 2 Slides
Open to download resource.
Open to download resource.
Lesson 3
Lesson 3 Overview
Open to download resource.
Open to download resource.
Lesson 3 Reconciliation Activity
Open to download resource.
Open to download resource.
Session 3
09/30/2021 at 12:00 PM (EDT)  |  Recorded On: 09/30/2021
09/30/2021 at 12:00 PM (EDT)  |  Recorded On: 09/30/2021
Lesson 3 Slides
Open to download resource.
Open to download resource.
Mini Project Assignment
Open to download resource.
Open to download resource.
Mini Project Week
Mini Project Week Overview
Open to download resource.
Open to download resource.
Office Hours Sign-up
Select the "Sign up here" button to begin.
Select the "Sign up here" button to begin.
Office Hours
10/07/2021 at 12:00 PM (EDT)  |  120 minutes
10/07/2021 at 12:00 PM (EDT)  |  120 minutes
Lesson 4
Lesson 4 Overview
Open to download resource.
Open to download resource.
Session 4
10/14/2021 at 12:00 PM (EDT)  |  Recorded On: 10/15/2021
10/14/2021 at 12:00 PM (EDT)  |  Recorded On: 10/15/2021