📝🐙 Organising and documenting collaborative digital methods projects

In the context of collaborative digital methods projects which may often involve researchers and students working together with external issue experts - how does one keep track of who is doing what and who has done what?

While natural and computer scientists may talk of reproducibility, how can more interpretive styles of collective inquiry be accounted for?

This document outlines different practices approaches for organising, documenting and accounting for collaborative digital methods projects.

🐜 Organising collaborative work at a distance

How do we organise ourselves online?

When doing collaborative work online it is useful to have a channel for text chat in order to share links and coordinate (e.g. Slack, Teams, Mattermost, IRC). While you may not want the channel to be “public”, you may also want to choose something that external guests and issue experts are able to join.

As well as having a channel for everyone involved, you may also wish to have separate channels per project or even per sub-project to facilitate more focused exchanges while doing interpretive work.

In addition to a text-based chat channel you may wish to use video conferencing (e.g. Zoom, Jitsi) for introductions, instructions, group discussions, presentations and other things. Video channels can either be “left open” (for ongoing ambient connection), used for scheduled check-ins, or some combination.

Text based chat can be useful to coordinate work across different time zones, which can be accompanied by video moment when participants first join or for “handovers” between participants in different time zones.

📖 Taking notes together

What was the third thing we found again?

Throughout all of these individual, small group and collective moments it can be helpful to keep a diary as you discuss, notice things and learn together.

You may have a “group diary”, where you keep a chronological record of everything you do in the different working sessions, as well as breaking out into other documents (e.g. per dataset). If you do fork off, it can help to add a link from the main group diary so that you are able to more easily retrace your steps.

🗂 Set up a shared workspace

How can I share this with you?

An important step is having somewhere to put documents, datasets and other materials so that you can work on them together. This could be using free cloud services (Google Drive is a common choice), university infrastructure or otherwise.

What should be included? Here is a common template from the Digital Methods Initiative, suggesting sub-folders for “data”, “graphics” and “slides” as well as a spreadsheet with contact details (for keeping in touch), documentation from the project and a report template.

One can also include other kinds of document templates (e.g. slide presentations), or more elaborate folder structures.

The most important thing is that you agree on an approach that works for everyone and that lets all involved have access to different materials, rather than exchanging them in a more “ad hoc” way via other channels.

Once you have agreed on how to do this, everything should be gathered in the shared workspace here rather than being scattered across different machines. This is especially important when it comes to revisiting or building on the fruits of collective labour. 🍓

💾 Documenting datasets

How can we work on this at the same time?

For a start, if you are using digital methods tools to collect or export datasets, you may want to ensure that you are working with the same datasets, versions of datasets or datasets with the same parameters.

Also some tools may be slower if multiple people are doing multiple operations at the same time.

🗄 Archiving original exports

What is this based on?

For these and other reasons you may consider:

Rather than editing these originals you can keep them in the original “exports” folder and make copies into separate folders in order to modify, analyse and work with them.

🔖 Making a data diary

What changes did you make?

As well as these original exports, you may wish to copy, transform, cut, combine and otherwise change them.

So that others can see what you’ve done (and so that you can trace your steps later), you can consider keeping a “data diary” or data “README” to document transformations and operations. This can also be used as a device to attend to and surface the micro-decisions you’ve made in assembling and curating your data in order to address different questions or align with different analytical concerns.

Here is an example template from digital methods work at KCL, with “data operations” in the first section” and a second section with observations, findings, questions and working notes on the level of the dataset:

Here is an example of brief “data notes” on the export, cleaning and curation of a Twitter dataset from TCAT:

📐Using spreadsheets to track transformations

Do you have an earlier version?

As well taking note of your data operations, you may wish to “go back” to versions of your datasets somewhere in between the exports and versions that have been edited for particular purposes or lines of inquiry. For example looking at 100 rather than 50 URLs.

There are lots of ways of keeping track of changes to datasets, but one simple way is to use separate sheets within a spreadsheet in order to keep track of different “versions” or successive transformations of a dataset.

For example, here’s a Twitter data export that has been transformed and curated in order to produce a RankFlow diagram, with different “cuts” of the dataset in different sheets:

☃️ Freezing shared lists and datasets

If you have lists and datasets which multiple people and/or groups are working with at the same time, it may be preferable to either finalise these before parallel group work begins or to do this at the start of the workshop and then freeze them, to avoid having to redo/refresh other datasets and research which is based on this.

✂️ Make a longer version and a shorter cut

Where did that graphic go?

When organising research across a group it can be useful to keep notes, versions and materials in order to be able to retrace your steps and also to understand what each other have done in the context of interpretive group work.

For the purposes of presenting your work you’ll often want to focus on a much smaller subset of some of the most interesting things you’ve found (rather than all of the things that you did).

In order to be able to accommodate both of these scenarios, for example with accounts of your work or slide decks, you can make a longer version to gather everything together, and then make a much shorter cut (a “radical edit” as Richard Rogers puts it) to boil it down to the most interesting and compelling material.

☕️ Being mindful of time

When is the next coffee break?

One of the sayings from Digital Methods Winter and Summer Schools is “take breaks seriously”.

This is to make sure that you give yourself the rest in order to feel refreshed and alert over several periods of focused concentration, but also to have collective moments together as a group beyond the immediate tasks at hand.

Plans may change and things often will not unfold as you expect, so it is also important to look at how much time you have, and make plans that work with rather than against the amount of time that you have available.

🐌 Curating an environment for interpretive work

Shall we split up and reconvene at 1pm?

Digital methods projects often oscillate between having collective moments to make plans and reflect together, as well as moments of individual, pair or small group work in order to “spend time with your data”.

It can be important to make an environment for interpretation where everyone is able to focus, which may involve planning sessions of focused work, interspersed with moments to discuss and plan with others.

If you get stuck or feel as if you are going around in circles, it can make sense to try breaking into smaller groups to discuss - or to try out different approaches.

For larger groups it can also make sense to break into smaller groups, and then even sub-divide further into individual work, pairs or threes for particular tasks.

🐙 Inspiration, acknowledgments and contributors

Many of the practices, approaches and conventions in this document were originally developed in the context of at data sprints with the Digital Methods Initiative. They were also developed in the context of the EMAPS project – and have been subsequently developed, supplemented and adapted by other researchers and centres associated with the Public Data Lab. Many of the points about time, scheduling and making an environment for focus and collective work come from discussions and interviews with Richard Rogers. The approach to “staying in the spreadsheet” and documenting datasets comes from teaching at DensityDesign Lab and approaches for documenting data were further developed through collaborations with the Department of Digital Humanities, King’s College London.