List of Participants

Musicological team

  • John Rink (leader)
  • Paul Banks
  • Cliff Eisen
  • Daniel Leech-Wilkinson
  • Jim Samson

Technical team

  • Marilyn Deegan and Harold Short (leaders)
  • Tim Crawford
  • John Bradley
  • Paul Vetch
  • Julia Craig-McFeely

Timetable

25 June 2003

09.30-10.00Coffee
10.00-11.00Presentation by John Rink on the project, followed by discussion
11.00-11.30 Informal discussion over coffee
11.30-12.45 Presentations on recently developed web technologies for image/ text display and interlinking (Harold Short, Marilyn Deegan, Julia Craig-McFeely)
13.00-13.45Lunch (on site)
13.45-15.00 Further presentations on new web technologies (John Bradley, Tim Crawford, Marilyn Deegan)
15.00-16.45Breakout discussions: team 1 team 1 (John Rink = rapporteur) & team 2 (Marilyn Deegan = rapporteur)
16.45-17.15Tea
17.15-18.30 Final plenary session: reports from Marilyn Deegan and John Rink, discussion, wrap-up
19.00Dinner

Workshop Report

Introduction and aims of the project: John Rink

John Rink explained in detail the aims of this pilot project, which is funded by the Andrew W. Mellon Foundation to explore the key issues in the creation of an online variorum edition of the works of Chopin, but which will also elucidate the problems and propose some solutions to the editing of complex textual traditions of the works of other major composers. The project will be known as the OCVE project - Online Chopin Variorum Edition.

The recent work (funded by the Leverhulme Trust) done by Rink and Christophe Grabowski in the preparation of their annotated catalogue has identified 4500 distinct versions of Chopin first editions alone in some 50 libraries worldwide. The Annotated Catalogue provides a detailed description of each first edition and later impressions thereof, facilitating identification of the sources used to prepare them, providing information about the number and chronology of impressions, and paving the way for analysis of the interrelationships between multiple editions.

This vast complication came about because inadequate copyright protection between the principal European countries during the early nineteenth century compelled Chopin to employ different publishers in France, Germany, and England, thus giving rise to three "first editions" of most pieces. Each is unique, as a result of his idiosyncratic editorial methods and ongoing compositional revisions. Further differences arose from the interventions of house editors and proof-readers in successive impressions which until recently have simply been regarded as "first editions" - an error of judgment that has undermined much Chopin scholarship. In particular, German and English publishers tended to tinker incessantly and sometimes intrusively. Scores which can plausibly be described as 'first editions' despite such editorial and other changes continued to be produced until 1879 in Germany and well into the 1890s in England.

There may be over 10 different versions of any one Chopin first edition, and to trace the full publication history of individual first editions requires detailed and labour-intensive manual inspection and comparison of the relevant materials. The versions vary in quality of image as the same plates were used over and over or because of the loss routinely incurred upon lithographic transfer, and the paper was also subject to shrinkage, so that the same plate might produce images of different sizes over time. The publication history of the different versions has been thoroughly investigated within the the Leverhulme-funded project, but ongoing refinement will be required as part of the OCVE research.

Rink also explained that major funding has been requested from the Arts and Humanities Research Board's Resource Enhancement programme in order to create an online resource uniting all of Chopin's first editions in an unprecedented virtual collection. The archive will be drawn from four partner libraries (Bibliothèque Nationale de France, Bodleian Library, British Library, and Chopin Society Warsaw) and eighteen other institutions, totalling 4,345 digital images of Chopin's music. The full score of each first impression will appear along with analytical commentary on particularly significant textual features. In addition, there will be excerpts from the Annotated Catalogue. Innovative methodologies for complex textual interlinking and web delivery of this material will be devised, using advanced imaging techniques allied with relevant open standards for metadata and interface design. The application to the AHRB was submitted in May 2003, and their decision will be announced in November 2003. If unsuccessful, the project team (Rink, Marilyn Deegan and Harold Short) will actively pursue other potential sources of funding for this project. While success with this will greatly enhance OCVE, each project has been conceived and would proceed independently.

There are however similarities between the potential AHRB-funded project and the Chopin Early Editions project at the University of Chicago Library (http://chopin.lib.uchicago.edu/). These are being explored by the project team. The OCVE pilot project will benefit from the Chicago digital image archive in that individual images can be borrowed and scanning definitions and other benchmark references either emulated or developed further.

Although numerous variorum projects exist in the field of textual studies, many of which exploit the latest technologies with regard to image manipulation and collation/cross-referencing across discrete filiation chains, musicology has not yet exploited the application of such technologies to complicated source networks such as those pertaining to Chopin as well as to Bach, Mozart, Beethoven, and other composers. The aim of the current pilot project is to capitalize upon emerging technical capacities for text/image comparison and new music-recognition technologies that allow unprecedented manipulation and comparison of finely-grained musical elements. An online variorum edition of Chopin will be created as a prototype for later, more ambitious research on Chopin and hopefully also on other composers. Its primary scholarly goal is to facilitate and enhance comparative analysis of three categories of source material: manuscripts (sketches, autographs, scribal copies, glosses in student copies, etc.); first impressions of the first editions; and later impressions of the first editions (i.e., those pages of the first editions containing variants, whether attributable to the composer or to others involved in the editorial process). Three types of image manipulation will be investigated by the project:

  • superimposition of printed sources within defined/discrete filiation chains (in order to reveal variants)
  • juxtaposition of excerpts of materials (both manuscript and printed) from single or multiple filiation chains, viewable against chosen "base texts" (with sources for comparison to be chosen from icons) and in isolation (decoupage/montage)
  • combination/interpolation of elements from disparate sources in purposeful collations.

It should be noted that the variorum edition need not stop at first editions: in principle it could include all the Chopin editions that have ever been produced, and it could also incorporate sound recordings for comparison. For use of any of these materials, we would need to deal with issues of permissions and copyright (as indeed for the manuscript and first-edition sources to be dealt with in the pilot project and in the first phase of the OCVE).

Methods

The pilot project will develop in particular the first two types of image manipulation referred to above. The third - combination/interpolation - will not feature in this project itself, although attention will be paid to eventual possibilities as well as potential problems.

Superimposition

It must be stressed that the project will be using images of the musical works themselves, not transcriptions, and one of the questions we will be asking is how to automate the process of comparison given that discrepant elements between different versions within a source pool are often very minor and thus difficult to discern even upon careful inspection. We will therefore investigate ways of visually superimposing printed sources within defined, discrete filiation chains in order to reveal the variants between them, which would be highlighted in colour and/or by other means. The requisite technology is already advanced in non-musical areas, although it will require modification. It might be possible to adapt Aruspix, a technique developed in Switzerland (see www.unige.ch/lettres/armus/music/devrech/aruspix/index.en.htm), the potential application of which will be investigated.

Juxtaposition/comparison

For the pieces chosen for the pilot project (to include Chopin's Preludes Op. 28 Nos. 4 & 20 as well as one large-scale piece, the Barcarolle Op. 60, extracts of which were shown during this presentation), the entire source will be digitized and separated out into bars. At the first stage of the project, the bar will be the smallest piece of information (not the note) because we are looking at the physical attributes of the sources. All bars will be made available, but we will be investigating some way of highlighting those which are particularly significant, given that a user (whether scholar or performer) may want to pull up quickly the variant bars in different versions. We will also investigate automating this process, but at the first stage we will need to do this as annotations. We might also want to build up user annotations to significant bars over time; thought will be given to the logistics and the scholarly implications of this.

As implied above, there will then be two modes of juxtaposition and comparison:

  1. Choosing a base text and comparing with all other versions Given that the sources will be divided into bars, a user can click on one or more bars and pull up all other versions chosen from a menu listing those sources by abbreviation and/or icon.
  2. Searching for a particular bar in a number of sources and show them on screen. Some of the discrepancies between the versions are microscopic (as noted above), so they may need to be clearly marked for the user. The types of investigations we envisage the users making are:
    • Looking for differences on a bar-by-bar basis or by jumping from one bar where variants exist to the next such bar
    • Choosing a base text, then looking at the relevant passage from all the other sources alongside
    • Bringing up the sources in different ways: all the English first editions (or, e.g., all those from a particular English publisher); all the German; chronological order, etc, etc.

In order to offer some of these features, we need to decide how we represent the works via sigla, code (e.g. as in the Annotated Catalogue) and/or in terms of other codicological information.

Such an approach will allow the users to investigate and to reconstruct for themselves the creative history of a composer's work from a number of different perspectives. It was pointed out that this might mean that some users may create bad histories, but the goal of this project is not to police historical investigation or prescribe the 'good' or 'bad' qualities thereof - rather, to make possible for the first time, or at least to facilitate, such investigation.

General discussion

The participants of the workshop were excited at the possibilities offered here, and asked many questions and made a number of useful suggestions. There was debate about who the audience for this might be, and agreement that availability online would draw in new audiences. It was suggested that to give the best information for users of the resource, we need to understand a composer's practice in working with publishers. There was some discussion about the degree to which we would have annotations, commentary or an explanatory framework: if a user wants to call up all the variants, can we make a database of editorial commentary to present with the editions? That had not been part of the original proposal, but perhaps tools could be provided for these to be added later, along with the (admittedly considerable) scholarly information that would have to be compiled. It was noted that there are certain attractions to presenting the material 'neutrally', i.e. without the particular scholarly steer that OCVE researchers would inevitably impose by means of such editorial commentary; the debate launched here will need to be taken forward to weigh up the relative merits of the two approaches. Nevertheless, all agreed that we will need to create a framework within which the user can make informed choices about the sources on the basis of some sort of academic/scholarly apparatus. It was also agreed that the resource would depend greatly on having a repository of good data, and questions about quality of the originals we will work with will need to be addressed. Other issues raised during the general discussion included copyright and permissions, and access to sources (e.g. whether or not libraries and other institutions as well as private owners would allow us to use them).

One possibility noted above is whether the variorum edition could be used by a performer to create their own performance score, by means of the third type of image manipulation that we envisage: i.e. combination/interpolation of elements from disparate sources in tailormade collations. Not only must certain thorny aesthetic questions be addressed here, but there are design implications in terms of providing a screen that can be used at the piano keyboard, all of which will need to be considered in the wake of this pilot project.

Recently developed web technologies for image/text display and interlinking

Harold Short first of all set the technical context for the work of the variorum Chopin project. He introduced the work of the Centre for Computing in the Humanities (CCH) at KCL which is involved in projects in all discipline areas in the humanities. See http://www.kcl.ac.uk/humanities/cch/projects.html for details of these projects. His work at CCH has shown that tools developed for one area are often needed in other areas, and that some of the underlying paradigms can illuminate areas that they weren't constructed for. He suggested that one of the consequences of the accessibility of technologies is that there are now very few scholarly research projects which do not use technology. He showed the example of the Corpus Vitrearum Medii Aevi (CVMA) digitization project at the Courtauld Institute (http://maple.cc.kcl.ac.uk/ps/cvma/). CVMA was established in 1949 to identify and describe medieval stained glass, and in the past, the main output has been high quality conventional publication, with all its limitations. Now CVMA are looking at new modes of electronic publication as well as capturing high quality images of medieval stained glass, much of which derives from Picture Archive of the National Monuments Record (NMR). CCH and CVMA have experimented with republishing existing CVMA volumes in digital form because this way they can represent more images at high quality for scholarly use. Texts are converted to XML, images are scanned and edited, information goes into a database to link text and images. CVMA started as an art history project but the scholarly information is valuable to a much wider audience. The scholarly commentaries here could serve as a model for the ones that participants in the previous discussion asked for in OCVE; likewise the navigational possibilities demonstrated by Short with regard to the ground plan of a medieval cathedral.

Another multi-disciplinary/multi-technology scholarly editing project with some similar aims to OCVE is the William Blake Archive being produced by scholars at three institutions in North America (http://www.blakearchive.org). This is a hypermedia archive which is presenting Blake's writings with their visual content: this is key to the understanding of Blake's poetry. Tools have been added to allow manipulation of images, and it is possible to compare plates in two different editions, something it might be useful for OCVE to investigate. Additional information is available in the form of editors' notes (again, this kind of scholarly support having a potential parallel in OCVE), and the Inote image annotation tool allows the editors to put commentary on various parts of the image (e.g. in pop-up boxes of potential relevance to OCVE).

In discussion, reference was also made to such examples as the Gutenberg Bible and Göttingen Model Book projects (http://www.gutenbergdigital.de/), and online illustrations were provided.

Marilyn Deegan gave an overview of some other editing projects, musical and textual, echoing Short's point that certain paradigms have ramifications beyond their original contexts/purposes and that 'enabling spaces' are now sought in contrast to early problems of transferability.

  • The Digital Beethoven House (http://www.beethoven-haus-bonn.de) intends by 2004 to have created a digital archive of 5,000 documents and 26,000 colour scans. These will derive from Beethoven sources, as well as other materials (photographs of musical instruments and other artefacts). The OCVE team needs to determine what this project and those below will do other than create an archive.
  • The Digital Mozart Edition at the University of Trier will be digitizing La Clemenza di Tito by 2004, and will include critical texts, notes, additional information about the opera. A database to hold all the information and a GUI (graphical user interface) for display of the edition will be developed as part of this process. By the end of 2006, all of Mozart's operas will have been digitized, including all text from the critical edition (4,000 pages plus 6,500 pages of notes) as full text plus images of manuscripts.
  • The Bach Digital Project (http://www.bachdigital.org) has digitized Bach's autograph manuscripts and also created a searchable digital library of Bach sources in a wide variety of formats, such as pictures (manuscripts), texts (background material), music extracts (sound samples), and videos (of the restoration process). IBM's Content Manager software is used to deliver the materials, and the project is a collaboration between a number of universities, libraries, publishers, and software suppliers which could serve as a model for OCVE (and also for the AHRB project referred to above). The issue of copyright is particularly pertinent here, given the high presence of manuscript material held by different institutions/individuals.
  • The Digital Atheneum Project at the University of Kentucky (http://www.digitalatheneum.org) is a partnership between humanities scholars and computer scientists to investigate new technologies for restoring severely damaged manuscripts, searching them as images, and presenting electronic editions for a widely distributed digital library of restored and edited, previously inaccessible, manuscripts. The Electronic Beowulf project was the first manuscript to be published from this project, which is also involved in the development of open source tools for creating editions. (It was explained that software can be proprietary or 'open source', the latter using underlying source codes which can be customised according to users' particular needs. Open source material is not necessarily free of charge: the better products are supported and documented and thus might well carry a cost implication. Furthermore, proprietary sources may use 'open standards', which have to do with how data is processed.) The potential of a navigator editor to take data from image files and to show it in different ways, using a range of tools, was explained - e.g. in the form of sequential access, collation access, table of contents, and thumbnail images. Examples were shown of how passages in different manuscripts could be compared, with boxes drawn around the passages, original excerpts shown against transcriptions etc. A correspondence editor controls the links between manuscript images and transcriptions in a semi-automated way with reference to pre-defined textual units. The implications for OCVE are potentially worth exploring.
  • Scholarly Digital Editions (http://www.sd-editions.com) at De Montfort University developed out of the Canterbury Tales Project. The CT project is particularly interesting in the context of OCVE as it is one of the paradigm projects for the handling of complex manuscript traditions; a free trial edition can be downloaded. Chaucer's major work survives in some 80 manuscripts and early printed editions, and the Canterbury Tales Project is aiming to make full text and page images of this available for collation, comparison, searching, editing etc. SDE has also developed the Anastasia publishing system for taking XML from complex editions and publishing it either to CD ROM or online.
  • The Digital Shikshapatri project (www.shikshapatri.org.uk) being carried out at Oxford University (Indian Institute Library and Refugee Studies Centre) is creating a manuscript study environment for the presentation of two manuscripts of the Shikshapatri (a Hindu religious text) in a complex network of textual, visual and audio-visual explanatory materials. The study environment is being designed to be as generic as possible so that it can be exploited by other projects. It is being developed by CCH and Oxford ArchDigital and some of the tools and techniques developed can therefore be investigated by OCVE (e.g. electronic image-snipping and text/commentary alignment mechanisms). • Related to the Digital Shikshapatri project and the manuscript study environment is the manuscript tool that has been developed by Oxford Archdigital for presenting manuscript translations, notes, etc in interactive form on the web. This can be seen in the Yeavering Site Archive on the New Opportunities Fund project, Past Perfect: go to. Look at http://www.pastperfect.info/sites/yeavering/archive/index.html, choose 'manuscripts', then under 'historical sources' 'Extract from Bede dealing with the conversion at Ad Gefrin with mouseover translation'. Then just mouseover the text to see the translation appear.

It was noted that visits by the OCVE team to at least some of these projects would be useful - e.g. Beethoven, Mozart and/or Bach; alternatively, relevant experts could be invited to London for discussion in a colloquium/small conference.

Julia Craig-McFeely then discussed the Digital Image Archive of Medieval Music (DIAMM) project (http://www.diamm.ac.uk), focusing on topics of particular relevance for the OCVE project. She remarked upon the massive changes from 1996 in terms of web usage which projects can now take advantage of. The main aim of DIAMM is to obtain, archive and, where necessary, enhance digital images of European sources of medieval polyphonic music, digital encoding having been chosen because of its longevity, potential for manipulation and restoration, and convenience and potential availability. The project has been in existence since 1998, and Craig-McFeely remarked that libraries and archives had made a huge leap of faith in allowing the digital photography to happen. Some digital images are ordered from institutions which can provide them; in other cases, DIAMM staff go into the archive and take the images, subject to strictly enforced legal agreements (see below) - a procedure which might or might not be available to OCVE researchers. Where necessary, tools such as Adobe Photoshop are used for the restoration of images from damaged originals: this has been much easier and yielded images of higher quality than originally expected. The potential application to OCVE is obvious, in that many print sources as well as manuscripts are heavily stained, foxed or otherwise visually compromised and would thus benefit from digital clean-up.

DIAMM have attempted to impose its own (high) standards on archives, which is not always easy (and here again the implications for OCVE are clear in terms of quality control and specification/definition of scanning standards). Metadata is not added to the images, other than detailed technical and locational information. For content description, DIAMM relies upon existing catalogue records - though Craig-McFeely pointed out that to date some 30% of catalogue records have contained erroneous information. Scanning is done using a camera back that can deliver 144M pixels, giving file sizes of 350Mb. This means that DIAMM is able to reproduce even large manuscript images at real size. DIAMM has huge storage needs: some of the restored images are 5 x as large as the original, giving sizes of c. 2 Gb each. Issues of storage capacity and delivery capabilities (see below) will of course need to be addressed in OCVE.

Craig-McFeely suggested that there might be a danger in giving too many tools to users because of the possibility that they could come up with bad results if not trained in their use. She also pointed to the difficulty of deciding what is reproduction and what is editing. She suggested that many scholars might not be prepared to put their scholarship on the web as part of projects like DIAMM and OCVE, something that both projects have been considering as part of commentary, annotation, etc. DIAMM was conceived of from the beginning as an archive, NOT a delivery system: it was set up to rescue fragments that were in danger of disappearing, and indeed has unearthed other important fragments of medieval music. However, the audience for DIAMM is very keen that there should be a delivery system, and developments in web technologies since the project started have made this more feasible. DIAMM was accordingly awarded a planning grant by the Andrew W. Mellon Foundation for the investigation of a delivery system. The tools to be investigated will have some overlap with OCVE and the technical developments for both are being led by CCH.

Issues in common between OCVE and DIAMM thus include legal matters (DIAMM's development of licences & detailed legal documentation could be directly applicable), storage/delivery implications, digital imaging standards and modes of capture, implications of making 'editorial' changes to images, the creation of a Scholar's Workbench, print disabling (e.g. PDF rather than JPG), risk of image impairment through compression, the need to combat colour infidelity/inconsistency across users' different display screens (reference was made to the Bach digital thumbnails, which show colourscales and greyscales for calibration purposes), and pop-up content (e.g. page and bar nos. in OCVE).

Tim Crawford presented a wide range of relevant projects in musicological research, which are summarized on his (temporary) web page at http://www.soi.city.ac.uk/~timc/chopin/.

  • First of all, a great deal of work has been done on Optical Music Recognition, and there are a number of nascent systems for this, including MidiScan, Nightingale, and Solero. Some work is also going on in music manuscript OCR, though this is very much in its infancy. There is work being done at the University of Leeds in Optical Manuscript Analysis (http://www.leeds.ac.uk/music/omr/). The particular problem of film scores which are often written in pencil is one of the issues being researched at Leeds. The Lester S. Levy Collection of Sheet Music Project at Johns Hopkins University (http://dkc.mse.jhu.edu/levy2) has made available page images of sheet music with catalogue records, and is now experimenting with recognition of the music using the Gamera recognition too, and representing the underlying structures of the music in the Guido musical representation language. The Gamera software can be trained improve the recognition. The derived music is not for performance, but it aimed at capturing the essence of the pieces and enabling their recognition. Gamera was developed by JHU, and is more flexible than commercial systems: the intention was to capture 29,000 pieces of popular American music. It can be run on Windows.
  • Music representation: there are a number of systems for representing music. Guido Music Notation Format (http://www.salieri.org/guido/) is a formal language for score level music representation. It is a plain-text, i.e. readable and platform independent format capable of representing all information contained in conventional musical scores. Guido is a text file that can be turned into music. MusicXML: Recordare (http://www.recordare.org/xml.html) has developed MusicXML technology to create an Internet-friendly method of publishing musical scores, enabling musicians and music fans to get more out of their online music (http://www.recordare.org/xml.html). Many projects now use MusicXML, especially as an interchange format. There needs to be an automated process for creating XML - this does not have to be perfect but must allow faster processing. There also needs to be coordinate information to map markup back to the notes on the page, and Finale can be used to create publishable music from XML.
  • The Music Encoding Initiative has taken as a model the Text Encoding Initiative (TEI) which over the last 15 years has achieved a high degree of buy-in from the scholarly community for creating tools and standards for the representation of text. The MEI is primarily concerned with musical expression which is in, or can take, a written form. It is limited to Common Music Notation (CMN) and is planned to be software independent. It is not primarily an aid to musical composition just as TEI does not function as an aid in the creation of text. It is intended to provide some tools for the underlying representation of music. The MEI is based at the University of Virginia (http://dl.lib.virginia.edu/bin/dtd/mei/).
  • Plaine and Easie is a code for the representation of musical notation in ASCII characters. It is used by RISM for the representation of musical incipits in its catalogues. Cambridge University has produced a converter to turn Plaine and Easie code into musical notes, and the University of Utrecht has produced pae2xml, which can turn Plaine and Easie code into XML (http://www.cs.uu.nl/people/rtypke/pae2xml/).
  • Other projects that the Chopin project will need to investigate include thematic catalogues (RISM (http://rism.stub.uni-frankfurt.de/index1_e.htm), and RISM searching through Orpheus, which is under development and not yet publicly available), and Music Information Retrieval systems (http://give-lab.cs.uu.nl/MIR/mirsystems/). Also, Aruspix, an entirely automatic system which makes it possible to superimpose two specimens of a musical score and highlight their differences in order to make them much more easily identifiable (see above).
  • Of interest too is Yo Tomita's work on Bach's Well-Tempered Clavier at http://www.music.qub.ac.uk/~tomita/wtc2.html.

John Bradley pulled many of the strands together that were discussed throughout the day with the presentation of a proof-of-concept for OCVE using materials available online from the Chopin Early Editions project in Chicago (specifically the Prelude in C-sharp minor Op. 45 as published by Mechetti and Schlesinger, i.e. the Austrian and French first editions respectively). He first separated the music into bars (using the Map Edit software). He then drew boxes around the bars, attached a label to each one, mapped the bar locations on the page, and created a bar location database which could be used to assemble collections of bars in different versions of the sources and to create a complex system of navigation. In doing this he found a number of problems that would need to be resolved, including the key one of overlapping hierarchies: bars, slurring, stave, etc. He remarked that we are trying to capture two different aspects of the sources: image - what's on the page? music - what is the content? He then experimented with some different ways of presenting the content in a web environment. (In this respect comparison with the Chicago project will be particularly useful, in that similar questions of content definition and mark-up had to be confronted there.) He proposed that we could attach notes to parts of the text, with discrepancies marked up with popup annotations. These could be used to help analyse the sources. He suggested that it would be possible to categorize annotations so that they could be more useful, for instance so that the user could search for different kinds of annotations (e.g. with regard to changes made by Chopin himself, as opposed to house editors). He pointed out the need for a tightly controlled vocabulary and syntax for adding annotations (which the scholarly researchers within the full-scale OCVE project would need to attend to), but suggested that there could also be some free text. He then looked at some aspects of a possible Scholar's Workstation for OCVE (that could be generalized to other projects - cf. the DIAMM discussion above). This would present a database of sources to the scholar, with some editorial materials as part of the sources. The scholar would be able to add his/her own materials which could be stored locally in a personal workspace or 'published' to the edition, clearly marked as being produced by that particular scholar (perhaps going first of all into a 'quarantine' area for peer review). The whole could constitute a series of online collaborative editions with signed up contributors, editors, etc. to which access could be regulated as necessary. The Workstation could also have a tool for creating 'pathways' (cf. Perseus Project) which could be used for giving presentations, teaching, sending to other scholars for discussion points.

Breakout group: technical

  1. The technical team expressed the need to view the project from a higher level of abstraction and to set a research agenda with very broad perspectives initially and mapping out parallel activities as relevant. The presentations and report from the meeting could form the nucleus of this, but it is also important to look at what's going on elsewhere. As well as projects mentioned earlier in the meeting, it was suggested that for instance the work of Frans Wiering (Utrecht) on the Thesaurus Musicarum Italicarum (http://pcm1671.cs.uu.nl/) and the Indiana Variations 2 project (http://www.dml.indiana.edu/) should be investigated
  2. More detailed work on what level of granularity the project should be aiming for is also needed. This would include investigating markup, music recognition, image processing, music display, experimentation with capture and digitization. We need to look at what is already available and what tools need to be built for the project. It was stressed that for the pilot project, orders for images should be put in to libraries as soon as possible, and that we need to get something from each library soon in order to test out the working relationships between the project and the libraries. We also need to look at the rights issues very soon, and begin working out model license agreements. These will be very important for the next phase of the project and will be one of the core issues to be worked out for the next grant proposal. DIAMM have a great deal of experience in working with copyright and permissions with a whole range of archive institutions.
  3. The team discussed accessibility issues, especially for the visually impaired. The project needs to ensure 'Bobby' compliance: Bobby is a set of formal web accessibility guidelines which comply with the World Wide Web Consortium (W3C) guidelines. The Bobby Online Free Portal allows projects to test web resources against these guidelines (http://bobby.watchfire.com/bobby/html/en/index.jsp) and it creates a report on problems. Things to remember in order to be complaint are that we need to make annotations available as sound files, provide audio capability, and think of appropriate colour combinations so that colour-blindness is not a problem.
  4. Design issues - we discussed how to represent the complex data in a way that users can navigate it. We felt that perhaps we need an information engineer or experienced web architect. The complex issues to resolve in the Chopin project include:
    • How do the information objects relate to each other?
    • What do they mean as information objects rather than content objects?
    • Does the user decide how to contextualize the various pieces of information or is this preset by the project?
  5. The need to set up a project web site and to establish dissemination plans was flagged - these should be done as soon as possible.
  6. The need for some in-depth user studies was discussed - a project such as this could have many different uses for different audiences, so we need to work out how to make materials accessible to these audiences, using pathways, different interfaces etc. and addressing the need for a single product which can serve as both an electronic archive and an electronic 'edition'. Users might include researchers, students, performers, historians of publishing, librarians, archivists, and many others. A number of institutions and projects are grappling with these issues: British Library, Corpus Vitrearum Medii Aevi digitization project, DIAMM, etc. We need to follow up these projects.
  7. Other technical issues discussed were: • music OCR, though the opinion was expressed that scholars will never be happy will output from this. • XML: do we need to work with this as an underlying representation? Should we use offset markup? • OMRAS, which has developed proximate matching which can be used as the basis for searching. • Data complexity: we need to deal with a mass of data and so we may want to use database technology to manage this. CCH are experts at this.

Breakout group: content

  1. The first issue has to do with the OCVE's users: who are they and how will they use the resource? We are developing an archive that users could, in principle, create editions from; we are not however striving for a single 'edition' which imposes on its users an exclusive understanding of the sources via comprehensive, in-depth textual commentary and other extramusical apparatus. Instead, we will create a somewhat 'neutral' resource in the sense described earlier; we will also develop tools (although not necessarily at the pilot stage) for adding notes, commentaries etc. to the extent that these might be useful (e.g. to performers, whose special needs were discussed at length) without being dictatorial. A three-part model was therefore suggested:
    1. A general introduction outlining the range of problems. This would be rather like a scholarly article, exploiting the expertise of OCVE researchers and those associated with the project.
    2. Sample pieces where users are taken through the problems - this would be a kind of Critical Commentary, in the form of pop-up boxes illustrating simple and complex case-study examples alike.
    3. The general resource pool, which would be neutral while also providing tools for users to do their own work. This might include some potential for signposting or signalling, some of which could be provided by the project, while others could be put in by users using tools provided by the project. Most of this model will be developed in the wake of the pilot project, though some textual commentary will be prepared even at this stage for the sake of experimentation and the eventual development of parts 2 and 3 of the above model.
  2. What sources and what types of source should be included in the full-scale OCVE? An indexing tool is needed, ideally with the capacity to differentiate between sources directly associated with Chopin, those used by his students, those prepared by copyists or even by specific copyists (e.g. Julian Fontana), those used in production (Stichvorlagen) or prepared as presentation manuscripts, etc. Having a searchable version of the Annotated Catalogue would be extremely beneficial to users (with relevant records attached to each source); permission to use this will need to be sought from CUP in due course, and robust search engines will have to be developed. It was suggested that a piece-by-piece approach to the materials without elaborate cross-referencing capabilities would not exploit the potential fully, but the advantage of the electronic medium is that we can incorporate and exploit many kinds of approaches. In later phases of the project we may want to add in other materials such as secondary sources and transcriptions of performers' versions.
  3. There are a number of structural differences between the music: some of the sources are sketches, and although most are editions there are many different versions. Each of the different sources for a given work may well have a unique layout or other physical idiosyncrasies. There are different ways of manipulating elements within the sources, all of which will need to be addressed in the OCVE. For instance, in dealing with sketch material (which poses particularly thorny problems), one might search for the presence of 'sketch fragment x' across the published material (which might appear, say, in bars 1, 5, 16, 93 etc.); conversely, one might search for all versions of 'bar y' across the sketch material (which might appear, say, in sketch fragments a, c, r, v etc.). One might also wish to compare parallel passages within a given piece, not to mention larger-scale variants (as in the case of Bruckner) than those common in Chopin. The relationships between such isolated materials will need to be made explicit, and ways of signalling whether there is no material in other sources relevant to a given fragment or bar, or conversely if there is a multiplicity of corresponding material across the sources will have to be developed. One of the key points that the content group brought out was that more scholarly input would be required than was originally envisaged, and this will have a bearing on the eventual application for funding the full-scale OCVE.
  4. Limitations of screen size for manipulating music were discussed. This means that it we want to extend the project to work with orchestral and operatic scores this could be problematic. It might be possible to separate the scores into different instrumental parts or families of instruments (e.g. all string parts might be compared, subject to screen size issues). Ongoing consultation with experts on different sorts of music - ranging from early to complex 20th-century works - will need to take place both within and after the pilot project, in order to cater for a range of notational possibilities. For piano pieces, we should consider defining the separate RH and LH parts as objects as well as dividing by bars.
  5. Long-term sustainability: the project should have a long shelf life if the issues are thought through from the beginning. The issues of sustainability need to be planned in the pilot and will include both scholarly and technical maintenance. Using XML will help to maintain data but what about all the complex interlinking? Where will the home for the Chopin archive be? Does it matter if this survives? Presumably Mellon will want to build a critical mass of music materials (cf. MusicSTOR) with scholarly portals?
  6. Would we want to add in historic recordings later? Would it be possible to match these up with the sources digitised within the Variorum to find out what editions were used? The pilot could perhaps have a small scoping/feasibility study for this, though more probably this will need to wait for the full-scale OCVE project (at which time the AHRB Centre for the History and Analysis of Recorded Music, which involves not only the OCVE Project Director but one member of its Advisory Panel, will be well underway). It is relatively easy to digitize sound from 78s and LPs, and the copyright situation is straightforward in that under UK law it remains in force for only 50 years from the date of issue, so everything before 1953 is in the public domain. Some issues that were discussed re sound recordings were:
    • What about commercial involvement?
    • How much post-processing would need to be done?
    • Would sound recordings be interesting to Mellon?
    • What would be the minimum amount of recorded material for extracts to be musically meaningful (e.g. '8-bar sentence')?
  7. Final issues: further funding; the need for routes into the project for a diverse range of skills; the need to have a portfolio of projects and partners for potential funders.