CL banner.JPG


Workshop registrations will close at 5pm Irish Standard Time on 8th July.

Workshop One

Date: Monday 12th July 2021

Time: Morning (Irish Standard Time: UTC + 1 Hr)

9th Workshop on

Challenges in the Management of Large Corpora (CMLC-9)

Special Topic: Design and Management of Research Software


The upcoming CMLC meeting continues the successful series of “Challenges in the management of large corpora” events, previously hosted at LREC (since 2012) and CL (since 2015) conferences. As in the previous meetings, we wish to explore common areas of interest across a range of issues in linguistic research data and tool management, corpus linguistics, natural language processing, and data science, with a special focus on tools, this time.

Linguistic research software and other topics of interest

To an even greater extent than in other disciplines, linguistic research data can hardly be used without the help of appropriate research software. As frequently noted at CMLC events, this often relates to the need for client/server approaches, as language data cannot usually be downloaded and processed on the home or lab PC, for legal and logistical reasons. Additionally, due to the complexity and high dimensionality of linguistic data and the unknown nature of the variation factors, specialized tools are needed on the way from raw data to their interpretation. These tools cannot be considered part of a general technical infrastructure.

Starting with the reconstruction or transformation of the raw data and e.g. its tokenization, the linguistic assumptions and decisions, as well as errors, manifested in research tools have as much influence on observations and possibly on research results as the research data itself – if data and tools can be treated separately at all. While approaches to the management of research data have been discussed quite broadly in the last 15 years, this was at best only marginally the case for research tools.


For more information, please visit the workshop homepage:

Workshop Two

Date: Monday 12th July 2021

Time: Afternoon (Irish Standard Time: UTC + 1 Hr)

#LancsBox: Large corpora, XML and automatic research reports (half-day practical workshop)

#LancsBox is a freely available cross-platform corpus analysis toolbox for researchers and educators. In this workshop, we will focus on new features of #LancsBox (5.1.2 and #LancsBox X), as well as methodological and design decisions which guided the development #LancsBox X. This discussion will be contextualised within the framework of the most recent debate in corpus linguistics about appropriate methodological solutions for large datasets.

#LancsBox can be used by linguists, language teachers, translators, historians, sociologists, educators and anyone interested in quantitative language analysis. Participants will be given data (corpora), interactive exercises and an early release of #LancsBox X. The workshop is open to anyone interested in innovative exploration of language using computational tools. It does not presuppose any technical or statistical knowledge.  You will learn to:

  • Search corpora efficiently using CQL, smart searches and complex searches.

  • Define subcorpora.

  • Compare multiple corpora.

  • Automatically produce research reports.

Technical requirements: a computer (desktop or laptop) with a minimum of 4GB RAM. #LancsBox runs on Windows, Mac and Linux. Participants will be provided with links to installation packages of #LancsBox X and #LancsBox 5.1.2 prior to the workshop.

Workshop Fees

If you are already registered to attend the conference: 

Both of the these workshops are free to attend for those who register in full for the entire conference. A link will be sent in due course to existing delegates for them to register separately for a workshop(s).

If you are not registered to attend the conference, but wish to register to attend the workshop(s):

A nominal charge of €20.00 per workshop (or €30.00 to attend both workshops) will apply to individuals who are not registered for the full conference, but who wish to attend a workshop(s).  Please email: for a registration and payment link.