New Ways of Working: Tools for 21C Research

This workshop will provide a space for the AI community to share and discuss innovative, non-standard ways of working and custom workflows that go beyond just using distinct tools, or integrate such tools in an ingenious way. For example, to (partially) automate academic tasks such as paper writing, figure generation, composition of experimental tables, event planning, email filtering (and email alternatives), etc.

📺 You can view a summary and the recording of the workshop toward the bottom of this page. Alternatively, follow the YouTube link placed at the top of this page. 📺

TAILOR – Trustworthy AI through Integrating Learning, Optimisation and Reasoning – is a network of over 54 partners across Europe, spanning universities and companies interested in pushing the boundaries of artificial intelligence research. A part of this effort (Workpackage 9, in particular) is dedicated to identifying and disseminating technologies and workflows to streamline academic tasks. For example, we’ve been working on a platform to generate (interactive) slides, computational notebooks and online articles from a single markdown source – see our ICLR 2021 workshop paper and its accompanying online exhibit for more details. To understand the broader need for novel, technology-supported (possibly AI-enabled) tools within academic and research-oriented circles, we are organising this workshop for TAILOR members and the broader AI community to present their own bespoke workflows, signal what is needed but not yet available, or simply participate in the discussion. If you are a forward-looking AI researcher who believes technology should adapt to research practices rather than stand in the way of innovation, we want to hear from you! In particular, the workshop aims to explore demonstrations of custom approaches to:

integrating and automating various stages of paper writing;
managing experiments;
managing and integrating source code associated with academic outputs;
little-known tools (and your adaptation thereof) that make academic life easier;
bots – e.g., Slack, GitHub – to automate mundane tasks;
and much more.

If you would like to contribute to the discussion, there will be a dedicated space in the latter part of the workshop for impromptu presentations (no need for a slideshow). We welcome brief descriptions of custom workflows and outlines of envisaged workflow ideas that you would like to discuss in a broader group. We are interested to hear about every bespoke workflow or tool, however small. Also, make sure to invite your colleagues and collaborators.

Schedule

The workshop is scheduled for the 10th of September 2021 between 13.00 and 15.00 CEST.

13.00–13.15: Welcome and Opening Remarks

Speaker: Peter Flach, University of Bristol

Outline: Embracing new (AI-powered) technologies and workflows in academic research.

Bio: Peter Flach has been Professor of Artificial Intelligence at the University of Bristol since 2003. An internationally leading scholar in the evaluation and improvement of machine learning models using ROC analysis and calibration, he has also published on mining highly structured data, and on the methodology of data science. His books include Simply Logical: Intelligent Reasoning by Example (John Wiley, 1994) and Machine Learning: the Art and Science of Algorithms that Make Sense of Data (Cambridge University Press, 2012). From 2010 until 2020, Prof Flach was Editor-in-Chief of the Machine Learning journal, one of the two top journals in the field that has been published for over 25 years by Kluwer and now Springer. He was Programme Co-Chair of the 1999 International Conference on Inductive Logic Programming, the 2001 European Conference on Machine Learning, the 2009 ACM Conference on Knowledge Discovery and Data Mining, and the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases in Bristol. He is a founding board member and current President of the European Association for Data Science. He is a Fellow of the Alan Turing Institute for Data Science and Artificial Intelligence.

13.15–13.30: SWI-Prolog SWISH for Online Material and Language Extensions

Speaker: Fabrizio Riguzzi, University of Ferrara

Outline: In this talk I will discuss how to use the SWISH web application of SWI-Prolog for:

embedding source code in online material (e.g. Learn Prolog Now, Simply Logical);
support languages that are extension of Prolog (e.g. CPLINT).

For more details, please see this arXiv paper.

Bio: Fabrizio Riguzzi is Full Professor of Computer Science at the Department of Mathematics and Computer Science of the University of Ferrara. He was previously Associate Professor and Assistant Professor at the same university. He got his Master and PhD in Computer Engineering from the University of Bologna. Fabrizio Riguzzi is vice-president of the Italian Association for Artificial Intelligence and Editor in Chief of Intelligenza Artificiale, the official journal of the Association. He is the author of more than 150 peer reviewed papers in the areas of Machine Learning, Inductive Logic Programming and Statistical Relational Learning. His aim is to develop intelligent systems by combining in novel ways techniques from artificial intelligence, logic and statistics.

13.30–13.45: Creating Taxonomies and Ontologies with SONNET

Speaker: Romy van Drie, TNO

Outline: Developing ontologies is an expensive and intensive task. The goal of SONNET (Semantic Ontology Engineering Toolset) is to assist in developing taxonomies or ontologies. SONNET contains algorithms to kickstart the ontology development using a set of relevant documents. In this talk, I give a demonstration of this toolset. I showcase an algorithm to generate key concepts, as well as one to generate a first version of an ontology.

Bio: I’m a scientist at TNO with a background in AI and linguistics. In my work, I primarily focus on Natural Language Processing.

13.45–14.00: Managing Machine Learning Experiments with OpenML

Speaker: Joaquin Vanschoren, Eindhoven University of Technology

Outline: Is massively collaborative machine learning possible? Can we share and organize our collective knowledge of machine learning to solve ever more challenging problems? In a way, yes: as a community, we are already very successful at developing high-quality open-source machine learning libraries, thanks to frictionless collaboration platforms for software development. However, code is only one aspect. The answer is much less clear when we also consider the data that goes into these algorithms and the exact models that are produced. A tremendous amount of work and experience goes into the collection, cleaning, and preprocessing of data and the design, evaluation, and finetuning of models, yet very little of this is shared and organized in a way so that others can easily build on it.

Suppose one had a global platform for sharing machine learning datasets, models, and reproducible experiments in a frictionless way so that anybody could chip in at any time to share a good model, add or improve data, or suggest an idea. OpenML is an open-source initiative to create such a platform. It allows anyone to share datasets, machine learning pipelines, and full experiments, organizes all of it online with rich metadata, and enables anyone to reuse and build on them in novel and unexpected ways. All data is open and accessible through APIs, and it is readily integrated into popular machine learning tools to allow easy sharing of models and experiments. This openness also allows a budding ecosystem of automated processes to scale up machine learning further, such as discovering similar datasets, creating systematic benchmarks, or learning from all collected results how to build the best machine learning models and even automatically doing so for any new dataset. We welcome all of you to become a part of it.

Bio: Joaquin Vanschoren is an assistant professor at the Eindhoven University of Technology (TU/e). His research focuses on the automation of machine learning (AutoML) and Meta-Learning. He co-authored and co-edited the books ‘Automatic Machine: Methods, Systems, Challenges’ and ‘Meta-learning: Applications to AutoML and data mining’, published over 100 articles on these topics, and received an Amazon Research Award, an Azure Research Award, the Dutch Data Prize, and an ECML PKDD demonstration award. He founded and leads OpenML.org, an open science platform for machine learning. He is a founding member of the European AI associations ELLIS and CLAIRE, chairs the Open Machine Learning Foundation, and co-chairs the W3C Machine Learning Schema Community Group. He has been tutorial speaker at NeurIPS and AAAI, and has given more than 20 invited talks, including ECDA, StatComp, IDEAL, and workshops at NeurIPS, ICML, and SIGMOD. He is datasets and benchmarks chair at NeurIPS 2021, program chair of Discovery Science 2018, general chair at LION 2016, demo chair at ECMLPKDD 2013, and he co-organizes the AutoML and Meta-Learning workshop series at NeurIPS and ICML from 2013 to 2021.

14.00–14.15: Adapting and Extending the Jupyter Book Ecosystem to Bespoke Presentation Needs

Speaker: Kacper Sokol, University of Bristol

Outline: In this talk I will discuss various customisations and extensions of the Jupyter Book ecosystem that allowed us to create bespoke online learning materials. In particular, we built plugins to embed non-standard interactive code boxes (SWI Prolog, CPLINT, ProbLog), and linked exercise–solution blocks. We also explored the reveal.js open source library to generate slides from Markdown sources intended as online articles, and experimented with the RISE Jupyter Notebook plugin that enables creation of interactive slide shows. All of these tools come together to create a suite of engaging online learning materials.

Bio: Kacper is a Senior Research Associate at the University of Bristol, working with the European Union’s Horizon 2020 TAILOR project – an ICT-48 AI Research Excellence Centre. His main research focus is transparency – interpretability and explainability – of data-driven predictive systems based on artificial intelligence and machine learning algorithms. In particular, he has done work on enhancing transparency of logical predictive models (and their ensembles) with counterfactual explanations, and building robust modular surrogate explainers. Kacper is the designer and lead developer of FAT Forensics – an open source fairness, accountability and transparency Python toolkit. He is also the lead author of a collection of online interactive training materials about machine learning explainability (to be published in December 2021) created in collaboration with the Alan Turing Institute – the UK’s national institute for data science and artificial intelligence.

14.15–15.00: Discussion and Impromptu Presentations

Moderators: Peter Flach and Kacper Sokol, University of Bristol

Outline: Developing, disseminating and adapting tools for 21st-century research.

Workshop Summary and Recording

This workshop discussed modern tools that may help researchers to do their job more effectively. It covered a range of workflows including online interactive programming language environments, taxonomy and ontology development toolkits, online collaborative machine learning platforms, and novel (scientific) publishing paradigms. In particular, the presenters discussed the following tools and resources:

Organisers

Kacper Sokol and Peter Flach
University of Bristol
Workpackage 9, TAILOR