The CONCEPTS 2026 Data Analysis Showcase (CDAS) focuses on discovery, interpretation, and hybrid analysis using tools and methods from the realm of conceptual knowledge structures.
This workshop aims to bridge the gap between theoretical results of structural conceptual analysis, real-world data, and modern machine learning methods. We provide three datasets and challenge you to demonstrate the power of your methods and tools. We particularly encourage hybrid approaches that pair, for example, Formal Concept Analysis (FCA) with sub-symbolic methods such as embeddings, or conceptual visualizations with modern embedding models, or any other interesting combination.
The Datasets
We provide three distinct datasets representing different analytical challenges. Participants are required to provide an analysis for at least one of these datasets, though we encourage submissions that explore multiple datasets.
Dataset A: Geopolitical Evolution (Pure Binary)
- Type: Formal Context (CXT/CSV).
- Source: Extracted from Wikidata, focusing on diplomatic relations and organizational memberships.
- Goal: A “pure” FCA dataset. Uncover geopolitical clusters and identify “conceptually stable” vs. “conceptually volatile” nations.
Dataset B: Global Country Indicators (General/Numerical)
- Type: Multi-valued tabular data (a pre-scaled formal context is also provided).
- Source: Socio-economic and sustainability-related indicators per country, extracted from Wikidata (population, Human Development Index, life expectancy, fertility rate, democracy index, number of official languages, number of diplomatic relations, continent).
- Goal: A general dataset requiring scaling or pattern structures. Identify non-trivial dependencies between development and sustainability indicators and explore how conceptual structures may reveal new insights.
Dataset C: Eurovision Song Contest Results
- Type: Multi-valued tabular data.
- Source: Contest results 1956–2023 (country, performer, song title, placements, and the modern televote/jury point split), derived from the public Eurovision Song Contest Dataset (Spijkervet et al.).
- Goal: Reveal underlying structures and dependencies, in particular with respect to song titles. We encourage using machine learning embeddings to derive categorical features based on semantic similarity.
Submission Formats
We favour modern, contemporary presentation formats. A submission must include the equivalent of at least 2 pages of results/analyses for at least one dataset (in LNCS style), and may be supplemented or presented in full as:
- Scientific Report: A 2–6 page PDF detailing methodology and findings.
- Interactive Blog Post: An online article (e.g., personal blog) featuring interactive visualizations.
- Living Repository: A public Git repository (GitHub/GitLab) containing well-documented Jupyter/Observable notebooks and source code.
Regardless of the format, the submission must clearly describe the methodology, the results, and any hybrid pairings (e.g., FCA + embeddings).
Evaluation and Awards
Submissions are evaluated on a rolling basis. Two special awards will be voted on during the workshop:
- The “Eureka” Award — for the most surprising or impactful insight.
- The “Bridge Builder” Award — for the best integration of conceptual methods with other ML techniques.
Important Dates (2026) — Rolling Process
| Date | Milestone |
|---|---|
| 10 June | Release of full datasets. |
| 17 June | Rolling submission window opens. Participants may submit their analysis at any time after this date. |
| Ongoing | Rolling notifications. Feedback and notification of acceptance within two weeks of each submission. |
| 31 July | Final submission deadline. Last opportunity to submit results for CDAS 2026. |
| 31 August | Workshop session in Montpellier. |
Datasets & Submission
Datasets here:
dataset-A-geopolitical-evolution
To submit, send your submission via email to cdas@cs.uni-kassel.de using the tag [CDAS 2026 at CONCEPTS] in the subject line and the name of your submission. Attach the PDF, if applicable, or put in the link(s) to your material. Also, please provide a short abstract of about 200 words. If you have any questions, feel free to contact us via the same email address.
Organizers
- Tom Hanika, University of Hildesheim
- Giacomo Kahn, Lumière University Lyon 2
