Intelligent Systems for Complex Data Research Group

We find patterns in data and mine information from complexity

Research & Research Areas

We are a team of researchers at Masaryk University in Brno, Czech Republic, specializing in complex data analysis. As part of the DISA laboratory and in close collaboration with the CERIT-SC centre, we aim to discover patterns within vast amounts of complex data.

We tackle unique challenges ranging from exploring patterns in images to the intricate analysis of complex biological structures such as proteins. Our ambition is to redefine the boundaries of effective and efficient processing of large datasets by leveraging our proficiency in machine learning, data mining, and clustering techniques.

Join us!

We are constantly looking for new members, postdocs, students for bachelor's and master's theses and more!

No description

We've introduced a Learned Metric Index – an index for complex, unstructured or high-dimensional data built as a structure of machine-learning models. 

No description

We develop our own application for searching in proteins by their structural similarity called AlphaFind.​

No description

Adhering to the Open Science principles, we publish our work without restrictions, and the Learned Metric Index Framework is available online on GitHub.

Results, news and achievements

No description

Best Student Paper at SISAP

Miriama Jánošová and David Procházka received a "Best Student Paper" award at the Similarity Search and Applications 2021 conference for their work on Organizing Similarity Spaces using Metric Hulls.

No description

Interview with Terézia

Terézia provided an interview for the EOSC CZ initiative about the experience during her Ph.D. studies, Open Science and Open Source, and most importantly about research reproducibility in light of our LMI reproducibility publication. 

No description

Pre-print of AlphaFind

We have published a bioRxiv pre-print of AlphaFind - a search engine for protein structure similarity indexing the whole AlphaFold DB (214M proteins). We're currently awaiting the second round of reviews in the Nucleic Acid Research journal.

Our Collaborations and Outreach

In our team, we strive to develop cooperation with other research teams from both Czechia and abroad. Our most prominent partners are:

No description

Information Systems and Data Mining Research Group, CAU University of Keil, Germany. Together with Prof. Dr. Peer Kruger and his group, we investigate modern techniques for indexing, searching and solving (reverse) kNN data retrieval techniques.

No description

Biological Data Management and Analysis Core Facility of the Central European Institute of Technology. Together, we work on the application of learned indexing to searching in protein structures produced by Alphafold.

No description

Engineers from e-INFRA CZ (specifically centre CERIT-SC) help us with fine-tuning our algorithms and running our experiments, often requiering non-trivial know-how and computing power. 

Loading map…

Publications

Team Members

No description

Vlasta is and associate professor at the Faculty of Informatics, Masaryk University, and one of the founding members of the Laboratory of Data Intensive Systems and Applications (DISA). He has a long history in research areas of data management of unstructured data for content-based retrieval, and similarity analytics. He is the co-author of about 40 research publications and a seminal book on similarity searching using metric spaces.

No description

RNDr. Matej Antol, Ph.D.

Group co-lead

Research activities conducted by Matej span from core research topics on optimizing index structures and the querying process in metric similarity search to research applications regarding the processing and managing research data, managing sensitive data and working with life-science-specific data and tools such as genomic or protein data. As an executive director of the national large research infrastructure CERIT-SC, he is responsible for integrating national e-infrastructure e-INFRA CZ. He is also one of the national leaders of Czech efforts towards the adoption of Open Science principles and implementation of EOSC in the Czech republic.

Outside of science, he is interested in the world of finance and investing, sings and plays sax and guitar in a small garage band, and, in winter times, enjoys playing squash and skiing.

No description

RNDr. Terézia Slanináková | Ph.D. Student

Terka's Ph.D. study is dedicated to exploring how best to apply machine learning, and more specifically learned indexing into the realm of complex data for the purposes of fast search.

Simultaneously, she is a researcher at Tom Rebok's research group at CERIT-SC, where she uses machine learning to solve real-world problems and co-leads a project focused on creating a national platform for analyzing geospatial data. She is passionate about reproducible research, open source software, and leading bachelors/masters students.

When she's not researching, she likes spending time outdoors by doing virtually any sport, meeting with friends, or putting together a good dish.

No description

Mgr. et Mgr. Jaroslav Oľha | Ph.D. Student

Jaroslav holds two master's degrees in computer science and in biochemistry, and he has published research in the areas of high-performance computing, similarity searching, and computational chemistry. He is currently finishing his dissertation thesis on HPC kernel autotuning, but his research focus is shifting towards organization and analysis of complex data, particularly data generated by various life sciences.

He likes to spend his spare time with his mischievous toddler, as well as playing the guitar and the bass in a few bands, painting miniatures, playing board games and running the occasional marathon.

No description

Mgr. David Procházka | Ph.D. Student

David focuses on producing high-quality research at the intersection of data indexing, similarity search, machine learning, and motion processing while integrating good software engineering practices. He began his academic journey with an award-winning undergraduate thesis on indexing metric spaces using metric hulls and has since made several contributions to the field of human motion classification. A long and successful cooperation with assoc. prof. Vlastislav Dohnal culminated in David's enrollment in a Ph.D. program under his supervision.

Fueled by collaboration with bright minds, David strives to elevate those around him and develop scalable solutions for searching complex data. When not pushing the boundaries of current knowledge, he enjoys good movies and has a deep appreciation for simple yet elegant design.

No description

Mgr. Miriama Jánošová | Ph.D. Student

Miriama's PhD studies are concerned with enhancing the field of analytics for unstructured data, especially finding convenient representations for metric regions and their applicability. She is also involved in mocap data research for rehabilitation therapy. In recent years, she has been responsible for designing exercises for a student's Data Warehousing project.

Apart from her studies, she works as a software engineer at Ataccama. She is responsible for developing parsers of SQL-like technologies and extracting data lineage from various BI reporting tools.

In her free time, she trains her dog Brutus, enjoys pilates or spends time with her close friends. Besides, she is enthusiastic about preparing some exotic dishes.

No description

Mgr. Adrián Rošnec

Young IT professional with broad experience in technical aspects of research infrastructures -- data storage, processing, AAI, cloud computing and more. He is also a PhD student focused on the management of scientific data. 

No description

Ing. Katarína Grešová

Katarína is a Genomics and Deep Learning PhD student with a background in computer science and bioinformatics, specializing in using Machine Learning to model small RNA binding rules.

No description

Jakub Čillík

Starting his career journey in computer science, Jakub thrives on leveraging his extroverted nature to uncover nuanced narratives hidden in data, remolding it into tangible and expedient solutions.

No description

Bc. Lucie Novotná

Lucie is finishing her master's degree in Artificial intelligence and data processing, with her thesis focused on similarity searching of protein structures. She works as a software engineer at Red Hat.

No description

Bc. Jakub Žovák

A data scientist specializing in vector databases, he is writing a thesis on its role in similarity searching and related applications.

Interested in Our Research? Join Us!

We are searching for new colleagues for various positions who would work with us on exciting projects, develop unique software and solve unconventional problems. In case of interest, contact is at dohnal(at)fi.muni.cz or antol(at)muni.cz

 

You are running an old browser version. We recommend updating your browser to its latest version.

More info