Intent2Workflows
Overview
This component executes the translation from user-defined intents to actionable workflows. The user starts the process by defining, on a high level, the analytical task to be executed. I2W extracts the key features from the description and maps it to a rich knowledge base. From there, it chases the dependencies indicated in the ontology to produce workflows to implement the defined task according to the specified intention. These workflows are initially encoded using RDF, which implies a high flexibility to be translated to other representations, such as the DSL required by the execution engine.
Architectures and Features
The backbone of the project can be found in the backend/modules folder, where the two modules that implement the main functionalities can be found (belonging to WP3 and WP4). Besides the backend logic, we provide a frontend for an easy and intuitive interaction with the system. The frontend communicates with the backend via a main API (backend/api) that collects the functionalities presented by the two modules.
The intentAnticipation module is in charge of anticipating, capturing and processing the user-defined intent. The user defines this intent by indicating the required parameters (type of task, dataset to use, etc.). Then, the system (i) maps the intent to the concepts defined by our knowledge base and (ii) provides recommendations, extracted from past experiments, to the user regarding the definition of the intent. That is, it indicates which are the additional constraints that are recommended to define in order to optimize the workflow (e.g. which is the best algorithm to use). Optionally, the intent can be defined on an even higher-level via natural language, which is processed by LLMs to extract the required elements.
The IntentSpecification2WorkflowGenerator module generates, once the intent has been captured, the corresponding workflow. This is done in a series of steps:
- Data annotation: the data is annotated following the concepts represented in the knowledge base. This allows us to understand the characteristics of the data and provide the appropriate operations to adequately work with them.
- Abstract plans: the first proposed workflows are very abstract instantiations of the pipeline to encode. These include high level tasks that provide a general orientation regarding how to define the workflow. One of these plans is created by each of the algorithms that can be employed for the task. The user has to select at least one of these plans.
- Logical plans: the abstract plans are mapped to specific workflows where all the necessary tasks to execute the intent cna be found. These plans explore all the potential variability points regarding the needs of the intent, pruning those paths that are deemed less relevant (in order not to bombard the user with too many, indistinguishable alternatives). These are divided by algorithms and specific implementations of these algorithms (e.g. neural networks -> LSTM networks, convolutional networks, etc.). The user selects, at least, one of these plans.
- Workflow representation: once the definitive list of workflows has been selected, these can be visualized, stored in the system for later use or exported in RDF format. Alternatively, we offer the possibility of directly converting the workflows to the DSL language required by the experimentation engine.