Exploratory Search Demonstration

In this demo, we show how queries can be dynamically defined and extended in the user search session. By means of such paradigm, the user is supported in expressing fully exploratory queries, starting by constructing a network of connected resources, each corresponding to a clearly identified real-world concept (e.g., hotel, flight, hospital, doctor), and correlated by predefined semantic links (hotels are close to restaurants, doctors cure diseases and are located at hospitals). In such setting, the system must be able of supporting query expansion and result tracking, giving the user the possibility to move “forward” (adding one node) and “backward” (deleting one node) along the exploration history; or of dynamically selecting and deselecting the object instances of interest.

 We demonstrate how the intertwining of appropriate result visualization and exploration steps can facilitate the user in his information seeking tasks. In particular, we illustrate a set of results presentation options, including tabular representations, which presents combinations as a whole, and atom views, which visualize instances of different objects in separate lists, and at the same time displaying the combinations they belong to and their global rank. This view allows users to focus at each step on the new results, and therefore is most suitable for a progressive exploration; moreover, the atom view paradigm is simpler and therefore can be exported to a mobile scenario. The supported search tasks go beyond the typical one-shot, memory-less interaction with a search engine, and span over several steps, with the possibility of suspending and resuming the work (e.g., perform the search process in different days). In our demonstration we highlight the exploration and interaction issues, together with the architectural and query execution challenges they pose to an engine that need to actually perform the queries.

The Search Computing Architecture

The Liquid Query demo sits on top of an architecture that covers all the phases necessary for formulating and processing multi-domain search queries. SeCo queries can be addressed to a constellation of Web data sources, including search engine APIs, products, events and people databases (e.g., Amazon, Eventful, LinkedIn), scientific data sources (e.g., DBPL, PubMed), and community curated data sources (e.g., YQL Open Data Tables, DBpedia). These data sources are registered in the system using the Service Mart Repository, which contains a multi-level description of the callable search services. At the most conceptual level, sources are registered as service marts and characterized by the service name and a collection of attributes (single or multi-valued) exposed by the service.

Such abstract description is refined into one or more access patterns, i.e., logical signatures that specify whether each attribute is either an input or an output in the service call; output attributes are tagged as ranked if the service produces results ordered on the value of that attribute. Access patterns can be joined, when parameters of one service mart match, both for type and meaning, the output parameters of another mart, and these parameters are either both tagged as output (yielding to a parallel join) or one is tagged as output and the other one is tagged as input (yielding to a pipe join). Access patterns are next refined into service interfaces, including a name and an endpoint of a concrete search service. The actual service invocation is managed by the Execution Engine, which supervises the interaction with several wrappers used to access Web APIs and databases.

Queries enter the system at the Liquid Query user interface, which is a client-side component that allows users to perform pre-defined queries, by filling-in runtime parameters through a form, or exploratory search. In both cases, queries are submitted to the Query Orchestrator, a server side component that manages queries, results caching, and user’s sessions. A new query consists of a conjunction of predicates over data sources, join predicates to express connections of the sources, and a global rank criterion for sorting the result set.

Each query undergo an analysis and translation process, managed by the Query Analyzer, which produces the Query Execution Plan (QEP), which is a graph of low-level components that specifies the activities to be executed (e.g., the service calls), their order of precedence, and the strategy to execute joins. The plan is output by the Query Optimizer, which chooses the join implementation (e.g., parallel vs. pipe join) and sets the parameters of the join execution strategy (e.g., the number of times a service is called to retrieve the top-k results of the query).

A QEP is executed by the Execution Engine, which analyses and breaks it recursively into subcomponents, which in turn are either QEPs or atomic service invocations. The results of service calls are accumulated by the engine, which builds progressively the combinations constituting the query response, which are submitted back to the Liquid Query interface for visualization and interaction.

The Liquid Query Demo

The demo implements a multi-domain search scenario over a set of heterogeneous data sources, and focuses on interaction primitives and visualization techniques for supporting complex search tasks. The demo demonstrates the exploratory search interaction paradigm. Starting from an initial status with no predefined query, the user can perform a progressive, step-by-step construction of the query itself by exploring the available services and their connections, as supported by the Service Mart Repository.

The Liquid Query demonstration presents the simplest result set visualization first, i.e. the Tabular View. In the Tabular View, combinations are presented as rows, sorted with respect to the global rank. Service mart attributes are presented as columns, grouped by domain. Derived attributes (e.g., calculated distances, total prices) are also shown as columns. The attendees will have the possibility of perusing the result set, with commands for view control (hiding and showing of columns, grouping on values, filtering, ordering). Some commands allow refining the query by interacting with the Orchestrator: e.g., users can ask for more results from one or more services or change the rank criterion. A special command is query expansion, which enables a controlled form of exploration: the attendee will select one or more combinations of interest and ask for novel information on some of the included objects (e.g., chosen a Concert, ask for information about the recent News associated with it).

After familiarizing with the Tabular View, the demo will explore other data visualizations, including the so-called Atom View. This view is useful to highlight the local population and ranking individual service marts, which are less visible in the Tabular View, by showing the object’s name (or any suitable identifier), while more properties can be asked for separately. Users can select combinations (in which case all objects forming the currently selected combination are highlighted) or objects within an Atom view (in which case all the combinations it belongs to and associated objects are highlighted).

Types of result data can be used to provide type-dependent visualizations of objects and their relationships, which will be demonstrated. Among them, one exploits the geographic coordinates for the involved objects, by representing them in a map: each object is shown as an icon, and the local ranking (e.g., the price of the Hotel or the rating of the Restaurant) is represented by the size of the icon. A combination is then represented by a set of different icons, which are highlighted when the combination is selected.