专注在线职业教育24年
下载APP
小程序
希赛网小程序
导航

Continuous Querying in Database-Centric Web Applications

责编:xiangquyd 2003-05-12

Abstract

Web applications are becoming increasingly database-centric. Unfortunately, the support provided by most web-sites to explore such databases is rather primitive and is based on the traditional database metaphor of submitting an SQL query and packaging the response as an HTML page. Very often, the result set is empty or contains too many records. It is up to the user to refine the query by guessing how the query constraints must be tightened or relaxed and then go through another submit/response cycle. Furthermore, once results are displayed, typically no further exploration capabilities are offered.

Web applications requiring interactive exploration of databases (e.g. e-commerce) need that the above submit/response metaphor be replaced with a continuous querying metaphor that seamlessly integrates querying with result browsing. In addition to supporting queries based on predicates on attribute values, queries based on example records should also be supported. We present techniques for supporting this metaphor and discuss their implementation in a web-based database exploration engine.

1. Introduction

Web applications are becoming increasingly database-centric. In a 1997 Forrester survey [5], respondent companies indicated that nearly 40% of the content at their web sites originated from databases. This was expected to rise as high as 65% by 1998, and that this fraction was expected to increase. Many new web applications require that a user be able to interactively explore these databases over the internet or an internal network.

A common example of such interactive exploration is the task of finding products or services matching a user's requirements. While this is a widely performed task [9], the support provided by current web-sites for implementing this functionality is rather primitive. Typically, a server-side database is relied upon for all query processing. The user is presented a form for providing specifications of the desired product in terms of bounds on the values of the product attributes (e.g. a 3.3V zero delay clock buffer in 16-pin 150-mil SOIC or TSSOP package with output skew less than 250ps and device skew less than 750ps having operating range of 25-100MHz). On submission, this information is used to construct an SQL query that is in turn submitted to a server-side database. The result is returned to the browser formatted as an HTML page. Very often, the result set is empty or contains too many records. It is up to the user to refine the query by guessing how the query constraints must be tightened or relaxed and then go through another submit/response cycle. Furthermore, once results are displayed, typically no further exploration capabilities are offered. As a result, the user needs knowledge of not only the domain of interest but also the particular data set. Further aggravating this problem is that the round-trip time between browser and the database server for each submit/response cycle is often frustratingly large.

The problem is that database query technology is targeted at reporting rather than user exploration. In traditional database applications, queries are rigid in that they are intended for asking very specific questions. The query results are interesting regardless of whether they contain zero records or ten thousand. In a sense, an individual query itself is the goal. In user exploration, the goal is not simply an individual query or its results, but rather locating particular records of interest. Rarely can this be achieved with a single query. As a result, users typically issue many related queries before they are finally satisfied.

What is needed is that this "submit a query and wait for a response" metaphor be replaced by a new continuous querying metaphor. The user should be able to combine searching with result browsing so that the user simultaneously sees the current query and the qualifying records in a single view. As the user changes the query constraints , the user should immediately see the impact on the qualifying records in that view.

We present techniques for supporting this continuous querying metaphor. These techniques have been implemented in a database exploration engine, we call Eureka. We present in Section 2 the user interface that facilitates database exploration using the continuous querying metaphor. For this metaphor to be successful, it is imperative that as soon as a user manipulates a GUI control, the user sees its effect instantaneously, which in turn requires well-tuned data structures. In Section 3, we present the design and implementation of the Eureka engine. We conclude with a summary and some possible directions for future work in Section 4.

1.1 Related Work

A large number of e-commerce sites provide parametric search capabilities in which users search for desired products by providing bounds on attribute values. Stock screens at investment sites such as Charles Schwab (http://www.schwab.com/), travel package selection at travel sites such as Travelocity (http://www.travelocity.com/), electronic component search at semiconductor sites such Cypress (http://www.cypress.com/) are examples of this type of search. Some e-commerce tools (e.g. Net.Commerce [7]) provide support for implementing such searches. As stated earlier, these sites typically rely entirely on a server-side database for query processing. They are thus limited to the submit/response metaphor and suffer from the problems of long response times and too many or too few answers.

Some newer sites are providing a subset of interactive exploration capability described in this paper. For example, Microsoft's Carpoint (carpoint.msn.com), Cars.com (http://www.cars.com/) and Wireless Dimension (http://www.wirelessdimension.com/) combine browsing with querying --- as the query is changed, the user immediately sees the effect on results. These sites currently do not support querying based on example products. The details of their implementations are not available in published literature. It is doubtful that the Wireless Dimension's Javascript implementation (which uses HTML for its output) is designed to scale to large product sets, and only Carpoint (using native ActiveX code) allows users to explore data sets with more than a few hundred products.

An interesting approach to handle the problem of too many or too few answers was taken by 64K Inc.[1]. Although still an HTML forms-based approach, the query pages generated by the 64K engine contain histogram information, showing how records are distributed over each attribute's range of values, as well as a count of the total number of records. This information is meant to provide hints to the user for modifying the query before resubmitting it. If a query results in too many records, rather than showing them to the user, the engine redisplays the query page updated with new histogram and count information. In the case of too few answers, the engine uses domain-specific distance metrics to relax the query and return nearby records. However, the 64k search metaphor is still the standard submit/response metaphor (although perhaps with fewer cycles and richer features).

Another approach to handling too many answers is represented by the FOCUS application described in [10]. In this technique, the results of a query are cached and displayed in a compressed table. Further restrictions to reduce the result set are applied on the cached results. However, as the authors acknowledge, FOCUS is mainly suited for tables with up to a few hundred records and attributes. In the case of Spotfire Pro (www.spot

更多资料
更多课程
更多真题
温馨提示:因考试政策、内容不断变化与调整,本网站提供的以上信息仅供参考,如有异议,请考生以权威部门公布的内容为准!
相关阅读
查看更多

加群交流

公众号

客服咨询

考试资料

每日一练

咨询客服