|
Searching for patterns with Interactive Data Mining
A unique characteristic of the Hunch Engine™ is that the
user does not need to have any knowledge of the techniques
and operations that are used within the specific
application. By selecting "interesting" candidates through
the HE interface, the user is implicitly determining which
of potentially many search filters and what parameter
settings to use.
In one example we developed a tool to help a user
identify clusters in a large data set. For instance, one
may wish to identify customer segments based on extensive
consumer data. Currently, this process requires the user
to be familiar with data mining techniques such as
dimension reduction and clustering.
The figure below shows how HE can be used to discover
segments even with no knowledge of data mining. Starting
with a high-dimensional data set, the user is presented
with several candidates, each corresponding to a
particular 2-D representation of the data (left panel).
As the user selects candidates that seem to contain
some information, the Interactive Data Mining tool begins
to reveal some structure in the data (center
panel). Eventually (right panel), the user reaches
solutions that show clear existence of five distinct
clusters in the data.
This simple demo could easily be extended to use a
large variety of data mining algorithms, without requiring
the user to have any understanding of the underlying
techniques. In addition, even expert users benefit from
this tool, because it allows them to search a vast space
of algorithms, rather than being forced to choose a class
of algorithms at the outset -- a practice that often
limits the nature and extent of the search.
|