spacer.gif
Home
spacer.gif
spacer.gif
spacer.gif
Genus Mining Integrator for NonStop™ SQL
Overview
The process of data extraction and aggregation
Benefits of using Genus Extractor and Aggregator Engine
Ordering Information
Resource Center
spacer.gif
spacer.gif Genus Fast Extractor and Aggregator Engine
spacer.gif
spacer.gif spacer.gif

 

Overview

Essential to any campaign management or data mining application is the creation process of restructured data set. Restructured data is a subset of data derived from the NonStop SQL database and is in a “ready to use” format by campaign management application such as DoubleClick Ensemble or data mining application such as SAS Enterprise Miner and others. The High performance Extraction and Aggregation Engine built by Genus does excellent job of creating the restructured data set. Campaign management is the process of sending marketing material to customers that are likely to respond favorably to the marketing effort. Data mining, is the process of analyzing large data sets to find useful, previously undiscovered patterns, is used to analyze the marketing campaign results and related data to find the patterns, or factors, that differentiates people who respond favorably to campaign from those that did not. Campaign can be based on the combination of several factors like marital status, disposable income, hobbies and recent product purchases. After these factors are identified, they can be used to optimize future campaigns by sending materials to only those customers that are likely to respond favorably

The process of data extraction and aggregation

Creation of restructured data subset from the large data sets available in the NonStop SQL database involves three steps viz. configuration, extraction and aggregation.

  • Configuration. The first step is to configure the Extraction and Aggregation Engine. The user does this by providing the selection criteria for restructured data set. Providing selection criteria could mean providing the parameters of interests that need to be captured from the database, the number of extraction and aggregation processes to be run etc. The user stores this information in SQL tables.
  • Extraction. The Extraction processes then uses the selection criteria to select and combine data from the NonStop SQL database.
  • Aggregation. The extracted data is filtered and routed to Aggregator process, which restructures the incoming data, in a format suitable for querying by campaign management applications, before storing it into the output tables.

The data in the output tables can then be readily used by campaign management applications as well as by data mining software for further data discovery.

 

Benefits of using Genus Extractor/Aggregator Engine

The benefits of using the extraction and aggregation engine are ease in customization, tremendous improvement in performance and leveraging the use of existing data mining, campaign management and analytics products as explained below:

  • Easy to configure
    Extraction and Aggregation engine can easily adapt to user needs. User just stores in SQL tables all the configuration parameters that the user wants to maintain control of. The Extraction and Aggregation engine at run time will then read the configuration parameters provided as inputs by the user.
  • Extraction performance
    A single extraction process handles data pertaining to multiple extract criteria in a single, coordinated run. As each run over the same data consumes system resources, a reduction in the number of runs reduces the corresponding resource consumption. The advantage of using this solution becomes even more evident with more number of extraction and aggregation operations to be run on large data sets. For example, if five campaigns are to be run and each takes 4 hours to perform extraction, the total run time for the campaign becomes 20 hours. With the use of high performance Extraction and aggregation engine, a single extraction process can satisfy all five campaigns within a time frame of a little over 4 hours. The ability to amortize the cost of an extraction over multiple campaigns is a key benefit of this solution.
  • Aggregation Performance
    The aggregation performance refers to the total time required to build the output tables containing the restructured data. Amortizing the time required to build the output tables over more than one aggregation criteria creates tremendous improvement in performance. The extraction and aggregation engine gives the user benefit of improved performance through its unique ability of simultaneous aggregation of data supplied by multiple extractor processes. The current version of Aggregation engine supports the MIN, MAX, SUM, COUNT and DISTINCT COUNT aggregate operations.
  • Leverage use of data mining, campaign management and analytics products
    Existing data mining, campaign management and analytics applications issue queries against the output tables generated by the extraction and aggregation engine. The output tables provide the targeted data in a structure that is optimal for these applications eliminating the need to navigate the data stored in the NonStop SQL/MP database that is designed to hold operational data and is not efficient for querying.

Ordering Information


spacer.gif