Skip to main content
Adaptive Insights
Knowledge and Support - Adaptive Insights

Setting Up a Scripted Data Source

Scripted data sources let you solve complex, custom data extraction and transformation scenarios. You can write Kettle ETL scripts to specify information sources to extract data from and then apply customer-specific business rules to the extracted data. (Before you set up a scripted data source, you must have Adaptive Data Agent installed on an on-premises Windows server in the network. For information on setting up Adaptive Data Agent, see Using Adaptive Data Agents.)

 

Scripted data sources are for advanced ETL implementers who have experience writing Pentaho Kettle scripts and using the Spoon UI. For more information about Pentaho Kettle and Spoon, including extensive documentation and tutorials, please see the Pentaho wiki at http://wiki.pentaho.com.

To set up a scripted data source, follow these steps:

  1. Access the Data Designer by going to Integration > Data Designer.

  2. In the Data Source folder in the Component Library on the right side of the screen, click Create New Data Source.
    The Create New dialog box appears.

  3. Select Scripted Data Source as the data source type and enter a name for the data source

  4. Click Create.
    The center area of the screen displays your new data source’s settings and other information, as shown below.

5. Enter the data source’s information:

  • Associated Agent: Select the agent from the drop-down list.
  • Entry Point: Select an entry point from the drop-down list. The entry point is an ETL script on the system. (You can edit or change this later.)
  • Log Level: Select a log level from the drop-down list to specify the detail for the logging for this data source. Your options are:
  • Error: Only logs serious errors.
  • Info: Logs all basic information, such as when the data source was updated.
  • Verbose: Provides very detailed information about all phases and actions. (This level is used primarily for debugging or auditing, as it may produce more log information than is practical for typical use.)
  1. Click Save on the Actions menu.

Starting Spoon

After you have saved the basic information for the scripted data source, you can start Spoon, the ETL script editor, work with the script, and add tables to the scripted data source.

To view and edit the script, follow these steps:

  1. From the Windows Start Menu, click All Programs.

  2. Open the Adaptive Data Agent folder.

  3. Click Adaptive Script Editor.
    After a Pentaho splash screen and a startup tip, the Spoon screen appears, as shown below.


  1. Enter your Adaptive Integration login email and password and click Login Now/Refresh Tenant List.

  2. Select the tenant from the drop-down list. (There is usually only one tenant.)
    Spoon refreshes the tenant list.

  3. Select the agent you are using for this data source from the drop-down list.
    Spoon refreshes the agent list.

  4. Select the scripted data source you just created from the drop-down list in Edit Script.

Now that you’re logged in and have told Spoon which agent and data source you are working on, you can start editing the script.

  1. Click Data Integration in the upper-right corner of the Spoon screen.
    The Integration perspective of Spoon appears, as shown below.

2. Select File > Open and browse to the script you want to view and edit, then click OK.
The image below shows the Scripted Data Source example. (The sample scripts included with Adaptive Integration give information about script structure and data­base connections that may be useful to you as you develop your own scripts.)

!spoon_scripted_example.jpg
  1. Edit the script as appropriate, then save it.

  2. Click Adaptive in the upper right to return to the Adaptive Perspective.
    The Spoon screen (shown earlier) appears again.

  3. Click Publish Agent Changes. This pushes the script back up to the Adaptive server, where it is stored in the script repository.

The following data import modes are supported. (The settings can vary by table.)

  • All records are replaced each time the data is imported.
  • Remote Agent scans all source records and transmits the changed records each time data must be synchronized.
  • All records that fall within a period range are transmitted.

Adding Tables and Columns to a Scripted Data Source

With the script in place, you can add tables and columns to your data source in Adaptive Integration.

It’s essential that the columns defined in the data source must match the columns defined in the ETL script exactly, or else the data load will fail.

To add tables to a scripted data source, follow these steps:

  1. Expand the Custom Table entry in the Data Components menu.

  2. Drag and drop the Scripted Tables item to the Tables to Import area.
    When you drop this item, the Scripted Tables dialog box appears, as shown below.

    !scripted_table_settings_xantus.png
  3. Enter the table settings:

  • Name: Enter the name for the table.
  • Data Import Mode: Select one of the import modes:
    • All records replaced each time a data import is run: This is the default. Whenever you import data, all records in the staging area are replaced.
    • Remote cache used to identify and send changed records on import: Adaptive Integration looks at the staging area and compares it with the records in the spreadsheet or database and imports only the changed records.
    • All records that fall within a period range transmitted: When you select this mode, you are prompted to enter a period range and a column Adaptive Inte­gration applies the range to. Any records that meet this selection criterion are imported.
  • Data Import Filter: Use this to enter a SQL expression that filters or formats information being imported. The SQL filter is applied before data is extracted from the database to reduce the amount of data that needs to be imported, giving you an additional level of control for which data is brought into the staging area. Click the field to display the Edit SQL Filter dialog box, shown earlier. (The filter you enter here is applied to the actual data in the database rather than the preview data in the staging area.)
  1. Click Apply to apply the settings.

To add columns to a scripted data source, follow these steps:

  1. Expand the Custom Column entry in the Data Components menu.

  2. Drag and drop one of the column types to the Tables to Import area.
    When you drop this item, the Custom Columns dialog box appears.

  3. Enter the columns settings:

  • NameEnter the name for the column. (Column names must match the column names as specified in the script in Spoon exactly, including capitalization and embedded spaces.)
  • Column Type: A display-only field that identifies the column type.
  • Convert Column Type: You can not change this value except for SQL columns. For SQL columns, you can select any of half a dozen conversion options.
  1. Click Apply to apply the settings.

Managing Columns

If a table you are working with has a large number of columns, you can use the Manage Columns to do quick edits for which columns to import. The Manage Columns dialog box displays the columns in a table so you can then check or uncheck columns for importing.

To manage columns for importing, follow these steps:

  1. Click the arrow on the tab to the right of the table name.
    A floating menu with several options appears.

  2. Click Manage Columns on the floating menu.
    The Manage Columns dialog box appears.

    BatchEdit-ManageColumns00063.png
  3. Check the box next to a column to import it. When you check a column, the column also appears within the preview window after the popup contents are saved. Unchecking a column removes the column from the preview window.

    You can not change the import status on system-generated fields such as isDe­leted. You can also change the import status on individual columns by selecting col­umn properties and setting the property or by dragging and dropping the columns onto the designer.

Customizing the Tables

For each table, you can customize the way in which data is imported.

To customize the table settings, follow these steps:

  1. Click the arrow on the tab to the right of the table name.
    A floating menu with several options appears.

  2. Click Table Settings on the floating menu.
    The Table Settings dialog box appears, as shown earlier.

  3. Click Apply to apply the settings.

  4. Repeat steps 1-3 for each table.

  5. Click Save to save the settings in the data source.

Setting Column Options

While using the Tables to Import section, you can see a preview of the data as it is in the data source or as it is in the Staging area after it has been imported. You can change the options for each column by mousing over its heading and then clicking the drop-down menu next to its name. These options are:

  • Sort Ascending: Sort the entire table based on this column in ascending order.
  • Sort Descending: Sort the entire table based on this column in descending order.
  • Column Settings: This dialog is used to change the properties of the column within the staging table.
  • Name: The name of the column, displayed in the header.
  • Column Type: The type of data used in the column.
  • Convert Column Type: Select the new type from this drop-down menu.
  • Exclude column from import: Select this to remove this column from the data being imported.
  • Delete custom column: Select this to delete the column. (Available for custom col­umns only.)

Downloading Data from a Staging Table

You can download data from a single staging table.

To download data from a single staging table, follow these steps:

  1. Click the down arrow in a table heading in the Tables to Import section.

  2. Select Download Data.

  3. At the prompt “This operation will send an email with a url to download the data,” click Send. Adaptive Integration sends an email containing a URL to the email account associated with the data designer.

  4. Open the email and click the URL. The data is downloaded in .csv format.

Using the Advanced Filter

Below the Source drop-down menu, you can click Advanced Filter to reveal options for fil­tering the data in the Tables to Import preview area. The Advanced Filter provides tools to view and browse a subset of the data within a staging table or from the script. (When you select Script from the Source drop-down, the preview request is sent to the agent for processing.) By changing filter properties, you can change the view of the data shown within the preview window without changing the data in the staging area.

The preview area is specifically for previewing a small portion of the data in the staging area. You can explore the data that is available from the spreadsheet and verify that you are getting the data you expect by examining the staging area. (Filters set in the preview function do not affect the actual data in the staging area.)

The options for the Advanced Filter are:

  • Distinct Rows: Check this box to hide duplicate rows.
  • Max Rows: Limit the number of rows displayed in the preview area by entering a number in the text box.
  • Columns: Check or uncheck columns in this drop-down menu to show or hide the columns displayed.

Removing a column from this list does not remove the column from the import list.

  • SQL Filter: Click the large text box to display the Edit SQL Filter dialog box (shown earlier), where you can enter a SQL filter to restrict the rows being previewed. Click Apply to automatically check the SQL syntax of the filter. (If there are any errors, the error is indicated in the expression.) For detailed SQL syntax help, you can click online help in the Notes section.

Click Remove Filter (next to the Source drop-down menu) to clear all advanced filter set­tings.

As part of the table customization process, you can also create join tables and custom columns using SQL expressions.

Adding and Editing Parameters

You can use parameters to pass information to the underlying Kettle script. These param­eters can be set with a default value. The script can then use the parameters to control the transformations and filtering in the ETL scripts.

To add a parameter, follow these steps:

  1. Click the Edit Parameters icon to the right of the Parameters heading.
    The Parameters Editor dialog box appears (shown below).

    !loaders_script_edit_parms.png
     
  2. Click the folder in the left pane that you want to add a parameter to. The folder name appears in the right pane.

  3. Click Add and select one of the following options from the floating menu:

  • Folder: Create a new folder.
  • Boolean: Create a boolean parameter (true/false).
  • Integer: Create an integer parameter.
  • Text: Create a text parameter.
  • Password: Create a password parameter.
  • Period Range: Create a period range parameter (from/to dates).
  • Dimension: Create a dimension.
  • Actuals Version: Create an Actuals Version.
  • Plan Version: Create a Plan Version.
  1. In the right pane, enter the information for the option you selected.
  2. Click Apply.

You can edit individual parameters or the folders by double-clicking the item you want to edit in the left pane and making changes to the information in the fields, then clicking Apply to save your changes.

You can also create folders to group parameters, then drag parameters to the folder.

Importing Data with a Scripted Data Source

When you have set up and saved your scripted data source, you can import data.

To import data with a scripted data source, do the following:

  1. Click Import data in the Actions menu.
    The selected data is filtered by the ETL scripts, as well as any SQL expressions, and loaded into the staging area. You can preview the data in the staging area.

Information remains in the staging area until you clear it or until a subsequent data load causes it to be cleared per the active Data Import Mode setting.

To clear data from the staging area, do the following:

  1. Click Clear staging tables in the Actions menu.
    The staging area for all the tables in the data source is cleared.