You use the data source to do the initial data extraction from the spreadsheet or database. Although there may be some filtering of information through table selection or SQL expressions, it’s mostly just selection of the information to bring into the staging area. At this point, the next step in the import operation is done with loaders.
Loaders are a set of business rules that identify how the raw data from a data source is to be used. Loaders take the data from the staging area, add timestamps and do other things to prepare it for loading into Discovery, and then load the data.
Loaders prepare the data by doing the following:
identifying a timestamp for the column(s) you specify for referencing the information
You can set the format and granularity of the timestamp. The timestamp lets you view data across different time periods.
converting the information into metrics and dimensions
A metric is any collection or series of numeric data that is a direct numerical measure of some piece of business data. Revenue, expenses, or gross sales by week could all be metrics. Dimensions are a way to classify metrics. For example, if the metric is gross sales by week, the numerical measure is the dollars (the gross sales amount) and a dimension is time (the week).
For a given measure, you may also want to see the values across different hierarchies within a dimension. For example, you could look at the gross sales by day, week, or month. Making the association of a measure with a specific hierarchical level within a dimension is the granularity of the metric. As another example, you could have a metric for products sold. The measure would be the count of products sold. An associated dimension called Region could have values of North, South, East, and West, showing the region in which the products were sold. A second dimension might be Warehouse, identifying the warehouse from which the product was shipped.
applying SQL calculations and filters
You can use SQL to create calculations and filter conditions, such as filtering rows based on product category.
You can run a loader manually from the Loader screen or you can run a loader as part of a task.
Although you can set up multiple data sources that use the same spreadsheet or database, best practice is to create a single data source, import the data once to the staging area, then have multiple loaders take what they require for their specific needs. (You might need to use multiple data sources for the same external source if you need to specify different and incompatible data import filters. However, it’s still generally better to import the union of all the data you need to the staging area and then use filters on the loaders to extract the specific sets of data you need to use.)