Gaurav Sinha

Gaurav Sinha

Expert Insights: Understanding the Einstein Analytics Timeseries Function

Salesforce Einstein Analytics recently came up with new SAQL functionality, which allows users to forecast future details using their existing data. This would allow the users to make effective business decisions based on what the predictions are.

Before getting into the details of the function and the inputs it requires from the users, let’s understand what time series is. A time series is a sequence of data points recorded at equally spaced time intervals. For example, weekly/monthly/daily snapshots of the data. Time series analysis involves the methods to analyze this data and extract some meaningful information out of it. Time series forecasting is using a model to predict future data points based on the previously observed values.

Time series has three main components:

  1. Seasonality – The variations that happen due to a specific reason, which could be climate variations, responses during a specific period, etc.
  2. Trend – The direction in which your data is pointing to. It could be trending up/down/neutral.
  3. Cyclic information – A repeated pattern in the data.

 

The Timeseries() Function

The timeseries function that Salesforce provides for forecasting is a highly effective and optimized method that performs time series analysis and finds the best forecasting model that fits in the data that you are providing. It also allows you to select your own model, however, there are limited choices. The function automatically identifies the seasonality and cycles in the data and chooses the best fit if you don’t want to specify. This timeseries function is available via SAQL in a step in einstein analytics. So there will be a syntax for it – R = timeseries T generate (measure1 as fmeasure1 [,maeasure2 as fmeasure2..]) with parameters; – (Source – Salesforce help)

Understanding the Timeseries Function

Let’s look at what goes into the timeseries function. The measure1 and measure2 or measure#n are the measures that we want to predict. Along with these, there are several parameters that are required for the timeseries function to provide the best results. These parameters are:

  1. length – The number of points to predict. In simple words, it would be the duration for which the prediction has to be made. This depends on the date grouping that is done in the next parameter.
  2. dateCols – The date grouping to be used and the columns to be used for the grouping. Only specific groupings are allowed:
    • Year/Month – “Y-M”
    • Year/Quarter – “Y-Q”
    • Year – “Y”
    • Year/Month/Day – “Y-M-D”
    • Year/Week – “Y-W”\
  3. ignoreLast – Whether the model uses the last time period data or not. It accepts True/False. The default is False.
  4. order– Field to be used for ordering the data.
  5. partition– Which column to be used for data partitioning. It should be a dimension. It allows the calculation to be done separately for each partition, in turn, improving the accuracy of predictions.
  6. predictionInterval– To view the uncertainty related to each prediction. It allows identifying the range of values in which the future predicted values will fall. The upper and lower bounds allowed are 80 and 90.
  7. model – This is what predicts the values. Timeseries function allows you to specify your own model or it selects the best model by comparing the models based on Bayesian Information Criterion (BIC). The allowed values are:
    • None – Automatic selection based on BIC.
    • Additive – An additive model factors the effect of individual factors and performs differentiation and addition to perform data modeling. In this case, the Holt-Winters method is used with Additive components.
    • Multiplicative – It assumes that as the data volume increases, the seasonal patterns increase as well. Here we multiply the Trend and Seasonal components and then add the error component. A Holt-Winters method with multiplicative components is used.
  8. seasonality– The final parameter is the seasonality, this is used with the dateCols. The allowed values are 0 that defines no seasonality or a value between 2-24.

Share this post

LinkedIn
Twitter
Facebook