The Regression Activity allows you to view correlations on a Site, view the CuSUM and set targets based on the CuSUM and Regression. This activity is not available from the main Activity Menu; it is only available via right click of a Site or through the Site Overview activity.
Click on the Site in the Site Overview Activity
Click Actions in the top right of the screen to allow you to navigate to other activities within Sigma including Regression
Click on Regression
Alternatively, right-click on a Site and select Regression from the menu
On entry to the Regression function, a screen will be displayed which shows all existing Regression Correlations that have been created.
From here, it is possible to view and work with these existing correlations. Alternatively, new correlations can be created by either manually by selecting or by letting the system automatically discover potential correlations by selecting
. More details are provided below.
To view an existing regression correlation, either right-click the image and select or double-click the regression thumbnail.
You will now be taken to a new screen and see three tabs - Regression, CUSUM and Control Chart.
Correlations can also be setup manually by selecting and choosing the required correlation items within the setup screen.
Once the Dependent and Independent Data Types have been set, along with the start and end date, then if the "Refresh Chart" button is pressed after selecting one of the data types then a graph at the bottom of the window is visible which shows the data for the period. This is useful for checking that the data is available and complete for the whole period. In the screenshot above, HH consumption data from a Meter is being plotted against Occupancy data in an Associated Data item. |
Field | Details | |
---|---|---|
Name | This is the name of the correlation that will be saved. If the "Use default name" checkbox is selected, the system will automatically generate one based on the names of the dependent and independent data types that have been selected for the Regressions. | |
Selecting from Data of Type | When selecting the data to use in the Regression, both a Dependent Data Type and Independent Data Type must be selected. This drop down list allows you to select which streams of data you wish to use in the regression based on the data-sets which are available for the selected Site. | |
Dependant Data Type | This is the stream data which might be influenced by the independent data type. It is usually metering data relating to electricity or gas consumed or potentially electricity generated. In terms of Sigma items, this could be a meter, periodic channel, non-periodic channel or virtual meter. This can be set by "dragging and dropping" the appropriate item from the "Available Data Type" in the list above. | |
Independent Data Type | This is the data which might influence the dependent data type (for example, degree days, air temperature, solar irradiance, production output etc.). In terms of Sigma items, this will usually be Associated Data and will show all the items that are available for the Site. This can be set by "dragging and dropping" the appropriate item from the "Available Data Type" in the list above. | |
Dates | This sets the start and end date that the regression should be set for and the Timezone that should be used. The regression that is initially created will be for data that falls between these dates. Note - the dates can be updated subsequently when working with the Regression. | |
Interval Period | The granularity that should be used when creating the Regression - i.e. the number of unique data points that should be used. The options here are:
For example, if a year long period is selected and an interval of One Month is used, then 12 data points will be included in the regression. Conversely, if One Week was selected, then 52 points would be included. Note - this should be set according to the length of time the Regression is being created for. If looking at a longer period of time, then a higher interval period might be used. |
By using this function, Sigma will automatically check the system to find any correlations with an R2 value of 0.9 or greater based on the available sets of data within the Site.
The R2 value is called the coefficient of determination and is a statistical measure of how close the data points are to the regression line. Typically, a value of 0.9 or above represents a good correlation and indicates that the two datasets are related, i.e. a 90% correlation between the variation in consumption and the influencing dataset.
Correlations will be between two data sets and those found automatically will display as follows, the first item is the dependent data type, the second is the independent data type.
Upon entering a Regression Correlation, then the Regression screen will be displayed in the context of the Chart Tab.
This is where the Dependent Data Type (y axis or vertical axis) is plotted against the Independent Data Type (X axis or horizontal axis) and we can start working with the relationship between the two datasets and determine what expected performance might be. There are a number of components in this screen, which are explained in the subsequent sections below.
Note - these Regression Correlation that is created is read only - if customisation of the regression is required (e.g. to exclude specific data points or manually set the gradient or intercept), then a new regression line need to be created. Please see Regression Lines directly below on how to do this.
To export the regression data, click on Export .
This will create a zip file called "Regression.zip" that contains:
The graph plots the Dependent Data against the Independent Data and draws the regression line (line of best fit) based on all the data available between the dates that have been selected.
Where the points are green they are included in the regression. It is possible to exclude data points from the regression if there is a desire to remove these from determining the expected performance. Where this is the case, these would be shown in red.
This section shows a number of details relating to the relationship between that has been established.
The gradient or intercept can be adjusted by selecting and entering a value in the popup box that is displayed.
To remove the custom gradient or intercept click
Note - when a custom gradient is set the "R" values are not available and will be shown as "N/A".
This feature allows the dependent variable to calculated based on the Regression Correlation that has been created.
For example, if the dependent variable (x-values) represented occupancy and the independent variable (y-values) represented electricity consumption. Then entering 50 and clicking "Evaluate" would calculate what the electricity consumption should be.
Use the start date and end date to choose the start period and end period for the regression, either selecting the drop down boxes or the calendar icon .
Selecting "Refresh" This will restrict the data points on the graph to only those that fall within the two dates.
This section allows new Regression Lines to be created, so that they can be manipulated, saved and then re-visited at any point in the future.
When initially creating a new Regression Correlation, the system will create:
When entering the screen in the context of Regression Lines and correlations that have previously been credited, modified and saved, then these will be displayed.
To create a new Regression Line, click .
This opens the new Regression Line pop up.
Here the following details can be entered or updated:
Right clicking on a Regression Line and selecting edit will show the same pop-up outline in the "Creating New Regression Lines" above and allow the same details to be updated for the existing regression line.
Right clicking on a Regression Line and selecting remove, will show the following confirmation pop-up.
Clicking "Yes" will remove the Regression Line.
It is possible to exclude specific data points from the regression correlation as you require either individually or in bulk in a variety of different ways. This is useful to exclude specific outliers that may significantly impact and skew the baseline performance.
Excluding points from the graph will recalculate the Regression Line.
This is managed via the Select Points component of the screen which can be seen on the left hand side. It can be achieved using the graph, using the time filter feature or updating the tabular data.
The graph can be used by selecting individual points on the graph (holding down the CTRL key on the keyboard if wanting to select multiple) or highlighting multiple points on the graph by left clicking and dragging the mouse over the applicable data points. This will place a black border around the selected points and add each of them as a unique row in the Selected Points pane.
Clicking the "Exclude" button will then exclude the selected data points from the regression and changed them to a red colour which visually show which points are excluded.
Clicking the "Include" button will re-enable selected data points that have previously been excluded.
Clicking "Clear Selection" will de-select any points that are highlighted and listed in the Selected Points pane.
The Time Filter feature can be used to quickly remove data points which relate to particular time periods. It is only available where an interval of one day, one hour or half hour are used.
The view available will be applicable to the interval that has been used when creating the regression. For example, if a daily interval is used, then the days of the week would be available for inclusion/exclusion. Alternatively, if one hour was used, then a grid of the hourly time bands in the context of each day would be included where each 'cell' could be included or excluded.
The list of data points can be viewed in tabular form in the "Table" tab, which is described below.
The screen defaults to show the data in a graph. This can be changed to show a list of all the data points on that graph in tabular form.
Click on tab at the top of the screen to display a list of the data points.
For each data point the table provides:
Once the Regression Correlation has been created and the expected performance determined, then the CuSUM control chart can be used to view the performance of the dependent variable over time. More details about this can be seen on the introduction page here.
This can be accessed by clicking on either "CUSUM" links at the top of the screen:
The graph shows the cumulative data based on the CuSUM period.
This is the cumulative sum of the differences between the Actual minus the predicted value based on the regression, over time (e.g the regression expects the consumption to be 50 based on the production output being 10. The actual value is 75, so the difference would be 25). This data can be seen in the Table view, below.
Clicking and dragging your mouse over the graph will zoom into the data. Clicking the "Reset Zoom" button that appears will reset the graph to what it was.
This will highlight the step change in performance over time and highlight periods where there is significant performance degradation (i.e. line on the graph goes up), or improvement (i.e. line on the graph goes down). This trend over time may otherwise be hidden in the graph that was generated to create the Regression. This gives a much better view of the performance over time, so when looking at CuSUM chart, the changes in direction of the line indicate events that have relevance to the energy consumption pattern.
In the example above, reviewing the chart allows you to quickly see that there was a step change in performance which started from February 2019, as per the annotation below, which isn’t immediately obvious in the general correlation.The line sharply goes down after a prior consistent rising trend. This might trigger investigative action as to what changed at the start February 2019 and lead to corrective action being taken to optimise the performance. Subsequently, the same method would be used to re-assess the performance over the updated period of time and validate the performance has improved and the building is operating at an optimally. If corrective action was taken, and a a new Regression line was created to also include the new period of time, then you would expect to see the line to start going back "up" after the change shad been made to bring performance back on track.
This technique can then be used on an ongoing basis in the continuous improvements lifecycle to help identify and react to resolve issues effectively.
This section shows a number of details relating to the CuSUM that has been established.
CuSUM and CuSUM CO2 will always be 0.00 when moving straight from Regression without changing the CuSUM period. When the CuSUM period is changed then anything above the 0 value based on the regression period will be calculated and listed in both kWh's and CO2 emissions.
This section allows the creation of visual overlays which can be used to determine whether the performance is out of control shown in the CuSUM graph. A V-Mask is an overlay shape in the form of a V on its side that is superimposed on the graph of the cumulative sums. The origin point of the V-Mask (see diagram below) is placed on top of the latest cumulative sum point and past points are examined to see if any fall above or below the sides of the V
Tick the Show V-Mask box to add a V-Mask target onto the CuSUM.
To confirm target creation select "Yes" in the resulting popup window
By default a truncated V-Mask will be applied, adding a green layer to the graph.
The CuSUM points that now fall above the top or bottom arms of the V-Mask will be highlighted in red and shown on the graph as "exceptions".
There are 3 types of V-Mask targets available as follows
The type of V-Mask that is used can be changed after a V-Mask has been created, this is explained in the "Targets" section directly below.
Tick Show Fixed Targets to add a CuSUM fixed target. To confirm target creation select "Yes" in the resulting popup window.
Use the start date and end date to choose the start period and end period for the CuSUM, either selecting the drop down boxes or the calendar icon .
Selecting "Refresh" This will restrict the data points on the graph to only those that fall within the two dates.
This section displays any targets that have been created based on the CuSUM graph.
If you edit a CuSUM V-Mask Target you will be presented with the following window. After changing any configuration, click "Recalculate" and the chart will be updated.
If you edit a CuSUM Fixed Target you will be presented with the following window. The target value can be entered as required and configured to end on a specific date. After changing any configuration, click "OK" and the chart will be updated.
Update the settings as required and click OK
The Targets will only create an event if the first point of data is outside the tolerance range by >5% or if there are two consecutive points of day outside the tolerance range but with the 5% range.
Selecting the tab to show the data points that are plotted on the graph
The date displayed within the table is the end date for that period
The table and charts can be saved by click
It is not possible to go directly from Regression to the Control Chart as it is a designed as a stepped approach.
Navigate from Regression to the Control Chart via CuSUM by clicking then
The graph will display data for the same period that has been set in the CuSUM
The graph displays
Control Lines - accepted areas of data based on the control limit
Exceptions - areas which fall outside of the control limit
Difference line - data for the period being presented
The Overview Graph highlights the current target period by default and displays historic and future data
The Overview Graph can be minimised by clicking the Overview header bar
By default a New Control Target will be created for the user to amend as required. This target is not saved unless you click the Save button
The Targets section details the current control line parameters as well as allowing you to add and remove targets.
To amend the control lines applied change the control limit
Alternatively you can manually drag the control lines on the graph, to do this select and click and drag on the graph to set desired control limits
To remove the Control Limits from the graph uncheck .
The Targets will only create an event if the first point of data is outside the tolerance range by >5% or if there are two consecutive points of day outside the tolerance range but with the 5% range.