# Calculating a baseline formula for Measure and Verification projects

## Introduction

This article describes basic guidelines in order to calculate the mathematical formula which defines the reference consumption (baseline) in a facility, by reviewing the available variables, the reference period and how to obtain the formula.

## External variables

Depending on the data that is available, the error and uncertainty in the regressions will be vary. Obtaining data might produce overcosts for the analysis.

Below is a matrix with some of the influential external variables that can affect energy consumption. Each project can be influenced by other variables, but these are the most common. The user must identify them based on the facility and their experience. ## Reference period

The period of time with a representative cycle of activity in our facility must be established in order to know which is the minimum period of data to be collected before defining the reference consumption (baseline). Some examples: ## Data preprocessing

The variables to be correlated with the energetic consumption are established, they can be inserted in the platform and they can be treated to be used with a statistical software (EXCEL, Minitab, Stata, Matlab, etc.) in order to generate the formula.

There  can be interactions between different variables or a variable can be representative if it is squared or cubed. It is recommended to prepare additional data sets among variables and power calculations.

Anomalous or unrepresentative data of the installation should be purged at this point.

## Obtaining the mathematical formula

The mathematical formula is the result of a linear/nonlinear statistical regression. Each of the software will have their own processes to obtain them, so we must follow their own tutorials to get the formula.

Each project has its own mathematical formula and there not exist "standard" formulas for a type of installation.

### Statistical error or uncertainty

All statistical process involves a misscalculation. The procedure selected for the calculation of the formula will give a correlation coefficient which will indicate the number of points explained in the  formula

Normally, the indicator is R^2.

For example, if the result is R^2 = 98%, this indicates that the formula explains 98% of the consumption, so there will be an error or uncertainty of 2%.