Regression analyses are a popular statistical tool for illustrating trends. They are also used in various types of risk assessment such as the Jensen and Treynor ratios. The basic method of linear regression seeks to construct a straight line in a two-dimensional system of coordinates such that all data points within the system of coordinates lie as near as possible to this line.
The straight line constructed in this manner is described by the two variables alpha (intercept, intersection of the straight line with the y-axis) and beta (slope, gradient of the straight line). For every data point, the point n,y of the regression line can then be calculated by means of these two variables.
The formulas for the calculation of alpha and beta in the analysis of time series are:
In these equations, N is the number of data points in the time series, i.e. the number of days, for example, n is the number of the data point, i.e. 1 for the first and 1000 for the thousandth data point in the time series. That is to say, y-values exist only for the natural numbers (n) on the x-axis. The curve of the time series thus arises through the connection of all points n,y of the time series.
Hence, the calculation for time series of a constant length is relatively easy, since the denominators in the formulas for the calculation of alpha and beta all become constants.
Using the alpha and beta values thus calculated, it is now possible to calculate a point n,y of the regression line for every point n in the time series:
The connection of these points yields a regression line as in Fig. 1. It illustrates the direction and strength of the trend of the blue curve representing the time series.
Fig. 1: Linear regression lines such as the red line represented in the graph above illustrate the trend of a time series over its entire duration. The point of intersection with the y-axis (alpha) as well as the slope of the line (beta) provide information regarding the direction and strength of the trend.
Regression lines can be used for a great number of analyses and transformations of the time series under investigation. One of the most interesting fields of application is the so-called detrending procedure, i.e. the removal of trends from the time series. Such a transformation is necessary, if a software system capable of learning is to investigate a time series, in order to make predictions, for example, or to learn strategic trading. In such systems, a strong trend would have the effect of having the systems always predict in the direction of the trend, since in so doing, they would potentially have a higher likelihood of being correct. Such a procedure, however, has nothing to do with intelligence.
Now, a transformation on the basis of regression procedures could be carried out as illustrated in Fig. 2. Here, the difference of the regression value (red line) and the actual value (blue line) is calculated at every data point n. The transformed time series (green line) thus generated is adjusted for the strong upward trend. It corresponds to the blue line but has been tilted downward, figuratively speaking, by the angle of gradient of the regression line.
Fig. 2: The diagram illustrates the effect of a trend reduction based on linear regression. The trend-adjusted green time series generated by the transformation is now much more suitable for processing by software systems.
Much more information than the direction and strength of a trend can be obtained by means of regression procedures, if the regression line is constructed to cover only a subset of the set of all data points. Just as in the case of a moving average, an analysis window is moved n by n through the time series. One begins, for example, with the first 10 data points and constructs a linear regression line through this time span. This procedure is then continued with the time span n = 1 … 11, then n = 2 … 12 etc.
The construction of a moving linear regression line goes a step further. Alpha and beta values are calculated on the basis of the data in the partial time span. These are then inserted into the above-mentioned formula, in order to calculate a hypothetical value for the following day. Such a prediction will then state: »If the time series continues to develop as in the last n (days, for example), on the next n (tomorrow), it will have the value y.« Thus, the predicted y contains much information concerning the entire time span under investigation as well as containing a prediction based on this knowledge.
Various forecasting systems use this regression procedure for their predictions. During phases of a prevailing trend, they are quite accurate using this procedure. The formula for the calculation of the predicted value shows, however, at what points in the time series such a predictive procedure will lead to error. Such a system will issue false predictions when a trend is reversed, i.e. when the qualification »if things continue as before« no longer applies. These turning points in a time series, however, are precisely the most interesting, especially when we are dealing with the price curve of a security.
Hence, systems that make predictions for the following day or even further into the future on the basis of regression procedures are simply useless to the technical trader, for they leave him in the lurch precisely at the point when the situation becomes dangerous. The moving linear regression line and hence also its forecast are able to change direction only once the trend reversal has already set in. This behavior is illustrated well in Fig. 3. Sometimes it takes the MLR curve two days before the predictions are back on course. In a forecasting system, the red curve would be able to claim very good hit rates, and yet this would only amount to fraudulent labelling.
Fig. 3: The above graph shows the differences and common features of a moving average and a moving regression curve. A moving average always reveals a lag. During phases of an unbroken trend, this is not the case in a regression forecast. At the points of reversal in the chart, however, the red curve too has a lag. Hence, it follows the trend instead of predicting it.
Although moving regression analyses are not really any good for making predictions, they can be quite useful for implementing short-term memory structures. The behavior of moving regression lines has various parallels in our logic. A generalization is made on the basis of past events. Thus, for example, we infer that night will soon fall, if we observe that it has been getting progressively darker over an extended period of time. Here too, we combine several successive observations and draw an inference for the future.
Software components capable of executing moving regression analyses also make it possible for computer applications to develop the ability to draw such conclusions and to use this ability in the analysis of data.