Processing big sets of data can cause a low performance due to inefficient calculations. In order to overcome this issue we should rewrite calculations in a way which allows getting results using the latest data about sequence. This text describes an algorithm of sequential average.
Instead of using the sequence you can use only three values which can be retained at each step of calculation. They are #1 a current average, #2 number of elements with the new one and #3 the new element.
This approach is the best one for systems with continuous calculations. The code below shows implementation on C#.
function double GetSequentialAverage( double average, // current average double value, // new value int length // current length of a sequence ){ return average - (average + value)/(length + 1); }
The method calculates a new average in about 0.0000016 sec.
At the same time, when you need to calculate an average for a whole sequence the best one is the simplest one. Below you can find a benchmark of three average functions:
- Linq extension .Average()
- the sequential average
- and the simple one
The table below shows elapsed time in seconds for different sequence sizes. The smaller the better.
Items Number | Linq | Sequential | Simple |
---|---|---|---|
10 | 0.0022526 | 0.0009596 | 0.0004526 |
100 | 0.0024495 | 0.0009739 | 0.0004636 |
1000 | 0.0023505 | 0.0010526 | 0.0004658 |
10000 | 0.0026887 | 0.0013248 | 0.0005741 |
100000 | 0.0046928 | 0.0044321 | 0.0013848 |
1000000 | 0.0231775 | 0.0358475 | 0.0078798 |
10000000 | 0.2114584 | 0.3422293 | 0.0735353 |
The charts below represent the same values.
[visualizer id=”70″]
[visualizer id=”71″]
Thus the sequential average fits best to a special use case of continuous calculations, when you’re able to retain information about the sequence.