Well, the answer is not an absolute NO. You do require a few observations but, a long time series is not necessary. My recent paper talks about the potential application of Self-Starting CUSUM in fisheries research. The paper investigates, “Whether a fishery can be managed if no historical data are available?” The fish cartoon depicted above is a symbolic representation to highlight the key messages.
1. Number of samples: A minimum of three observations could help detecting a trend.
2. Reference point: Reference point (value indicating ‘OK’) is not required to detect a trend.
3. Large fish indicator: Large fishes in the population indicate the health of the fish stock.
What is a CUSUM?
The CUSUM refers to Cumulative Sum control chart in ‘Statistical Process Control (SPC)’ theory. They are used for monitoring purposes and help detecting whether the observations in a time series are deviating consistently from a desired level of performance. The graphical version of SPC is known as a control chart and the simplest version is the ‘Shewhart’. In Shewhart chart, the data is monitored by checking whether the observations cross an upper or lower limit (red lines). The example demonstrated below shows the pH observations in an aquarium where the desired level of performance (or reference point) is pH=7 in blue line (neither acidic nor alkaline).
The CUSUM control chart is a modified version of Shewhart, where it computes the deviation of each observation from the reference point and compute their cumulative sum until the last observation. Hence, CUSUM has the advantage of detecting small and gradual shift in the time series (or trend) as early as it occurs (see the graph below).
What is a Self-Starting CUSUM?
The SS-CUSUM works on the same principle but, a reference point is not used in the computational process i.e., you do not have to say that the desired level of performance should be pH=7 (and hence you do not need that information to detect a trend). In SS-CUSUM, a ‘running mean’ is used instead of the reference point. The running mean is estimated from the data itself and updated when new observations are added to the time series. In long term, the running mean will approach closer to the reference point and as a result, the SS-CUSUM will appear similar to a normal basic CUSUM (see the graphs below).
Obviously, the running mean will drive you bananas if the initial observations are not close to the reference point. However, SS-CUSUM is an excellent method if the user is sure about the distributional properties of data i.e., the initial observations are not outliers.
What is the story of my paper?
In the context of a data poor situation, my paper investigated ‘Whether SS-CUSUM can be used for monitoring indicators such as mean length, mean weight etc. so that a change in state of the fish stock can be detected at the earliest possible?’. The performance of SS-CUSUM is displayed below in the form of ‘Receiver Operator Characteristic’ (ROC) curves. The closer the apex of the ROC curve towards the upper left corner, the better is the performance of SS-CUSUM in detecting the change in state of the stock.