I am using the MLTK.
I have a question about the usecase "Detect Numeric Outliers". Specifically line #4.
Why is it important when detecting outliers? I have plotted 2 graphs. Graph 1 uses line #4 and Graph 2 does not.
For me it seems that Graph 2 is the most accurate because it shows the forecast (future_timespan=172) form 30 Nov to 4 Dec. Meanwhile the other one just eliminates those days (it only shows up tp 30 Nov).
1. | inputlookup logins.csv
2. | predict logins as prediction algorithm=LLP future_timespan=172 holdback=36
3. | eval residual = prediction - logins
4. | where prediction!="" AND logins!=""
5. | table _time, logins prediction residual
USING: where prediction!="" AND logins!=""
![USING: ][1]
WITHOUT: where prediction!="" AND logins!=""
![WITHOUT: ][2]
[1]: /storage/temp/274918-capture.png
[2]: /storage/temp/274919-capture1.png