sql server - Standalone R and R - SQL give different results -


i working on forecasting model monthly data intend use in sql server 2016 (in-database).

i created simple tbats model testing:

dataset <- msts(data = dataset[,3],             start = c(as.numeric(dataset[1,1]),                       as.numeric(dataset[1,2])),              seasonal.periods = c(1,12))  dataset <- tsclean(dataset,        replace.missing = true,        lambda = boxcox.lambda(dataset,                                method = "loglik",                                lower = -2,                                upper = 1))  dataset <- tbats(dataset,                  use.arma.errors = true,                  use.parallel = true,                  num.cores = null                  )  dataset <- forecast(dataset,                      level =c (80,95),                     h = 24)  dataset <- as.data.frame(dataset) 

dataset imported .csv file created sql query.

later, used same code in sql server, input being same query used .csv file (meaning data same aswell)

however, when executed script, noticed got different results. numbers fine , make perfect sense, both sql , standalone r give forecast table, numbers between 2 tables differ few % (about 3% on average).

is there explanation this? bothers me need best possible results.

edit: how data looks easier understanding. it's 3 column table: year, month, value of transactions (numbers randomised because data classified). alltogether have data 9 years.

2008    11  1093747561919.38 2008    12  816860005030.31 2009    1   341394536377.06 2009    2   669993867646.25 2009    3   717585597605.75 2009    4   627553319006.03 2009    5   984146176491.78 2009    6   605488762214.33 2009    7   355366795222.40 2009    8   549252969698.07 2009    9   598237364101.23 

this example of results. top 2 rows sql server, bottom 2 rows rstudio.

t    point            lo80            hi80 1    872379.7412      557105.271      1187654.211 2    1093817.266      778527.1078     1409107.424  1    806050.6884      517606.464      1094494.913 2    1031845.483      743387.015      1320303.95 

edit 2: checked each part of code , figured out difference in results happens @ tbats model.

sql server returns: tbats(0.684, {0,0}, -, {<12,5>})

rstudio returns: tbats(0.463, {0,0}, -, {<12,5>})

this explains difference in forecast values, question remains these should same.

i'll answer having problems in future:

seems there difference in execution in r engine depending on os , runtime. tested runing standalone r on pc , on server using rstudio , microsoft r open , runing r in database on pc , on server. tested different runtimes.

if wants test themseves, r runtime can changed in tools - global options - general - r version (for rstudio)

all tests returned different results. not mean results wrong (in case @ least, i'm forecasting real business data , results have wide intervals anyway).

this may not actual solution, hope can prevent panicking week did.


Comments

Popular posts from this blog

python - Selenium remoteWebDriver (& SauceLabs) Firefox moseMoveTo action exception -

html - How to custom Bootstrap grid height? -

transpose - Maple isnt executing function but prints function term -