Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Neuralforecast #1115

Merged
merged 27 commits into from
Sep 30, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8f49397
Set the write output column type for forecast functions
xzdandy Sep 13, 2023
043d671
Fix forecast integration test
xzdandy Sep 13, 2023
0977c1f
Move the generic utils test
xzdandy Sep 13, 2023
092c03f
Fix ludwig unittest cases and add unittestcase for normal forecasting
xzdandy Sep 13, 2023
96e40db
Add unitest cases for forecast with rename in binder.
xzdandy Sep 13, 2023
5648371
Add unittest when an expected column is passed to forecasting
xzdandy Sep 13, 2023
8692ff1
Add unittest when required columns are missing in binder
xzdandy Sep 13, 2023
0679200
Merge branch 'staging' into neuralforecast
americast Sep 13, 2023
1fd3c02
Add neuralforecast support
americast Sep 14, 2023
65ed6e1
less horizon no retrain
americast Sep 15, 2023
5fd8af7
Merge branch 'staging' into neuralforecast
americast Sep 24, 2023
be242ee
add support for exogenous variables
americast Sep 25, 2023
583e778
Fix exogenous support; add tests
americast Sep 25, 2023
52c563e
add tests
americast Sep 25, 2023
84a159e
wip: fix test
americast Sep 25, 2023
06a7db0
remove strict column check in test
americast Sep 25, 2023
32a204b
Fix GPU issue with neuralforecast; fixed auto exog veriables
americast Sep 28, 2023
fda2b40
Merge remote-tracking branch 'origin/staging' into neuralforecast
americast Sep 28, 2023
736d9e0
added auto support; updated docs
americast Sep 29, 2023
06fb001
Update forecasting notebook.
xzdandy Sep 29, 2023
a36a1f5
fixes
americast Sep 29, 2023
eee78c9
Merge branch 'neuralforecast' of github.com:georgia-tech-db/evadb int…
americast Sep 29, 2023
09bee12
Fix horizon issue for multi uniqueids
americast Sep 29, 2023
b422000
update docs
americast Sep 29, 2023
e176bd4
fix exogenous for auto; made default
americast Sep 30, 2023
68265d3
turn auto off for neuralforecast test to avoid TLE error
americast Sep 30, 2023
267443d
Update the Notebook
xzdandy Sep 30, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add tests
  • Loading branch information
americast committed Sep 25, 2023
commit 52c563e2d81d27e68df9956ccf084a0b9f480d6f
6 changes: 5 additions & 1 deletion docs/source/reference/ai/model-forecasting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,12 @@ EvaDB's default forecast framework is `statsforecast <https://nixtla.github.io/s
- The name of the column that contains the datestamp, wihch should be of a format expected by Pandas, ideally YYYY-MM-DD for a date or YYYY-MM-DD HH:MM:SS for a timestamp. Please visit the `pandas documentation <https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html>`_ for details. If not provided, an auto increasing ID column will be used.
* - ID
- The name of column that represents an identifier for the series. If not provided, the whole table is considered as one series of data.
* - LIBRARY
- We can select one of `statsforecast` (default) or `neuralforecast`. `statsforecast` provides access to statistical forecasting methods, while `neuralforecast` gives access to deep-learning based forecasting methods.
* - MODEL
- We can select one of AutoARIMA, AutoCES, AutoETS, AutoTheta. The default is AutoARIMA. Check `Automatic Forecasting <https://nixtla.github.io/statsforecast/src/core/models_intro.html#automatic-forecasting>`_ to learn details about these models.
- If LIBRARY is `statsforecast`, we can select one of AutoARIMA, AutoCES, AutoETS, AutoTheta. The default is AutoARIMA. Check `Automatic Forecasting <https://nixtla.github.io/statsforecast/src/core/models_intro.html#automatic-forecasting>`_ to learn details about these models. If LIBRARY is `neuralforecast`, we can select one of NHITS or NBEATS. The default is NBEATS. Check `Automatic Forecasting <https://nixtla.github.io/neuralforecast/models.nbeats.html>`_ for details.
* - EXOGENOUS
- The names of columns to be treated as exogenous variables, separated by comma. These columns would be considered for forecasting by the backend only for LIBRARY `neuralforecast`.
* - Frequency
- A string indicating the frequency of the data. The common used ones are D, W, M, Y, which repestively represents day-, week-, month- and year- end frequency. The default value is M. Check `pandas available frequencies <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>`_ for all available frequencies.

Expand Down
28 changes: 12 additions & 16 deletions evadb/executor/create_function_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ def handle_forecasting_function(self):
"""
Set or infer data frequency
"""

if "frequency" not in arg_map.keys():
arg_map["frequency"] = pd.infer_freq(data["ds"])
frequency = arg_map["frequency"]
Expand Down Expand Up @@ -297,19 +297,21 @@ def handle_forecasting_function(self):
raise FunctionIODefinitionError(err_msg)
model_args = {}
if "exogenous" in arg_map.keys():
exogenous_args = [x.strip() for x in arg_map["exogenous"].strip().split(",")]
exogenous_args = [
x.strip() for x in arg_map["exogenous"].strip().split(",")
]
model_args["hist_exog_list"] = exogenous_args

if "auto" not in arg_map["model"].lower():
model_args["input_size"] = 2*horizon
model_args["input_size"] = 2 * horizon
model_args["max_steps"] = 50

model_args["h"] = horizon

model = NeuralForecast(
[model_here(**model_args)],
freq=new_freq,
)
[model_here(**model_args)],
freq=new_freq,
)

# """
# Statsforecast implementation
Expand All @@ -335,8 +337,6 @@ def handle_forecasting_function(self):
logger.error(err_msg)
raise FunctionIODefinitionError(err_msg)



else:
model = StatsForecast(
[model_here(season_length=season_length)], freq=new_freq
Expand All @@ -346,22 +346,18 @@ def handle_forecasting_function(self):

encoding_text = data.to_string()
if "exogenous" in arg_map.keys():
encoding_text += "exogenous_"+str(sorted(exogenous_args))
encoding_text += "exogenous_" + str(sorted(exogenous_args))

model_dir = os.path.join(
self.db.config.get_value("storage", "model_dir"),
self.node.name,
library,
arg_map["model"],
str(hashlib.sha256(encoding_text.encode()).hexdigest())
str(hashlib.sha256(encoding_text.encode()).hexdigest()),
)
Path(model_dir).mkdir(parents=True, exist_ok=True)

model_save_name = (
"horizon"
+ str(horizon)
+ ".pkl"
)
model_save_name = "horizon" + str(horizon) + ".pkl"

model_path = os.path.join(model_dir, model_save_name)

Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ def read(path, encoding="utf-8"):

forecasting_libs = [
"statsforecast" # MODEL TRAIN AND FINE TUNING
"neuralforecast" # MODEL TRAIN AND FINE TUNING
]

### NEEDED FOR DEVELOPER TESTING ONLY
Expand Down
4 changes: 2 additions & 2 deletions test/integration_tests/long/test_model_forecasting.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ def setUpClass(cls):
# reset the catalog manager before running each test
cls.evadb.catalog().reset()


create_table_query = """
CREATE TABLE AirData (\
unique_id TEXT(30),\
Expand Down Expand Up @@ -116,7 +115,8 @@ def test_forecast(self):
result = execute_query_fetch_all(self.evadb, predict_query)
self.assertEqual(len(result), 12)
self.assertEqual(
result.columns, ["airpanelforecast.unique_id", "airpanelforecast.ds", "airpanelforecast.y"]
result.columns,
["airpanelforecast.unique_id", "airpanelforecast.ds", "airpanelforecast.y"],
)

@forecast_skip_marker
Expand Down