Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LSTNet #596

Merged
merged 6 commits into from
Feb 29, 2020
Merged

Add LSTNet #596

merged 6 commits into from
Feb 29, 2020

Conversation

ehsanmok
Copy link
Contributor

@ehsanmok ehsanmok commented Feb 4, 2020

Description of changes:

An implemetation of LSTNet.

  • Initial implementations in gluonts/model/lstnet/
  • Add unit tests
  • Usage and verify some of the paper's results use the gist lstnet.py
  • Cleanup and add docs

Note that I haven't replicated all of the paper's results as there is quite a big range of hyper-parameters that paper used for tuning and no final hyper-parameters were given!

TODO: Temporal Attention layer.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov-io
Copy link

codecov-io commented Feb 8, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.17%.
The diff coverage is 99.13%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.72%   83.89%   +0.17%     
==========================================
  Files         181      184       +3     
  Lines       10326    10442     +116     
==========================================
+ Hits         8645     8760     +115     
- Misses       1681     1682       +1
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/_network.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)

@alexw91
Copy link
Member

alexw91 commented Feb 8, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.17%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.72%   83.89%   +0.17%     
==========================================
  Files         181      184       +3     
  Lines       10326    10442     +116     
==========================================
+ Hits         8645     8760     +115     
- Misses       1681     1682       +1     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)
src/gluonts/model/lstnet/_network.py 100.00% <0.00%> (ø)
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)

@codecov-io
Copy link

codecov-io commented Feb 12, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.17%.
The diff coverage is 97.7%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.73%   83.91%   +0.17%     
==========================================
  Files         180      183       +3     
  Lines       10281    10412     +131     
==========================================
+ Hits         8609     8737     +128     
- Misses       1672     1675       +3
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)
src/gluonts/model/lstnet/_network.py 97.72% <97.72%> (ø)

@alexw91
Copy link
Member

alexw91 commented Feb 14, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.17%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.73%   83.91%   +0.17%     
==========================================
  Files         180      183       +3     
  Lines       10281    10412     +131     
==========================================
+ Hits         8609     8737     +128     
- Misses       1672     1675       +3     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)
src/gluonts/model/lstnet/_network.py 97.72% <0.00%> (ø)

@codecov-io
Copy link

codecov-io commented Feb 18, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.18%.
The diff coverage is 97.7%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.72%   83.91%   +0.18%     
==========================================
  Files         178      183       +5     
  Lines       10279    10412     +133     
==========================================
+ Hits         8606     8737     +131     
- Misses       1673     1675       +2
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)
src/gluonts/model/lstnet/_network.py 97.72% <97.72%> (ø)
...gemaker_sdk/entry_point_scripts/run_entry_point.py 43.75% <0%> (-3.31%) ⬇️
src/gluonts/transform/sampler.py 91.8% <0%> (-2.75%) ⬇️
...maker_sdk/entry_point_scripts/train_entry_point.py 25% <0%> (-1.54%) ⬇️
src/gluonts/nursery/sagemaker_sdk/model.py 53.33% <0%> (-1.51%) ⬇️
src/gluonts/model/forecast.py 71.92% <0%> (-0.61%) ⬇️
src/gluonts/nursery/sagemaker_sdk/estimator.py 42.98% <0%> (-0.5%) ⬇️
src/gluonts/distribution/bijection.py 81.35% <0%> (-0.47%) ⬇️
... and 10 more

@alexw91
Copy link
Member

alexw91 commented Feb 18, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.18%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.72%   83.90%   +0.18%     
==========================================
  Files         178      181       +3     
  Lines       10279    10420     +141     
==========================================
+ Hits         8606     8743     +137     
- Misses       1673     1677       +4     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)
src/gluonts/model/lstnet/_network.py 96.90% <0.00%> (ø)
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)

@ehsanmok ehsanmok requested a review from lostella February 18, 2020 19:19
@ehsanmok ehsanmok changed the title [WIP] Add LSTNet Add LSTNet Feb 19, 2020
@codecov-io
Copy link

Codecov Report

Merging #596 into master will increase coverage by 0.18%.
The diff coverage is 97.16%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #596      +/-   ##
=========================================
+ Coverage   83.72%   83.9%   +0.18%     
=========================================
  Files         178     181       +3     
  Lines       10279   10420     +141     
=========================================
+ Hits         8606    8743     +137     
- Misses       1673    1677       +4
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)
src/gluonts/model/lstnet/_network.py 96.9% <96.9%> (ø)

@alexw91
Copy link
Member

alexw91 commented Feb 19, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.18%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.72%   83.90%   +0.18%     
==========================================
  Files         178      181       +3     
  Lines       10279    10420     +141     
==========================================
+ Hits         8606     8743     +137     
- Misses       1673     1677       +4     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)
src/gluonts/model/lstnet/_network.py 96.90% <0.00%> (ø)
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)

Copy link
Contributor

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thank you! See some comments and questions inline

@codecov-io
Copy link

codecov-io commented Feb 20, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.17%.
The diff coverage is 97.16%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.94%   84.11%   +0.17%     
==========================================
  Files         178      181       +3     
  Lines       10337    10478     +141     
==========================================
+ Hits         8677     8814     +137     
- Misses       1660     1664       +4
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)
src/gluonts/model/lstnet/_network.py 96.9% <96.9%> (ø)

@alexw91
Copy link
Member

alexw91 commented Feb 20, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.17%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   83.94%   84.11%   +0.17%     
==========================================
  Files         178      181       +3     
  Lines       10337    10478     +141     
==========================================
+ Hits         8677     8814     +137     
- Misses       1660     1664       +4     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_network.py 96.90% <0.00%> (ø)
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)

@ehsanmok
Copy link
Contributor Author

@lostella thanks! all the comments have been addressed.

Copy link
Contributor

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the settings for this model are slightly inconsistent with the other ones: in GluonTS, prediction_length indicates the time length of the forecast that the predictor produces; in this case, the length of the forecasts is fixed to 1, and prediction_length effectively controls the lead time, i.e. how long past the conditioning range (the “past target”) the forecast starts.

I believe that I ncluding the model as it is now could create confusion, for example when evaluating its performance on standard datasets: there, all other models are really tested against a prediction interval of length > 1. We should actually implement a very generic test for all predictors, that makes sure that all common concepts (such as freq or prediction_length) are uniformly adopted, in which case this model wouldn’t fit the story well.

I see two ways to address this issue, which could actually be implemented together:

  1. Adjusting the LSTNet network in such a way that forecasts for the whole prediction interval are produced. I’m thinking maybe the number of units in ar_fc could be increased from 1 to reflect the number of predicted points? Other parts of the network may need to be adjusted similarly, what do you think @ehsanmok?

  2. Introducing an explicit lead_time property in predictors, which could default to 0 for models not exposing this customization (being very careful to off-by-one errors in setting the convention here), and instead be set to the appropriate value here in the LSTNet estimator. This is something that GluonTS could use in general, regardless of LSTNet, and should be addressed in a separate PR. I summon the wise opinions of @vafl and @jaheba on this matter.

@ehsanmok
Copy link
Contributor Author

ehsanmok commented Feb 23, 2020

@lostella thanks for bringing up the difference! yes, I was thinking to add two modes basically, one with horizon (as introduced in the paper) predicting y_{t+h} only and the other with prediction_length to predict y_t, ..., y_{t+h} which is the repeated application of the first mode (plus some required changes) to make it consistent with the library API. Please see the changes to this PR. Main thing is either ar_fc with 1 output or num_series outputs for horizon and prediction_length modes resp.

@codecov-io
Copy link

codecov-io commented Feb 25, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.2%.
The diff coverage is 97.56%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #596     +/-   ##
=========================================
+ Coverage   84.63%   84.83%   +0.2%     
=========================================
  Files         178      181      +3     
  Lines       10401    10565    +164     
=========================================
+ Hits         8803     8963    +160     
- Misses       1598     1602      +4
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)
src/gluonts/model/lstnet/_network.py 97.45% <97.45%> (ø)

Copy link
Contributor

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments about the network implementation, see inline. I think once these are addressed, everything should be fine. Maybe you could share some results with this estimator, to make sure it's learning correctly and giving meaningful predictions?

Comment on lines +95 to +103
assert (
fct.start_date
== pd.date_range(
start=str(test_ds["start"]),
periods=test_ds["target"].shape[1], # number of test periods
freq=freq,
closed="right",
)[-(horizon or prediction_length)]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right assertion in case horizon is set: if horizon or prediction_length are set to p, then the forecast contains the following time steps

horizon=p           =>               y_{t+p}
prediction_length=p => y_{t+1}, ..., y_{t+p}

So, when horizon is set, the fct.start_date should be p-1 time steps later, while according to the assertion it's the same in both cases.

However, at the moment there might be no easy way to get the correct fct.start_date out of the predictor in that case: I've opened issue #677 in this regard (in particular, the ForecastGenerator needs to be modified). I think it's fine for now to have the prediction_length case doing the right thing (and the test looks OK for that), and then fix the horizon case once #677 is addressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! seems the correct start_date doesn't seem be available. Though in test case, it's actually p=2 and the start_date is one before the last time.

Ehsan M. Kermani added 5 commits February 26, 2020 18:33
Add initial test

Make smoke test pass

Fix reshape bug and parameters validation

Add more tests

Fix unregistered hybrid rnn layer

Add docs

LSTNet scaling support

Scaling and doc

Fix dtype
@alexw91
Copy link
Member

alexw91 commented Feb 27, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.20%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   84.63%   84.83%   +0.20%     
==========================================
  Files         178      181       +3     
  Lines       10401    10565     +164     
==========================================
+ Hits         8803     8963     +160     
- Misses       1598     1602       +4     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)
src/gluonts/model/lstnet/_network.py 97.45% <0.00%> (ø)
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)

@codecov-io
Copy link

Codecov Report

Merging #596 into master will increase coverage by 0.19%.
The diff coverage is 97.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   84.63%   84.83%   +0.19%     
==========================================
  Files         178      181       +3     
  Lines       10401    10561     +160     
==========================================
+ Hits         8803     8959     +156     
- Misses       1598     1602       +4
Impacted Files Coverage Δ
src/gluonts/model/lstnet/_estimator.py 100% <100%> (ø)
src/gluonts/model/lstnet/__init__.py 80% <80%> (ø)
src/gluonts/model/lstnet/_network.py 97.36% <97.36%> (ø)

@alexw91
Copy link
Member

alexw91 commented Feb 27, 2020

Codecov Report

Merging #596 into master will increase coverage by 0.19%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #596      +/-   ##
==========================================
+ Coverage   84.63%   84.83%   +0.19%     
==========================================
  Files         178      181       +3     
  Lines       10401    10561     +160     
==========================================
+ Hits         8803     8959     +156     
- Misses       1598     1602       +4     
Impacted Files Coverage Δ
src/gluonts/model/lstnet/__init__.py 80.00% <0.00%> (ø)
src/gluonts/model/lstnet/_estimator.py 100.00% <0.00%> (ø)
src/gluonts/model/lstnet/_network.py 97.36% <0.00%> (ø)

@ehsanmok ehsanmok requested a review from lostella February 27, 2020 17:37
@ehsanmok
Copy link
Contributor Author

@lostella checkout some of the results for exchange_rate data here.

Copy link
Contributor

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! We will probably want to revisit how the horizon case is treated once #677 is settled (see this comment), @ehsanmok I may ping you about that to review the changes

@lostella lostella merged commit 1cdfced into awslabs:master Feb 29, 2020
past_observed_values
Tensor of shape (batch_size, num_series, context_length)
future_target
Tensor of shape (batch_size, num_series, 1) if `horizon` was specified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the time length 1 in general when horizon is set? In the estimator, the “future length” in the instance splitter is set to horizon in this case, so it will be > 1 if horizon is. Do I understand this right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when using horizon the since we're predicting a point not a sequence then future is 1 too to compute the loss and for prediction_length it's a sequence as usual.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm expecting instance splitter to respect it, so you're saying it may not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think so: I would set horizon to 2 and put a breakpoint here to check the shape of future_target, I’m petty sure the time length will be 2, not 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants