Skip to content

Conversation

@brk21
Copy link

@brk21 brk21 commented Aug 16, 2021

/databricks/python/lib/python3.8/site-packages/tslearn/barycenters/softdtw.py:103: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
X_ = numpy.array([to_time_series(d, remove_nans=True) for d in X_])

Because this method often uses numpy arrays with different shapes, I think the approach is justified here.

/databricks/python/lib/python3.8/site-packages/tslearn/barycenters/softdtw.py:103: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  X_ = numpy.array([to_time_series(d, remove_nans=True) for d in X_])

Because this method often uses numpy arrays with different shapes, I think the approach is justified here.
@codecov-commenter
Copy link

codecov-commenter commented Aug 16, 2021

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.74%. Comparing base (42a56cc) to head (500a2fb).
Report is 69 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #360      +/-   ##
==========================================
+ Coverage   94.71%   94.74%   +0.02%     
==========================================
  Files          59       59              
  Lines        4525     4525              
==========================================
+ Hits         4286     4287       +1     
+ Misses        239      238       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rtavenar
Copy link
Member

Hi @brk21

Thanks for pointing out this deprecation warning! It seems to me that it is not even necessary to cast the list into a numpy array, or is it?

Couldn't we do something like:

X_ = [to_time_series(Xi, remove_nans=True) for Xi in X_]

?

@brk21
Copy link
Author

brk21 commented Aug 17, 2021

Hi @rtavenar,

I think you are right, but I also think we will generate the same warning in the to_time_series function here:

https://github.com/tslearn-team/tslearn/blob/main/tslearn/utils/utils.py#L146

So you may end up adding the object=true parameter there to avoid the deprecation.

Please let me know if I can support in any way.

Ross

@rtavenar
Copy link
Member

I would like to avoid the object=True since our algorithms in tslearn are supposed to operate on datasets represented as 3d arrays, which is no longer the case when arrays of objects are used.
Regarding the to_time_series function, it is definitely unexpected behaviour to provide a series that does not have the same number of features for all timestamps, so maybe we should throw a ValueError in this case, what do you think?

@brk21
Copy link
Author

brk21 commented Aug 17, 2021

@rtavenar I would be strongly opposed to that because one of the main values of softdtw is that it can operate on time series of different lengths! This ValueError would effectively eliminate our ability to make use of softdtw from tslearn.

If you wanted to port to_time_series into a separate version for time series of varying lengths and just use object=true in the contexts or algorithms that are designed to operate on time series of different lengths, then that might be a good middle ground that preserves integrity for the to_time_series function but makes allowance for the algorithms that do operate on time series of different lengths.

@rtavenar
Copy link
Member

@rtavenar I would be strongly opposed to that because one of the main values of softdtw is that it can operate on time series of different lengths! This ValueError would effectively eliminate our ability to make use of softdtw from tslearn.

I meant raising a ValueError for the case of the to_time_series function only, since this function only deals with one time series at a time, not a dataset of time series (for which to_time_series_dataset exists). Your workaround or the one I suggested earlier are fine for softDTW.

@brk21
Copy link
Author

brk21 commented Aug 17, 2021

Phew @rtavenar Apologies for the misunderstanding.

Yes, I would be fine with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants