Releases: DiffAPF/torchlpc
v0.7.2
What's Changed
- fix: missing build dependency by @yoyolicoris in #27
There's no new feature added.
Full Changelog: v0.7.1...v0.7.2
v0.7.1
This release is identical to v0.7. No new feature is added.
Full Changelog: v0.7...v0.7.1
v0.7
What's Changed
- feat: parallel scan extension for CPU by @yoyolicoris in #17
- feat: enhance CI workflow to support multiple OS environments by @yoyolicoris in #19
- feat: cpp extension for LPC by @yoyolicoris in #18
- feat: use original cuda scan from linear RNN by @yoyolicoris in #22
- feat: LPC CUDA kernel by @yoyolicoris in #24
- feat: return
zfby @yoyolicoris in #25 - fix: linking openMP on mac by @yoyolicoris in #20
No pre-built wheels
This release contains C++/CUDA implementations of time-varying LPC/scan as replacements for the existing Numba kernels, which will be dropped entirely in v1.0.
Since this release, there will be no pre-built wheels; only source distributions are available.
The user must ensure that the corresponding compilers (e.g., gcc, nvcc) are available on their system for building binaries during the installation process.
Performance changes
Comparing v0.7 to v0.6, the speed on Nvidia GPU is 1.2 to 2 times faster than before, tested on an RTX 5060 Ti.
However, when running on the CPU, it's sometimes two to three times slower, as tested on a Ubuntu 24.10 machine with an Intel i7-7700K.
We are exploring ways to address this gap in the future, to achieve the same speed again without the Numba compiler.
Full Changelog: v0.6...v0.7
v0.6
What's Changed
- v0.6: drop support for <2.0, support jacobian and hessian computation by @yoyololicon in #12
Summary:
- Drop support for
torch<2.0to use the new schema for writingautograd.Function - add
vmapfor calculatingjacfwd/jacrev/hessian
Full Changelog: v0.5...v0.6