So you might or might not know, I was working on HyperLearn –> a faster optimized ML package designed to make everything at least 50% (I hope) faster.

Thanks so much for all the support Redditors for HyperLearn! [Made it to the Trending Github list for Jup Notebooks!! yayy!]

Anyways, I didn’t update the code a lot, but that’s because I was busily testing and finding out which algos were the most and best.

Key findings for N = 5,000 P = 6,000 [more features than N near square matrix]

  1. For pseudoinverse, (used in Linear Reg, Ridge Reg, lots of other algos), JIT, MKL, PinvH, Pinv2 and HyperLearn’s Pinv are very similar. ’s is clearly problematic, having close to over x4 slower than Scipy MKL.

  2. For Eigh (used in PCA, LDA, QDA, other algos), ’s PCA utilises SVD. Clearly, not a good idea, since it is much better to compute the eigenvec / eigenval on XTX. JIT Eigh is the clear winner at 14.5 seconds on XTX, whilst Numpy is 2x slower. Torch likewise is slower once again…

  3. So, for PCA, a speedup of 3 times is seem if using JIT compiled Eigh when compared to Sklearn’s PCA

  4. To solve X*theta = y, Torch GELS is super unstable. Like really. If you use Torch GELS, don’t forget to call theta_hat[np.isnan(theta_hat) | np.isinf(theta_hat)] = 0, or else results are problematic. All other algos have very similar MSEs, and HyperLearn’s Regularized Cholesky Solve takes a mere 0.635 seconds when compared to say using Sklearn’s next fastest Ridge Solve (via cholesky) by over 0% (after considering matrix multiplication time) –> HyperLearn 2.89s vs 4.53s Sklearn.

So to conclude:

  1. HyperLearn’s Pseudoinverse has no improvement

  2. HyperLearn’s PCA will have over 2 times speed boost. (200% improvement)

  3. HyperLearn’s Linear Solvers will be over 1 times faster. (100% improvement)

Help make HyperLearn better! All contributors are welcome, as this is truly an overwhelming project…

Lower Time == better

Source link
thanks you RSS link


Please enter your comment!
Please enter your name here