M
Modularβ€’3mo ago
DobyDabaDu

Mojmelo: Machine Learning algorithms

I want to introduce you Mojmelo. Implementation of machine learning algorithms from scratch in pure Mojo. Here is the list of the algorithms: Linear Regression Polynomial Regression Logistic Regression KNN KMeans SVM: Primal, Dual Perceptron (single layer: Binary Classification) Naive Bayes: GaussianNB, MultinomialNB Decision Tree (both Regression/Classification) Random Forest (both Regression/Classification) GBDT (both Regression/Classification) PCA LDA Adaboost https://github.com/yetalit/mojmelo
GitHub
GitHub - yetalit/Mojmelo: Machine Learning algorithms in pure Mojo ...
Machine Learning algorithms in pure Mojo πŸ”₯. Contribute to yetalit/Mojmelo development by creating an account on GitHub.
9 Replies
DobyDabaDu
DobyDabaDuOPβ€’3mo ago
The plan is to implement more ML algorithms including neural networks and importantly improve the algorithms. So, any suggestion from you related to mojo language is really appreciated!
Darin Simmons
Darin Simmonsβ€’3mo ago
I didn't get the name until I got to the repo, NOW I get the name πŸ™‚ Very nice
DobyDabaDu
DobyDabaDuOPβ€’3mo ago
πŸ˜„πŸ˜„ Yes, played a bit with the words. Thank you for your interest!
Caroline
Carolineβ€’3mo ago
very cool! I love the little marshmallow guy πŸ’–
DobyDabaDu
DobyDabaDuOPβ€’3mo ago
πŸ™‚ Thank you! I want to talk a bit about the latest commit and the future plan. GBDT algorithm added for both classification and regression. Mojmelo now supports following preprocessing operations: - train_test_split - normalize with inversion - standardScalar with inversion -MinMaxScalar with inversion For better prototyping, I added the support to initialize a Matrix in numpy style:
X = Matrix('[[1, 2], [3, 4]]')
X = Matrix('[[1, 2], [3, 4]]')
---------------- The implemented algorithms are the fundamental algorithms. And there will not be much new algorithms in the near future. Instead, I will focus on increasing the accuracy of current algorithms. I also keep my eyes on mojo numpy replacement projects like Endia and NuMojo. So, after dealing with accuracy of models, I'll try to increase the speed by optimizing the algorithms and may replace my Matrix type with a better numpy equivalent. Hope progress will be made over time, even if it's slow.
esportsmodelling
esportsmodellingβ€’3w ago
Hey πŸ™‚ This looks awesome! What is the inference latency like this compared to python implementation?
DobyDabaDu
DobyDabaDuOPβ€’3w ago
Hi, Thank you! If you mean pure pythonπŸ™‚ Mojmelo will be much faster. Numpy is the key of the performance in python and it depends on the algorithm. If it performs large matrix operations, then numpy can make a noticeable difference. If we use a data structure as fast as numpy, or the algorithm performs smaller matrix operations, I think Mojmelo will have an equal or better performance. (For example, mojo is better than python at parallel computing) Btw, I'm currently improving the matrix data type and I hope we will get some good results in terms of the speed
esportsmodelling
esportsmodellingβ€’3w ago
Ah cool! Basically I do some monte Carlo simulations where the decision points are GBDT models, currently I compile them but was wondering if swapping to mojo would make my simulations run faster
DobyDabaDu
DobyDabaDuOPβ€’3w ago
Well, I'm not fully aware of the process, if GBDT parts are bottleneck and you are using a cpu based library (like sklearn), after the mojo gpu support, they can run faster. But, my current GBDT is not optimized and may not fully support your desired parameters. It's WIP If you're using something like XGBoost, then they are optimized very well
Want results from more Discord servers?
Add your server