Mojmelo: Machine Learning algorithms
I want to introduce you Mojmelo. Implementation of machine learning algorithms from scratch in pure Mojo. Here is the list of the algorithms:
Linear Regression
Polynomial Regression
Logistic Regression
KNN
KMeans
SVM: Primal, Dual
Perceptron (single layer: Binary Classification)
Naive Bayes: GaussianNB, MultinomialNB
Decision Tree (both Regression/Classification)
Random Forest (both Regression/Classification)
GBDT (both Regression/Classification)
PCA
LDA
Adaboost
https://github.com/yetalit/mojmelo
GitHub
GitHub - yetalit/Mojmelo: Machine Learning algorithms in pure Mojo ...
Machine Learning algorithms in pure Mojo π₯. Contribute to yetalit/Mojmelo development by creating an account on GitHub.
9 Replies
The plan is to implement more ML algorithms including neural networks and importantly improve the algorithms. So, any suggestion from you related to mojo language is really appreciated!
I didn't get the name until I got to the repo, NOW I get the name π
Very nice
ππ Yes, played a bit with the words. Thank you for your interest!
very cool! I love the little marshmallow guy π
π Thank you!
I want to talk a bit about the latest commit and the future plan.
GBDT algorithm added for both classification and regression.
Mojmelo now supports following preprocessing operations:
- train_test_split
- normalize with inversion
- standardScalar with inversion
-MinMaxScalar with inversion
For better prototyping, I added the support to initialize a Matrix in numpy style:
----------------
The implemented algorithms are the fundamental algorithms. And there will not be much new algorithms in the near future. Instead, I will focus on increasing the accuracy of current algorithms. I also keep my eyes on mojo numpy replacement projects like Endia and NuMojo. So, after dealing with accuracy of models, I'll try to increase the speed by optimizing the algorithms and may replace my Matrix type with a better numpy equivalent.
Hope progress will be made over time, even if it's slow.
Hey π This looks awesome! What is the inference latency like this compared to python implementation?
Hi, Thank you! If you mean pure pythonπ Mojmelo will be much faster. Numpy is the key of the performance in python and it depends on the algorithm. If it performs large matrix operations, then numpy can make a noticeable difference. If we use a data structure as fast as numpy, or the algorithm performs smaller matrix operations, I think Mojmelo will have an equal or better performance. (For example, mojo is better than python at parallel computing)
Btw, I'm currently improving the matrix data type and I hope we will get some good results in terms of the speed
Ah cool! Basically I do some monte Carlo simulations where the decision points are GBDT models, currently I compile them but was wondering if swapping to mojo would make my simulations run faster
Well, I'm not fully aware of the process, if GBDT parts are bottleneck and you are using a cpu based library (like sklearn), after the mojo gpu support, they can run faster. But, my current GBDT is not optimized and may not fully support your desired parameters. It's WIP
If you're using something like XGBoost, then they are optimized very well