Canarias Machine Learning

PRModel

Technology

PRModel is a machine learning framework written in C with python support.
It is designed mainly to save memory and secondly to speedup (a lot) perceptron models with big number of features, it is not fast on small models because it has an optimization for weights updating.
For big networks with hundreds of thousands or millions features can be 1000 times faster or more. The key is an algorithm that cuts calculus dynamicaly on the learning process

Introduction

This framework was written out of necessity. In one of our projects, an intelligent caching system, the model was too big and slow that was very expensive in hardware terms.
During development a radical idea arose to design the backpropagate algorithm in a very different and unusual way, a proof of concept was done and ... it worked better than expected

Research

The design was made entirely in C to guarantee the maximum possible speed. It behaves like a regular perceptron. Internally optimizes calculations by refusing to update all weights in each learning cycle. We also developed python libraries to be able to use the PRModel API from this language.

Final Result

The results are spectacular in both speed and memory usage. To compare the result of a standard classification problem against Keras using TensorFlow, the Google framework for ML. In this case the problem determine in which country a hotel is located, with a database of addresses and hotel data, as it is a big corpus of a bag different words the X vector has 79.810 features. On 15 epochs learnings:

Keras	PRModel (from python)
5 days and 17 hours	4 hours and 45 minutes

The loss, precission and recall of the two models where similar on prediction. PRModel learnt 28.9 times faster than Keras on the same computer not using GPU.

Indeed Keras is faster for less than 200 features on the X input, but as parameters increase, PRModel has a stellar perfomance. For ML problems of millons input parameters PRModel is 1000 or more time faster
case

About memory usage:

Epoch	Keras	PRModel (from python)
1	516,857,856 bytes	426,455,040 bytes
2	618,995,712 bytes	426,455,040 bytes
3	719,631,424 bytes	426,455,040 bytes
4	834,805,760 bytes	426,455,040 bytes
5	926,666,752 bytes	426,455,040 bytes
...	...	...
15	1,326,414,208 bytes	426,455,040 bytes

Keras or TensorFlow looks like the versions tested has a kind of memory leak, anyway the memory usage in PRModel is much better.

Future

This technology allows us to address problems that are beyond the reach of machile learning due to size and provides us with a very important competitive advantage. PRModel can be expanded to support other types of neural networks, such as LSTM or convolutional networks. PRModel will be even faster when GPU support is ready and can span infinite models regardless of available physical memory because internally it manages weights as b-tree, PRModel can be automaticaly be connected to a database like mysql.

PRModel

Machine Learning Framework

Selling Predictor

Size stocks to demand