Releases · KxSystems/automl

Refactor coding/commenting style to be up to date with coding standards
AutoML now requires ML Toolkit >=3.0. This change was necessary as the function signatures have been changed significantly

Complete backend change to AutoML.
- Framework now uses a directed acyclic graphing and pipelining structure provided by the ML toolkits last release to define the code base. This dramatically improves code cleanliness and makes modifying and expanding the code base significantly easier.
Addition of command line interface option for AutoML allowing configuration for the session to be updated or complete run and exit to be undertaken
Fitting model and predicting now uses a .automl.fit function which returns a dictionary containing the predict function call, this replaces the .automl.new functionality which required users to retrieve fit models from disk on each invocation.
To retrieve models from disk for use the .automl.getModel function is provided which will return a dictionary containing the predict function as one of its keys.
- This model retrieval finds the prevailing model s.t. if the latest model needs to be found you can pass in current date/time in the appropriate format. Retrieval by this method can also retrieve named models.
Models can now be named rather than just dated and timed.
A function .automl.deleteModels is provided to allow individual models or regex matching string representations of models to be deleted.
Support added for Theano models
The stdout printing of AutoML can now be logged to the outputs folder associated with a run or redirected to a user defined location.
Some warnings/errors are now flexible, for example previously data with > 10000 targets would remove fitting of neural networks, this can now be ignored or the number of targets modified.
- 3 warning levels are provided, ignore everything = 0, tell me the action that you're taking and continue anyway = 1, raise an appropriate error = 2
All configuration that a user may be required to change is now defined using JSON, this includes
- Models which are to be applied
- Hyperparameter sets for applied models
- Scoring functions supported and the expected ordering of these
- Any updates to default parameters which a user wants to persist for the entire process, run command line or use a custom configuration for.

There are a number of other changes and the above is only a brief overview, more in depth explanations of functionality will be provided in documentation.

What's New:
Natural Language Processing:

Feature engineering techniques to transform text data into appropriate numerical representations for the application of the machine learning algorithms provided. Techniques include:
- Named entity recognition
- Sentiment analysis
- word2vec embedding
- Stop word/part of speech/numerical decomposition

Grid Search:

Provide the ability to change the grid search methodology from exhaustive grid search to random/pseudo random (Sobol) grid searching

Report Generation:

Default report generation will now produce LaTeX reports rather than reportlab. These reports are generated using pyLatex and rely on a user having the appropriate compilers installed so report generation will fail through to reportlab.

PyTorch:

Support to allow users test their own PyTorch models against the models provided by default

Kx AutoML builds upon our existing machine learning libraries (particularly the ML Toolkit and FRESH libraries) to provides a full end-to-end ML workflow for users.

This automates the entire task of applying machine learning solutions to real-world problems, from the raw dataset to the deployment of an optimized model.

Features include

Data preprocessing and encoding
Feature generation (including time-series features via FRESH)
Feature significance testing
Selection of models applicable to the data available and task at hand
Training and validation of models
Selection of optimal models, based on statistical scoring metrics
Hyperparameter tuning based on grid search methods
Model and final report generation
Options to modify and extend workflows, including bring-your-own algorithms and scoring functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: KxSystems/automl

Patch release for drafting of updated docker image

0.4.0

Refactor, api update and functional additions

Initial release of v0.2-beta

Initial Release