v0.16
·
1219 commits
to master
since this release
New Features
- Added the
AzureSearchWriter
for integrating Spark with Azure Search - Added the Smart Adaptive Recommender (SAR) for better recommendations in SparkML
- Added Named Entity Recognition Cognitive Service on Spark
- Several new LightGBM features (Multiclass Classification, Windows Support, Class Balancing, Custom Boosting, etc.)
- Added Ranking Train Validation Splitter for easy ranking experiments
- All Computer Vision Services can now send binary data or URLs to Cognitive Services
New Examples
- Learn how to use the Azure Search writer to create a visual search system for The Metropolitan Museum of Art with: AzureSearchIndex - Met Artworks.ipynb
Updates and Improvements
General
- MMLSpark Image Schema now unified with Spark Core
- Now supports Query pushdown and Deep Learning Pipelines
- Bugfixes for Text Analytics services
PageSplitter
now propagates nulls- HTTP on Spark now supports socket and read timeouts
HyperparamBuilder
python wrappers now return idiomatic python objects
LightGBM on Spark
- Added multiclass classification
- Added multiple types of boosting (Gradient Boosting Decision Tree, Random Forest, Dropout meet Multiple Additive Regression Trees, Gradient-based One-Side Sampling)
- Added windows OS support/bugfix
- LightGBM version bumped to
2.2.200
- Added native support for categorical columns, either through Spark's StringIndexer, MMLSpark's ValueIndexer or list of indexes/slot names parameter
isUnbalance
parameter for unbalanced datasets- Added boost from average parameter
Acknowledgements
We would like to acknowledge the developers and contributors, both internal and external who helped create this version of MMLSpark.
- Ilya Matiach, Casey Hong, Daniel Ciborowski, Karthik Rajendran, Dalitso Banda, Manon Knoertzer, Sudarshan Raghunathan, Anand Raman,Markus Cozowicz, The Microsoft AI Development Acceleration Program, Cognitive Search Team, Azure Search Team