
Surprised! Pleasantly surprised, that is in this sixth install of m autoML series. While I used IBM’s Data Science cloud platform a few years ago, I have not had any exposure to Watson Studio before nor their AutoAI offering. I wasn’t sure what I was going to find, but I was pleased with what I saw. For someone who wants (needs) constant feedback of progress during the training runs, I loved the wealth of information provided. The accuracy isn’t as tuned as DataRobot or H2O Driverless AI, but that is to be expected based on the extreme difference in price (tens of thousands of dollars a year).
Why IBM Watson Studio AutoAI?
Big Blue isn’t ‘just’ mainframes and SPSS. I used the IBM data science cloud notebook environment a few years ago. Watson has a storied history at IBM. While you aren’t running ‘on’ Watson, the reputation brings authority.
The Cost
I was able to run this experiment for FREE. The pricing tiers are very reasonable for the individual data scientist.

The Setup
"These capabilities are available as part of a fully managed starter set of Cloud Pak for Data services on the IBM Cloud. Provision the integrated Lite versions of Watson Studio and Watson Machine Learning for free today as part of Cloud Pak for Data as a Service."
The link above will take you to a Try AutoAI on Watson Studio button. AutoAI is a hosted cloud solution, so the process of setting up a new project is pretty straightforward.
Once you are in Watson Studio, you add an AutoAI asset.

From there, you can set up a new experiment.

The Data
To keep parity across the tools in this series, I will stick to the Kaggle training file. Contradictory, My Dear Watson. Detecting contradiction and entailment in the multilingual text using TPUs. In this Getting Started Competition, we’re classifying pairs of sentences (consisting of a premise and a hypothesis) into three categories – entailment, contradiction, or neutral.
6 Columns x 13k+ rows – Stanford NLP documentation
- id
- premise
- hypothesis
- lang_abv
- language
- label
Loading the data
It couldn’t be simpler.

Training your model
The interface to configure and run your experiment is very minimal. There are some other options under Experiment Settings, but I wanted to run it just as basic as I could. Pick your label and hit Run Experiment.

This is where it gets fun! There is an interactive visualization that allows you to see where in the process you are with the experiment. I love this! Leaderboards are also available for you to review during the training.

Evaluate Training Results
There is a small indicator that the experiment has completed. I would have expected something more eye-catching, but I am happy it finished in a reasonable amount of time, 22 minutes.

The leaderboards provide information on accuracy and other success metrics as well as the model type and the enhancements make (such as feature engineering). I didn’t see a wide variety of models attempted, thus the short training time.


For Pipeline 3, I looked at the engineered features. You have to look into the details because ‘NewFeature_2’ isn’t very descriptive.

Conclusions
Of interest, after the training, I got a popup that introducing feature engineering on multiple datasets. Definitely, something to be investigated! If they can identify relationships between datasets and create new features, that would be amazing.
Overall, I enjoyed the Watson AutoAI experience itself. The process ran FAST (22 minutes versus H2O Driverless AI’s 4+hours), but at the expense of model variety and accuracy right out of the box. Additional experimental setup would be needed. But for the $78k lower price than DataRobot and H2O Driverless AI, that might be an acceptable trade-off.
I would encourage you to consider trying AutoAI. This IBM offering is a reasonably-priced autoML tool.
If you missed one of the articles in the series, here are the links.
Is AWS Sagemaker Studio Autopilot ready for prime-time?
Experience Google autoML Tables for Free