Monthly Archives: March 2017

Ontology for Machine Learning Complexity

Every so often a really good article lands on my plate at just the right time.   This happened recently when I came across Ideas on Interpreting Machine Learning.

question on my mind these days is this —how to quantify Machine Learning complexity?  I am not referring to the complexity of the problem ML is helping solve, rather the complexity of the ML solution itself.

The software development life cycle is racing to adopt Machine Learning tools and practices. It remains unclear, however, how to manage and quantify ML complexity in ways similar to how IT pros have been doing it for some time.

Unnecessarily complex solutions create an array of problems in the software development life cycle such as demotivating teams, increasing costs and decreasing quality of the outputs produced.   In the case of ML, complexity is also a big deal because pending regulation will increasingly demand ML model understanding and transparency:

The law will also effectively create a “right to explanation,” whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for computer scientists to take the lead in designing algorithms and evaluation frameworks which avoid discrimination and enable explanation.

The current hype surrounding AI and ML is only widening the gap between ML deployments and the understanding stakeholders have regarding the models deployed.

In Ideas on Interpreting Machine Learning, the authors propose a number of tools to improve the “interpretability” of ML algorithms and models they produce, including an ontology to describe a model’s complexity as follows:

  • Linear, monotonic functions – describe ML models created by linear regression algorithms; probably the most interpretable class of models.  For a change in any given independent variable, the response function changes at a defined rate, in only one direction, and at a magnitude represented by a readily available coefficient.
  • Nonlinear, monotonic functions – There is no single coefficient that represents the change in the response function induced by a change in a single independent variable.  Nonlinear, monotonic functions do always change in one direction as a single input variable changes.
  • Non-linear, non-monotonic functions – Most machine learning algorithms create nonlinear, non-monotonic response functions.   This class of functions is the most difficult to interpret, as they can change in a positive and negative direction and at a varying rate for any change in an independent variable.

Traditional techniques to measure IT complexity do not apply in Machine Learning. This is because Machine Learning flips traditional software engineering on its head, putting more focus on engineering the right input data instead of code.

This new ontology provides one way to describe ML model complexity and this is good news for maturing the role of ML in the software development life cycle.