How to choose between a data, model or software solution
Most people don’t know when to use a machine learning model over a traditional software solution.
This 8 step guide will allow you to pick the correct solution for your problem.
1/ Can you code a program with explicit rules
Software engineers write precise rules in their code which a computer can execute. Data scientists, collect input data and specify the desired target value then select the best model which predicts the optimal value for each parameter to reach the desired target value.
2/ Does the software have to adapt to regular changes in the environment
The logic for ML algos are based on finding patterns in the data, if the data changes then so will the accuracy of the model and so we expect to change it continuously. Whereas, software engineering usually stick to strict rules to deliver a solution.
3/ Does it need to be personalised
Sometimes it’s too complicated to find patterns in the data for humans to create personalised solution hence we use machine learning.
4/ Do we have the time to experiment in case the solution does not work right away
ML is more of an iterative and experimental process compared to software engineering. Sometimes we cannot build a ML solution that works so we need to be comfortable knowing that it may not work, if that’s not a viable option then ML is not the way to go.
5/ Do you have enough input data / Get more data / Can we get more data
You need a large data set of typical cases in order to learn rules from data. If we don’t have enough or the quality isn’t there (for example, it’s not tagged with the desired outcome) then we have to get more if possible.
6/ Do we know the desired target value
As mentioned previously the target value needs to be specified to compare the prediction with as the model learns to adapt the parameters in order to get the prediction closer to the target value.
7/ Are there lots of typical cases in the data that we can use to train the model
If we have examples of typical cases in the data then the model should be able to identify the system which can be used to make accurate predictions.
8/ Are there patterns in the data
For instances where we don’t the know the target value, if there are patterns in the data then we can use different altos to cluster the data into meaningful groups.