Posted: 2024-01-12
This post is about practical challenges with data science.
At Process IQ we like Python and friends. It is a great language: it is expressive, easy to read and comes with fantastic set of libraries. However there are two things which are thorny with Python: deployments and no type safety.
This is our opinion - we are a small and agile tech company and we like efficiency that comes with type safety. Nonetheless, there are large companies, running Python applications at big scale e.g. Instagram, Google and YouTube, Dropbox or PayPal.
One activity where we use Python in production is machine learning, or rather machine “teaching”. Our data science team likes to use Python environment to experiment with models, and train them. However for production use we require models to be executed directly through the .NET runtime for security and stability reasons.
We also like Gradient Boosting Models for their expressiveness and efficiency with respect to required number of data samples. So here is a practical problem: how can you transfer a model from Python environment into .NET runtime?
Turns out that Microsoft has been pushing successfully ML.NET libraries which also ship with Onnx model format. Onnxmltools are able to translate industry standard sci-kit models so .NET can execute them directly. Here’s how:
import onnxmltools
from onnxmltools.convert.common.data_types import FloatTensorType
sci_kit_model = ...
onnx_model = onnxmltools.convert_lightgbm( sci_kit_model, initial_types=[('input', FloatTensorType([1,2]))] )
onnxmltools.utils.save_model(onnx_model, 'onnx.model')
The following code fragment shows how to use onnx model in C#.
private class TestData
{
[VectorType(2)]
public float[] input;
}
public class TestOutput
{
[ColumnName("label")]
[VectorType(1)]
public long[] Label { get; set; }
}
static void Main(string[] args)
{
var mlContext = new MLContext();
var data = mlContext.Data.LoadFromEnumerable( new TestData[] {});
var onnxPredictionPipeline = mlContext.Transforms
.ApplyOnnxModel("onnx.model");
var transformer = onnxPredictionPipeline.Fit(data);
var onnxPredictionEngine = mlContext.Model.CreatePredictionEngine<TestData, TestOutput>(transformer);
var prediction = onnxPredictionEngine.Predict(new TestData { input = new float[] { 0f, 0f } });
Console.WriteLine($"label: {string.Join(",", prediction.Label)}");
}
Pretty easy 😎. Our corporate cat is shocked!
Photo by Ariana Suárez on Unsplash