Machine learning operations (MlOps) means operating ML models at scale
Data science teams moving from research to production need performance, continuous integration, scalability, and feedback.
Scalable SKIL enables fault-tolerant batch processing, load-balanced real-time inference, and large-scale usage across multiple GPUs or any number of cores.
Transparent The Skymind Intelligence Layer (SKIL) enables monitoring so that AI models, their performance and operations are observable. Model accuracy, throughput and latency metrics are visible to the user.
Portable SKIL deploys an embeddable package on any infrastructure with support for bare metal, Kubernetes, Linux and Windows.
Flexible SKIL can run any machine learning workload, and supports any machine-learning model.
API clients available in 8 different languages with all of the functions you need to interact with a SKIL cluster.
import skil_clientuploads = client.upload("tensorflow_rnn.pb") new_model = DeployModel(name="recommender_rnn", scale=30, file_location=uploads.path) model = client.deploy_model(deployment_id, new_model) ndarray = INDArray(array=base64.b64encode(x_in)) input = Prediction(id=1234, prediction=ndarray, needsPreProcessing=false) result = client.predict(input, "production", "recommender_rnn")
Integrated with Hadoop and Spark, SKIL is designed to be used in business environments on distributed GPUs and CPUs on-prem, in the cloud, or hybrid.
Script your workflow just like a continuous integration server. Set up versioning of trained models, periodically train on feedback data, and rollback when new versions underwhelm.
Cluster architecture for best practices for protecting system state from node failure. Integrates with best-in-class tooling such as Apache ZooKeeper to support leader election and high availability.
|Deeplearning4j||Deep Learning for the JVM on Hadoop & Spark||✔||✔|
|Tensorflow||Importing Pre-Trained Models from TensorFlow||✔||✔|
|Keras||Importing Pre-Trained Models from Keras||✔||✔|
|DataVec||Data ETL Normalization and Vectorization||✔||✔|
|ND4J||High Performance Linear Algebra CPU and GPU Library on the JVM||✔||✔|
|RL4J||Reinforcement Learning Algorithms||✔||✔|
|Model Server||Integrated Model Hosting, Management, and Version Control||LIMITED||✔|
|Model Import||Importing Pre-Trained Models from TensorFlow and Keras||✔||✔|
|Workspaces||Notebook System for Model Construction and Collaboration||LIMITED||✔|
|Hardware Acceleration||Managed CUDA for GPU and MKL for CPU||✔||✔|
|Integration Tooling||Native Integration with CDH and HDP||✔||✔|
|Somatic||Sensor Vision and Control Integration for Robotics||✖||✔|
|Online Community||Access to Community Forum, Videos, and Documentation||✔||✔|
|Development Support||General Feature Engineering and Model Tuning Advice||✖||✔|
|SLA||Guaranteed Uptime and Response Times||✖||✔|