AI Infrastructure for Enterprise

Machine learning operations (MlOps) means operating ML models at scale

A Scalable AI Model Server

Data science teams moving from research to production need performance, continuous integration, scalability, and feedback.

Scalable SKIL enables fault-tolerant batch processing, load-balanced real-time inference, and large-scale usage across multiple GPUs or any number of cores.

Transparent The Skymind Intelligence Layer (SKIL) enables monitoring so that AI models, their performance and operations are observable. Model accuracy, throughput and latency metrics are visible to the user.

Portable SKIL deploys an embeddable package on any infrastructure with support for bare metal, Kubernetes, Linux and Windows.

Flexible SKIL can run any machine learning workload, and supports any machine-learning model.

Major Libraries Supported

Upload, Deploy, and Query in less than 10 lines of code

API clients available in 8 different languages with all of the functions you need to interact with a SKIL cluster.

          import skil_client
          uploads = client.upload("tensorflow_rnn.pb")
          new_model = DeployModel(name="recommender_rnn", scale=30, file_location=uploads[0].path)
          model = client.deploy_model(deployment_id, new_model)
          ndarray = INDArray(array=base64.b64encode(x_in))
          input = Prediction(id=1234, prediction=ndarray, needsPreProcessing=false)
          result = client.predict(input, "production", "recommender_rnn")

Integrated with Hadoop and Spark, SKIL is designed to be used in business environments on distributed GPUs and CPUs on-prem, in the cloud, or hybrid.








Management UI

1-Click Deployment

  • Faster time to market
  • Track model progress and share experiments, then deploy the best to production
  • AI model server with load balancer, zero-downtime updates, timed updates

Critical integrations

  • API clients in 8 languages, including Python, C#, and Java
  • Apache ZooKeeper for fault tolerance and state
  • Query using HTTP or Thrift RPC

AWS-like ML platform

  • Bare-metal or Cloud deployment
  • Managed Spark/GPU Cluster
  • Distributed, Hybrid CPU/GPU Resource Management
  • Multi-region self-healing fault tolerance

Continuous Deployment and Versioning

Script your workflow just like a continuous integration server. Set up versioning of trained models, periodically train on feedback data, and rollback when new versions underwhelm.

  • Versioning of deployed models
  • Store evaluation results
  • Rollback when performance degrades
  • "Cron jobs" for online learning/batch training
  • Integrate feedback data for retraining

Scalability and Fault Tolerance

Cluster architecture for best practices for protecting system state from node failure. Integrates with best-in-class tooling such as Apache ZooKeeper to support leader election and high availability.

  • Recover from node failures
  • Automatic load balancing across instances
  • Leader election and automatic standby
  • Any JDBC-compatible integration for backup
  • Continuous heartbeat and process checking



Feature Description Community Enterprise
Supported Libraries
Deeplearning4j Deep Learning for the JVM on Hadoop & Spark
Tensorflow Importing Pre-Trained Models from TensorFlow
Keras Importing Pre-Trained Models from Keras
DataVec Data ETL Normalization and Vectorization
ND4J High Performance Linear Algebra CPU and GPU Library on the JVM
RL4J Reinforcement Learning Algorithms
SKIL Platform
Model Server Integrated Model Hosting, Management, and Version Control LIMITED
Model Import Importing Pre-Trained Models from TensorFlow and Keras
Workspaces Notebook System for Model Construction and Collaboration LIMITED
Hardware Acceleration Managed CUDA for GPU and MKL for CPU
Integration Tooling Native Integration with CDH and HDP
Somatic Sensor Vision and Control Integration for Robotics
Online Community Access to Community Forum, Videos, and Documentation
Development Support General Feature Engineering and Model Tuning Advice
SLA Guaranteed Uptime and Response Times
Cost Free Contact Us