Using this Guide

You can explore this user guide based on your knowledge level and needs:

  • See Product Overview for an overview of the features and capabilities offered by the Intel® Nervana™ Cloud software.
  • See Getting Started Now to get started using the Intel® Nervana™ Cloud right away. This link explores the Command Line Interface, the REST API, Jupyter* notebooks, and the Web UI.
  • See Basic Concepts for an overview of the concepts relating to deep learning systems and the Cloud software.
  • See Intel® Nervana™ Cloud for more information about this release.

Product Overview

Intel® Nervana™ Cloud contains a comprehensive software suite that enables data scientists to develop custom enterprise-grade deep learning solutions quickly and cost-effectively. The software supports the following features and capabilities:

  • Robust dataset management. Upload labeled data, partition data and manage and track datasets.
  • Dataset storage integration with AWS (cloud). Supports a wide range of data types.
  • Extensive state of the art model library and datasets for neon, to jump-start you solution.
  • Prototype with Jupyter* interactive notebooks in a docker environment. This functions as a Python interpreter; enter and execute code immediately. Access notebooks within your browser to do experimentation, debugging, and visualization.
  • Multi-user, multi-tenancy support. Multiple simultaneous users can share platform resources in containerized sandboxes. Resources can be provisioned, managed, and metered per-user and per-tenant.
  • Fast training on multiple GPUs/CPU. Multi-node distribution on multiple GPUs.
  • Batch training support. Depending on resources, you can run multiple training jobs on different models and datasets. Queued jobs are run automatically as resources become available.
  • Analytics and visualization support with bokeh and matplotlab.
  • Launch various hyper-parameter experiments. See snapshots of their progress and view past experiments.
  • Support for the hosted deployment of trained models for inference, in both streaming and batch modes.
  • Export trained models by downloading weight files for offline inference deployments.

Following is a diagram of the major components and their relationships.


Getting Started Now

Getting Access

Before you can use the Nervana Cloud, you need access. Access is currently restricted to valid account holders.

To acquire about login credentials, please email After credentials are received, you can interact with the software using several methods:

  • Using the Command Line Interface: ncloud is a command line client to help you use and manage the Nervana Cloud. The CLI supports a complete, rich set of commands to help you quickly train models, view the status and results of your training jobs, import trained models, deploy a trained model so it can be used to generate predictions (in stream or batch mode), attach new data for both training and streaming jobs, and depending on your role, manage tenants, other users, and user jobs. See Getting Started with the CLI for more information.
  • Using the REST API: REST API endpoints expose all objects of the Cloud software, allowing the development of applications to make use of the Cloud. See REST API for more information.
  • Using Jupyter Interactive Notebook: Interactive mode provides a containerized Jupyter notebook with access to a CPU or GPUs, and the same environment used for ncloud model training. This is an effective way to develop new models or customize neon, receive quick feedback on syntax and algorithm correctness, visualize results with matplotlib or bokeh, and accelerate debugging. See interact Commands for more information.
  • Using the Web User Interface: Nearly all of what can be done at the CLI can also be done using the Web UI. The UI provides an intuitive, layered GUI with forms and fields to import, train, and deploy models, and also view visuals of your model training results. Admins and Super Admins can also perform their tasks using the Web UI. See Getting Started with the Web UI for more information.

Basic Concepts


Nervana Cloud exists for the purpose of developing, training, and deploying deep learning models. These models generally represent artificial deep neural networks, consisting of an input layer, a cascade of multiple “hidden” layers of processing units for feature extraction and transformation, and an output layer. Each successive layer uses the output from the previous layer as input, and each processing unit in a hidden layer receives input from every processing unit in the previous layer. Inputs to each processing unit include individual tunable parameters that are adjusted based on input training data (including target labels to support supervised learning), a specified cost or error value, and an update rule. The following figure is a conceptual diagram of model, with one input layer, two hidden layers, and one output layer.


The neon, framework is a highly optimized deep learning framework. Each model in neon is specified in a .yaml or .py file. There are many example network implementations at that can be used as a starting point for doing work in the cloud. And as of release 1.8, Intel Nervana Cloud also provides full support for TensorFlow* for model training and inference.

Using calls to API endpoints or ncloud commands, you can start and stop training new models, resume training existing models, and list details about their current status.You can deploy trained models and then generate predictions from them using new input data. When complete, you can also “undeploy” a deployed model.

Users and Tenants

An individual person accessing the Cloud is a user. Users are uniquely identified by their access credentials. Users can upload data, initiate calls to train models and generate predictions, and run interactive sessions using notebooks, among other tasks.

Each user is assigned to one (or more) tenants. A tenant is a grouping mechanism that allows multiple users to share compute resources with each other.


Resources are the compute power (CPUs, GPUs) and memory available for you to carry out deep learning model training and inference tasks. Resources are allocated at a tenant level. Resources are pre-allocated and dedicated to their assigned tenant.


Datasets are collections of data exemplars and target labels that are fed into the model in order to train it. The type and format of a dataset varies, depending on the use-case and network architecture, but common datasets are collections of things like jpeg image files, short audio snippets, video files, and plain text.

Datasets are handled by aeon (, our open source tool for efficient data loading and transforming. Documentation for aeon is located here: The basis for aeon is a CSV file called a manifest. The command ncloud dataset upload <path to manifest file> will ingest the contents into the Nervana platform. (Note the aeon supports the Neon framework only.)

To ensure that datasets can be processed by the Cloud, they should be formatted as described in Custom Datasets. If your data is already available in a public network accessible location (for example, in an Amazon S3 bucket), or Nervana has been given appropriate access credentials to access privately held data, then you can link these datasets for use in the Cloud. This is ideal for large datasets. Alternatively, you can upload new datasets for use in the Cloud.

During model training time, you can specify an uploaded or linked dataset and that will be referenced for use in training. There may be some delay the first time a dataset is being requested, but subsequent accesses will be cached on the Cloud worker node.

Note: dataset commands are not supported in this release. Use volume commands to manage data.


Often, you may have additional data that does not fit in the manifest file format that you want to include. Some examples are vocabulary files for language processing tasks, lookup tables, and pre-processing metadata. Volumes are the appropriate resource type to use for these types of data.

Volumes are mounted to the /data directory with read and write access. In future releases, multiple volumes and configurable mount paths will be supported.


After a model has been trained and deployed, you can pass in new (unlabeled) data exemplars, and have predicted labels and other details returned. This process is called inference.

In general, generating predictions involves pre-processing the input data, running it through the model, and then collecting the results from the last layer of the network.

Batch inference and streaming inference are supported. Batch inference involves inputting a file of data and receiving a file with inference results. Streaming inference is where the user deploys the model on the cloud and processes data as it is received.

By default, a model’s generated predictions are assumed to be probabilities and only the top (user-specifiable) largest number(s) are returned, along with their index and label (if label information was present in the training dataset).