Access to the Nervana cloud is currently restricted to valid account holders. To acquire login credentials, please email email@example.com
Once credentials are received, users can interact with the cloud either through
calls to our REST API, or through our command line interface called
Models are the primary units of interest in the Nervana cloud. These correspond to individual deep neural networks, which are themselves a layered collection of tunable parameters that are adjusted based on input training data (plus target labels), a specified cost function, and an update rule.
neon, a highly optimized deep learning framework, powers the Nervana cloud. Each neon model is specified in a .yaml or .py file. There are many example network implementations that can be used as a starting point for doing work in the cloud.
This release of Nervana cloud also supports the TensorFlow* framework.
Using calls to API endpoints or
ncloud commands, one can start and stop
training new models, resume training existing models, and list details about
their current status.
Once trained, one can deploy models and then generate predictions from them using new input data. When complete, the user can also undeploy a deployed model.
Users and Tenants¶
An individual person accessing the cloud is referred to as a user. They are uniquely identified by their access credentials. Users initiate calls to train models and generate predictions.
Each user will be assigned to one (or more) tenants. A tenant is a grouping mechanism that allows multiple users to share compute resources and trained models with each other.
Resources are the compute power (CPUs, GPUs, Nervana Engines) and memory available to you to carry out deep learning model training and inference tasks. Resources are allocated at a tenant level. Currently they are pre-allocated and dedicated to their assigned tenant.
These are collections of data exemplars and target labels that are fed into the model in order to train it. The type and format of the data will vary depending on the use case and network architecture, but common inputs would be things like jpeg image files, short audio snippets, video files, or plain text.
Datasets encapsulate aeon (https://github.com/NervanaSystems/aeon), our open source framework for efficient data loading and transforming. Documentation for aeon is located here: http://aeon.nervanasys.com/index.html/. The basis for aeon is a CSV file called the manifest. The command
ncloud dataset upload <path to manifest file> will ingest the contents into the Nervana platform.
To ensure these can correctly be processed by the cloud, they should be formatted as described in Custom Datasets. If a user already has their data available in a public network accessible location (for example, in an Amazon S3 bucket), or Nervana has been given appropriate access credentials to privately held data, existing datasets can be linked for use in the cloud. This is ideal for large datasets. Alternatively, the user has the ability to upload new datasets for subsequent use in the cloud.
During model training time, the user can specify an uploaded or linked dataset and that will be referenced for use in training. There may be some delay the first time a dataset is being requested, but subsequent accesses will be cached on the cloud worker node.
Often times you may have additional data that does not fit in the manifest file format that you may want to include. Some examples are vocabulary files for language processing tasks, lookup tables, and pre-processing metadata. Volumes are the appropriate resourse type to use for these types of data.
Volumes are mounted to the
/data directory with read and write access. In future releases, multiple volumes and configurable mount paths will be supported.
It is not recommended to use volumes as the primary data for training neural networks; datasets are highly optimized for this use case.
Once a model has been suitably trained and deployed, a user can then pass in new (unlabelled) data exemplars, and have predicted labels and other details returned.
In general this involves pre-processing the input data, running it through the model, then collecting the outputs from the last layer of the network.
By default, these network outputs are assumed to be probabilities and only the top (user-specifiable) largest number are returned, along with their index and label (if label information was present in the training dataset).