Installation (AEN 4.1.2)#
Overview¶
This installation procedure covers the steps needed to install a basic Anaconda Enterprise Notebooks (AEN) system comprised of a front-end Server, one or more Gateways, and one or more Compute Nodes.
If you have any questions about installation instructions, please contact your sales representative or Priority Support team.
Components¶
The AEN platform consists of three main service groups: AEN Server, AEN Gateway, and AEN Compute. These services can be either be distributed across multiple servers (recommended), or run on a single machine.
Server¶
The Server component is the administrative front-end to the system. This is where users login to the system, where user accounts are stored, where admins can manage the system, and interfacing with the database.
The Server is the main entry point for all users. It handles setting up projects and ensuring users are sent to the correct Data Center for a given Project.
Anaconda Enterprise Notebooks uses MongoDB to store internal data. This is typically run on the same host as the Server but can also be deployed on a separate host.
The Server uses NGINX to handle the user-facing web interface.
NGINX acts as a request proxy. The actual Server web-process runs on
a high numbered port listening only on localhost
, and NGINX
forwards requests there. The NGINX server is also responsible for
static content.
Gateway¶
The Gateway is a reverse proxy that authenticates users and automatically directs them to the proper AEN Compute machine for their project.
The Gateway provides a single access point to a set of Compute Nodes, and acts as a proxy service to manage authorization and mapping of URLs and ports to services that are running on Compute Nodes, thus providing a consistent uniform interface for the user.
Generally you need one Gateway for each physical location in your organization using AEN for firewall reasons.
Users will not notice the Gateway as it automatically routes requests to the proper Compute Node.
Compute Nodes¶
Compute Nodes are where Apps (such as Jupyter Notebook and Workbench) actually run. These are also the hosts that a user would see in a terminal session or if they used SSH to access the node. It is where all user-visible programs run. Each Project is associated with one or more Compute Nodes, and these in turn are part of a single Data Center. Compute Nodes need only be reachable by the AEN Gateway, so they can be completely isolated by a firewall.
Component organization¶
Organizationally, each Anaconda Enterprise Notebooks installation has exactly one Server instance. One or more Gateway instances can be configured and each Compute Node can only connect to one Gateway. The collection of Compute Nodes served by a single Gateway will be referred to as a Data Center. New Data Centers can be added to the AEN installation at any time.
For example, a Anaconda Enterprise Notebooks deployment with two Data Centers, where one Gateway had a cluster of 20 physical computers, and the second Gateway had 30 virtual machines would have the following complement of services installed and running:
1 AEN Server instance
2 AEN Gateway instances
50 AEN Compute instances (20 + 30)
Anaconda Enterprise Notebooks users interact with the system predominantly through Projects, a set of conda environments, Jupyter Notebooks, and other Apps that can be accessed by a Team of users.
Projects are associated with a single Data Center within the AEN environment. The team of users includes one Owner, which is the user that created the Project.
Since Anaconda Enterprise Notebooks is web-based, it uses configurable HTTP ports on the Server.
Installers¶
The Anaconda Enterprise Notebooks installers are available to paid customers only. If you are interested in a demonstration of Anaconda Enterprise Notebooks, please contact us.
Distributed install¶
In a distributed install the Server and Gateway run on separate hosts.
Single box install¶
Both the Server and the Gateway need separate external ports since they are independent services that are running on the same host in the single-box installation.
Installation requirements¶
Ensure you have the proper hardware and software resources before installing AEN.
Hardware requirements¶
See System Requirements for all Anaconda Enterprise hardware requirements.
NOTE: We recommend putting ``/opt/wakari`` and ``/projects`` on the same filesystem. If the project and conda env directories are on separate filesystems then more disk space will be required on compute nodes and performance will be worse.
Software requirements¶
- Red Hat/CentOS versions 6.5 to 7.2 on all nodes (Other Linux distros are supported, but this installation document assumes Red Hat or CentOS.)
- Linux home directories are required since Jupyter looks in $HOME for profiles and extensions.
- /opt/wakari: Ability to install here and at least 10 GB of storage.
- /projects: Size depends on number and size of projects. At least 20 GB of storage.
Linux system accounts required¶
Some Linux system accounts (UIDs) are added to the system during installation. If your organization requires special actions, here is the list of UIDs:
- mongod (Red Hat) or mongodb (Ubuntu/Debian): Created by the RPM or deb package
- elasticsearch: Created by RPM or deb package
- nginx: Created by RPM or deb package
- AEN_SRVC_ACCT: Created during installation of Anaconda Enterprise Notebooks, and defaults to “wakari”
- ANON_USER: An account such as
public
oranonymous
on the Compute Node If this user is not found,AEN_SRVC_ACCT
will try to create it, and if this fails, projects will fail to start. - ACL: These directories need the filesystem mounted with Posix ACL (Access Control List) support
(Posix.1e). Check with
mount
andtune2fs -l /path/to/filesystem | grep options
Additional software requirements¶
AEN Server¶
- Mongo Version: >= 2.6.8 and < 3.0
- NGINX version: >= 1.6.2
- ElasticSearch: >= 1.7.2
- Oracle JRE 7 or 8
- bzip2
AEN Gateway¶
No additional software prerequisites.
AEN Compute Node¶
- git
- bzip2
- bash (Red Hat default) or zsh
- X Window System
Note: If you don’t want to install the whole X Window System, you still need to install the following packages for R plotting support:
sudo yum install libXrender libXext libXdmcp libSM libICE libXt \
dejavu-sans-fonts dejavu-serif-fonts dejavu-fonts-common \
fontpackages-filesystem
Security requirements¶
- Root or sudo access
- SELinux in Permissive or Disabled mode
One way to change SELinux to either permissive or disabled mode is to edit the /etc/sysconfig/selinux file and set SELINUX parameters value to either disable or permissive. Edit the following file using either root or sudo access:
/etc/sysconfig/selinux
Edit the following and reboot for changes to take effect:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
Verify changes with getenforce
.
Network/TCP requirements¶
Note that all port numbers are configurable, but defaults are shown below.
Direction | Type | Default Port | Protocol | Optional | Configurable | Comments |
---|---|---|---|---|---|---|
Inbound | TCP | 80 | HTTP or HTTPS | No | Yes | Server |
Inbound | TCP | 8089 | HTTP or HTTPS | No | Yes | Gateway |
Inbound | TCP | 5002 | HTTP | No | Yes | Compute |
Other requirements¶
Assuming the above requirements are met, there are no additional dependencies necessary for AEN.
Note: While not a requirement for running the software, these instructions use curl or wget to download packages used in the install process. You may use other appropriate means to put the needed files into the installation directory.
Install Steps#
Carry out the procedures linked from the table below to perform a complete install of all Anaconda Enterprise Notebooks components.
The following optional install procedures may need to be performed, depending on how you set up your Data Center:
Additional post-install information: