Multi-Node Cluster [v5]
A multi-node cluster consists of a set of connected systems (nodes) that work together and in many ways can be viewed as a single system. The nodes of a cluster are usually connected through local area networks, with each node running its own instance of the same operating system.
A Scuba multi-node cluster consists of the following nodes that can each be deployed on a separate device, or grouped (as stacked services) on two or more devices:
Config Node: A node from which you administer the cluster. MySQL database is only installed on this node for storage of Scuba metadata. Configure this node first.
API Node: Serves the Scuba application, merges query results from data and string nodes, and then presents those results to the user. Nginx is only installed on the API node.
Import Node: Connects to data repositories (S3, Azure, local file system), downloads new files, processes the data, and then sends to data and string tiers, as appropriate.
Data Node: Data storage that must have enough space to accommodate all events and stream simultaneous query results.
String Node: String storage for the active strings in the dataset, stored in a compressed format. Requires sufficient memory to hold the working set of strings accessed during queries.
Listener Node: Streams live data from the web or cloud. Also known as streaming ingest. This node is optional during installation.