multi-node cluster consists of a set of connected systems (nodes) that work together and in many ways can be viewed as a single system. The nodes of a cluster are usually connected through local area networks, with each node running its own instance of the same operating system. 

A Scuba multi-node cluster consists of the following nodes that can each be deployed on a separate device, or grouped (as stacked services) on two or more devices:

  • Config Node: A node from which you administer the cluster. MySQL database is only installed on this node for storage of Scuba metadata. Configure this node first.

  • API Node: Serves the Scuba application, merges query results from data and string nodes, and then presents those results to the user. Nginx is only installed on the API node. 

  • Import Node: Connects to data repositories (S3, Azure, local file system), downloads new files, processes the data, and then sends to data and string tiers, as appropriate.

  • Data Node: Data storage that must have enough space to accommodate all events and stream simultaneous query results.

  • String Node: String storage for the active strings in the dataset, stored in a compressed format. Requires sufficient memory to hold the working set of strings accessed during queries.

  • Listener Node: Streams live data from the web or cloud. Also known as streaming ingest. This node is optional during installation.

Related terms

More information