Optimum cluster performance depends on maintaining a sufficient margin of available disk space. For example, once disk usage on the string tier exceeds the total amount of available memory, imports across the cluster cease altogether. This is caused by the inability to add new events to the string tier. Performance is hindered when the data tier becomes overloaded as well.
The Usage page in the Interana user interface (UI), provides a quick and easy way to monitor cluster disk space and data usage. This article demonstrates how to access the Usage page, and explains each of the features and how to utilize them.
Introducing the Usage page
The Usage page shows information about your datasets, including data usage and event counts. The Usage page provides the following capabilities:
- See how much space is available on your instance and use that information to make decisions about whether you can import more tables, whether you need to roll off data, or whether you need to increase the cluster size.
- Figure out which columns are using the most space so you can decide how to manage them; for example, whether to omit them or use a different transformation method on them.
- See which strings have especially high cardinality.
- Check the total event counts for your data.
Accessing the Usage page
To access the Usage page, edit the URL of your instance and add
/?resourceusage after the name of your Interana instance, as shown in this example:
Working with the Usage page
The Usage page displays detailed information about your datasets and data usage. At the top of the page, a timestamp shows when the usage statistics were cached.
The Event Count shows you the sampled event count for each dataset, over the full time range of the datasets.
In the next section, Data Usage, you can show usage information for All Servers or Most Utilized Server.
- All Servers: Displays the total storage across all of your data and string servers.
- Most Utilized Server: Displays information about the server that is the most heavily used by your Interana instance. Of all of the servers that you are using, this is the server that is closest to running out of space.
The Cache value shows the disk space used by the aggregation cache. If necessary, some of this space can be reclaimed for data storage.
The table at the bottom of the page provides detailed information about the columns in your datasets. You can select to show information about String or Data columns.
Columns of types Int and Time exist only on the data server. String columns exist on both the string and data servers. For example, the literal strings are stored on the string server, and the int lookup value for that string is stored in the data server. If you toggle between the Strings and Data buttons when viewing the columns, you will see more columns in the Data view. But the string columns (along with the string samples) will show up in both views.
These columns apply only to string data:
- Unique Values: the number of unique values of this column
- Samples: lists up to three distinct examples of the data from the column
Show servers: Display the name of the server on which the data resides. For example, if you notice that you have a large column and want to see how its storage is distributed across servers, select this option to view that additional information.
Show table copies: For each shard key that you have defined, Interana creates a dedicated copy of your data. Use this option to display the shard key associated with each column; you can then use the column filter options to organize the table by the shard keys.