This document is a step-by-step walkthrough of how to use the Ingest Wizard to ingest a JSON data file into Interana, and covers the following topics:
- Access the Ingest Wizard
- Name the new table
- Choose a file to ingest
- Choose the Time and Shard Key (actor) columns
- Verify the configuration
- Navigate to the Explorer
- If your data isn't loaded
For a high-level view of the tasks you can perform with the Ingest Wizard, see Ingest Wizard ~ how it works.
Step 1: Access the Ingest Wizard
You must have Interana admin role permissions to be able to ingest data into your Interana cluster.
To bring up the Ingest Wizard, do the following:
- Open a browser window.
- Navigate to https://<cluster_location>/?import.
- Log in with your admin credentials, or the Interana default login credentials:
- Username: root@localhost
- Password: root
Step 2: Name the new table
Give your table a name. Note that table names must be unique across the system. Click the Next button.
Step 3: Choose a file to ingest
On the "Select Sample File" page, accept the default options of "Single File" and "Local File" and click Browse to choose a file from your local machine. For this exercise, choose a newline-separated JSON file that is less than 100 MB in size (although you can choose any of our supported file types). Then click Next.
You cannot use the Ingest Wizard to configure a batch import of a local filesystem. Use the Interana CLI ingest to configure a batch import.
Step 4: Choose the Time and Shard Key (actor) columns
After clicking Next on the previous page, you'll see a busy spinner while Interana: 1) uploads your file to the cluster, and 2) analyzes the file. Once these two steps are finished, you'll arrive at the "Analyze and Transform" page of the wizard where you'll see a tabular preview of your data. Your next step is to choose your Time and Shard Key (aka Actor) columns and then click Next.
Choosing a Time column
Your Time column should indicate the time when the event happened. In the Time column dropdown, Interana will show you a list of columns from your data that Interana was able to interpret as time. These could be Unix epoch time (in seconds, milliseconds or microseconds) or formatted time strings (including ISO-8601 format).
Sometime Interana cannot detect any time columns in your data, and you therefore cannot proceed to the next step of the Ingest Wizard. If your data doesn't have any timestamps because it is not event data, then you'll need to choose a different file. Alternately, if your data encodes time in a format that Interana did not detect automatically, you can choose a format manually. See Handling Time Formats During Ingest for more details.
Choosing Shard Key (actor) columns
Your Shard Key columns should represent one or more actors whose behavior you'd like to analyze using Interana. These columns act like indexes for behavioral queries, and you can only use Interana's behavioral functions (like Cohorts, Funnels and Sessions) on these columns.
While you can choose as many Shard Keys as you'd like, Interana stores a full copy of your data for each one, so keep in mind the implications for storage on your system. If you later discover that you're not running any queries against a particular Shard Key, you can do a targeted delete of data for just that shard key to free up space.
Step 5: Verify the configuration
On the final page of the wizard, you'll see a summary of your selections. One point of interest here is the Applied Transformers section; even if you did not manually apply any transformers, Interana is still using them under the covers to process your file and you can view the configuration here. Click the Start Import button to go ahead and ingest your data file.
Step 6: Navigate to the Explorer
When the Ingest Wizard completes, your data is being ingested in the background and may take a little time to become available. You'll initially find yourself on the Settings page looking at a list of all tables in the system, including the new one you just created. Click the Explore icon in the top left corner of the navigation bar (it looks like a compass) to navigate to the Explorer where you can query your data.
Step 7: If your data isn't loaded
You might arrive at the Explorer and run a Count Events query only to notice that you don't seem to have any data in your new table and your Time Scrubber has a loading spinner.
Adjust the Time controls
The first thing to try is, adjust the Time control on the left side to make sure you're querying a time range where you just ingested data (it defaults to "7 days ago to now" which might not match what you actually had in your data file).
Change to an Unsampled query
If your data file was small (not very many events) it might be the case that a "sampled" query doesn't match any data points (or fewer than you'd expect). Click on Chart Controls and uncheck the Sampled Query option. This will run the query across all data in the system.
The Time Scrubber will eventually catch up
The Time Scrubber at the bottom runs a query over the full time range of your data set in order to show you a summary of where the data lives. Because of this, it sometimes takes a few minutes to catch up after you import new data. If you wait a few minutes and then reload (F5) the Interana Explorer, the Time Scrubber usually populates. Needless to say, this makes it a LOT easier to know which time range to query to see your data.