After you load your data set, make sure that the data has been correctly ingested. We recommend that you run some simple queries to make sure the results make sense. You don’t have to be an expert on the data to do this.
Use the following questions to sanity check your ingested data.
Total record count and date range
Make sure that the total record count matches the input files.
Then make sure there is data available for the full time period where you expected to see data.
Check field types against the data dictionary
Go to Settings > Manage Data > Edit. Check each variable and make sure that all the variables listed in the data dictionary are listed. Make sure that numerical fields and text fields have settings for groupable and aggregable that make sense to you.
Check sample view agianst the data dictionary
Run a sample view query on Interana. Check every field in the data and compare what you see to what the data dictionary says you should see.
- Is the data type correct?
- Are there unexpected missing values? (Or are all values missing?)
- Do text fields have extra quotes, escaped characters, or other artifacts?
- For fields with time stamps, do they match the time range that you expect?
Group and compute
Using the data dictionary for a reference, group by each text field (to see the most frequent values), then compute min/max/average/median for each numerical field. Make sure that the values computed with Interana are consistent with what you expect.
Although a data dictionary can provide useful information (like “the average value for credit_score is 680”), you will often have to rely on the field definition and use your common sense. For example, there might be a field called
purchasePrice. If you compute the average value for app purchase data, for example, and find that the average is $2,798,023.29, that’s a good indicator that something is wrong and requires further investigation.