If an ingest is in progress and there is a problem with data that's already been imported, you can pause the ingest and delete the interval of problem data and re-import only that data, then restart the ingest job. This document demonstrates how to perform this task.
Before you begin, gather the following information:
- Time range for the data that is to be deleted (PST/UTC)
- Table from which the data is to be deleted
Deleting and reingesting an interval of data
You can use Explorer (time view in the UI), to view the range of data you want to delete. Once the data is deleted there will be zero events for that time frame.
Time is Epoch time that is measured in milliseconds. For more information, see the Interana CLI ingest Quick Start.
To delete and reingest data, do the following:
- Log in to the push node of the Interana cluster.
- Find the ID of the job, or jobs, to be paused and make a note of them. These are the jobs that specify the table from which data is to be deleted.
ia job list --unsafe -s running | grep "<table_name>"
- Pause the job, or jobs, using the IDs from step 2. Enter multiple job IDs in a space-separated list, then confirm the jobs are paused.
ia job pause <job_id1 job_id2 job_id3> --unsafe ia job list --unsafe -s running | grep "<table_name>"
- Confirm that the time range you want to delete is correct, then delete the interval of data.
ia table delete-time-range <table name> <start_time> <End_time> --unsafe ia table delete-time-range <table name> <start_time> <End_time> --unsafe --run
- Verify in the UI that the data was deleted. The event count should be zero.
- Create a new one-time job to re-ingest the data, using a copy_id override. Specifying a copy_id indicates that, even though the files you are re-ingesting have already been imported, you wish to process them again. You can optionally set overrides for max_concurrent_batches and concat_file_size to throttle backfill jobs on clusters with busy import.
ia job create <job_id4_pipeline> onetime <start_time> <end_time> -p -o copy_id <copy_id> -o max_concurrent_batches 1 -o concat_file_size 200000000 --unsafe
- Resume the newly created one-time job.
ia job resume <job id4> --unsafe
- Resume the job, or jobs, that you paused in step 3. Enter multiple job IDs in a space-separated list.
ia job resume <job_id1 job_id2 job_id3> --unsafe
- Verify the status of the resumed jobs.
ia job stats <job_id1> ia job stats <job_id2> ia job stats <job_id3>