PDA

View Full Version : Issues related to stalling



tannebil
April 14th, 2012, 05:00 PM
My workflow is to take TiVo and WTV files, apply the output profile "MPEG2 Program Stream (MPG)", run Ad Scan to create a Drax chapter file, and use VAPpost process to recode the output file using Handbrake, update the metadata using AP, and add the chapter stops using Drax.

I've been doing some testing trying to understand why I'm having a couple of odd problems and it appears that stall detection/retry is the source of the problems. In particular, if the Output Folder disk is busy (for example, I'm copying a bunch of large video files into Monitor Folder), stall detection gets fooled into thinking that VRD has stalled when, in fact, it is just running very slowly. If I'm not doing a copy, VRD takes about 40 seconds and 25% CPU to QSF a one hour show. However, if the disk is busy, the same QSF takes 20 minutes and VRD CPU is in single digits. Sometimes (but not always), this results in stall detection being triggered which shouldn't necessarily be a problem except that the retry doesn't seem to work properly.

As far as I can tell, retry:

1. Moves the stalled input file to stalled_files
2. Sets a timer equal to StallRetryIntervalMinutes
3. Moves to the next active file in Monitor Folder
4. After the timer expires, VAP moves the input file is moved back to Monitor Folder and processes it as a new file.

The problems I'm encountering are:

1. The partially completed output file sometimes still exists in Output Folder and a (2) version of the output file is created. VAP doesn't realize that VRD created the (2) version and the .apo and .drx files have the original, rather than the (2) name. VAPpostprocess fails to update the metadata and insert the chapter stops as both steps depend on the file names matching except in the extension.

2. When Drax fails and is left open with a "file not found" error, after a couple of minutes, VAP concludes that VAPpostprocess is complete and moves on to the next active file. They seem to process normally but don't get chapter stops and no error appears on the screen. Maybe something related to the way I call it? Or something in the way Drax works? Here's the way I call it.

"cmd /c ""%DraXpath%" /file:%outputFileM4V% /import:%inputFileDRX%"

3. Sometimes VAP appears to get confused and puts the output file in Temporary (QSF) Folder rather than Output Folder which causes VAPpostprocess to fail since it expects the output file to be in Output Folder. This seems to happen when I "Stop Processing" on a file in VAP (aborting VRD), delete the partially complete output file from Output Folder, and "Start Processing". It doesn't happen if I do a reset on the file before restarting processing.

4. Sometimes the retry doesn't happen. For example, VAP starts processing "The Bishop Revival". It stalls and gets moved to stalled_files. VAP moves to "Brown Betty" which also stalls and gets moved to stalled_files. "The Bishop Revival" gets moved back and processes successfully. However, it's now 1:49 and VAP hasn't done anything with "Brown Betty". Maybe the logic doesn't exist to handle more than one stalled file at a time?

4/14/2012 12:15:09 PM Started QSF, input: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:15:10 PM Using filter dimensions from MediaInfo: 1280 X 720
4/14/2012 12:15:10 PM Using profile: MPEG2 Program Stream
4/14/2012 12:15:13 PM VideoReDo said: INFO: VideoReDo version 4.20.7.635 - Mar 13 2012
4/14/2012 12:15:17 PM VideoReDo said: INFO: Applying QSF filter: 1280X720
4/14/2012 12:22:17 PM Clean up error for file: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo Unable to delete: .vprj file (may not be present)
4/14/2012 12:22:17 PM Repeating metadata generation for: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo
4/14/2012 12:22:18 PM Read 6 input filename metadata parsing template(s)
4/14/2012 12:22:18 PM Read 15 title translation data lines
4/14/2012 12:22:22 PM theTVDB metaData search succeeded for: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo
4/14/2012 12:35:11 PM STALL ERROR: VideoReDo process stalled -- killing it
4/14/2012 12:35:12 PM Moving input file: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:35:12 PM Started moving stalled file: "Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo"
4/14/2012 12:35:12 PM Getting input file encoding for: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo
4/14/2012 12:35:28 PM MPEG2 is encoding for: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo
4/14/2012 12:35:28 PM Started QSF, input: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo
4/14/2012 12:35:29 PM Using filter dimensions from MediaInfo: 1280 X 720
4/14/2012 12:35:29 PM Using profile: MPEG2 Program Stream
4/14/2012 12:35:29 PM Finished moving "Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo" to stalled folder
4/14/2012 12:35:33 PM VideoReDo said: INFO: VideoReDo version 4.20.7.635 - Mar 13 2012
4/14/2012 12:35:35 PM VideoReDo said: INFO: Applying QSF filter: 1280X720
4/14/2012 12:45:30 PM STALL ERROR: VideoReDo process stalled -- killing it
4/14/2012 12:45:30 PM Moving input file: Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo
4/14/2012 12:45:30 PM Started moving stalled file: "Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo"
4/14/2012 12:45:30 PM Started moving stalled file: "Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo" back for retry
4/14/2012 12:45:33 PM Finished moving "Fringe - ''Brown Betty'' (Recorded Apr 29, 2010, KCPQDT).TiVo" to stalled folder
4/14/2012 12:45:36 PM Finished moving "Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo" back to monitor folder
4/14/2012 12:47:12 PM theTVDB metaData search succeeded for: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:47:12 PM MediaInfo failed on: "Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo"
4/14/2012 12:47:12 PM Getting input file encoding for: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:47:31 PM MPEG2 is encoding for: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:47:31 PM Started QSF, input: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:47:32 PM MediaInfo failed to provide filter dimensions for this video. No filter will be applied
4/14/2012 12:47:32 PM No filter dimensions apply
4/14/2012 12:47:32 PM Using profile: MPEG2 Program Stream
4/14/2012 12:47:36 PM VideoReDo said: INFO: VideoReDo version 4.20.7.635 - Mar 13 2012
4/14/2012 12:47:39 PM VideoReDo said: INFO: No QSF filter to apply
4/14/2012 12:56:38 PM VideoReDo completed QSF on input: D:\Recorded TV\Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:56:38 PM Started Ad Scan/chapter marks, input: Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).TiVo
4/14/2012 12:56:44 PM VideoReDo said: INFO: VideoReDo version 4.20.7.635 - Mar 13 2012
4/14/2012 12:56:47 PM VideoReDo said: INFO: Edit mode set to cut
4/14/2012 1:02:54 PM VideoReDo completed CHAPMARKS on input: O:\Recorded TV - VAP\Fringe - ''The Bishop Revival'' (Recorded Jan 28, 2010, KCPQDT).mpg
4/14/2012 1:02:54 PM Created Drax Chapter File
4/14/2012 1:02:55 PM Using thumbnail image: Fringe_thumbnail.jpg
4/14/2012 1:02:55 PM Wrote Atomic Parsley Options File
4/14/2012 1:02:55 PM User post process batch file started
4/14/2012 1:23:15 PM User post process batch file completed
4/14/2012 1:23:16 PM BATCH MSG: "Started HB run at 13:02:55.16 for Fringe-S02E14-The Bishop Revival-20100128"
4/14/2012 1:23:16 PM BATCH MSG: "Ended HB run at 13:20:07.84 for Fringe-S02E14-The Bishop Revival-20100128"
4/14/2012 1:23:16 PM BATCH MSG: "Started AP run at 13:20:07.85 for Fringe-S02E14-The Bishop Revival-20100128"
4/14/2012 1:23:16 PM BATCH MSG: "Ended AP run at 13:20:08.50 for Fringe-S02E14-The Bishop Revival-20100128"

I'll disable stall detection for the moment by reducing the stall CPU value to .01 in the XML file but it looks like stall detection and processing needs some adjustments. I tried increasing the processing priority for VAP using the command line argument but it doesn't appear to make any difference. Maybe my experience explains some of the other intermittent stalling issues being reported.

tannebil
April 14th, 2012, 07:34 PM
Tweaking the stall values didn't do any good.

<MaxStalledFileRetries>1</MaxStalledFileRetries>
<StallCheckIntervalSeconds>2400</StallCheckIntervalSeconds>
<StallRetryIntervalMinutes>10</StallRetryIntervalMinutes>
<StallCpuUseThreshold>0.01</StallCpuUseThreshold>

I guess I'll just have to make sure I don't do any file copying while VAP is processing.

dlflannery
April 14th, 2012, 08:33 PM
You can set StallCpuUseThreshold to 0.0 and that should disable stall detection altogether. I'm amazed at how much disk I/O is slowing things down. I suppose it's because VAP/VRD and your transfers are all competing for the same disk I/O channel, i.e., it isn't CPU usage but just processes waiting for disk operations. I don't know how exact your figures were but a QSF that takes 40 secs with 25&#37; CPU usage will be averaging only 0.83% CPU usage if it takes 20 minutes. Thus a CPU threshold of 0.01 (1.0 %) would actually not be low enough to prevent stall detection.

I'll look into the other issues you mentioned.

dlflannery
April 14th, 2012, 09:56 PM
1. The partially completed output file sometimes still exists in Output Folder and a (2) version of the output file is created. VAP doesn't realize that VRD created the (2) version and the .apo and .drx files have the original, rather than the (2) name. VAPpostprocess fails to update the metadata and insert the chapter stops as both steps depend on the file names matching except in the extension.
Try the VAP executable (not an installer) contained in the attached VAP102T2.zip. I've disabled adding the (##) to the output file when Drax files are being produced. And if a partial output file exists, VAP tries to delete it before generating the new one. If deletion fails, a log entry is made.

2. When Drax fails and is left open with a "file not found" error, after a couple of minutes, VAP concludes that VAPpostprocess is complete and moves on to the next active file. They seem to process normally but don't get chapter stops and no error appears on the screen. Maybe something related to the way I call it? Or something in the way Drax works? Here's the way I call it.

"cmd /c ""%DraXpath%" /file:%outputFileM4V% /import:%inputFileDRX%"
I'm not sure what's going on here. I would dispense with the "cmd /c" unless you've found it absolutely necessary for proper Drax operation. VAP will wait forever for a postProcess .cmd file to complete.

3. Sometimes VAP appears to get confused and puts the output file in Temporary (QSF) Folder rather than Output Folder which causes VAPpostprocess to fail since it expects the output file to be in Output Folder. This seems to happen when I "Stop Processing" on a file in VAP (aborting VRD), delete the partially complete output file from Output Folder, and "Start Processing". It doesn't happen if I do a reset on the file before restarting processing.
Doing a reset is a good idea anyway. ;)

4. Sometimes the retry doesn't happen. For example, VAP starts processing "The Bishop Revival". It stalls and gets moved to stalled_files. VAP moves to "Brown Betty" which also stalls and gets moved to stalled_files. "The Bishop Revival" gets moved back and processes successfully. However, it's now 1:49 and VAP hasn't done anything with "Brown Betty". Maybe the logic doesn't exist to handle more than one stalled file at a time?

There should be no problem with multiple retries pending. The code is designed to handle that.

Let's see where you are with VAP102T2 and setting the stalled file CPU threshold to 0.0, then we'll take it from there.

tannebil
April 16th, 2012, 01:53 PM
I'm using the new executable but am working my way through my backlog of files that are missing chapter before I test the stall/retry code.

tannebil
April 16th, 2012, 11:18 PM
In 1.01, when VAp retries the file, why are the apo and drx files being created using the original rather than the retry (xx) file name? Doesn't that create a problem for anyone using Atomic Parsley as well as Drax in VAPpostprocess?

dlflannery
April 16th, 2012, 11:52 PM
That's a valid issue for the to-do list. If you are generating .drx files, VAP102T2 should provide an acceptable workaround for now.

tannebil
April 16th, 2012, 11:53 PM
Retry didn't work with 1.02T. The stall was detected and the input file moved to stalled_files but after 22 minutes, VAP still had not retried it.

<MaxStalledFileRetries>1</MaxStalledFileRetries>
<StallCheckIntervalSeconds>120</StallCheckIntervalSeconds>
<StallRetryIntervalMinutes>10</StallRetryIntervalMinutes>
<StallCpuUseThreshold>0.1</StallCpuUseThreshold>


4/16/2012 9:22:49 PM Getting input file encoding for: Alcatraz - ''Cal Sweeney'' (Recorded Jan 30, 2012, KCPQDT).TiVo
4/16/2012 9:22:59 PM MPEG2 is encoding for: Alcatraz - ''Cal Sweeney'' (Recorded Jan 30, 2012, KCPQDT).TiVo
4/16/2012 9:22:59 PM Started QSF, input: Alcatraz - ''Cal Sweeney'' (Recorded Jan 30, 2012, KCPQDT).TiVo
4/16/2012 9:23:00 PM Using filter dimensions from MediaInfo: 1280 X 720
4/16/2012 9:23:00 PM Using profile: MPEG2 Program Stream
4/16/2012 9:23:05 PM VideoReDo said: INFO: VideoReDo version 4.20.7.635 - Mar 13 2012
4/16/2012 9:23:08 PM VideoReDo said: INFO: Applying QSF filter: 1280X720
4/16/2012 9:29:02 PM STALL ERROR: VideoReDo process stalled -- killing it
4/16/2012 9:29:02 PM Moving input file: Alcatraz - ''Cal Sweeney'' (Recorded Jan 30, 2012, KCPQDT).TiVo
4/16/2012 9:29:02 PM Started moving stalled file: "Alcatraz - ''Cal Sweeney'' (Recorded Jan 30, 2012, KCPQDT).TiVo"
4/16/2012 9:29:05 PM Finished moving "Alcatraz - ''Cal Sweeney'' (Recorded Jan 30, 2012, KCPQDT).TiVo" to stalled folder

dlflannery
April 17th, 2012, 12:04 AM
I can't imagine what the problem would be. I tested with the same process flow you are using and with two .tivo files simultaneously. I termporarily modified the code to force stall detections artificially.

VAP should have started moving the stalled file back into the monitor folder 10 minutes after it moved it into the stall folder, i.e., about 9:39 pm. I could imagine the timing being slowed if you were simultaneously transferring a lot of new files into the monitor folder, especially if VAP is doing theTVDB.com lookups for all of them. Even after the stalled file is moved back into the monitored folder, VAP may first process other files that were just moved into that folder if their time stamps are older than the stalled file.

tannebil
April 17th, 2012, 01:21 AM
The system was idle as I cancelled the file copy as soon as the stall was detected and the file moved. Workflow was simply to start a long file copy, do a reset/uncheck block on a file, and wait for it to stall. I'll try it again.

dlflannery
April 17th, 2012, 06:39 AM
Pretty sure I found a bug. I was able to duplicate your retry-failures. They are triggered by certain sequences of past history for a given file name with regard to stalls, retries and successful processing. These sequences would tend to occur if you're shuffling the same input files through more than one time. I think I've got it fixed in the executable (not an installer) contained in attached VAP102T4.zip.
.

tannebil
April 17th, 2012, 09:14 PM
Second test worked fine. I'll try a longer test with multiple retries and and multiple files tonight.

tannebil
April 18th, 2012, 01:53 AM
Longer test with two stalled files also worked fine.

dlflannery
April 18th, 2012, 11:54 PM
In 1.01, when VAp retries the file, why are the apo and drx files being created using the original rather than the retry (xx) file name? Doesn't that create a problem for anyone using Atomic Parsley as well as Drax in VAPpostprocess?
That problem is corrected in the VAP executable (not an installer) in the attached VAP102T6.zip. (This supercedes the work-around solution provided in an earlier 102T version.)

tannebil
April 19th, 2012, 11:00 PM
Worked great, thanks.