What are the best settings for speed vs quality in HEVC?

Dartman

Member
I had my combo very dialed in but my buddy just built me a new box with a Ryzen 1900x and I added another 16 gig of DDR4 3200 memory so it has 32 gig right now. It also has a old 1070 gtx and a decent 1 tb M2 SSD boot drive. My encode times are all over the place and generally are around 300 fps doing a WTV to HEVC MKV recode. I've seen as high as 800 but must have been a lower resolution encode as lately I'm doing 1080p captures.
I should have saved all my settings from the old 10 install as now it's back to square one and not sure what settings would help speed it up or if these numbers are typical.
I had a I7 4960x before with DDR3 32 gig and mostly the same everything else except for the new M2 SSD boot drive and I could have sworn it was doing 400 to 800 fps doing similar videos. Anyway any ideas as to settings I've missed that will help or am I as good as it's going to get, thanks.
 

Dan203

Senior Developer
Staff member
Are you just using the Intelligent profile? Or are you manually setting the bitrate?

The only setting that can have a major impact on encode performance besides bitrate is the "Preset" option in the HEVC section of Advanced. Changing that has a direct correlation between quality and speed. (slower speed = higher quality, and vice versa)
 

Dartman

Member
I'm mostly just running the presets. So far it looks pretty good at the settings it has but wondering if I can push it any faster without loosing the quality. It's making mkv files that started out at about 2 to 4k down to about 900 meg so might be really working to get it that small so slowing down. I should have made a record of the settings and various tweaks I had between suggestions from here and things I tried to see what happens.
 

Danr

Administrator
Staff member
This is an extremely complex question about which many academic papers and industry articles have been written. Here's the real short answer :)

For software encoding:

1. Best quality, 2 pass encoding.
2. Best speed, single pass, at one of the higher speed presets.
3. Best compromise: CRF encoding around 20-22.

For best overall speed, use hardware encoding if available.
 

Dan203

Senior Developer
Staff member
Unfortunately your current card doesn't support NVEnc with HEVC encoding. If you can get a newer card, even just a 2000 series, NVEnc will be way faster and provide nearly identical quality.
 

Dartman

Member
It works with HEVC and worked great with the 4960x and the 980 before that. Just trying to get the best compromise. I started with a AMD quad core and it would take HOURS to do a recode so I usually went to bed and hoped for the best if I had many to do.
Can't really afford to spend stupid money on a newer card right now so will wait out a deal like I did with this upgrade.
 

Danr

Administrator
Staff member
I would think that the 1070 is faster than the 980 for HEVC encoding. Likewise the 1070 is faster then that Intel Quicksync on the 4960x as well. Unless you step up to a later NVidia family generation the encoding speed should be the same for all chips in the same family. The 1900x itself is about 10%-20% faster than the 4960x which not be at all noticeable if your CPU isn't running at 100% during the output transcode.
 

Dartman

Member
I would think that the 1070 is faster than the 980 for HEVC encoding. Likewise the 1070 is faster then that Intel Quicksync on the 4960x as well. Unless you step up to a later NVidia family generation the encoding speed should be the same for all chips in the same family. The 1900x itself is about 10%-20% faster than the 4960x which not be at all noticeable if your CPU isn't running at 100% during the output transcode.
Well it's 12 core native so you'd think it'd be a lot faster but I guess the 1070 does most of the work. Is there a way to get it to use all 12 cores more aggressively as I don't see a setting in the advanced stuff for HEVC. It definitely uses all of them but it doesn't seem to use them all equally while I'm running a recode.
My first somewhat recent higher end box was a old 1366 Intel I7980, then the I74960x, now the Ryzen 3900x. I typed in the wrong CPU number as it's all new to me and still getting it dialed in. All of those boxes used the 1070gtx card I picked up and each one was faster then the last but I wasn't using HEVC till the upgrade to 6 so was doing H264 which is faster but not as compact when done.
 

svcdmaker

Member
Could you post an image of your core utilization during encoding? I have been wondering how well VRD uses multiple cores in a high core count environment. I am still in the land of quad core.
 

Danr

Administrator
Staff member
Big difference between a 3900 and a 1900 :). Personally, I run a 3950x, love it. If your wtv source is mpeg2 you're listed to about 8 cores for decoding. Adding more cores simply adds overhead. If you're reising or deinterlacing that's adding more overhead.

I would try running 2 transcoding instances at the same time and see what happens.
 

Dartman

Member
Big difference between a 3900 and a 1900 :). Personally, I run a 3950x, love it. If your wtv source is mpeg2 you're listed to about 8 cores for decoding. Adding more cores simply adds overhead. If you're reising or deinterlacing that's adding more overhead.

I would try running 2 transcoding instances at the same time and see what happens.
So how do I do that. Never heard of it but noticed some weird setting in the first advanced mode that might do it. If I set it to something easy to transcode it gets about 7k frames a sec. But go back to playing with HEVC and it's 250 to almost 400 and the cpu load you see. Pretty sure the 6 core I7 cpus were using everything they had as they would ramp up the fans and load till done but that was only 6 physical cores.
 

Otter

Member
The CPU and video card are only a part of the issue.
Each generation of AMD chips, BIOS and Win10 OS has had advances to improve multi-threading on the AMD CPUs.
How the system divides tasks into multiple threads is determined by your CPU, chipset, BIOS and Windows drivers working together.

The Gen 1 Threadripper and X399 was a good system for it's time, but it can't keep up with the new generation Ryzen 9s.
The 1900X is now dirt cheap, but the MB is still expensive - for less money you'd be better getting an X5xx mb and 5900X.
An old 1900X running on a x399 board will have very basic thread scheduling compared to a 3900X or 5900X running on a x570 with latest AMD Combo BIOS
The latest AMD Chipset drivers and the Windows 10 kernel itself have also included advances to thread scheduling and multi-threading.

My 5900X running on a X570 board with latest AMD drivers and Win10 2021-H2 runs all 24 threads equally at about 65%-75% when doing a SW 2-pass encoding.
Switching to x264 or x265 CRF18 encoding using my 1650 Super (Turing) video card still uses all 24 threads at about the same load, but finishes in 20% of the time of all SW.
I think the quality of the SW 2-pass is better, but NVENC CRF18 is good for most programs.

Since you have already gone with the 1900X, make sure you have your Windows 10 upgraded to get what advances in thread scheduling the 1900X can use.
Also install the latest AMD Direct chipset drivers - MB vendors are usually slow to issue updates and I've never had an issue with the AMD website drivers.
 

Dartman

Member
I made a mistake about which chip I have, it's a 3900x on a Asus Strix X570-E Gaming so it's a pretty top end combo. If you know of better settings for the bios to make everything better I'm willing to give it a try. My 10 pro 64 enterprise has all the normal updates installed and I'm pretty sure all the drivers for the board are up to date but it's probably mostly running the safe default settings. You can see what load my CPU is running on a typical HEVC recode from the attachment I posted a few replies up from here.
 

jmc

Well-known member
My 3950x 16 core runs 40-50% at my low bitrate of 1.4Mbps (Dvd.MPGs toH264.MP4s).
If I process a 20-30Mbps B-Ray cpu use jumps to 70-80%+. So glad I find Dvds "good enough".

I believe Dan203 said that HEVC is for high bitrate video compression and H.264 is best for my level (1.4Mbps) of compression.
And in my tests my eyes agree with that. No (HEVC) half the bitrate and equal quality at my level.
Was a relief to let go of HEVC and (newer) concerns.

While testing the H.264 Advanced Profile Options I found that some combinations made a really big improvement.
Perhaps HEVC will be the same?

Some settings must go together. A default H264.MP4 profile did not get the big boost that certain settings gave with a "loaded" Profile
that had the slowest (highest quality) Preset and a lot of other settings turned on.

You just have to go through and do a lot of trial and errror testing for the codec you want to use.

Good luck!
jmc
 

Dartman

Member
Yeah, I had my HEX core Intel boxes dialed in perfectly for quality and speed and now I'm figuring it out yet again. I guess close to 400fps isn't too bad with decent quality and small size but I always want the best combination I can get. I seem to remember a discussion here where somebody pointed out some settings and tweaks I hadn't thought of and you could really tell the difference from going too far and having tiny, fast, but horrible looking video to the one that about halved the size quickly and still looked great
 

Otter

Member
I made a mistake about which chip I have, it's a 3900x on a Asus Strix X570-E Gaming so it's a pretty top end combo. If you know of better settings for the bios to make everything better I'm willing to give it a try. My 10 pro 64 enterprise has all the normal updates installed and I'm pretty sure all the drivers for the board are up to date but it's probably mostly running the safe default settings. You can see what load my CPU is running on a typical HEVC recode from the attachment I posted a few replies up from here.
Glad your mistake was in chip name, not buying 4yo hardware.

It's not so much user settings in BIOS as the BIOS revision itself. AMD is constantly updating and improving their reference BIOS code to improve speed and reliability.
This includes bug fixes, reliability fixes and speed improvements - notably in the code that controls the memory and thread scheduling subsystems that identify tasks that can be run concurrently, distributes them to the cores and brings them back together in sequence.

AMD refers to their recent versions as "AGESA Combo" meaning "AMD Generic Encapsulated Software Architecture". That reference code is turned over to motherboard vendors who customize it for particular boards and sooner or later release to users. Some vendors are better at keeping current than others - ASUS seems pretty good.

I'm using a ASUS TUF Gaming x570 myself. Current ASUS release for my TUF Gaming is v4010 based on the AMD reference BIOS "AGESA Combo V2 PI 1.2.0.3 Patch B" The same level BIOS is available for your ROG Strix via the ASUS Download Center.

Not sure what level BIOS you are on, but going to the latest can't hurt. Last spring, I was running the same x570 TUF with a 3900X and then current v3602 BIOS. Just swapping to the 5900X gave me a 20%+ drop in encoding time, reflecting the changes AMD made to the internal systems of the ZEN 3 cpus. Since then, BIOS updates seem to have given me another 5% and higher CPU usage per core/thread during encoding than 6 months ago.

ASUS download webpages for both the TUF x570 and ROG x570 are still offering v2.11.26.106 from back in 2021/02/02
Latest AMD Direct chipset driver is v2.17.25.506 - 2021-06-02. Find at: "https://www.amd.com/en/support/chipsets/amd-socket-am4/x570"

There were also some problems with 2019 versions of Windows 10 kernel not interfacing fully with the Ryzen internal schedulers, but you probably are running 2021-H1 by now.

Post if you any of this makes an improvement
 

Danr

Administrator
Staff member
Are you resizing or deinterlacing? Those could be a bottleneck.

To run 2 transcodes at the same time, start 2 instances of VRD. Load and recode 2 different files at the same time. Or, run one transcode in batch and the other interactive. Same effect as having 2 interactive sessions.
 

Otter

Member
Are you resizing or deinterlacing? Those could be a bottleneck.

To run 2 transcodes at the same time, start 2 instances of VRD. Load and recode 2 different files at the same time. Or, run one transcode in batch and the other interactive. Same effect as having 2 interactive sessions.
True, but if you need to do either to get the finished file you want, then you have to take the hit. There is no magic to avoid the bottleneck other than faster hardware.

Seeing "100%" on all 12 cores is a thrill, but running 2 simultaneous encodes will generate some serious heat in the Ryzen 3900X. Without some overclocking tweaks and serious cooling the Ryzen will initiate thermal throttle-down somewhere around 76C. To keep from overheating, the CPU will drop clock speeds until the heat load balances the cooling capacity of whatever CPU cooler you have.

When I run 1 SW encode on my 5900X, the 12 cores hit varying clocks from ~4500-4950MHz and temp about 68-72C with the AIO pump and fans at 85% speed (and noise). Running 2 encodes gets me 99-100% on the cores, but the clocks drop to ~4175MHz on most cores with temps rising to 78-80C even with the fans at 100%.

Recoding a 55 min 1080p raw recording down to 720p using NVENC CRF18 only takes 110 secs, so the effort to set up a second parallel encode is not worth it.
Besides' I'd rather just stack files up in the Batch que for 2-pass SW encoding, shut my office door and go to bed.
 
Top Bottom