How We Tested
Repeatability is one of the most important components of any useful benchmark methodology. All tests have some degree of uncertainty, but we're looking for a minimal and consistent amount of variability. Results plagued by wild swings in performance from one run to the next aren't usable as accurate benchmarks.
As an example, we've yet to develop any reliable multi-tasking benchmarks. In response to reader requests, we have worked diligently to create a series of tests that measure gaming performance with background applications like Web browsers, email clients, media players, Discord, and Skype open. Windows' prioritization appears to be based on fickle and unexplained factors. The operating system suspends various background processes unpredictably during one scripted sequence, then leaves them fully active during the next (even when the test environment hasn't changed). This unpredictability becomes more, well, unpredictable, as the number of open applications increases. Switching Windows into Game Mode only complicates matters further. So far, we have no solution. Our multi-tasking experiments yield deltas from 5 to 15 FPS between successive runs, which means they land nowhere near our expectations for a reliable benchmark.
Luckily, game streaming is much easier to control. Encoding is a CPU-intensive task that chews up plenty of cycles, so Windows doesn't suspend or otherwise interfere with it. This allows us to create repeatable benchmarks without extreme outliers.
What We're Measuring
Evaluating game streaming performance works across two axes: game quality and stream quality. Of course, we'll measure average, minimum, and 99th percentile frame rates with and without streaming in the background. We'll also include our usual frame time and variance results, which become more important once we start streaming.
We also need to account for stream quality. That means recording the percentage of frames encoded. Each processor pushes different frame rates, so each run correspondingly generates a different number of frames. As such, we measure the percentage of frames successfully encoded as "% of Frames Delivered." In the test below, a Threadripper 1950X CPU encoded 98.9% of the frames generated by our gaming session, meaning it skipped 1.1% of the frames due to encoding lag.
We're streaming at 60 FPS, so we also measure stream quality by listing the percentage of frames encoded within the desirable 16.667ms (60 FPS) threshold. We also include the percentage of frames that land above and below the 60 FPS threshold, which helps quantify the hitching and stuttering a viewer would see on the stream. Subjective visual measurements are still important, so we'll call out tests that generate a bad-looking stream.
Open Broadcaster System
There are several software encoding applications, but we chose Open Broadcasting System (OBS) due to its flexible tuning options, detailed output logs, and broad compatibility with streaming services. We're using the x264 software encoder, along with YouTube Gaming for our streaming service. Any run that reports frames dropped due to networking interference is discarded.
Our ultimate goal is to develop a test that measures CPU performance, so we select parameters that remove the most obvious bottlenecks. Gaming at 1920x1080 with an EVGA GeForce GTX 1080 FE side-steps a GPU limitation (as much as possible). Encoding overhead isn't as high with lesser video cards that generate fewer frames per second. We also test with a 10 Mb/s upload rate, though you can stream at 6 Mb/s or less. Our Internet connection would accommodate up to 35 Mb/s uploads. To vary game selection, we chose Grand Theft Auto V, Middle-earth: Shadow of War, and Battlefield 1 for our tests.
There are several other scenarios we could have added to increase the complexity of our testing, such as a simultaneous video stream from a webcam, recording the game to the host system, or streaming to multiple services at once. We went with just one service to reduce the number of variables...at least for now.
Finding the best streaming options requires some tuning for every game and hardware configuration. There is a delicate balance between game performance on the host system and stream quality for the remote viewer, so fine-tuning is needed to yield the best mix. We picked somewhat general settings that offered a good range of performance by our subjective measure. We also stuck with options that'd establish a level playing field for a wide range of test systems. Just be aware that there are plenty of knobs to turn, some of which could offer better performance than the ones we use (lowering the stream to 30 FPS, for instance, cuts encoding overhead significantly)
Tuning the encoding presets is one of the most direct ways to adjust streaming performance and quality for your system's capabilities. Slower encoding increases compression efficiency, which provides better output quality and reduces compression artifacts. OBS has 10 presets ranging from "ultrafast" (the lowest-quality setting with the least computational overhead) to "placebo" (offering the best streaming quality and consuming the most host processing resources). The placebo setting is aptly named; there is certainly a rapidly diminishing rate of return on stream quality after passing the "slower" preset (two ticks before placebo). More strenuous settings can quickly cripple even powerful processors, particularly if you are streaming from a single host system. Placebo with care.
We split our test groups into three different classes. After evaluating a few Core i3- and Ryzen 3-class processors and determining that they can't stream effectively at our settings, we chose Ryzen 5 and Core i5 models for our entry-level systems. We used the "veryfast" encoding setting for this class of CPU. Naturally, higher-end processors, such as our Ryzen 7/Core i7 and Threadripper/Core i9 chips, offer more performance, so we use the "faster" and "fast" settings, respectively, for brawnier CPUs.
Because we're testing with different encoding presets, you cannot compare test results for the different classes directly.
|Test System & Configuration|
|Hardware||Intel LGA 1151 (Z370)|
Intel Core i5-8600K, Core i7-8700K
MSI Z370 Gaming Pro Carbon AC
4x 8GB G.Skill RipJaws V DDR4-3200 @ 2666 and 3200 MT/s
AMD Socket AM4
AMD Ryzen 5 1600X, Ryzen 7 1800X
MSI Z370 Xpower Gaming Titanium
2x 8GB G.Skill RipJaws V DDR4-3200 @ 2667 and 3200 MT/s
Intel LGA 1151 (Z270)
Intel Core i5-7600K, Core i7-7700K
MSI Z270 Gaming M7
2x 8GB G.Skill RipJaws V DDR4-3200 @ 2666 and 3200 MT/s
AMD Socket SP3 (TR4)
AMD Ryzen Threadripper 1950X
Asus X399 ROG Zenith Extreme
4x 8GB G.Skill Ripjaws V DDR4-3200 @ 2666 and 3200 MT/s
Intel LGA 2066
Intel Core i9-7900X, Core i9-7980XE
MSI X299 Gaming Pro Carbon AC
4x 8GB G.Skill Ripjaws V DDR4-3200 @ 2666 and 3200 MT/s
EVGA GeForce GTX 1080 FE
1TB Samsung PM863
SilverStone ST1500-TI, 1500W
Windows 10 Creators Update Version 1703
MORE: Best CPUs
MORE: All CPUs Content