This directory contains a set of webdriver-driven benchmarks for Cobalt.
Each file should contain a set of tests in Python “unittest” format.
All tests included within performance.py will be run on the build system. Results can be recorded in the build results database.
In most cases, you will want to run all performance tests, and you can do so by executing the script performance.py. You can call python performance.py --help
to see a list of commandline parameters to call it with. For example, to run tests on the raspi-2 QA build, you should run the following command:
python performance.py -p raspi-2 -c qa -d $RASPI_ADDR
Where RASPI_ADDR
is set to the IP of the target Raspberry Pi device.
To run individual tests, simply execute the script directly. For all tests, platform configuration will be inferred from the environment if set. Otherwise, it must be specified via commandline parameters.
If appropriate, create a new file borrowing the boilerplate from an existing simple file, such as browse_horizontal.py.
Add the file name to the tests added within performance.py, causing it run when performance is run.
If this file contains internal names or details, consider adding it to the “EXCLUDE.FILES” list.
Use the record_test_result*
methods in tv_testcase_util
where appropriate.
Results must be added to the build results database schema. See the internal README-Updating-Result-Schema.md file.
To run the benchmarks against any desired loader, a --url command line parameter can be provided. This will be the url that the tests will run against.
It should have the following format:
python performance.py -p raspi-2 -c qa -d $RASPI_ADDR --url https://www.youtube.com/tv?loader=nllive
The results will be printed to stdout. You should redirect output to a file if you would like to store the results. Each line of the benchmark output prefixed with webdriver_benchmark TEST_RESULT:
provides the result of one measurment. Those lines have the following format:
webdriver_benchmark TEST_RESULT: result_name result_value
where result_name
is the name of the result and result_value
is a number providing the measured result for that metric. For example,
webdriver_benchmark TEST RESULT: wbBrowseHorizontalDurLayoutBoxTreeUpdateUsedSizesUsPct50 3061.5
gives the 50th percentile of the duration Cobalt took to update the box tree's used sizes, on a horizontal scroll event, in microseconds.
Note that most time-based measurements are in microseconds.
Some particularly interesting timing-related benchmark results are:
wbStartupDurLaunchToBrowseUs
: Measures the time it takes to launch Cobalt and load the desired URL. The measurement ends when all images finish loading and the final render tree is produced. This is only run once so it will be noiser than other values but provides the most accurate measurement of the requirement startup time.wbStartupDurLaunchToUsableUs
: Measures the time it takes to launch Cobalt, and fully load the desired URL, including loading all scripts. The measurement ends when the Player JavaScript code finishes loading on the Browse page, which is the point when Cobalt is fully usable. This is only run once so it will be noiser than other values but provides the most accurate measurement of the time when Cobalt is usable.wbStartupDurNavigateToBrowseUs*
: Measures the time it takes to navigate to the desired URL when Cobalt is already loaded. The measurement ends when all images finish loading and the final render tree is produced. This is run multiple times, so it will be less noisy than wbStartupDurLaunchToBrowseUs
, but it does not include Cobalt initialization so it is not as accurate of a measurement.wbStartupDurNavigateToUsableUs
: Measures the time it takes to navigate to the desired URL when Cobalt is already loaded, including loading all scripts. The measurement ends when the Player JavaScript code finishes loading on the Browse page, which is the point when Cobalt is fully usable. This is run multiple times, so it will be less noisy than wbStartupDurLaunchToUsableUs
, but it does not include Cobalt initialization so it is not as accurate of a measurement.wbBrowseHorizontalDurTotalUs*
: Measures the latency (i.e. JavaScript execution time + layout time) during horizontal scroll events from keypress until the render tree is submitted to the rasterize thread. It does not include the time taken to rasterize the render tree so it will be smaller than the observed latency.wbBrowseHorizontalDurAnimationsStartDelayUs*
: Measures the input latency during horizontal scroll events from the keypress until the animation starts. This is the most accurate measure of input latency.wbBrowseHorizontalDurAnimationsEndDelayUs*
: Measures the latency during horizontal scroll events from the keypress until the animation ends.wbBrowseHorizontalDurFinalRenderTreeDelayUs*
: Measures the latency during horizontal scroll events from the keypress until the final render tree is rendered and processing stops.wbBrowseHorizontalDurRasterizeAnimationsUs*
: Measures the time it takes to render each frame of the animation triggered by a horizontal scroll event. The inverse of this number is the framerate.wbBrowseVerticalDurTotalUs*
: Measures the latency (i.e. JavaScript execution time + layout time) during vertical scroll events from keypress until the render tree is submitted to the rasterize thread. It does not include the time taken to rasterize the render tree so it will be smaller than the observed latency.wbBrowseVerticalDurAnimationsStartDelayUs*
: Measures the input latency during vertical scroll events from the keypress until the animation starts. This is the most accurate measure of input latency.wbBrowseVerticalDurAnimationsEndDelayUs*
: Measures the latency during vertical scroll events from the keypress until the animation ends.wbBrowseVerticalDurFinalRenderTreeDelayUs*
: Measures the latency during vertical scroll events from the keypress until the final render tree is rendered and processing stops.wbBrowseVerticalDurRasterizeAnimationsUs*
: Measures the time it takes to render each frame of the animation triggered by a vertical scroll event. The inverse of this number is the framerate.wbBrowseToWatchDurVideoStartDelay*
: Measures the browse-to-watch time.In each case above, the *
symbol can be one of either Mean
, Pct25
, Pct50
, Pct75
or Pct95
. For example, wbStartupDurBlankToBrowseUsMean
or wbStartupDurBlankToBrowseUsPct95
are both valid measurements. The webdriver benchmarks runs its tests many times in order to obtain multiple samples, so you can drill into the data by exploring either the mean, or the various percentiles.
Some particularly interesting count-related benchmark results are:
wbStartupCntDomHtmlElements*
: Lists the number of HTML elements in existence after startup completes. This includes HTML elements that are no longer in the DOM but have not been garbage collected yet.wbStartupCntDocumentDomHtmlElements*
: Lists the number of HTML elements within the DOM tree after startup completes.wbStartupCntLayoutBoxes*
: Lists the number of layout boxes within the layout tree after startup completes.wbStartupCntRenderTrees*
: Lists the number of render trees that were generated during startup.wbStartupCntRequestedImages*
: Lists the number of images that were requested during startup.wbBrowseHorizontalCntDomHtmlElements*
: Lists the number of HTML elements in existence after the event. This includes HTML elements that are no longer in the DOM but have not been garbage collected yet.wbBrowseHorizontalCntDocumentDomHtmlElements*
: Lists the number of HTML elements within the DOM tree after the event.wbBrowseHorizontalCntLayoutBoxes*
: Lists the number of layout boxes within the layout tree after the event.wbBrowseHorizontalCntLayoutBoxesCreated*
: Lists the number of new layout boxes that were created during the event.wbBrowseHorizontalCntRenderTrees*
: Lists the number of render trees that were generated by the event.wbBrowseHorizontalCntRequestedImages*
: Lists the number of images that were requested as a result of the event.wbBrowseVerticalCntDomHtmlElements*
: Lists the number of HTML elements in existence after the event. This includes HTML elements that are no longer in the DOM but have not been garbage collected yet.wbBrowseVerticalCntDocumentDomHtmlElements*
: Lists the number of HTML elements within the DOM tree after the event.wbBrowseVerticalCntLayoutBoxes*
: Lists the number of layout boxes within the layout tree after the event.wbBrowseVerticalCntLayoutBoxesCreated*
: Lists the number of new layout boxes that were created during the event.wbBrowseVerticalCntRenderTrees*
: Lists the number of render trees that were generated by the event.wbBrowseVerticalCntRequestedImages*
: Lists the number of images that were requested as a result of the event.In each case above, the *
symbol can be one of either Max
, Median
, or Mean
. For example, wbBrowseVerticalCntDomHtmlElementsMax
or wbBrowseVerticalCntDomHtmlElementsMedian
are both valid measurements. The webdriver benchmarks runs its tests many times in order to obtain multiple samples, so you can drill into the data by exploring either the max, median, or mean.
The webdriver benchmarks output many metrics, but you may only be interested in a few. You will have to manually filter only the metrics that you are interested in. You can do so with grep
, for example:
python performance.py -p raspi-2 -c qa -d $RASPI_ADDR > results.txt echo "" > filtered_results.txt printf "=================================TIMING-RELATED=================================\n" >> filtered_results.txt printf "\nSTARTUP\n" >> filtered_results.txt grep -o "wbStartupDurLaunchToBrowseUs.*$" results.txt >> filtered_results.txt grep -o "wbStartupDurLaunchToUsableUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbStartupDurNavigateToBrowseUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbStartupDurNavigateToUsableUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt printf "\nBROWSE HORIZONTAL SCROLL EVENTS\n" >> filtered_results.txt grep -o "wbBrowseHorizontalDurTotalUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalDurAnimationsStartDelayUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalDurAnimationsEndDelayUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalDurFinalRenderTreeDelayUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalDurRasterizeAnimationsUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt printf "\nBROWSE VERTICAL SCROLL EVENTS\n" >> filtered_results.txt grep -o "wbBrowseVerticalDurTotalUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalDurAnimationsStartDelayUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalDurAnimationsEndDelayUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalDurFinalRenderTreeDelayUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalDurRasterizeAnimationsUs.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt printf "\nBROWSE TO WATCH\n" >> filtered_results.txt grep -o "wbBrowseToWatchDurVideoStartDelay.*$" results.txt >> filtered_results.txt printf "\n\n=================================COUNT-RELATED==================================\n" >> filtered_results.txt printf "\nSTARTUP\n" >> filtered_results.txt grep -o "wbStartupCntDomHtmlElements.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbStartupCntDomDocumentHtmlElements.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbStartupCntLayoutBoxes.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbStartupCntRenderTrees.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbStartupCntRequestedImages.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt printf "\nBROWSE HORIZONTAL SCROLL EVENTS\n" >> filtered_results.txt grep -o "wbBrowseHorizontalCntDomHtmlElementsM.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalCntDomDocumentHtmlElements.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalCntLayoutBoxesM.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalCntLayoutBoxesCreated.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalCntRenderTrees.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseHorizontalCntRequestedImages.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt printf "\nBROWSE VERTICAL SCROLL EVENTS\n" >> filtered_results.txt grep -o "wbBrowseVerticalCntDomHtmlElementsM.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalCntDomDocumentHtmlElements.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalCntLayoutBoxesM.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalCntLayoutBoxesCreated.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalCntRenderTrees.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt grep -o "wbBrowseVerticalCntRequestedImages.*$" results.txt >> filtered_results.txt printf "\n" >> filtered_results.txt cat filtered_results.txt