| ## Key Aspects of a Well Designed Test |
| |
| A badly written test can lead to false passes or false failures, as |
| well as inaccurate interpretations of the specs. Therefore it is |
| important that the tests all be of a high standard. All tests must |
| follow the [test format guidelines][test-format] and well designed |
| tests should meet the following criteria: |
| |
| * **The test passes when it's supposed to pass** |
| * **The test fails when it's supposed to fail** |
| * **It's testing what it claims to be testing** |
| |
| ## Self-Describing Tests |
| |
| As the tests are likely to be used by many other people, making them |
| easy to understand is very important. Ideally, tests are written as |
| self-describing, which is a test page that describes what the page |
| should look like when the test has passed. A human examining the |
| test page can then determine from the description whether the test |
| has passed or failed. |
| |
| _Note: The terms "the test has passed" and "the test has failed" |
| refer to whether the user agent has passed or failed a |
| particular test — a test can pass in one web browser and fail in |
| another. In general, the language "the test has passed" is used |
| when it is clear from context that a particular user agent is |
| being tested, and the term "this-or-that-user-agent has passed |
| the test" is used when multiple user agents are being compared._ |
| |
| Self-describing tests have some advantages: |
| |
| * They can be run easily on any layout engine. |
| * They can test areas of the spec that are not precise enough to be |
| comparable to a reference rendering. (For example, underlining |
| cannot be compared to a reference because the position and |
| thickness of the underline is UA-dependent.) |
| * Failures can (should) be easily determined by a human viewing the |
| test without needing special tools. |
| |
| ### Manual Tests |
| |
| While it is highly encouraged to write automatable tests either as [ |
| reftests][reftests] or [script tests][scripttests], in rare cases a |
| test can only be executed manually. All manual tests must be |
| self-describing tests. Additionally, manual tests should be: |
| |
| * Easy & quick to determine the result |
| * Self explanatory & not require an understanding of the |
| specification to determine the result |
| * Short (a paragraph or so) and certainly not require scrolling |
| on even the most modest of screens, unless the test is |
| specifically for scrolling or paginating behaviour. |
| |
| ### Reftests |
| |
| [Reftests][reftests] should be self-describing tests wherever |
| possible. This means the the descriptive statement included in the |
| test file must also appear in the reference file so their renderings |
| may be automatically compared. |
| |
| ### Script Tests |
| |
| [Script tests][scripttests] may also be self-describing, but rather |
| than including a supplemental statement on the page, this is |
| generally done in the test results output from ```testharness.js```. |
| |
| ### Self-Describing Test Examples |
| |
| The following are some examples of self-describing tests, using some |
| common [techniques](#techniques) to identify passes: |
| |
| * [Identical Renderings][identical-renderings] |
| * [Green Background][green-background] |
| * [No Red 1][no-red-1] |
| * [No Red 2][no-red-2] |
| * [Described Alignment][described-alignment] |
| * [Overlapping][overlapping] |
| * [Imprecise Description 1][imprecise-1] |
| * [Imprecise Description 2][imprecise-2] |
| |
| ## Techniques |
| |
| In addition to the [self describing](#self-describing) statement |
| visible in the test, there are many techniques commonly used to add |
| clarity and robustness to tests. Particularly for reftests, which |
| rely wholly on how the page is rendered, the following should be |
| considered and used when designing new tests. |
| |
| ### Indicating success |
| |
| #### The green paragraph |
| |
| This is the simplest form of test, and is most often used when |
| testing the things that are independent of the rendering, like |
| the CSS cascade or selectors. Such tests consist of a single line of |
| text describing the pass condition, which will be one of the |
| following: |
| |
| <span style="color: green">This line should be green.</span> |
| |
| <span style="border: 5px solid green">This line should have a green |
| border.</span> |
| |
| <span style="background: green; color: white">This line should have |
| a green background.</span> |
| |
| #### The green page |
| |
| This is a variant on the green paragraph test. There are certain |
| parts of CSS that will affect the entire page, when testing these |
| this category of test may be used. Care has to be taken when writing |
| tests like this that the test will not result in a single green |
| paragraph if it fails. This is usually done by forcing the short |
| descriptive paragraph to have a neutral color (e.g. white). |
| |
| This [example][green-page] is poorly designed, because it does not |
| look red when it has failed. |
| |
| #### The green square |
| |
| This is the best type of test for cases where a particular rendering |
| rule is being tested. The test usually consists of two boxes of some |
| kind that are (through the use of positioning, negative margins, |
| zero line height, transforms, or other mechanisms) carefully placed |
| over each other. The bottom box is colored red, and the top box is |
| colored green. Should the top box be misplaced by a faulty user |
| agent, it will cause the red to be shown. (These tests sometimes |
| come in pairs, one checking that the first box is no bigger than the |
| second, and the other checking the reverse.) These tests frequently |
| look like: |
| |
| <p>Test passes if there is a green square and no red.</p> |
| <div style="width: 100px; height: 100px; background: green"></div> |
| |
| #### The green paragraph and the blank page |
| |
| These tests appear to be identical to the green paragraph tests |
| mentioned above. In reality, however, they actually have more in |
| common with the green square tests, but with the green square |
| colored white instead. This type of test is used when the |
| displacement that could be expected in the case of failure is |
| likely to be very small, and so any red must be made as obvious as |
| possible. Because of this, test would appear totally blank when the |
| test has passed. This is a problem because a blank page is the |
| symptom of a badly handled network error. For this reason, a single |
| line of green text is added to the top of the test, reading |
| something like: |
| |
| <p style="color: green">This line should be green and there should |
| be no red on this page.</p> |
| [Example][green-paragraph] |
| |
| #### The two identical renderings |
| |
| It is often hard to make a test that is purely green when the test |
| passes and visibly red when the test fails. For these cases, it may |
| be easier to make a particular pattern using the feature that is |
| being tested, and then have a reference rendering next to the test |
| showing exactly what the test should look like. |
| |
| The reference rendering could be either an image, in the case where |
| the rendering should be identical, to the pixel, on any machine, or |
| the same pattern made using different features. (Doing the second |
| has the advantage of making the test a test of both the feature |
| under test and the features used to make the reference rendering.) |
| |
| [Visual Example 1][identical-visual-1] |
| |
| [Visual Example 2][identical-visual-2] |
| |
| [Text-only Example][identical-text] |
| |
| ### Indicating failure |
| |
| In addition to having clearly defined characteristics when |
| they pass, well designed tests should have some clear signs when |
| they fail. It can sometimes be hard to make a test do something only |
| when the test fails, because it is very hard to predict how user |
| agents will fail! Furthermore, in a rather ironic twist, the best |
| tests are those that catch the most unpredictable failures! |
| |
| Having said that, here are the best ways to indicate failures: |
| |
| #### Red |
| |
| Using the color red is probably the best way of highlighting |
| failures. Tests should be designed so that if the rendering is a few |
| pixels off some red is uncovered or otherwise rendered on the page. |
| |
| [Visual Example][red-visual] |
| |
| [Text-only Example][red-text] |
| |
| _View the pages' source to see the usage of the color |
| red to denote failure._ |
| |
| #### Overlapped text |
| |
| Tests of the `line-height`, `font-size` and similar properties can |
| sometimes be devised in such a way that a failure will result in the |
| text overlapping. |
| |
| #### The word "FAIL" |
| |
| Some properties lend themselves well to this kind of test, for |
| example `quotes` and `content`. The idea is that if the word "FAIL" |
| appears anywhere, something must have gone wrong. |
| |
| [Example][fail-example] |
| |
| _View the page's source to see the usage of the word FAIL._ |
| |
| ### Special Fonts |
| |
| #### Ahem |
| Todd Fahrner has developed a font called [Ahem][ahem-readme], which |
| consists of some very well defined glyphs of precise sizes and |
| shapes. This font is especially useful for testing font and text |
| properties. Without this font it would be very hard to use the |
| overlapping technique with text. |
| |
| The font's em-square is exactly square. Its ascent and descent is |
| exactly the size of the em square. This means that the font's extent |
| is exactly the same as its line-height, meaning that it can be |
| exactly aligned with padding, borders, margins, and so forth. |
| |
| The font's alphabetic baseline is 0.2em above its bottom, and 0.8em |
| below its top. |
| |
| The font has four glyphs: |
| |
| * X U+0058 A square exactly 1em in height and width. |
| * p U+0070 A rectangle exactly 0.2em high, 1em wide, and aligned so |
| that its top is flush with the baseline. |
| * É U+00C9 A rectangle exactly 0.8em high, 1em wide, and aligned so |
| that its bottom is flush with the baseline. |
| * U+0020 A transparent space exactly 1em high and wide. |
| |
| Most other US-ASCII characters in the font have the same glyph as X. |
| |
| #### Ahem Usage |
| __If the test uses the Ahem font, make sure its computed font-size |
| is a multiple of 5px__, otherwise baseline alignment may be rendered |
| inconsistently (due to rounding errors introduced by certain |
| platforms' font APIs). We suggest to use a minimum computed font- |
| size of 20px. |
| |
| E.g. Bad: |
| |
| ``` css |
| {font: 1in/1em Ahem;} /* Computed font-size is 96px */ |
| {font: 1in Ahem;} |
| {font: 1em/1em Ahem} /* with computed 1em font-size being 16px */ |
| {font: 1em Ahem;} /* with computed 1em font-size being 16px */ |
| ``` |
| |
| E.g. Good: |
| |
| ``` css |
| {font: 100px/1 Ahem;} |
| {font: 1.25em/1 Ahem;} /* with computed 1.25em font-size being 20px |
| */ |
| ``` |
| |
| __If the test uses the Ahem font, make sure the line-height on block |
| elements is specified; avoid `line-height: normal`__. Also, for |
| absolute reliability, the difference between computed line-height |
| and computed font-size should be divisible by 2. |
| |
| E.g. Bad: |
| |
| ``` css |
| {font: 1.25em Ahem;} /* computed line-height value is 'normal' */ |
| {font: 20px Ahem;} /* computed line-height value is 'normal' */ |
| {font-size: 25px; line-height: 50px;} /* the difference between |
| computed line-height and computed font-size is not divisible by 2. */ |
| ``` |
| |
| E.g. Good: |
| |
| ``` css |
| {font-size: 25px; line-height: 51px;} /* the difference between |
| computed line-height and computed font-size is divisible by 2. */ |
| ``` |
| |
| [Example test using Ahem][ahem-example] |
| |
| _View the page's source to see how the Ahem font is used._ |
| |
| |
| ##### Installing Ahem |
| |
| 1. Download the [TrueType version of Ahem][download-ahem]. |
| 2. Open the folder where you downloaded the font file. |
| 3. Right-click the downloaded font file and select "Install". |
| |
| ### Explanatory Text |
| |
| For tests that must be long (e.g. scrolling tests), it is important |
| to make it clear that the filler text is not relevant, otherwise the |
| tester may think he is missing something and therefore waste time |
| reading the filler text. Good text for use in these situations is, |
| quite simply, "This is filler text. This is filler text. This is |
| filler text.". If it looks boring, it's working! |
| |
| ### Color |
| |
| In general, using colors in a consistent manner is recommended. |
| Specifically, the following convention has been developed: |
| |
| #### Red |
| Any red indicates failure. |
| |
| #### Green |
| In the absence of any red, green indicates success. |
| |
| #### Blue |
| Tests that do not use red or green to indicate success or failure |
| should use blue to indicate that the tester should read the text |
| carefully to determine the pass conditions. |
| |
| #### Black |
| Descriptive text is usually black. |
| |
| #### Fuchsia, Yellow, Teal, Orange |
| These are useful colors when making complicated patterns for tests |
| of the two identical renderings type. |
| |
| #### Dark Gray |
| Descriptive lines, such as borders around nested boxes, are usually |
| dark gray. These lines come in useful when trying to reduce the test |
| for engineers. |
| |
| #### Silver / Light Gray |
| |
| Sometimes used for filler text to indicate that it is irrelevant. |
| |
| ### Methodical testing |
| |
| Some web features can be tested quite thoroughly with a very |
| methodical approach. For example, testing that all the length units |
| work for each property taking lengths is relatively easy, and can be |
| done methodically simply by creating a test for each property/unit |
| combination. |
| |
| In practice, the important thing to decide is when to be methodical |
| and when to simply test, in an ad hoc fashion, a cross section of |
| the possibilities. |
| |
| This is an [example][methodical-test] of a methodical test of the |
| `:not()` pseudo-class with each attribute selector in turn, first |
| for long values and then for short values. |
| |
| ### Overlapping |
| |
| This technique should not be cast aside as a curiosity -- it is in |
| fact one of the most useful techniques for testing CSS, especially |
| for areas like positioning and the table model. |
| |
| The basic idea is that a red box is first placed using one set of |
| properties, e.g. the block box model's margin, height and width |
| properties, and then a second box, green, is placed on top of the |
| red one using a different set of properties, e.g. using absolute |
| positioning. |
| |
| This idea can be extended to any kind of overlapping, for example |
| overlapping to lines of identical text of different colors. |
| |
| ## Tests to avoid |
| |
| ### The long test |
| |
| Any manual test that is so long that is needs to be scrolled to be |
| completed is too long. The reason for this becomes obvious when you |
| consider how manual tests will be run. Typically, the tester will be |
| running a program (such as "Loaderman") which cycles through a list |
| of several hundred tests. Whenever a failure is detected, the tester |
| will do something (such as hit a key) that takes a note of the test |
| case name. Each test will be on the screen for about two or three |
| seconds. If the tester has to scroll the page, that means he has to |
| stop the test to do so. |
| |
| Of course, there are exceptions -- the most obvious one being any |
| tests that examine the scrolling mechanism! However, these tests are |
| considered tests of user interaction and are not run with the |
| majority of the tests. |
| |
| Any test that is so long that it needs scrolling can usually be |
| split into several smaller tests, so in practice this isn't much of |
| a problem. |
| |
| This is an [example][long-test] of a test that is too long. |
| |
| ### The counterintuitive "this should be red" test |
| |
| As mentioned many times in this document, red indicates a bug, so |
| nothing should ever be red in a test. |
| |
| There is one important exception to this rule... the test for the |
| `red` value for the color properties! |
| |
| ### Unobvious tests |
| |
| A test that has half a sentence of normal text with the second half |
| bold if the test has passed is not very obvious, even if the |
| sentence in question explains what should happen. |
| |
| There are various ways to avoid this kind of test, but no general |
| rule can be given since the affected tests are so varied. |
| |
| The last [subtest on this page][unobvious-test] shows this problem. |
| |
| [test-format]: ./test-format-guidelines.html |
| [reftests]: ./reftests.html |
| [scripttests]: ./testharness-documentation.html |
| [identical-renderings]: http://test.csswg.org/source/css21/syntax/escapes-000.xht |
| [green-background]: http://test.csswg.org/source/css21/syntax/escapes-002.xht |
| [no-red-1]: http://test.csswg.org/source/css21/positioning/abspos-containing-block-003.xht |
| [no-red-2]: http://test.csswg.org/source/css21/tables/border-conflict-w-079.xht |
| [described-alignment]: http://test.csswg.org/source/css21/margin-padding-clear/margin-collapse-clear-007.xht |
| [overlapping]: http://test.csswg.org/source/css21/tables/table-anonymous-objects-021.xht |
| [imprecise-1]: http://test.csswg.org/source/css21/tables/border-style-inset-001.xht |
| [imprecise-2]: http://test.csswg.org/source/css21/text/text-decoration-001.xht |
| [green-page]: http://www.hixie.ch/tests/adhoc/css/background/18.xml |
| [green-paragraph]: http://www.hixie.ch/tests/adhoc/css/fonts/size/002.xml |
| [identical-visual-1]: http://test.csswg.org/source/css21/floats-clear/margin-collapse-123.xht |
| [identical-visual-2]: http://test.csswg.org/source/css21/normal-flow/inlines-016.xht |
| [identical-text]: http://test.csswg.org/source/css21/fonts/shand-font-000.xht |
| [red-visual]: http://test.csswg.org/source/css21/positioning/absolute-replaced-height-018.xht |
| [red-text]: http://test.csswg.org/source/css21/syntax/comments-003.xht |
| [fail-example]: http://test.csswg.org/source/css21/positioning/abspos-overflow-005.xht |
| [ahem-example]: http://test.csswg.org/source/css21/positioning/absolute-non-replaced-width-001.xht |
| [ahem-readme]: http://www.w3.org/Style/CSS/Test/Fonts/Ahem/README |
| [download-ahem]: http://www.w3.org/Style/CSS/Test/Fonts/Ahem/AHEM____.TTF |
| [long-test]: http://www.hixie.ch/tests/evil/mixed/lineheight3.html |
| [unobvious-test]: http://www.w3.org/Style/CSS/Test/CSS1/current/sec525.htm |
| [methodical-test]: http://www.hixie.ch/tests/adhoc/css/selectors/not/010.xml |