User:Jwilder~mediawikiwiki/Jeff's Test Gen Script - SMAA

From mediawiki.org

Introduction[edit]

This test suite generation methodology is designed to give a high degree of flexibility with minimal effort. Folks typically create their own perl scripts to generate a suite of tests with the parameters under test. While my flow also relies on a perl script written by the user, a perl library is provided to do much of the heavy lifting of performing cross products and basic stat rollup.

The methodology consists of specifying a base configuration file, then a number of series of parameters that should be swept. The script will perform the cross-product of the parameters and create new config files consisting the base cfg and the new params tacked onto the end.

You also specify the stats you want rolled up using regular expressions. If there are multiple matches, the stat result will be averaged automatically (i.e. throughput of multiple pcie busses). 0 stats are automatically excluded from the average. There are callback provisions that allow you to create virtual stats that allow you to modify stats or create new stats that are mathematical variants of other stats or parameters. An example of this is calculating memory throughput from a CAS count stat and simulation time.

A single script can easily manage multiple similar studies that involve different base configs. For example, if you want to do the same type of throughput-latency curve study on a 2P and 4P system, you only need to specify an additional base config. All the series specifications and stats don't need to be repeated, and thus simplifies modification and maintenance.

Setup[edit]

  1. Make sure your $PERL5LIB variable include the path where TestGen.pm resides. This is: /p/map/arch/ds0/lib
  2. Create an appropriate test suite generation script. Some examples are in /nfs/pdx/disks/ppa.map.6/lib/sample_test_gen_scripts
  3. You need to have your own indigo_run.pl script to execute the test. This script launches the indigo executable specifying the appropriate cfg file argument, and then strips out key stats from the result file to make rollup faster. You can copy mine at: /nfs/site/home/jwilder/bin/indigo_run.pl. If you want to run Coho insead of Indigo stand-alone, then you would make the appropriate changes in this file.

Flow[edit]

  1. Create test generation script, i.e. gen_study.pl.
  2. Execute gen_study.pl create. This creates sub-directory specified in script and creates all the .cfg files for the study. It also creates a launch.bat file that can be sent to NB.
  3. If your indigo_run.pl script only has a local path the indigo or coho executable, like mine does, then you need to copy the executable to the new directory. I prefer this method over an absolute path to the build directory so that I can reliably reproduce a particularly test later, with more debug info, etc.
  4. Send the launch.bat file to NB. I use an alias for the NB params, so that I just say, mynbm29 launch.bat.
  5. When all the tests are completed, do gen_study.pl tally, to rollup all the specified stats for each test into a single results.txt file.

Script Creation[edit]

Start with an example. Keep the top part unchanged:

use strict;
use vars        qw($outdir $base_cfg $base_outname $pre_override $post_override $os $home $sep);
use TestGen     qw(&ProcessArgs &AddSeries &GetValue &GetStatVal &SetStatVal &SetStatName &ClearSeries &ClearAllSeries &ClearAllStats &Process &AppendOverride *RESULT);

# Need to specify "create" or "tally" as an argument
ProcessArgs();

Set $base_cfg to a base configuration file. It's OK for this file to have parameters that you are going to change. The changes will be tacked on later in the file and thus will override previous settings.

Optionally set $base_outname. This will be a prefix to the automatically generated test name, which may be useful if you have multiple studies rolled up into the same spreadsheet.

Set $outdir to the name of the directory you want to be created for this study.

Optionally set $pre_override and $post_override. These will tack the specified text to the beginning or end of the config file. $pre_override was useful in Plato, but can't think of a reason for it's use in Indigo. $post_override is useful specifying a parameter that should be applied to every test in a particular study, but isn't included in the base config. For example, you can have 1 study with -dram_force_all_pg_hits 0, and another with -dram_force_all_pg_hits 1.

ClearAllSeries[edit]

Call ClearAllSeries(); at the beginning of each study to clear out the data structure that manages the series data.

AddSeries[edit]

Call AddSeries(tag, param name|function, value, ...) to specify a series of param values to be swept and participate in the cross product.

The filename of the test is made up of the tag and current value of each of the unique series'. Only the first occurence of each tag is used in the filename.

tag is just a unique label for this series and will be used to construct the test name, so keep it brief. If you have multiple lines with the same tag then the vertical slice will be taken together and put in the config file. Only unique tags participate in the cross-product. Note the following example. We don't want the full package string to be in the filename, so we make the first line contain a more readable description of the parameter; 2, 4, 6, or 8 cores. The second line of the "cores" tag specifies the actual parameter to go in the config file.

AddSeries ("cores", "",                             "2",              "4",                "6",                "8");
AddSeries ("cores", "-package_config",   "JDIDDDTDDIDR",   "JDIIDDTDIIDR", "JDIIIDDDTDDIIIDR", "JDIIIIDDTDIIIIDR");

Param name / function is the 2nd argument of the function and typically is the name of the knob that goes in the config file. If you leave this string empty (""), nothing gets added to the config file. This is common just to provide a readable filename, as in the example above. If this argument does not start with '-', it is assumed to be a function name that will be called to do more complicated stuff. More on that later.

The next arguments are simply a variable number of parameter values to sweep across. However, one additional feature is that you can specify an exclusion tag, of the form 'e<n>:'. These only need to be specified for the first occurrence of a particular tag. If the test generation algorithm comes across 2 or more values from different series that have the same exclusion tag, then this test combination will be completely excluded from generation. For example, if a concurrent test is sweeping core BW throughput and IO throughput, you don't want a test that has both of those values 0. So you might specify something like:

AddSeries ("creq", "set_cia",   "e1:0", "2", "4", "6", "8", "10");
AddSeries ("ireq", "set_iia",   "e1:0", "1", "2", "3", "4", "5");

The combination of creq0 and ireq0 will be excluded. The number associated with the exclusion tag needs to be a single digit, so 10 seperate exclusions can be specified.

ClearAllStats[edit]

Call ClearAllStats(); at the beginning of each study to clear out the data structure that manages the stat data.

SetStatName[edit]

Call SetStatName(tag, stat spec); to register a stat to be rolled up.

The tag will be used for reference and also printed as a column header for each stat.

The stat spec is a regular expression for matching the stat name from a testname.stat file. .stat file is created by the indigo_run.pl test running script. Multiple matches are averaged together automatically. Results of 0 are automatically excluded from averaging.

Process[edit]

Finally, call Process(); to cause the the script to generate the suite of tests.

After the Process() line, you can create an additional suite by making small modification without having to specify the series all over again. For example, you can just specify a new $outdir and $base_cfg. This is useful if you just want to run the same suite of tests on a 2P or 4P system, NUMA vs. UMA, etc. ... anything with enough common param differences that it requires a seperate base config file rather than just specifying them on a series.

PrintHeader Callback[edit]

This callback is called just before the stats are outbput to the results.txt file. It prints the header for each column of stats in the result table, and populates the @stat_key_list structure used to output the stats. The most common thing to do here is adjust the order in which you want the stats to be listed, and add any virtual stats you want. A 'virtual stat' is one that was not specifically named with a SetStatName call, but is calculated in TallyCallback.

TallyCallback[edit]

This callback is called for each stat in the @stat_key_list structure and gives you the opportunity to make adjustments to the value. You can do calculations utilizing previously processed stat values via GetStatVal(tag). Then the new value can be set with SetStatVal(tag, val). For example:

        if ($test_name =~ /cores([0-9]+)_/) {
            $numcores = $1;
        }
        if ($key eq "total_core_tput") {
            SetStatVal($key, (GetStatVal('core_rd_bw') + GetStatVal('core_wr_bw')) * $numcores);
        }