6.7.13

Customizing apache Big Top smoke tests

Apache BigTop smoke tests can confirm all your ecosystem components... But
you never really want to test EVERYTHING.

0) When you start , make sure you have logging configured to TRACE for easy debugging: 

Logging of bigtop tests is largely done by the Shell.groovy class, which uses the apache commons logging library to log output from the system calls to mapreduce jobs and ecosystem tool invocations.  You can edit the logging level in the top level pom file for test-executions:


bigtop-tests/test-execution/common/pom.xml
Otherwise, be careful what argument you send to your smoke execution task:  It will determine wether or not your client's standard out gets saved or not.     See the pink highlight in (3) below.

1) Start by excluding the ecosystem components which you don't use.  Surely, there is no point in running Flume integration tests if you're only running MapReduce and HIVE, right?

You can do this easily by editing the "modules" tag in the pom file.  Below is what the modules will look like for only running the basic hadoop smoke tests. 
smokes/pom.xml:
<modules>
  <!--
    List of modules which should be executed as a part of stack testing run
    <module>pig</module>
    <module>hive</module>
 -->


  <module>hadoop</module>

  <!--   
    <module>oozie</module>
    <module>hbase</module>
    <module>mahout</module>
    <module>giraph</module>
    <module>hue</module>
    <module>crunch</module>
    <module>flume</module>
    <module>sqoop</module>
    <module>hcatalog</module>
  -->
 </modules>

2) Now, you dive further into the remaining submodules (in this case, hadoop), and you can refine which specific tests in that module you actually want to run (for example, you can exclude testing of hadoop's snappy features if your not using snappy compression, etc..).

All the big top tests are in named classes under
./bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/<ecosystem>/<testclassname>
Where <ecosystem>= the ecosystem tool, for example "mahout", and testclassname is the name of the iTest implementation in bigtop under that package, (i.e. "TestMahoutExamples").
 
To see which tests you don't care about, or why tests are failing, you can just grep through the target/failsafe-reports for FAILUREs.
Speaking of failures... quick side note for debugging tests: Thankfully, ITest logs the exact hadoop command invocations, so you can manually invoke failures and observe hadoop crash and burn directly, rather than having to rerun the bigtop test suites when debugging a single failed command/job.

For example, (from the failsafe logs):
13/07/06 22:46:53 TRACE shell.Shell: /bin/bash << __EOT__
hdfs dfsadmin -safemode get
__EOT__

So.. Here is how you remove unwanted sub-ecosystem components

Now its time to open up the ecosystem subcomponent pom file, and micro customize the individual tests you want to run.  Note that the maven failsafe plugin will, by default, include no tests in bigtop (because bigtop tests at the time of this writing do not use the IT* syntax which failsafe expects).

Thus, you will have to be specific in your inclusion filters, and you shouldn't need to use exclusion filters (except to filter out some of your includes).

So, there are specific poms for each of: hadoop, sqoop, flume, hbase, griafe, etc..

In this case, lets fire up vi and edit the hadoop/pom.xml tests to only run the TestCli* and TestFs* smokes, and nothing else:

smokes/hadoop/pom.xml:
     <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-failsafe-plugin</artifactId>
        <version>2.11</version>
        <configuration>
          <forkMode>always</forkMode>
          <systemPropertyVariables>
            <test.cache.data>
              ${project.build.directory}/clitest_data
            </test.cache.data>
          </systemPropertyVariables>
    <includes>
        <include>**/Test*</include> <!-- DELETE THIS -->
    </includes>


... And add a more precise include tag:
smokes/hadoop/pom.xml:
    <includes>        <include>**/TestCli*,**/TestF*</include>
  </includes>
3) Now, re-run your tests as normal:

mvn -fae clean verify -Dorg.apache.bigtop.itest.log4j.level=INFO -f smokes/pom.xml
That was easy :) 

What happened?  The smokes/pom.xml simply invokes each submodule, and in turn, each submodule uses the regex filtering to include/exclude certain iTest modules.  So, for example:

- smokes/pom.xml invokes hadoop submodules
- hadoop submodules queries for tests matching the <includes> regex
- hadoop submodule runs matching tests
- control returns back to the smokes/pom.xml super module.

Thanks to the magic of maven's failsafe plugin and maven modules, you can uber-customize apache bigtop smoke tests and test your whole ecosystem without having to write a single line of code.

(advanced)
4) You can go one step further, and customize the ITests themselves, you will need to edit the .groovy files.  

This is not entirely trivial: I found that maven would grab SNAPSHOT named builds from remote repositories during the build life cycle before I turned
<offline>true</offline>
In my settings.xml file.  After setting offline=true, however, the
mvn clean install -DskipTests -DskipITs -DperformRelease -f bigtop-tests/test-artifacts/pom.xml 
command will build and install your code into your local maven repository so that you can run your own customized bigtop iTests.  To speed up the build, I comment out all the submodules which I'm not changing (i.e. the ones that I'm not concerned about customizing).  i.e.
 
  <modules>
      <module>hadoop</module>
      <module>mahout</module>
      <!-- ...

Example above: modules commented in bigtop-tests/test-artifacts/pom.xml  so that you dont have to build all the smoke tests when you are customizing groovy scripts for a particular test.


* there are a few other components to the source tree which you may want to play with, like the test-framework.  These jars are built separately.
mvn clean install -DskipTests -DskipITs -DperformRelease -f bigtop-tests/test-framework/pom.xml

No comments:

Post a Comment