This is the positive test case, representing the best case
scenario, so you expect this to be the cheapest path both in
terms of time and memory. For the most part, there's nothing
surprising here.
I'm mostly curious why there's this much difference between
test-unit and minitest.
For some reason test-unit internal time performs much worse than rspec but the real time (the external or wall time) is much better than rspec's. At the end of the day, I care about how much time my tests took to actually finish rather than what they say they took to run internally.
This shouldn't be much different from positive internal time, but it is. It should just be the time it takes to load the libraries and to start and report. In some cases, it appears that reporting is costly.
Memory is memory, right? Sorta. You can see that even in the
null test case that generating a bunch of methods comes with a
cost. If your test framework generates methods for tests, then
you'd expect it to be fairly similar. If, on the other hand your
test framework reinvents the method, then you'll have to expect
it to cost more, even in the best case scenario.
And whether you choose to hold onto passing results also makes a
big difference.
But when it comes to failing tests, this is where your test framework has some choices to make. Hold onto the failing test? Hold onto something else (hopefully cheaper)? Print the failure immediately and throw everything away (nothing seems to do this). This is when what you choose to hold onto starts to really make a difference.
Library | files | blank | comment | code | difference |
---|---|---|---|---|---|
minitest | 15 | 829 | 1369 | 2093 | |
testunit | 51 | 1023 | 2232 | 7676 | 3.6x |
rspec | 199 | 4301 | 8867 | 18841 | 9.0x |
Test | size | time [s] | real [s] | MRSS [meg] |
---|---|---|---|---|
pos | 10 | 0.00 | 0.030 | 12.829 |
pos | 100 | 0.00 | 0.030 | 12.845 |
pos | 1000 | 0.00 | 0.030 | 13.304 |
pos | 10000 | 0.00 | 0.040 | 17.646 |
pos | 100000 | 0.00 | 0.150 | 79.036 |
neg | 10 | 0.00 | 0.030 | 12.943 |
neg | 100 | 0.00 | 0.030 | 12.861 |
neg | 1000 | 0.00 | 0.030 | 13.468 |
neg | 10000 | 0.00 | 0.040 | 22.561 |
neg | 100000 | 0.00 | 0.200 | 79.970 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
class NullSpeed ITERS.times.map { |n| case TYPE when "pos" define_method "test_#{n}" do raise "no" unless 1 == 1 end when "neg" define_method "test_#{n}" do raise "no" unless 1 == 2 end end } end ITERS.times.each do |n| NullSpeed.new.send(name) rescue nil end |
This is a “null” test, which just simulates N methods that either call 1 simple thing or “fail” by calling a simple thing and then raising. Basically, the same structure as a test suite in pretty much any library and the same rough amount of work. It is here to compare against the raw speed of ruby.
Test | size | time [s] | real [s] | MRSS [meg] |
---|---|---|---|---|
pos | 10 | 0.00 | 0.050 | 16.171 |
pos | 100 | 0.00 | 0.050 | 16.073 |
pos | 1000 | 0.01 | 0.060 | 16.990 |
pos | 10000 | 0.08 | 0.140 | 26.608 |
pos | 100000 | 0.86 | 1.080 | 109.085 |
neg | 10 | 0.00 | 0.050 | 16.581 |
neg | 100 | 0.00 | 0.050 | 16.925 |
neg | 1000 | 0.02 | 0.080 | 22.626 |
neg | 10000 | 0.14 | 0.410 | 83.313 |
neg | 100000 | 1.66 | 4.490 | 702.480 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
describe "minitest" do ITERS.times do |n| case TYPE when "pos" then it "test_pos_#{n}" do expect(1).must_equal 1 end when "neg" then it "test_pos_#{n}" do expect(1).must_equal 2 end end end end |
This is minitest/spec style to be more of an apples-to-apples
comparison against rspec. It uses describe
, it
, and “expectation-
style” rather than class
, def
, and “assertion-style”. The
difference in speed is negligable.
Notice that compared to the null test, times and memory are roughly proportional until you get to 100k. At that point minitest is definitely holding onto memory (failing test objects) that plain ruby code doesn’t need to have.
Test | size | time [s] | real [s] | MRSS [meg] |
---|---|---|---|---|
pos | 10 | 0.00 | 0.050 | 16.810 |
pos | 100 | 0.00 | 0.050 | 16.089 |
pos | 1000 | 0.01 | 0.060 | 16.941 |
pos | 10000 | 0.05 | 0.110 | 24.527 |
pos | 100000 | 0.56 | 0.790 | 107.512 |
neg | 10 | 0.00 | 0.050 | 16.531 |
neg | 100 | 0.00 | 0.050 | 16.728 |
neg | 1000 | 0.01 | 0.070 | 23.118 |
neg | 10000 | 0.10 | 0.330 | 83.558 |
neg | 100000 | 1.18 | 3.660 | 711.098 |
This is exactly the same code as the above, but was ran w/ --quiet
option to not print out the dots while running. The only difference is
a slightly faster internal and real time.
Test | size | time [s] | real [s] | MRSS [meg] |
---|---|---|---|---|
pos | 10 | 0.00 | 0.060 | 18.235 |
pos | 100 | 0.00 | 0.060 | 17.449 |
pos | 1000 | 0.01 | 0.070 | 18.874 |
pos | 10000 | 0.13 | 0.250 | 33.079 |
pos | 100000 | 1.40 | 2.560 | 193.266 |
neg | 10 | 0.01 | 0.060 | 18.022 |
neg | 100 | 0.05 | 0.100 | 18.858 |
neg | 1000 | 0.46 | 0.530 | 21.447 |
neg | 10000 | 4.68 | 4.810 | 44.728 |
neg | 100000 | 82.47 | 84.210 | 288.77 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
class TestUnitSpeed < Test::Unit::TestCase ITERS.times do |n| case TYPE when "pos" then define_method "test_pos_#{n}" do assert_equal 1, 1 end when "neg" then define_method "test_pos_#{n}" do assert_equal 1, 2 end end end end |
Not as fast as minitest, but still pretty fast. It only gets weird looking on the last test when the time and real time both explode 20x and the memory goes up 6x. But all in all, pretty solid.
To me, the library still feels pretty crufty.
Test | size | time [s] | real [s] | MRSS [meg] |
---|---|---|---|---|
pos | 10 | 0.00 | 0.070 | 19.169 |
pos | 100 | 0.01 | 0.070 | 19.579 |
pos | 1000 | 0.04 | 0.130 | 22.446 |
pos | 10000 | 0.39 | 0.690 | 58.360 |
pos | 100000 | 4.07 | 6.640 | 398.541 |
neg | 10 | 0.01 | 0.080 | 21.725 |
neg | 100 | 0.01 | 0.130 | 24.052 |
neg | 1000 | 0.06 | 0.560 | 73.777 |
neg | 10000 | 0.60 | 5.280 | 809.894 |
neg | 100000 | 6.64 | 1354.81 | 2777.61 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
describe "rspec" do ITERS.times do |n| case TYPE when "pos" then it "pos #{n}" do a, b = 1, 1 expect(a).to eq(b) end when "neg" then it "neg #{n}" do a, b = 1, 2 expect(a).to eq(b) end end end end |
Internal time seems to be consistent, but real time just explodes 270x for the worst case scenario!
With the exception of test unit’s internal time for negative 100k, there’s literally no other test framework datapoint that’s slower or fatter. Compared to minitest memory consumption is often 4x higher in both positive and negative test cases. Sometimes it is 10x.