2 variants:
- MH Darcy Flow
- LMH Darcy Flow

3 variants of changes:
- *_orig ...  compute gradients, quadrature order 3
- *_no_grad ... turn off gradients, quadrature order 3
- *_q2 ... turn off gradients, quadrature order 2


Turning of updating gradients brings:
[*_no_grad]
1,84 x speedup in DarcyFlowMHy::assembly_mh_matrix
1,81 x speedup in DarcyLMH::assembly_steady_mh_matrix
(measured with profiler on master a41e096, release build, average of 10 measurements)



I tried decreasing the quadrature order to 2 which should be accurate for the combinations of RT0 base functions (v_i, v_j)
with further speedup:
[*_q2]
1,72 x speedup in DarcyFlowMHy::assembly_mh_matrix
1,32 x speedup in DarcyLMH::assembly_steady_mh_matrix
However it breaks tests with bin output (03-37).