Commits
Jozsef Bakosi authored 4d5ad5ec532
Add LohCG regression tests exercising damp2/4, rk, migration
While attempting to add the test with migration, results were wrong, and
appeared non-deterministic.
Fixed bug with ngradcomp, which correctly for damp4 should be 12.
However, this was not the cause of the migration problem, but this
triggers a revisit of damp4 LohCG convergence rate as previous code
operated with some garbage derivatives.
Migration was fixed by forcing communication with empty data instead of
skipping in parallel if gradients are not needed (for damp2) as
- if (d->NodeCommMap().empty() or !m_grad.nprop()) {
+ if (d->NodeCommMap().empty()) {
This forces the correct asynchronous logic in parallel, required even
for damp2 which would not require gradients. This apparently interfered
with migration previously and now appears reproducible and fixing the
migration test with LohCG.
A consequence, on the receive side in comgrad() now empty containers are
operated on, so operators += and -= were updated in ContainerUtil so
simply return the dst on empty src data instead of throwing exceptions.
Also simply listed everything in LohCG::pup, just be safe.