Skip to content

Commit 8c52609

Browse files
committed
feat: size_of of live DistArray storage in a World
Adds a ground-truth tile-data accounting facility that finds every live DistArray of a given type by walking a World's WorldObject registry, rather than summing size_of over a set of handles. Because each array's tile storage is a single DistributedStorage WorldObject, an array referenced by N shallow-copy handles is counted exactly once -- handle summation double-counts shared storage, which makes it unsuitable as a cross-check. API (TiledArray namespace, dist_array.h): - size_of<S>(const detail::DistributedStorage<T>&) Tile-data bytes of one storage object (sum of size_of<S>(tile) over locally-owned, set tiles). - size_of_live_distarray_storage<DistArrayT, S>(World&) Walks world.get_object_ids(), recovers each registered pointer as the common polymorphic base madness::WorldObjectBase, dynamic_casts to the DistributedStorage matching DistArrayT's tile type (others skipped), and sums the above. - size_of_live_distarrays_storage<S, DistArrayTs...>(worlds) [world][type] matrix of the above. IMPORTANT: these report the DistributedStorage (tile-data) footprint ONLY. They exclude the DistArray-level TiledRange, Shape, and Pmap -- those live in the owning ArrayImpl/TensorImpl, not the storage, and are not reachable from the registered WorldObject. Under SparsePolicy the Shape (per-tile Frobenius-norm table) can be sizeable, so the result is NOT comparable term-for-term with a sum of size_of(const DistArray&) over handles (which includes the shape). Names say "storage" to make this explicit. DistributedStorage gains for_each_local_tile(op): applies op to each locally-owned, set tile -- the same tile set size_of(DistArray) iterates. The size_of<S>(tile) summation is done by the size_of(storage) overload in dist_array.h, where the tile-type overloads are visible (they need not be at the point this low-level header is parsed). Type-safety rests on WorldObjectBase sitting at offset 0 of every registered WorldObject; verified that across MADNESS, TiledArray, and MPQC no WorldObject-derived class has WorldObject as a non-primary base (the single-inheritance "class X : public WorldObject<X>" idiom). The recovered base is dynamic_cast, so a wrong type yields nullptr, not UB. Counts only locally-owned set tiles; excludes remote-tile caches. Call at a quiescent point (after a fence). Test (array_suite/live_storage_size_in_world): builds two distinct arrays plus a shallow copy of one, checks the storage walk equals the two-array (deduplicated) tile-data total -- not the three-handle sum -- that a ToT-typed walk does not pick up regular-tensor arrays, and the variadic matrix form. Passes at np=1 and np=2 (CI).
1 parent 600c4ad commit 8c52609

3 files changed

Lines changed: 172 additions & 0 deletions

File tree

src/TiledArray/dist_array.h

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2015,6 +2015,93 @@ std::size_t size_of(const DistArray<Tile, Policy>& da) {
20152015
return result;
20162016
}
20172017

2018+
/// \return the number of bytes the locally-owned tiles of \p storage occupy
2019+
/// in memory space `S`.
2020+
///
2021+
/// This is the *tile-data* footprint of a `DistArray`'s storage object only.
2022+
/// It deliberately does **not** include the `DistArray`-level metadata --
2023+
/// `TiledRange`, `Shape`, and `Pmap` -- because those live in the owning
2024+
/// `ArrayImpl`/`TensorImpl`, not in the `DistributedStorage`. For
2025+
/// `SparsePolicy` the `Shape` (a per-tile Frobenius-norm table) can be
2026+
/// sizeable, so this undercounts the full per-array footprint that
2027+
/// `size_of(const DistArray&)` reports. Counts only tiles whose futures are
2028+
/// set; pending and remote-cached tiles are skipped.
2029+
/// \tparam S the memory space to report
2030+
template <MemorySpace S, typename T>
2031+
std::size_t size_of(const detail::DistributedStorage<T>& storage) {
2032+
std::size_t result = 0;
2033+
storage.for_each_local_tile(
2034+
[&result](const auto& tile) { result += size_of<S>(tile); });
2035+
return result;
2036+
}
2037+
2038+
/// \return the per-rank tile-data bytes (in memory space `S`) of the
2039+
/// `DistributedStorage` of *all* live `DistArray<Tile,Policy>` of the
2040+
/// requested type currently registered in \p world, discovered by walking
2041+
/// the World's `WorldObject` registry.
2042+
///
2043+
/// Each array's tile storage is a single `detail::DistributedStorage`
2044+
/// `WorldObject`, so an array referenced by N shallow-copy handles is counted
2045+
/// exactly once — unlike summing `size_of` over a set of handles, which
2046+
/// double-counts shared storage. This makes the result suitable as ground
2047+
/// truth for validating handle-based tile-data accounting.
2048+
///
2049+
/// Discovery is type-safe: each registered pointer is recovered as the common
2050+
/// polymorphic base `madness::WorldObjectBase` and `dynamic_cast` to the
2051+
/// `DistributedStorage` matching `DistArrayT`'s tile type; non-matching
2052+
/// objects (other tile types, MADNESS containers) are skipped. Assumes the
2053+
/// registered `WorldObject`s place `WorldObjectBase` at offset 0 (true for
2054+
/// the single-inheritance `class X : public WorldObject<X>` idiom TA uses).
2055+
///
2056+
/// \warning This reports the `DistributedStorage` (tile-data) footprint only.
2057+
/// It excludes the `DistArray`-level `TiledRange`, `Shape`, and `Pmap`; the
2058+
/// `Shape` can be large under `SparsePolicy`. It is therefore **not**
2059+
/// comparable term-for-term with a sum of `size_of(const DistArray&)` over
2060+
/// handles (which includes the shape). Use it for tile-data accounting, not
2061+
/// total-DistArray-footprint accounting.
2062+
/// \note Counts only locally-owned tiles whose futures are set. Excludes
2063+
/// remote-tile caches. Call at a quiescent point (after a fence).
2064+
/// \tparam DistArrayT the `DistArray` specialization to look for
2065+
/// \tparam S the memory space to report (default `Host`)
2066+
template <typename DistArrayT, MemorySpace S = MemorySpace::Host>
2067+
std::size_t size_of_live_distarray_storage(World& world) {
2068+
using tile_type = typename DistArrayT::value_type;
2069+
using storage_type = detail::DistributedStorage<tile_type>;
2070+
std::size_t result = 0;
2071+
for (const auto& id : world.get_object_ids()) {
2072+
auto base_opt = world.template ptr_from_id<madness::WorldObjectBase>(id);
2073+
if (!base_opt || !*base_opt) continue;
2074+
if (auto* storage = dynamic_cast<storage_type*>(*base_opt)) {
2075+
result += size_of<S>(*storage);
2076+
}
2077+
}
2078+
return result;
2079+
}
2080+
2081+
/// \return a matrix of per-rank live-storage tile-data byte totals indexed
2082+
/// `[world_index][type_index]`: for each `World` in \p worlds (rows) and each
2083+
/// `DistArray` type in the pack `DistArrayTs` (columns), the value of
2084+
/// `size_of_live_distarray_storage<DistArrayT, S>(world)`. Lets a caller
2085+
/// inventory which array types hold how much tile data in which world at a
2086+
/// checkpoint, deduplicated across shallow-copy handles.
2087+
///
2088+
/// \warning Tile-data only; see `size_of_live_distarray_storage` for the
2089+
/// excluded-metadata caveat (no `TiledRange`/`Shape`/`Pmap`).
2090+
/// \note `S` is the leading template argument (it has a default but precedes
2091+
/// the type pack), so callers must spell it out:
2092+
/// `size_of_live_distarrays_storage<MemorySpace::Host, ArrayA,
2093+
/// ArrayB>(worlds)`.
2094+
template <MemorySpace S = MemorySpace::Host, typename... DistArrayTs>
2095+
std::vector<std::array<std::size_t, sizeof...(DistArrayTs)>>
2096+
size_of_live_distarrays_storage(const std::vector<World*>& worlds) {
2097+
std::vector<std::array<std::size_t, sizeof...(DistArrayTs)>> result;
2098+
result.reserve(worlds.size());
2099+
for (World* w : worlds) {
2100+
result.push_back({size_of_live_distarray_storage<DistArrayTs, S>(*w)...});
2101+
}
2102+
return result;
2103+
}
2104+
20182105
#ifndef TILEDARRAY_HEADER_ONLY
20192106

20202107
extern template class DistArray<Tensor<double>, DensePolicy>;

src/TiledArray/distributed_storage.h

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
#ifndef TILEDARRAY_DISTRIBUTED_STORAGE_H__INCLUDED
2121
#define TILEDARRAY_DISTRIBUTED_STORAGE_H__INCLUDED
2222

23+
#include <TiledArray/platform.h>
2324
#include <TiledArray/pmap/pmap.h>
2425

2526
namespace TiledArray {
@@ -360,6 +361,25 @@ class DistributedStorage : public madness::WorldObject<DistributedStorage<T>> {
360361
/// \throw nothing
361362
size_type size() const { return data_.size(); }
362363

364+
/// Apply \p op to each locally-owned tile whose future is already set.
365+
366+
/// Pending (unset) and remote-cached elements are skipped. No
367+
/// communication; intended to be called at a quiescent point (e.g. after a
368+
/// fence). This is the per-rank local tile set, the same one
369+
/// `size_of(DistArray)` iterates. Any summation it enables (e.g. of
370+
/// `size_of<S>(tile)`) is left to the caller, which sees the tile-type
371+
/// overloads -- those need not be visible where this low-level header is
372+
/// parsed.
373+
/// \tparam Op a callable invocable as `op(const value_type&)`
374+
/// \param op the callable to apply to each set local tile
375+
template <typename Op>
376+
void for_each_local_tile(Op&& op) const {
377+
for (auto it = data_.begin(); it != data_.end(); ++it) {
378+
const future& f = it->second;
379+
if (f.probe()) op(f.get());
380+
}
381+
}
382+
363383
/// Max size accessor
364384

365385
/// The maximum size is the total number of elements that can be held by

tests/dist_array.cpp

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1020,6 +1020,71 @@ BOOST_AUTO_TEST_CASE(size_of) {
10201020
BOOST_REQUIRE(sz0 == sz0_expected);
10211021
}
10221022

1023+
BOOST_AUTO_TEST_CASE(live_storage_size_in_world) {
1024+
using T = Tensor<double>;
1025+
using ToT = Tensor<T>;
1026+
using Policy = SparsePolicy;
1027+
using ArrayT = DistArray<T, Policy>;
1028+
using ArrayToT = DistArray<ToT, Policy>;
1029+
1030+
auto& world = get_default_world();
1031+
world.gop.fence();
1032+
1033+
// arrays from earlier test cases may still be registered (destruction is
1034+
// deferred to the next fence), so measure a baseline and compare deltas
1035+
auto const base_T = TiledArray::size_of_live_distarray_storage<ArrayT>(world);
1036+
auto const base_ToT =
1037+
TiledArray::size_of_live_distarray_storage<ArrayToT>(world);
1038+
1039+
TiledRange const trange({{0, 2, 5, 7}, {0, 5, 7, 10, 12}});
1040+
1041+
// two distinct regular arrays
1042+
auto a1 = make_array<ArrayT>(world, trange, [](T& tile, Range const& rng) {
1043+
tile = T(rng, 1.0);
1044+
return tile.norm();
1045+
});
1046+
auto a2 = make_array<ArrayT>(world, trange, [](T& tile, Range const& rng) {
1047+
tile = T(rng, 2.0);
1048+
return tile.norm();
1049+
});
1050+
// shallow copy: shares a1's storage WorldObject, must NOT be double-counted
1051+
ArrayT a1_copy = a1;
1052+
BOOST_REQUIRE(a1_copy.trange() == a1.trange()); // keep a1_copy alive & used
1053+
1054+
world.gop.fence();
1055+
1056+
// per-array local tile-data bytes = size_of(array) - size_of(shape); the
1057+
// storage walk reports tile data only, so subtract the shape from the
1058+
// handle-based full-array size_of
1059+
auto tiles_only = [](ArrayT const& a) {
1060+
return TiledArray::size_of<MemorySpace::Host>(a) -
1061+
TiledArray::size_of<MemorySpace::Host>(a.shape());
1062+
};
1063+
1064+
// the storage walk counts each distinct DistributedStorage once: a1 + a2,
1065+
// NOT a1 + a2 + a1_copy
1066+
auto const expected_T = tiles_only(a1) + tiles_only(a2);
1067+
auto const got_T =
1068+
TiledArray::size_of_live_distarray_storage<ArrayT>(world) - base_T;
1069+
BOOST_CHECK_EQUAL(got_T, expected_T);
1070+
1071+
// the ToT-typed walk must not pick up the regular (T) arrays
1072+
auto const got_ToT_delta =
1073+
TiledArray::size_of_live_distarray_storage<ArrayToT>(world) - base_ToT;
1074+
BOOST_CHECK_EQUAL(got_ToT_delta, 0u);
1075+
1076+
// variadic matrix: one world (one row), two types (two columns)
1077+
auto const mat = TiledArray::size_of_live_distarrays_storage<
1078+
MemorySpace::Host, ArrayT, ArrayToT>(std::vector<World*>{&world});
1079+
BOOST_REQUIRE_EQUAL(mat.size(), 1u);
1080+
BOOST_CHECK_EQUAL(mat[0][0],
1081+
TiledArray::size_of_live_distarray_storage<ArrayT>(world));
1082+
BOOST_CHECK_EQUAL(
1083+
mat[0][1], TiledArray::size_of_live_distarray_storage<ArrayToT>(world));
1084+
1085+
world.gop.fence();
1086+
}
1087+
10231088
BOOST_FIXTURE_TEST_CASE(fill_zero_sparse, ArrayFixture) {
10241089
// construct a sparse array with some non-zero tiles and fill it
10251090
SpArrayN as(world, tr, TiledArray::SparseShape<float>(shape_tensor, tr));

0 commit comments

Comments
 (0)