Skip to content

Interaction transformation method and Operon. #39

@greenleaf641

Description

@greenleaf641

I am trying to see if it is possible to have Operon work not with a tree structure but by using the "Transformation Interaction representation". Briefly what this method hopes to do, is simplify the resulting functions by representing the population not as a tree but as a collection of multiple unaries acting on top of a polynomial. Something like:

weight1 * function1(x1^poly11 * x2^ poly12 * ...) + weight2 * function2(x1^poly21 * x2^poly22 * ...) + ...

I made a new generator that will only select functions with arity 1 and instead of a tree will generate a collection of genotypes that have the form described above:

struct Individual {
        // number of nodes (Functions to be applied)
        uint16_t Length = 0UL; // 0-65535

        Operon::Vector<Operon::Scalar> Fitness;

        // vector of free parameters of the affine combination of the `Functions` in `Individual`.
        Operon::Vector<Operon::Scalar> Coefficients;

        // vector of transformation Functions by index.
        Operon::Vector<Operon::Function> Functions;

        // vector of interaction Functions for each transformation.
        Operon::Vector<Operon::Vector<Operon::Scalar>> Polynomials;

        Individual() = default;

        ...
};

I altered the Interpreter as well so it can evaluate the genotypes of this form; I added this function to the interpreter class:

        auto ComputePolynomials(Operon::Range range, int i) const {
            auto const& individual{ individual_.get() };
            auto const& terms = individual.Polynomials[i];
            const auto start = range.Start();
            const auto size = range.Size();

            constexpr int64_t S{ BatchSize };
            auto* h = primal_.data_handle();

            std::fill_n(h + i * S, S, T{1});

            for (int64_t j = 0; j < static_cast<int64_t>(terms.size()); ++j) {
                auto const* variableValues = dataset_.get().GetValues(j).subspan(start, size).data();
                auto* primalPtr = h + i * S;
                auto term = terms[j];
                std::transform(variableValues, variableValues + size, primalPtr, primalPtr, [term](auto value, auto current) {
                    T result = std::pow(value, term);
                    return current * result;
                });
            }
        }

So then the newForwardPass can evaluate them like this:

        inline auto ForwardPass(Operon::Range range, int row, bool trace = false) const -> void {
            auto const start { static_cast<int64_t>(range.Start()) };
            auto const len   { static_cast<int64_t>(range.Size()) };
            auto const& individual = individual_.get();
            auto const nn = std::ssize(individual.Functions);
            constexpr int64_t S{ BatchSize };

            auto rem = std::min(S, len - row);
            Operon::Range rg(start + row, start + row + rem);

            // forward pass - compute primal and trace
            for (auto i = 0L; i < nn; ++i) {
                auto const& [coefficient, nodeFunction, nodeDerivative, polynomial] = context_[i];

                auto* ptr = primal_.data_handle() + i * S;

                if (nodeFunction) {
                    ComputePolynomials(rg, i);
                    std::invoke(*nodeFunction, individual.Functions, primal_, i, rg);

                    // first compute the partials
                    if (trace && nodeDerivative) {
                        std::invoke(*nodeDerivative, individual.Functions, primal_, trace_, i, i);
                    }

                    // apply weight after partials are computed
                    if (coefficient != T{1}) {
                        std::ranges::transform(std::span(ptr, rem), ptr, [coefficient](auto x) { return x * coefficient; });
                    }
                }
            }
        }

but I'm not sure what to do when it comes to the ForwardTrace and ReverseTrace which I assume will be necessary for optimizing the coefficients later on.

Why are we multiplying trace with dot, or trace with primal exactly, and do you have any idea on how these two functions should be altered to support transformation interaction instead?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions