Speed up lambertw single diode functions with a custom LambertW function #2720

cwhanse · 2026-03-13T22:16:22Z

cwhanse
Mar 13, 2026
Maintainer

I've been working on functions for mismatch (series and parallel). I've observed that the single diode calculations are the big time sink, and the scipy lambertw function is a substantial (20%) portion of that. That's surprising to me.

Here's an alternative calculation of lambertW(x) that appears to be 4x faster, with similar accuracy. scipy accepts complex numbers which pvlib doesn't need, maybe that's contributing to slower speed. I couldn't locate the scipy source C code to compare.

import numpy as np
from scipy.special import lambertw
import time

# test custom implementation of lambertw for accuracy and speed
# compare with scipy.special.lambertw

def lambertw_pvlib(x):
    r'''Compute lambertw in log space and use newton iteration.
    Does not work for x <= 1, due to log(log(x)). Switch to scipy for x<=1.

    Parameters
    ----------
    x : numeric

    Returns
    -------
    numeric

    '''
    small = x <= 1
    
    w0 = np.log(x)
    w = w0.copy()
    # pvlib
    for _ in range(0, 4):
        w = w * (1. - np.log(w) + w0) / (1. + w)

    if any(small):
        # w will contain nan for these numbers due to log(w) = log(log(x))
        w[small] = lambertw(x[small]).real
    return w


def check(w, x):
    r''' relative difference between w*exp(w) and x
    '''
    return (x - w*np.exp(w)) / x


test_exp = np.arange(-10., 300, step=1)
test_x = 10.**test_exp

test_sci = lambertw(test_x).real
test_pvlib = lambertw_pvlib(test_x)

# evaluate accuracy by checking x = w*exp(w)
test_sci_check = check(test_sci, test_x)
test_pvlib_check = check(test_pvlib, test_x)

print("scipy: " + str(np.abs(test_sci_check).max()))
print("pvlib: " + str(np.abs(test_pvlib_check).max()))

# evaluate speed
time_x = np.tile(test_exp, 100)

start_time = time.time()
_ = lambertw(time_x)
end_time = time.time()
print("scipy: " + str(end_time - start_time))


start_time = time.time()
_ = lambertw_pvlib(time_x)
end_time = time.time()
print("pvlib: " + str(end_time - start_time))

Relative accuracy
scipy: 5.6216403309862655e-14
pvlib: 9.739431680310303e-14
Calculation time
scipy: 0.004271030426025391
pvlib: 0.0008788108825683594

echedey-ls · 2026-03-14T00:05:20Z

echedey-ls
Mar 14, 2026
Collaborator

I couldn't locate the scipy source C code to compare

Stub Python implementation (just holds the docstring): https://github.com/scipy/scipy/blob/8c75ae75176236f233824e9a0483c26a69e6dfec/scipy/special/_lambertw.py#L6-L149

They've got a TODO in there:

TODO: special expert should inspect this

C implementation is in another scipy repo: https://github.com/scipy/xsf/blob/main/include/xsf/lambertw.h

I won't have a look at ways of improving it due to time constrains. I suggest confirming and overhauling the TODO, plus motivating the scipy package to improve its performance or we can try to accommodate a sister package for the needs of pvlib.

1 reply

cwhanse Mar 20, 2026
Maintainer Author

Thanks for that link to the underlying algorithm in C.

The TODO is about how the scipy function communicates the branch to the C routine. There's not much I can contribute there. It's also not relevant to speeding up calculation of real values, which is what I'm after.

adriesse · 2026-03-14T10:28:17Z

adriesse
Mar 14, 2026
Collaborator

For pvlib purposes you could perhaps reduce the number of iterations, or make that a parameter to choose between speed and accuracy.

1 reply

cwhanse Mar 20, 2026
Maintainer Author

I looked at that. In this custom function, the 4th trip adds about 2% of time to improve precision by roughly 4 orders of magnitude of precision (10^-10 to 10^-14).

I've updated this proposed lambertw function to replace the current calculations in pvlib (calls to scipy.special.lambertw) and also the current 3 trips through Newton that pvlib does to converge in log(x) space where x, the argument of lambertw, overflows. The result is as precise as scipy (10^-14) and runs 5-6 times faster.

The faster speed makes sense, because scipy is doing complex arithmetic, which means a*b involves 4 times as many floating point operations as a.real * b.real.

echedey-ls · 2026-03-23T23:29:12Z

echedey-ls
Mar 23, 2026
Collaborator

Damn you nailed it @cwhanse ! I was hesitant and thought that it maybe could have been either not testing with a big enough vector or a typo (I think) at line 55:

- time_x = np.tile(test_exp, 100)
+ time_x = np.tile(test_x, 100)

Modified benchmark

import numpy as np
from scipy.special import lambertw
import time

# test custom implementation of lambertw for accuracy and speed
# compare with scipy.special.lambertw


def lambertw_pvlib(x):
    r"""Compute lambertw in log space and use newton iteration.
    Does not work for x <= 1, due to log(log(x)). Switch to scipy for x<=1.

    Parameters
    ----------
    x : numeric

    Returns
    -------
    numeric

    """
    small = x <= 1

    w0 = np.log(x)
    w = w0.copy()
    # pvlib
    for _ in range(0, 4):
        w = w * (1.0 - np.log(w) + w0) / (1.0 + w)

    if any(small):
        # w will contain nan for these numbers due to log(w) = log(log(x))
        w[small] = lambertw(x[small]).real
    return w


def check(w, x):
    r"""relative difference between w*exp(w) and x"""
    return (x - w * np.exp(w)) / x


test_exp = np.arange(-10.0, 300, step=1)
test_x = 10.0**test_exp

test_sci = lambertw(test_x).real
test_pvlib = lambertw_pvlib(test_x)

# evaluate accuracy by checking x = w*exp(w)
test_sci_check = check(test_sci, test_x)
test_pvlib_check = check(test_pvlib, test_x)

print("scipy: " + str(np.abs(test_sci_check).max()))
print("pvlib: " + str(np.abs(test_pvlib_check).max()))

# evaluate speed
time_x = np.tile(test_x, 100)

start_time = time.time()
_ = lambertw(time_x)
end_time = time.time()
print("scipy: " + str(end_time - start_time))


start_time = time.time()
_ = lambertw_pvlib(time_x)
end_time = time.time()
print("pvlib: " + str(end_time - start_time))


# %%
from timeit import timeit
import matplotlib.pyplot as plt

NUMBER_OF_RUNS_FOR_AVG = 5

repetitions_vector = np.arange(1, 10_000, 500, dtype=np.uint)
results_scipy = np.zeros_like(repetitions_vector, dtype=float)
results_pvlib = np.zeros_like(repetitions_vector, dtype=float)
input_sizes = np.zeros_like(repetitions_vector)
for idx, reps in enumerate(repetitions_vector):
    time_x = np.tile(test_x, reps)
    input_sizes[idx] = len(time_x)
    context = {
        "lambertw_pvlib": lambertw_pvlib,
        "lambertw_scipy": lambertw,
        "x": time_x,
    }
    scipy_1_result = timeit(
        stmt="lambertw_scipy(x).real", globals=context, number=NUMBER_OF_RUNS_FOR_AVG
    )
    pvlib_1_result = timeit(
        stmt="lambertw_pvlib(x)", globals=context, number=NUMBER_OF_RUNS_FOR_AVG
    )

    results_scipy[idx] = scipy_1_result
    results_pvlib[idx] = pvlib_1_result

plt.plot(input_sizes, results_pvlib, label="pvlib")
plt.plot(input_sizes, results_scipy, label="scipy")
plt.xlabel("Input vector length [-]")
plt.ylabel("CPU time [s]")
plt.legend()
plt.show()

My humble laptop for reference

OS: CachyOS x86_64
Host: HP Laptop 15-dw2xxx
Kernel: Linux 6.12.69-3-cachyos-lts
Uptime: 14 hours, 21 mins
CPU: Intel(R) Core(TM) i7-1065G7 (8) @ 3.90 GHz
GPU 1: NVIDIA GeForce MX330 [Discrete]
GPU 2: Intel Iris Plus Graphics G7 @ 1.10 GHz [Integrated]
Memory: 3.39 GiB / 7.54 GiB (45%)
Swap: 1.26 GiB / 7.54 GiB (17%)

I'm running it interactively and I've gotten these warnings:

<ipython-input-19-784d9cac9f66>:28: RuntimeWarning: divide by zero encountered in log
  w = w * (1.0 - np.log(w) + w0) / (1.0 + w)
<ipython-input-19-784d9cac9f66>:28: RuntimeWarning: invalid value encountered in log
  w = w * (1.0 - np.log(w) + w0) / (1.0 + w)
<ipython-input-19-784d9cac9f66>:28: RuntimeWarning: invalid value encountered in multiply
  w = w * (1.0 - np.log(w) + w0) / (1.0 + w)

I suggest to check if they are important and/or dismiss them completely.

1 reply

cwhanse Mar 24, 2026
Maintainer Author

The function in #2723 fixes some issues with the above, regarding convergence for small and overflowing arguments. I think it's just as accurate as scipy and as fast as the function you have tested.

kandersolar · 2026-03-31T18:14:54Z

kandersolar
Mar 31, 2026
Maintainer

I've written some code that adds a real-only calculation path to scipy.special.lambertw. On my computer, it competes well for speed with this numpy implementation, with no accuracy penalty wrt the existing code. Blue is current scipy, orange is #2723, and green is scipy/xsf#116.

Not sure when it will become available in scipy (assuming it does get merged), but whenever it does, I believe it will not require any change on our side; any pvlib calls should dispatch to the new path automatically.

1 reply

echedey-ls Mar 31, 2026
Collaborator

Whenever your improvement gets into a stable release, I think it would be worth to announce that in whatsnew, in mailing release highlights or even as an admonition in relevant docs.

Great improvement!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up lambertw single diode functions with a custom LambertW function #2720

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Speed up lambertw single diode functions with a custom LambertW function #2720

Uh oh!

cwhanse Mar 13, 2026 Maintainer

Replies: 4 comments · 4 replies

Uh oh!

echedey-ls Mar 14, 2026 Collaborator

Uh oh!

cwhanse Mar 20, 2026 Maintainer Author

Uh oh!

adriesse Mar 14, 2026 Collaborator

Uh oh!

cwhanse Mar 20, 2026 Maintainer Author

Uh oh!

Uh oh!

echedey-ls Mar 23, 2026 Collaborator

Uh oh!

cwhanse Mar 24, 2026 Maintainer Author

Uh oh!

kandersolar Mar 31, 2026 Maintainer

Uh oh!

echedey-ls Mar 31, 2026 Collaborator

cwhanse
Mar 13, 2026
Maintainer

Replies: 4 comments 4 replies

echedey-ls
Mar 14, 2026
Collaborator

cwhanse Mar 20, 2026
Maintainer Author

adriesse
Mar 14, 2026
Collaborator

cwhanse Mar 20, 2026
Maintainer Author

echedey-ls
Mar 23, 2026
Collaborator

cwhanse Mar 24, 2026
Maintainer Author

kandersolar
Mar 31, 2026
Maintainer

echedey-ls Mar 31, 2026
Collaborator