Skip to content

Failures while assigning public IP via enableStaticNat #10313

@akrasnov-drv

Description

@akrasnov-drv

problem

enableStaticNat starts failing just after several uses.
Actually I see that IP is assigned but the call still fails with timeout just after 3-4 uses
For my test I created a number of VMs, and then tried to assign public IP to all of them sequentially:

for i in `cloudstack listVirtualMachines | jq ".virtualmachine[].id"`
> do
> ip=`cloudstack associateIpAddress networkId=_my-network-id_ | jq ".ipaddress.id"`
> time cloudstack enableStaticNat ipaddressid=${ip} virtualmachineid=${i}
> done
{
  "success": true
}

real	0m9.272s
user	0m0.259s
sys	0m0.052s
{
  "success": true
}

real	0m10.097s
user	0m0.274s
sys	0m0.033s
{
  "success": true
}

real	0m10.301s
user	0m0.260s
sys	0m0.040s
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 446, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 441, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.10/http/client.py", line 1375, in getresponse
    response.begin()
  File "/usr/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.10/http/client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.10/socket.py", line 705, in readinto
    return self._sock.recv_into(b)
TimeoutError: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 756, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 534, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 719, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 700, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 448, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 337, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='localhost', port=8080): Read timed out. (read timeout=10)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/cloudstack", line 11, in <module>
    load_entry_point('cs==2.7.1', 'console_scripts', 'cs')()
  File "/usr/lib/python3/dist-packages/cs/__init__.py", line 104, in main
    response = getattr(cs, command)(fetch_result=fetch_result,
  File "/usr/lib/python3/dist-packages/cs/client.py", line 213, in handler
    return self._request(command, **kwargs)
  File "/usr/lib/python3/dist-packages/cs/client.py", line 273, in _request
    response = session.send(prepped,
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 657, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='localhost', port=8080): Read timed out. (read timeout=10)

real	0m10.357s
user	0m0.281s
sys	0m0.068s
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 446, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 441, in _make_request
    httplib_response = conn.getresponse()
...

I tried different configurations of network and VR, and got it always failing, in the best case after 6-7 successful assignments.
large VR with 4 CPUS and several GB memory did not help neither.

time for all failing ones shows real 0m10.350s or slightly more.

Started from #10184

versions

CloudStack 4.20.0.0 with https://github.com/apache/cloudstack/pull/10254/files applied (PR does not help with this)
Ubuntu 22.04.5 LTS
libvirt 8.0.0-1ubuntu7.10
isolated network over VLAN
about 1000 public IPs in /20

The steps to reproduce the bug

  1. Create number of VMs (I used 100, but looks like just several should be enough
    Repeat 2-3 for VMs in 1. till it starts failing (just after about 3-4 cycles in my case)
  2. Use associateIpAddress to get public IP ID
  3. Use enableStaticNat with VM ID and IP ID
    Initially enableStaticNat takes 9 seconds then increases to 10, and then just starts failing with timeout

Here is a cycle doing the above

for i in `cloudstack listVirtualMachines | jq ".virtualmachine[].id"`; do
    ip=`cloudstack associateIpAddress networkId=${YOUR-NETWORK-ID-HERE} | jq ".ipaddress.id"`
    time cloudstack enableStaticNat ipaddressid=${ip} virtualmachineid=${i}
done

What to do about it?

Looks like the call is taking too much time to return. Should be optimized.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions