Conversation
Eliminate snapshot restoration cost by using a MAP_PRIVATE / copy-on-write mapping. To prevent object destruction from faulting most of our mapping, we skip snapshotted objects in zend_objects_store_call_destructors(), unless they have destructors. Credit for these ideas belong to Bob. We also increase the refcount of snapshotted global variables and objects to prevent freeing them, which would also generate a lot of page faults. Co-authored-by: Bob Weinand <bobwei9@hotmail.com>
arnaud-lb
pushed a commit
that referenced
this pull request
Mar 31, 2025
```
ext/gd/libgd/gd.c:2275:14: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
#0 0x5d6a2103e1db in php_gd_gdImageCopy /home/dcarlier/Contribs/php-src/ext/gd/libgd/gd.c:2275
#1 0x5d6a210a2b63 in gdImageCrop /home/dcarlier/Contribs/php-src/ext/gd/libgd/gd_crop.c:57
#2 0x5d6a21018ca4 in zif_imagecrop /home/dcarlier/Contribs/php-src/ext/gd/gd.c:3575
#3 0x5d6a21e46e7a in ZEND_DO_ICALL_SPEC_RETVAL_USED_HANDLER /home/dcarlier/Contribs/php-src/Zend/zend_vm_execute.h:1337
#4 0x5d6a221188da in execute_ex /home/dcarlier/Contribs/php-src/Zend/zend_vm_execute.h:57246
#5 0x5d6a221366bd in zend_execute /home/dcarlier/Contribs/php-src/Zend/zend_vm_execute.h:61634
#6 0x5d6a21d107a6 in zend_execute_scripts /home/dcarlier/Contribs/php-src/Zend/zend.c:1895
#7 0x5d6a21a63409 in php_execute_script /home/dcarlier/Contribs/php-src/main/main.c:2529
#8 0x5d6a22516d5e in do_cli /home/dcarlier/Contribs/php-src/sapi/cli/php_cli.c:966
#9 0x5d6a2251981d in main /home/dcarlier/Contribs/php-src/sapi/cli/php_cli.c:1341
#10 0x7f10d002a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#11 0x7f10d002a47a in __libc_start_main_impl ../csu/libc-start.c:360
#12 0x5d6a20a06da4 in _start (/home/dcarlier/Contribs/php-src/sapi/cli/php+0x2806da4) (BuildId: d9a79c7e0e4872311439d7313cb3a81fe04190a2)
```
close phpGH-18006
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This implements snapshotting of the process's state. The snapshot can be restored in a subsequent request, to save initialization time.
This adds two functions:
Example usage:
What is restored
Internal objects
Internal objects don't need special support as long as they use only the zend heap (e.g. no native allocations, kernel resources). However they need to be marked as such. The current branch marks a few internal classes as safe because these are used in the symfony-demo benchmark.
snapshot_state()will throw if an unsupported object is found.Snapshots are private to a php instance / process, so it may be possible to support objects with native allocations and kernel resources. In some case it may be acceptable that the state of the native allocation / kernel resource changes after the snapshot (e.g. for streams or database connection), but for others it might not (e.g. DOM objects), in which case we may need to separate/clone the resource.
Design
The basic idea is that we copy the heap to a separate buffer. Later, we restore the state by copying back the buffer to the old location. We don't have to copy/relocate individual zvals and other state: We just copy heap chunks. We then update the global symbol table.
Snapshotting
We make a copy of every heap chunk (2MiB each), and prevent the heap from releasing these chunks later (to reserve the address space).
We also copy symbol tables (variables, classes, functions, constants). Symbols are allocated either in the heap or in opcache SHM, so we only copy the hashtables, not the symbols themselves.
Restoring
When restoring, we just restore chunk copies to their old location, and add the old chunks back to the heap. We then restore the symbol tables.
Thoughts
Snapshots are private to each php instance, as the location of the heap is specific to each instance (this is important because when we restore a snapshot, we want to do so at the original memory location).
Sharing snapshots is possible if a fixed memory region is reserved in the parent process, like the opcache SHM. But this wouldn't work for ZTS, as each php instance needs separate regions anyway. Also, instance-private snapshots are convenient to support snapshotting of internal objects (see above).
One issue with raw-copying heap chunks is that some slots will be reported as leaks after a restore, in debug builds, but this is fixable.
A less brute-force approach would copy/relocate every individual zval/object/etc to a new heap separately, in a similar fashion as
zend_persist.c. CoW objects such as arrays and strings could be moved to the native heap, to reduce restoring costs further. It would result in a smaller heap (reduces restoring cost). However, relocating would increase maintenance, as every internal object needs to know how to clone itself to a new location.Performance
Benchmark
/en/blog/on the Symfony Demo appnprocs*2threads/processesResults
PHP-FPM with snapshotting is slower than FrankenPHP. How much slower depends on the snapshot size and of when it's taken. Best results were obtained when taking a snapshot just after the first request:
Restoring the state takes about 14% of the time in this benchmark. This could possibly be reduced by a few percent.
Taking the snapshot just after booting the kernel has smaller improvements compared to baseline:
Initializing all container services before snapshot is slower than baseline due to the cost of restoring state.
More benchmarking / analysis is needed.