Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GH-17715: Handle preloaded internal function runtime cache #17835

Merged
merged 5 commits into from
Feb 24, 2025

Conversation

bwoebi
Copy link
Member

@bwoebi bwoebi commented Feb 16, 2025

This solely affects the builtin enum functions currently.

Given that these are stored in SHM, we cannot simply hardwire a pointer into the internal function runtime cache on NTS too, but have to use a MAP_PTR (like on ZTS). Now, by design, the runtime cache of internal functions no longer is reset between requests, hence we need to store them explicitly as static runtime cache.

On NTS builds we cannot trivially move the pointers into CG(internal_run_time_cache) as they're directly stored on the individual functions (on ZTS we could simply iterate the static map_ptrs). Hence, we have the choice between having opcache managing the internal run_time_cache for its preloaded functions itself or realloc CG(internal_run_time_cache) and iterate through all functions to assign the new address. We choose the latter for simplicity and initial speed.

This changes the accel globals structs, but that should be fine given that they're not exported.
map_ptr_static_last has been added as last element to zend_accel_shared_globals, so that accesses to it are compatible. We do not have to care about the ABI of creating new zend_accel_shared_globals structs. That's opcaches prerogative.

@dstogov
Copy link
Member

dstogov commented Feb 17, 2025

I didn't understand the problem from the PR name and comments.
Only the issue clarifies that the problem is related to observer.

Given that these are stored in SHM, we cannot simply hardwire a pointer into the internal function runtime cache on NTS too, but have to use a MAP_PTR (like on ZTS).

I can't understand this.

Now, by design, the runtime cache of internal functions no longer is reset between requests, hence we need to store them explicitly as static runtime cache.

I don't know about this design.
Run time cache for internal functions was introduced by you exactly for observer support.
Initially, I thought it's a good idea, but now observer related code infected into core parts of the engine and opcache...
This made observer one of the most things that I dislike.
Sorry, I don't understand the patch and if all of its details are safe.

@iluuu1994 this is related to PHP Enums. Please also take a look.

@@ -61,7 +63,7 @@ PHP NEWS
. Fix memory leak on overflow in _php_stream_scandir(). (nielsdos)

- Windows:
. Fixed phpize for Windows 11 (24H2). (bwoebi)
. Fixed phpize for Windows 11 (24H2). (Bob)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. :)

@bwoebi bwoebi requested a review from iluuu1994 February 17, 2025 14:26
@bwoebi
Copy link
Member Author

bwoebi commented Feb 17, 2025

I can't understand this.

op_arrays generated by preloaded scripts are stored in shm. Thus we need to use MAP_PTRs for NTS too. That's it.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to understand (especially after a day of work at my job).
It's more complexity over the original PR that introduced the issue. Is this worth the cost to just make observers faster? I suppose the original PR can't be reverted at this point due to ABI constraints?

@bwoebi bwoebi force-pushed the fix-internal-enum-rt-cache branch from 4f8cf6e to 7ce04d3 Compare February 17, 2025 21:38
@bwoebi
Copy link
Member Author

bwoebi commented Feb 18, 2025

This was quite worth the cost, yes :-(
I'm not happy about the extra complexity and may have to think about streamlining this better. But that's not for now.

I did have that assumption that all persistent internal functions were loaded before post_startup. Evidently preloaded enums violate this assumption...

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concept better now, but I still find it too complex to understand how this will interact with all of the details and possible implicit assumptions made in other parts of PHP.

I ran your test under Valgrind and I got the following Valgrind errors:

==226054== Invalid write of size 8
==226054==    at 0x9EF417: zend_observer_fcall_install (zend_observer.c:126)
==226054==    by 0x9EFAF4: zend_observer_fcall_begin_prechecked (zend_observer.c:262)
==226054==    by 0x8E3E8B: zend_observer_fcall_begin_specialized (zend_observer.h:116)
==226054==    by 0x8F7BC0: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2141)
==226054==    by 0x976552: execute_ex (zend_vm_execute.h:58859)
==226054==    by 0x97AED8: zend_execute (zend_vm_execute.h:64236)
==226054==    by 0xA1B97E: zend_execute_script (zend.c:1934)
==226054==    by 0x7CA094: php_execute_script_ex (main.c:2575)
==226054==    by 0x7CA212: php_execute_script (main.c:2615)
==226054==    by 0xA1DE38: do_cli (php_cli.c:935)
==226054==    by 0xA1EB86: main (php_cli.c:1310)
==226054==  Address 0x7b0f5d8 is 0 bytes after a block of size 8 alloc'd
==226054==    at 0x48457A8: malloc (vg_replace_malloc.c:446)
==226054==    by 0x8779E9: __zend_malloc (zend_alloc.c:3280)
==226054==    by 0x787C911: preload_load (ZendAccelerator.c:4388)
==226054==    by 0x787D3E3: accel_preload (ZendAccelerator.c:4596)
==226054==    by 0x787D94F: accel_finish_startup_preload (ZendAccelerator.c:4717)
==226054==    by 0x787DDFC: accel_finish_startup (ZendAccelerator.c:4862)
==226054==    by 0x7878771: accel_post_startup (ZendAccelerator.c:3373)
==226054==    by 0xA19647: zend_post_startup (zend.c:1105)
==226054==    by 0x7C99DE: php_module_startup (main.c:2324)
==226054==    by 0xA1CD4E: php_cli_startup (php_cli.c:397)
==226054==    by 0xA1EAEA: main (php_cli.c:1277)
==226054== 
==226054== Invalid write of size 8
==226054==    at 0x9EF48C: zend_observer_fcall_install (zend_observer.c:138)
==226054==    by 0x9EFAF4: zend_observer_fcall_begin_prechecked (zend_observer.c:262)
==226054==    by 0x8E3E8B: zend_observer_fcall_begin_specialized (zend_observer.h:116)
==226054==    by 0x8F7BC0: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2141)
==226054==    by 0x976552: execute_ex (zend_vm_execute.h:58859)
==226054==    by 0x97AED8: zend_execute (zend_vm_execute.h:64236)
==226054==    by 0xA1B97E: zend_execute_script (zend.c:1934)
==226054==    by 0x7CA094: php_execute_script_ex (main.c:2575)
==226054==    by 0x7CA212: php_execute_script (main.c:2615)
==226054==    by 0xA1DE38: do_cli (php_cli.c:935)
==226054==    by 0xA1EB86: main (php_cli.c:1310)
==226054==  Address 0x7b0f5d8 is 0 bytes after a block of size 8 alloc'd
==226054==    at 0x48457A8: malloc (vg_replace_malloc.c:446)
==226054==    by 0x8779E9: __zend_malloc (zend_alloc.c:3280)
==226054==    by 0x787C911: preload_load (ZendAccelerator.c:4388)
==226054==    by 0x787D3E3: accel_preload (ZendAccelerator.c:4596)
==226054==    by 0x787D94F: accel_finish_startup_preload (ZendAccelerator.c:4717)
==226054==    by 0x787DDFC: accel_finish_startup (ZendAccelerator.c:4862)
==226054==    by 0x7878771: accel_post_startup (ZendAccelerator.c:3373)
==226054==    by 0xA19647: zend_post_startup (zend.c:1105)
==226054==    by 0x7C99DE: php_module_startup (main.c:2324)
==226054==    by 0xA1CD4E: php_cli_startup (php_cli.c:397)
==226054==    by 0xA1EAEA: main (php_cli.c:1277)
==226054== 
==226054== Invalid read of size 8
==226054==    at 0x9EFB11: zend_observer_fcall_begin_prechecked (zend_observer.c:269)
==226054==    by 0x8E3E8B: zend_observer_fcall_begin_specialized (zend_observer.h:116)
==226054==    by 0x8F7BC0: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2141)
==226054==    by 0x976552: execute_ex (zend_vm_execute.h:58859)
==226054==    by 0x97AED8: zend_execute (zend_vm_execute.h:64236)
==226054==    by 0xA1B97E: zend_execute_script (zend.c:1934)
==226054==    by 0x7CA094: php_execute_script_ex (main.c:2575)
==226054==    by 0x7CA212: php_execute_script (main.c:2615)
==226054==    by 0xA1DE38: do_cli (php_cli.c:935)
==226054==    by 0xA1EB86: main (php_cli.c:1310)
==226054==  Address 0x7b0f5d8 is 0 bytes after a block of size 8 alloc'd
==226054==    at 0x48457A8: malloc (vg_replace_malloc.c:446)
==226054==    by 0x8779E9: __zend_malloc (zend_alloc.c:3280)
==226054==    by 0x787C911: preload_load (ZendAccelerator.c:4388)
==226054==    by 0x787D3E3: accel_preload (ZendAccelerator.c:4596)
==226054==    by 0x787D94F: accel_finish_startup_preload (ZendAccelerator.c:4717)
==226054==    by 0x787DDFC: accel_finish_startup (ZendAccelerator.c:4862)
==226054==    by 0x7878771: accel_post_startup (ZendAccelerator.c:3373)
==226054==    by 0xA19647: zend_post_startup (zend.c:1105)
==226054==    by 0x7C99DE: php_module_startup (main.c:2324)
==226054==    by 0xA1CD4E: php_cli_startup (php_cli.c:397)
==226054==    by 0xA1EAEA: main (php_cli.c:1277)
==226054== 
==226054== Invalid read of size 8
==226054==    at 0x9EFC94: call_end_observers (zend_observer.c:303)
==226054==    by 0x9EFD1C: zend_observer_fcall_end_prechecked (zend_observer.c:315)
==226054==    by 0x8E3ED2: zend_observer_fcall_end (zend_observer.h:126)
==226054==    by 0x8F7D44: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2161)
==226054==    by 0x976552: execute_ex (zend_vm_execute.h:58859)
==226054==    by 0x97AED8: zend_execute (zend_vm_execute.h:64236)
==226054==    by 0xA1B97E: zend_execute_script (zend.c:1934)
==226054==    by 0x7CA094: php_execute_script_ex (main.c:2575)
==226054==    by 0x7CA212: php_execute_script (main.c:2615)
==226054==    by 0xA1DE38: do_cli (php_cli.c:935)
==226054==    by 0xA1EB86: main (php_cli.c:1310)
==226054==  Address 0x7b0f5d8 is 0 bytes after a block of size 8 alloc'd
==226054==    at 0x48457A8: malloc (vg_replace_malloc.c:446)
==226054==    by 0x8779E9: __zend_malloc (zend_alloc.c:3280)
==226054==    by 0x787C911: preload_load (ZendAccelerator.c:4388)
==226054==    by 0x787D3E3: accel_preload (ZendAccelerator.c:4596)
==226054==    by 0x787D94F: accel_finish_startup_preload (ZendAccelerator.c:4717)
==226054==    by 0x787DDFC: accel_finish_startup (ZendAccelerator.c:4862)
==226054==    by 0x7878771: accel_post_startup (ZendAccelerator.c:3373)
==226054==    by 0xA19647: zend_post_startup (zend.c:1105)
==226054==    by 0x7C99DE: php_module_startup (main.c:2324)
==226054==    by 0xA1CD4E: php_cli_startup (php_cli.c:397)
==226054==    by 0xA1EAEA: main (php_cli.c:1277)
==226054== 
==226054== Invalid read of size 8
==226054==    at 0x9EFCA0: call_end_observers (zend_observer.c:303)
==226054==    by 0x9EFD1C: zend_observer_fcall_end_prechecked (zend_observer.c:315)
==226054==    by 0x8E3ED2: zend_observer_fcall_end (zend_observer.h:126)
==226054==    by 0x8F7D44: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2161)
==226054==    by 0x976552: execute_ex (zend_vm_execute.h:58859)
==226054==    by 0x97AED8: zend_execute (zend_vm_execute.h:64236)
==226054==    by 0xA1B97E: zend_execute_script (zend.c:1934)
==226054==    by 0x7CA094: php_execute_script_ex (main.c:2575)
==226054==    by 0x7CA212: php_execute_script (main.c:2615)
==226054==    by 0xA1DE38: do_cli (php_cli.c:935)
==226054==    by 0xA1EB86: main (php_cli.c:1310)
==226054==  Address 0x7b0f5d8 is 0 bytes after a block of size 8 alloc'd
==226054==    at 0x48457A8: malloc (vg_replace_malloc.c:446)
==226054==    by 0x8779E9: __zend_malloc (zend_alloc.c:3280)
==226054==    by 0x787C911: preload_load (ZendAccelerator.c:4388)
==226054==    by 0x787D3E3: accel_preload (ZendAccelerator.c:4596)
==226054==    by 0x787D94F: accel_finish_startup_preload (ZendAccelerator.c:4717)
==226054==    by 0x787DDFC: accel_finish_startup (ZendAccelerator.c:4862)
==226054==    by 0x7878771: accel_post_startup (ZendAccelerator.c:3373)
==226054==    by 0xA19647: zend_post_startup (zend.c:1105)
==226054==    by 0x7C99DE: php_module_startup (main.c:2324)
==226054==    by 0xA1CD4E: php_cli_startup (php_cli.c:397)
==226054==    by 0xA1EAEA: main (php_cli.c:1277)
==226054== 
==226054== Invalid read of size 8
==226054==    at 0x9EFCC7: call_end_observers (zend_observer.c:309)
==226054==    by 0x9EFD1C: zend_observer_fcall_end_prechecked (zend_observer.c:315)
==226054==    by 0x8E3ED2: zend_observer_fcall_end (zend_observer.h:126)
==226054==    by 0x8F7D44: ZEND_DO_FCALL_SPEC_OBSERVER_HANDLER (zend_vm_execute.h:2161)
==226054==    by 0x976552: execute_ex (zend_vm_execute.h:58859)
==226054==    by 0x97AED8: zend_execute (zend_vm_execute.h:64236)
==226054==    by 0xA1B97E: zend_execute_script (zend.c:1934)
==226054==    by 0x7CA094: php_execute_script_ex (main.c:2575)
==226054==    by 0x7CA212: php_execute_script (main.c:2615)
==226054==    by 0xA1DE38: do_cli (php_cli.c:935)
==226054==    by 0xA1EB86: main (php_cli.c:1310)
==226054==  Address 0x7b0f5d8 is 0 bytes after a block of size 8 alloc'd
==226054==    at 0x48457A8: malloc (vg_replace_malloc.c:446)
==226054==    by 0x8779E9: __zend_malloc (zend_alloc.c:3280)
==226054==    by 0x787C911: preload_load (ZendAccelerator.c:4388)
==226054==    by 0x787D3E3: accel_preload (ZendAccelerator.c:4596)
==226054==    by 0x787D94F: accel_finish_startup_preload (ZendAccelerator.c:4717)
==226054==    by 0x787DDFC: accel_finish_startup (ZendAccelerator.c:4862)
==226054==    by 0x7878771: accel_post_startup (ZendAccelerator.c:3373)
==226054==    by 0xA19647: zend_post_startup (zend.c:1105)
==226054==    by 0x7C99DE: php_module_startup (main.c:2324)
==226054==    by 0xA1CD4E: php_cli_startup (php_cli.c:397)
==226054==    by 0xA1EAEA: main (php_cli.c:1277)

This was quite worth the cost, yes :-(
I'm not happy about the extra complexity and may have to think about streamlining this better. But that's not for now.

Is it still worth the cost? I'm not convinced tbh. I rather revert the original PR. The public ZEND_API fields introduced by the original PR can be hardcoded to zeros.

@bwoebi
Copy link
Member Author

bwoebi commented Feb 18, 2025

@nielsdos You're absolutely right. It passed under asan, because asan is 16 byte aligned. Accidentally multiplied by sizeof(void *) instead of cache_size.

If I recall correctly this change had a minor (0.1~0.2%) benefit to performance without observers and 1.5% benefit with observers enabled with symfony demo. I'd be honestly disappointed if rolling it back were the way to go.
What I'd rather do is an improvement of the current situation in master (i.e. have this patch in 8.4, then improve for 8.5) and think about how to streamline this in master.
I was also anticipating that this work could be beneficial for mutable class data and (ironically) for preloaded functions to give yet another small percentage of benefit.
The estimated performance gain was to be assumed to be bigger for preloading (as most calls would ultimately be going to already allocated functions).

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought some time about this and went over it again and I think I fully understand it. I just have some small nits/questions. I hope there won't be any sneaky problems anymore.
I also tested the preloading in the test suite and Symfony-demo in Valgrind and there were no issue reports.
I hope this can be further cleaned up in master. I particularly don't like how some map_ptr details are spreading inside the accelerator itself. In particular the reallocation logic is very similar/repeated.

CG(map_ptr_real_base) = new_base;
zend_map_ptr_static_size = new_static_size;
} else {
CG(map_ptr_real_base) = perealloc(CG(map_ptr_real_base), CG(map_ptr_size) * sizeof(void *), 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this using CG(map_ptr_size) for the number of elements instead of new_static_size + CG(map_ptr_size) ?

@@ -70,6 +70,9 @@ typedef struct _zend_string zend_string;
} while (0)
# define ZEND_MAP_PTR_BIASED_BASE(real_base) \
((void*)(((uintptr_t)(real_base)) + zend_map_ptr_static_size * sizeof(void *) - 1))
// Note: chunked like: [8192..12287][4096..8191][0..4095]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Normally we use C-style comments

@@ -278,6 +281,8 @@ typedef struct _zend_accel_shared_globals {
void *jit_traces;
const void **jit_exit_groups;

size_t map_ptr_static_last;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope no one is doing anything funky ABI-wise, they shouldn't of course...
Anyway, since ABI shouldn't be a concern, can we please put this next to the map_ptr_last field? A bit more cache locality.

This solely affects the builtin enum functions currently.

Given that these are stored in SHM, we cannot simply hardwire a pointer into the internal function runtime cache on NTS too, but have to use a MAP_PTR (like on ZTS).
Now, by design, the runtime cache of internal functions no longer is reset between requests, hence we need to store them explicitly as static runtime cache.

On NTS builds we cannot trivially move the pointers into CG(internal_run_time_cache) as they're directly stored on the individual functions (on ZTS we could simply iterate the static map_ptrs).
Hence, we have the choice between having opcache managing the internal run_time_cache for its preloaded functions itself or realloc CG(internal_run_time_cache) and iterate through all functions to assign the new address. We choose the latter for simplicity and initial speed.

Note: map_ptr_static_last has been added as last element to zend_accel_shared_globals, so that accesses to it are compatible. We do not have to care about the ABI of creating new zend_accel_shared_globals structs. That's opcaches prerogative.
Signed-off-by: Bob Weinand <[email protected]>
Signed-off-by: Bob Weinand <[email protected]>
Signed-off-by: Bob Weinand <[email protected]>
@bwoebi bwoebi force-pushed the fix-internal-enum-rt-cache branch from 2a30bcc to 2bc5e14 Compare February 23, 2025 13:16
@bwoebi
Copy link
Member Author

bwoebi commented Feb 23, 2025

@nielsdos yeah, master should get a proper extend-API, to avoid doing this stuff in opcache, I fully agree.
Thanks for the thorough reviewing!

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see problems anymore. I hope all details are safe and interact properly.

@bwoebi bwoebi merged commit 53fa98e into php:PHP-8.4 Feb 24, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants