6419 Commits

Author SHA1 Message Date
Yang Liu
926e4b2da8
[DYNAREC] Added ranged Dynablock dump (#2570) 2025-04-24 10:37:24 +02:00
Yang Liu
4903177bab
[ARM64_DYNAREC] Minor optim to MOVNTDQA (#2568) 2025-04-24 09:17:52 +02:00
Yang Liu
d8a6fa0395
Added some missing newlines (#2567) 2025-04-24 09:16:49 +02:00
ptitSeb
69127efae9 [ARM64_DYNAREC] Small fixes and improvments to (V)MOVMSKP[S/D] opcodes 2025-04-23 18:43:13 +02:00
ptitSeb
6f0db360a4 [ARM64_DYNAREC] Few fixes and small cosmetic changes to some partial (V)MOV opcodes 2025-04-23 18:21:15 +02:00
ptitSeb
223de50ec9 [INTERP] Fex fixes and small cosmetic changes to some partial (V)MOV opcodes 2025-04-23 18:20:38 +02:00
ptitSeb
5cfad22165 [ARM64_DYNAREC] Made REP MOVSB optimisation flagless 2025-04-23 12:54:04 +02:00
ptitSeb
815836d285 [ARM64_DYNAREC] Optimized REP STOSB 2025-04-23 12:47:56 +02:00
ptitSeb
468a3c2165 [PERFMAP] Added x86 address of code when function name cannot be found, instead of ??? 2025-04-23 11:48:37 +02:00
ptitSeb
3afe87bcce [ARM64_DYNAREC] Various improvment to various SSE/AVX 128bits/256bits mov opcodes 2025-04-23 10:57:07 +02:00
ptitSeb
d79d6bd6c2 [INTERP] RaZ upper 128bits on vmov* Ex, Gx if Ex is a registry (unused?) 2025-04-23 10:55:17 +02:00
rajdakin
cc6500b7dd
[RBTREE] Fixed an edge case (#2562) 2025-04-22 13:52:01 +02:00
Yang Liu
ad494480ce
[DYNAREC] Added a x87pc test and some cosmetic changes too (#2561) 2025-04-22 13:31:04 +02:00
phorcys
854f6675db
[LA64_DYNAREC] Add SSSE3's mmx ops. (#2559)
0f.38.00 PSHUFB
      01 PHADDW
      02 PHADDD
      03 PHADDSW
      04 PMADDUBSW
      05 PHSUBW
      06 PHSUBD
      07 PHSUBSW
      08 PSIGNB
      09 PSIGNW
      0a PSIGND
      0b PMULHRSW
      1c PABSB
      1d PABSW
      1e PABSD
2025-04-22 13:26:51 +02:00
ptitSeb
fc15743ff9 [ARM64_DYNAREC] Improved (V)[MIN/MAX][S/P][S/D] opcodes 2025-04-22 12:20:23 +02:00
ptitSeb
91ead3b12a [INTERP] Improved (V)[MIN/MAX][S/P][S/D] opcodes 2025-04-22 12:20:03 +02:00
Chi-Kuan Chiu
75aaf7aa64
[RBTREE] Cache boundary nodes and remove add_range() (#2557)
* Cache leftmost and rightmost node

Add two fields to `rbtree`: `lefter` and `righter`, to cache the
leftmost and rightmost nodes respectively. This eliminates the need for
O(log N) traversals in `rb_get_lefter()` and `rb_get_righter()`.

Motivated by the Linux kernel's use of cached pointers for `rb_first()`
and `rb_last()`, this change improves efficiency of boundary queries by
replacing repeated tree walks with direct pointer dereference.

Experiment: running `chess.exe` with Box64 + Wine (#2511)
- ~3,500 insertions into the tree
- 607 lightweight cache updates (single assignment)
- 397 full tree traversals avoided

This results in reduced runtime overhead for boundary checks, with
memory cost (+2 pointer  per tree). Expected benefits increase
in larger or more dynamic workloads.

Ref: https://docs.kernel.org/core-api/rbtree.html

* Remove redundant add_range() wrapper

The function `add_range()` was only called when `tree->root == NULL`.
In such cases, the while-loop inside `add_range()` never runs,
resulting in a call to `add_range_next_to()` with `prev == NULL`.

Replaced it with direct calls to `add_range_next_to(tree, NULL, ...)`.
2025-04-22 11:51:57 +02:00
Yang Liu
39a66e25ee
[RV64_DYNAREC] Better handling of x87double=2 (#2560) 2025-04-22 11:07:34 +02:00
ptitSeb
574d6f9dab [ARM64_DYNAREC] Small improvements to (V)MASKMOVDQU opcode 2025-04-21 17:41:04 +02:00
ptitSeb
c39869b770 [ARM64_DYNAREC] Better handling of x87double=2 2025-04-21 16:07:05 +02:00
ptitSeb
302b6493e2 [ARM64_DYNAREC] Fixed potential issue with (V)LDDQU opcode 2025-04-21 14:13:56 +02:00
Yang Liu
20ea2987a8
[DYNAREC] More handling of low precision x87 flag change (#2556) 2025-04-21 14:09:26 +02:00
ptitSeb
eee547d50a [INTERP] More fixes to INSERTQ/EXTRQ opcodes 2025-04-21 14:01:56 +02:00
Yang Liu
a19f4b9eca
[RV64_DYNAREC][TRACE][COSIM] Improve x87 fiability in dynarec trace and cosim scenario (#2555) 2025-04-21 13:18:40 +02:00
ptitSeb
6f3f3e0e85 [ARM64_DYNAREC] Add/Improved (V)H[ADD/SUB]P[S/D] opcodes 2025-04-21 12:22:06 +02:00
Yang Liu
2384462f61
[ENV][COSIM] Enable x87double only if it's off (#2554) 2025-04-21 11:54:42 +02:00
ptitSeb
e6e6c3ac65 [ARM64_DYNAREC] Small change to 66 0F 3A 17 opcode 2025-04-21 11:45:09 +02:00
ptitSeb
102e6a1f61 [INTERP] Fixed EXTRQ opcode 2025-04-21 11:44:11 +02:00
ptitSeb
460e6c517d [ARM64_DYNAREC] Minor cosmetic changes 2025-04-21 11:27:20 +02:00
ptitSeb
965105c08a [INTERP] Better NAN handling for (V)DIV[P/S][S/D] opcodes 2025-04-21 11:27:01 +02:00
ptitSeb
3fef880a7d [INTERP] VDPPD has no 256bits version 2025-04-21 11:26:01 +02:00
Yang Liu
e7b4c79d12
[RV64_DYNAREC] Added X87DOUBLE=2 support (#2553) 2025-04-21 10:19:44 +02:00
ptitSeb
7cdfa187c5 [ARM64_DYNAREC] Another potential fix for X87DOUBLE=2 2025-04-21 09:25:24 +02:00
ptitSeb
d0da57547e [ARM64_DYNAREC] Fixed some potential issues with BOX64_DYNAREC_DOUBLE=2 2025-04-21 08:58:08 +02:00
ptitSeb
e60fd72672 [TRACE] Fixed an issue with a trace on dynablock exiting execution 2025-04-21 08:57:33 +02:00
ptitSeb
c6ae7d36dd [TRACE] Better trace, using maplile name if available, and better write on a dynablock memory log 2025-04-20 16:23:30 +02:00
ptitSeb
3836625017 Fixed an issue with custom memory when a map is created for an blockstree node, avoiding re-entrance on blockstree handling and tree corruption 2025-04-20 16:22:32 +02:00
ptitSeb
319f67d577 [DEBUG] Exposed a debug function to print an rbtree 2025-04-20 16:19:36 +02:00
Chi-Kuan Chiu
28d12f6785
Merge mmapmem into mapallmem (#2550)
This commit removes the separate `mmapmem` red-black tree and
merges its into `mapallmem`, reducing overall memory usage.

To distinguish memory regions that were previously tracked
separately in `mmapmem`,
we now use a new `mem_flag_t` bitmask enum in the data field
of `mapallmem`:

    MEM_ALLOCATED = 1  // allocated
    MEM_RESERVED  = 2  // reserved
    MEM_MMAP      = 3  // allocated via mmap()

Resolves #2546
2025-04-20 14:24:22 +02:00
ptitSeb
e7113b228f [BOX32][WRAPPER] Removed a debug leftover 2025-04-19 22:44:49 +02:00
ptitSeb
bc3cfe82cf [BOX32][WRAPPER] Added a workaround on XF86VidModeGetAllModeLines for 12 labors of Hercules 6 2025-04-19 21:04:02 +02:00
ptitSeb
6ccac8ab5a [BOX32][WRAPPER] Added 32bits wrapped ftime function 2025-04-19 20:33:07 +02:00
ptitSeb
c9d756eeea [RCFILE] Better profile for Cyberpunk 2077 2025-04-19 20:32:45 +02:00
ptitSeb
549b57ee30 [INTERP] Added F2 0F BA opcode (for 2547) 2025-04-19 15:44:37 +02:00
ptitSeb
660dbfc3ad Another fix for non-dynarec build 2025-04-18 18:33:00 +02:00
ptitSeb
d89ad5f1b9 This should fix non-dynarec build 2025-04-18 18:17:40 +02:00
ptitSeb
00a7d321a7 [DYNAREC] Better dynablock mempry handling, and fixed a regression introduced when improving dynmem rbtree 2025-04-18 17:57:39 +02:00
ptitSeb
4c9765cd5e Also preserve flags stuffs on signal handling when needed 2025-04-18 17:57:39 +02:00
ptitSeb
fe307dbc96 [DYNAREC][TRACE] Slightly better trace message on creating dynablock 2025-04-18 17:57:39 +02:00
phorcys
518c860e5d
[LA64_DYNAREC]Add/opt more SSE/MMX ops (#2543)
* Add SSE2 op MASKMOVDQU.
 * Opt PSADBW.
 * Add SSE3 HSUBPD op.
 * Add mmx PALIGNR op.
 * Fix PSRLDQ.
 * Fix PSRAW Gx,Ex PSRAW Gm,Em.
     mmx/sse get COUNT from Em/Ex as an 64bit unsigned...
     testsuite with shift 0x4,0x4,0x4,0x4, result COUNT as 0x04040404.
2025-04-18 17:09:34 +02:00