perf(P3): enable GMSH OpenMP multithreading — 4.4x faster tessellation

GMSH defaults to single-threaded meshing. Setting General.NumThreads,
Mesh.MaxNumThreads1D and Mesh.MaxNumThreads2D to min(cpu_count, 16) enables
parallel Frontal-Delaunay surface meshing across all available cores.

Benchmark on 121-face assembly (32-core host, capped at 16 threads):
  Before: 12.7s total (9.8s in gmsh.model.mesh.generate)
  After:   2.8s total (1.1s in gmsh.model.mesh.generate)

Cap at 16 threads — benchmark showed 16 threads (1.1s) matches or beats auto
(1.6s), likely due to NUMA/coordination overhead above that threshold.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-11 19:34:40 +01:00
parent af320bcdc8
commit 9703aec497
@@ -307,8 +307,13 @@ def _tessellate_with_gmsh(shape, linear_deflection: float, angular_deflection: f
try:
BRepTools.Write_s(shape, brep_path)
import os as _os
n_threads = min(_os.cpu_count() or 1, 16) # cap at 16 — sweet spot on benchmark
gmsh.initialize()
gmsh.option.setNumber("General.Terminal", 0) # suppress console output
gmsh.option.setNumber("General.NumThreads", n_threads) # enable OpenMP parallelism
gmsh.option.setNumber("Mesh.MaxNumThreads1D", n_threads) # parallel edge meshing
gmsh.option.setNumber("Mesh.MaxNumThreads2D", n_threads) # parallel surface meshing
gmsh.option.setNumber("Mesh.Algorithm", 6) # Frontal-Delaunay 2D
gmsh.option.setNumber("Mesh.RecombineAll", 0) # keep triangles (no quads)
# CharacteristicLength controls edge length target in mm
@@ -424,6 +429,7 @@ def _tessellate_with_gmsh(shape, linear_deflection: float, angular_deflection: f
f"GMSH tessellation: {n_faces_gmsh} faces meshed, "
f"{n_faces_fallback} BRepMesh fallback, "
f"{n_triangles_total} triangles total"
f" (threads={n_threads})"
)