Maths & Observations : einsteinathome.org Work units GPU+SiMD QE 2021
Speed testing for SSE>AVX FState Method? (like linus RAID Driver)
How can we make this coherent ? FFT?
FFT Examples : https://is.gd/ProcessorLasso in the SiMD Folder...
Advanced FFT & 3D Audio functions for CPU & GPU https://gpuopen.com/true-audio-next/
LATeah 1 row Suggest theading to advantage SiMD (FP16,x2FP16 : Precision)
Work unit size 160 FPU Threads 4MB Per group, Aligning work unit before compute.
Dataset size 400MB..? Optimise for 512MB, 1GB or 2GB Chunks
<core_client_version>7.16.11</core_client_version>
12:50:57 (2636): [normal]: This Einstein@home App was built at: May 8 2019 13:29:27
OpenCL device has FP64 support
12:50:57 (2636): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
12:50:57 (2636): [debug]: Set up communication with graphics process.
Win:
Peak working set size (MB):588.14
Peak swap size (MB):1023.74
H1 Spotlight 8 rows Work unit size 160 FPU Threads 4MB Per group, Aligning work unit before compute.
Dataset size 400MB..? Optimise for 512MB, 1GB or 2GB Chunks
How can we make this coherent ? FFT?
FFT Examples : https://is.gd/ProcessorLasso in the SiMD Folder...
Advanced FFT & 3D Audio functions for CPU & GPU https://gpuopen.com/true-audio-next/
2021-04-12 21:55:14.2649 (16972) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
021-04-12 22:00:07.1870 (16972) [normal]: Search FstatMethod used: 'ResampOpenCL'
2021-04-12 22:00:07.1890 (16972) [normal]: Recalc FstatMethod used: 'DemodSSE'
2021-04-12 22:00:07.1960 (16972) [normal]: OpenCL version is used for the semi-coherent step!
Fail:
Peak working set size (MB):492.02
Peak swap size (MB):2898.58 5RAM overload possible
Win:
Peak working set size (MB):400.71
Peak swap size (MB):1827.34
Peak disk usage (MB):4.46
Copyright © 2024 Einstein@Home. All rights reserved.
Some visual samples of the
)
Some visual samples of the test dataset https://is.gd/EinsteinE_MC_2