[SYCL] Data Parallel C++'s Table of Contents

Notice

Recent Posts

Recent Comments

Link

내블로그

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Tags more

Archives

Today

Total

관리 메뉴

Computing

[SYCL] Data Parallel C++'s Table of Contents 본문

Parallel | Distributed Computing/SYCL

[SYCL] Data Parallel C++'s Table of Contents

jhson989 2022. 2. 16. 00:42

[Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL] by James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, Xinmin Tian (Apress, 2020).

Chapter 1 : Introduction
- Advices for building parallel program
Chapter 2 : Where Code Executes
- SYCL: parallel programming framework for heterogeneous processors (CPU, GPU, and FPGA)
Chapter 3 : Data Management
- Buffer
- Unified shared memory
Chapter 4 : Expressing Parallelism
Chapter 5 : Error Handling
Chapter 6 : Unified Shared Memory
Chapter 7 : Buffers
Chapter 8 : Scheduling Kernels and Data Movement
Chapter 9 : Communication and Synchronization
- Communication in work-group : Barrier, local memory
- Sub-groups : Warp(nvidia), Wavefront(amd)
- Sub-groups collective functions : Broadcast, Votes, Shuffles, Load and Stores
Chapter 10 : Defining Kernels
- Lambda expression
- Function object (functor)
- Interoperability with Other APIs : OpenCL
Chapter 11 : Vectors
Chapter 12 : Device Information
- Device query API
Chapter 13 : Practical Tips
Chapter 14 : Common Parallel Patterns
- Map : No data dependences and high scalability
- Stencil : Data dependences and high data reuse
- Reduction : Data dependences
- Scan / Pack / Unpack : Limited scalability]
- DPC++ built-in Libraries
Chapter 15 : Programming for GPUs
- GPU 하드웨어: Simple core, Easy switching
- GPU 커널 실행 모델: SIMD, SPMD, Distributed memory
- Memory bound problem 최적화 기법: Global memory coalesced access, Local memory bank conflict, etc.
Chapter 16 : Programminig for CPUs
- CPU 하드웨어: cc-NUMA system, SIMD execution
- CPU Parallelism level: Instruction-level (Out-of-order, SIMD), Thread-level (multi-processing)
- CPU 커널 최적화 기법: Thread affinity, First touch, Vectorization
Chapter 17 : Programming for FPGAs
- FPGA 병렬화 기본 개념: Pipelining (task parallelism) > Data parallelism
Chapter 18 : Libraries
Chapter 19 : Memory Model and Atomics
- Race condition & memory consistency model
- Barrier & memory fence
- Atomic operation

'Parallel | Distributed Computing > SYCL' 카테고리의 다른 글

Intel DevCloud 실행 방법 (0)	2022.03.16
[SYCL] SYCL 설치 Ubuntu 18.04 (Nvidia GPU, Intel CPU) (0)	2022.02.23

'Parallel | Distributed Computing/SYCL' Related Articles

Computing

[SYCL] Data Parallel C++'s Table of Contents 본문

[SYCL] Data Parallel C++'s Table of Contents

'Parallel | Distributed Computing > SYCL' 카테고리의 다른 글

티스토리툴바