HV-MAPS Tracking Telescope: Fast Data Transfer with Direct Memory Access

> Dorothea vom Bruch for the Mu3e Collaboration

> > BTTB Workshop Feb 3, 2016







#### Introduction



#### talk by Lennart Huth: HV-MAPS tracking telescope



#### Motivation





4 x 1.25 Gbit/s LVDS links

- Max. 30 Mhits / s / plane
- More data than we can write to disk
- Need GPU online reconstruction for selection

#### Readout Scheme



Readout Scheme





## FPGA Board



Commercially available from Altera

- Data stream from four planes
- Timestamps of 32 ns
- Hit sorter:
  - Sort according to timestamps
  - Up to 15 hits / timestamp
- Receive trigger signals
- Send off data:
  - Polling data initiated by CPU OR
  - Direct Memory Access (DMA) initiated by FPGA

#### Readout Software





#### Readout Software





## Polling Data





## Direct Memory Access (DMA)





## Polling Data vs. Direct Memory Access



| Polling Data                   | Direct Memory Access                |
|--------------------------------|-------------------------------------|
| Read request from computer     | Write from FPGA to                  |
| Write from FPGA to main memory | main memory                         |
| Computer controls copying      | FPGA controls copying               |
|                                | CPU is informed about process       |
|                                | via interrupt messages              |
| Limited to $\sim$ 30 MB/s      | Theoretically limited by            |
|                                | PCIe bandwidth $(4  \text{GB/s})^1$ |

<sup>1</sup>8 lanes of PCIe 2.0

Feb 3, 2016

Telescope Readout with DMA

Dorothea vom Bruch

#### Rate Tests



- On FPGA: 256 kB memory buffer for data to be sent off
- Send in chunks of 4 kB
- Every 256 kB PC is notified via interrupt message where to read next
- For speed test: copy data to memory of graphics processing unit (GPU)
- Check for errors
- Not with telescope readout, only testing data transfer
- At 1.5 GB/s: Measured bit error rate  $\leq 4 \times 10^{-16}$

## DMA with Telescope





- Test with data generator on FPGA
- Produces hits from four planes
- Tested at 300 MB/s

- Tested at DESY in October 2015
- Run for two days continuously
- Errors occuring with probability of 10<sup>-4</sup>

## Online Track Reconstruction on GPUs





Image source: nvidia.com

- Sort hits according to planes
- Prepare memory for coalesced access on GPU
- Copy data to GPU memory
- Fit straight tracks
- Calculate efficiency
- Track rate of  $\sim$  700 kHz

# GPU Reconstruction: Results from Desy (10/2015)





## DMA and GPU Reconstruction - Current Status





## Combine DMA and GPU Reconstruction





#### Master thesis in progress by Carsten Grzesik

## Summary





- Data transmission via DMA @ 1.5 GB/s
- Simulated telescope data @ 300 MB/s
- Tested telescope readout via DMA @ Desy
- Tested online track reconstruction on GPU @ Desy

## Outlook



- Transfer sorted hits directly from FPGA to GPU memory via main memory
- Online track reconstruction and efficiency calculation independent from file writing for offline analysis
- Goal:
  - Telescope with large chips
  - Capable of high rates ( $\sim$  20 Mhits / plane / s)
  - Fast online track reconstruction
  - Iterative alignment procedure
  - Online efficiency calculation

