Being one of the fundamental problems in autonomous robotics, SLAM (Simultaneous Localization and Mapping) algorithms have gained a lot of attention. Although numerous approaches have been presented for determining 6D poses in 3D environments, one of the main challenges that remains is the required combination of real-time processing and high energy efficiency. In this paper, a combination of CPU and FPGA processing is used to tackle this problem, utilizing a reconfigurable SoC. We present a complete solution for embedded LiDAR-based SLAM that uses a global Truncated Signed Distance Function (TSDF) as map representation. A hardware-in-the-loop environment with ROS integration enables efficient evaluation of new variants of algorithms and implementations. Based on benchmark data sets and real-world environments, we show that our approach compares well to established SLAM algorithms. Compared to a software implementation on a state-of-the-art PC, the proposed implementation achieves a 7-fold speed-up and requires 18 times less energy when using a Xilinx UltraScale+ XCZU15EG