Data Processing Units (DPUs) are PCIe network cards (SmartNICs) equipped with specialized hardware accelerators for data processing. DPUs offer the opportunity to process data near the hardware network stack (near-network). By enabling near-network computation, DPUs reduce CPU load and improve end-to-end performance, an increasingly attractive approach to trends like compute-storage disaggregation and real-time data ingestion. However, existing research on DPU-based processing often overlooks hardware acceleration or relies on static offloading to the ARM subsystem, leaving open questions about how best to split work (or co-process) with the host CPU. In this paper, we analyze near-network hardware acceleration with co-processing on DPUs, revealing that DPU performance varies significantly depending on input data types, task and query-imposed configurations. Through our micro-benchmark experiments, we explore partial offloads and co-processing strategies that demonstrate the trade-offs between higher throughput against reconfiguration overhead on DPUs. Our findings offer practical insights for data systems practitioners seeking to leverage near-network accelerators in data processing pipelines.