Abstract: Image-text multimodal disease diagnostic models have the potential to provide more precise diagnosis results compared to conventional image-only diagnostic models. Existing image-text ...
Industrial scene text detection is challenging due to cluttered backgrounds, rust occlusions, and arbitrary orientations. We introduce ISTD-DLA, a dynamic local-aware aggregation network for ...