Table 1 Findings of the literature.
The paper | Proposed method | Results |
|---|---|---|
Paper (9)9 | A pipeline based on convolutional neural networks is shown to extract high-level visual features and enhance the effectiveness of text detection and recognition | Recall: 0.71 Precision: 0.74 |
Paper (10)10 | Text detection is formulated as a visual relationship identification problem by authors presenting a novel arbitrary-shaped text detection method called ReLaText | The authors conducted extensive tests on five scene text identification benchmark datasets to assess the effectiveness of their proposed ReLaText. To ensure that their results were comparable to those from other techniques, they adhered to the evaluation protocols outlined by the authors |
Paper (11)11 | An improved YOLO v3 for text identification algorithm is used in this study, focusing on scene text detection | The YOLOv3-Darknet19 and YOLOv3-Darknet53 algorithms are compared to assess their methodology. The results indicate that the Darknet19 network's loss is reduced quickly, the data swings less, and the final constant value minimizes |
Paper (12)12 | The authors provide TextField, a novel text detector that recognizes atypical scene texts | The experimental findings demonstrate that the suggested method performs significantly better on two curved text datasets than the cutting-edge techniques (28% and 8%) |
Paper (13)13 | This paper's authors provide a scene text identification technique that uses adaptive text region representation | Test results on five benchmarks: TotalText, ICDAR2013, ICDAR2015, CTW1500, and MSRA-TD500 demonstrate that this approach performs state-of-the-art scene text detection |
Paper (14)14 | The authors of this research directly learn a cross-modal resemblance along with a query text and every text occurrence using natural images. To be more precise, they created a fully trainable network by simultaneously improving the processes of cross-modal similarity learning and scene text detection | To assess the performance of their proposed method, they test their technique on three standard datasets to prove the effectiveness of their suggested approach |
Paper (15)15 | The authors of this research offer the Pixel Aggregation Network (PAN), a low-cost computational segmentation head and learnable post-processing for an accurate and efficient arbitrary-shaped text detector | F-measure on CTW1500 of 79.9% at 84.2 FPS |
Paper (16)16 | The authors provided a summary of the metrics and tools used for OCR evaluation and outlined two sophisticated applications for the output of OCR | For the experiment section, using two separate datasets and a variety of evaluation tools and criteria, the authors conduct an OCR evaluation experiment |
Paper (17)17 | The authors developed a system to test the security of Hindi CAPTCHAS | Two-color schemes have a 90% breaking rate, whereas multi-color schemes have a 93% breaking rate |
Paper (18)18 | The authors suggested a technique to evaluate the security of CAPTCHAs based on Devanagari scripts. For security testing, they selected five distinct monochrome and five distinct greyscale CAPTCHAs | The authors obtained segmentation rates ranging from 88.13% to 97.6%, a breaking rate of 73% to 93% for greyscale schemes and 66% to 85% for monochrome designs |
Paper (19)19 | the authors suggest a brand-new module named Multi-Domain Character Distance Perception (MDCDP) to create a position embedding that is both semantically and visually connected | To assess the performance of their proposed method, the authors compared CDistNet with nineteen other techniques published between 2017 and 2022 |