ICDAR-2015 ANDAR Text Lines

Competition on Text Line Detection in Historical Documents

SUMMARY OF COMPETITION


Introduction

Document layout analysis is an active and important area of research, yet it remains a largely unsolved problem. Certainly we have seen demonstrated in previous segmentation and layout-related competitions successful methods for particular classes of problems. However, these demonstrations are somewhat lacking with regards to their generality, robustness, or applicability to document types not present in the evaluation data set. We wish to take a slightly different tack from previous competitions related to segmentation and layout analysis for handwritten documents and instead address a fairly simple scenario in layout analysis in which the emphasis is on achieving some level of generality or context-independence.


The aim of this competition is to evaluate the performance of algorithms for detecting lines of handwritten text in paragraph form drawn from historical documents. In particular, we wish to investigate and compare general methods that can reliably and robustly identify the origin point for text lines in the presence of various noise conditions, interfering annotations, and the artifacts common to historical documents. Specifically, we wish to only consider the task of finding the baseline of the first character of the left-most word of each line of text. The goal with this competition is to understand how this generality is achieved in hopes that it might open new avenues for consideration in other (more challenging) areas of layout analysis.


Text-Line Detection

For this competition, we confine ourselves to documents consisting largely of handwritten paragraph text like the example shown in Figure 1.


Figure 1. An example of paragraph text.


In the following diagram in Figure 2 the individual Text-Lines from Figure 1 are highlighted in yellow.



Figure 2. An example of paragraph text with individual Text-Lines highlighted in yellow.

On the Protocol page we provide more details and definitions on Text-Line segments, but the general notion of this competition is to enable identifying the yellow Text-Lines in Figure 2 by finding the location of the starting character of the first word of each Text-Line that is part of a paragraph. This point is called the “origin point” and is abbreviated in this documentation as OP. The objective for a participant in this competition is to develop an algorithm for finding the origin point for Text-Lines that are in paragraph form.


Origin Points

We define the "origin point" for a text line as the (x, y) coordinate located at the intersection of the baseline of the first character of the first word in the line and the left-most edge of that character. We refer to origin points with the abbreviation OP. An example of an OP is shown in the following figure:


Figure 3. Example of an Origin Point, which is the red square at the intersection of the two red lines.

The OP for four Text-Lines in a paragraph are shown with the red symbol in the following figure:


Figure 4. Example showing interference in the form of a graphic to the left of the text lines.

Types of Text Segments

As stated above, the focus of this competition is on detecting and locating Text-Lines arranged in paragraph form. Informally, we refer to this type of text segment as Paragraph Text-Lines. In documents that are predominantly comprised of Paragraph Text-Lines (which are those making up the database for this competition), there are other types of text segments. In the following diagram we list five common types of text segments. In the subsequent diagrams below we provide examples that show in greater context these text types. In the next section we describe a type of text segment, which we call a Text-Block, which is of particular importance, since we require in this competition the ability to distinguish between it and the other types of text segments.

Text Segment Type Image Snippet with Segment Highlighted
Text-Line
Title
Header
Table
Signature

Figure 5. Types of text segments.

The following diagrams show examples of these text types.


Figure 6. Examples of different types of text segments.


Figure 7. Examples of different types of text segments.


Figure 8. Examples of different types of text segments.


Figure 9. Examples of different types of text segments.


Figure 10. Examples of different types of text segments.

In the following section we describe the importance of distinguishing between Text-Lines in paragraph form and Text-Blocks.

Text-Blocks

The five types of text segments described in the previous section can appear very similar in different contexts and in some cases can only be disambiguated after transcription. And so to narrow the scope of this competition so as to fit within the time frame provided by the conference schedule, we won’t make any distinction between these types of text segments: We will refer to them generically as Text-Lines. However, there is an additional type of text segment, which we call a Text-Block, which must be detected as different from Paragraph Text-Lines. In many document collections containing paragraph text the left section of the document contains Text-Blocks containing metadata for the Paragraph Text-Lines on the right section of the document. An example of a piece of a document containing a Text-Block and Paragraph Text-Lines is shown in the following figure.


Figure 11. Example of a Text-Block (red) and several Text-Lines (yellow).

In the example in Figure 14, it appears that the text is all just paragraph text with uneven indentation. The blue box shows this by (incorrectly) segmenting the part of the first line as a Text-Line. Although it is not be immediately obvious, the text in the (red) Text-Block must be treated separately from the text in the (yellow) Paragraph Text-Lines. To properly transcribe the handwritten ink in Paragraph Text-Lines, the correct location of each Text-Line must be determined. The correct origin points for this image is shown in the following figure:


Figure 12. The example from Figure 14 with the correct (red) origin points.

On the Evaluation tab we provide additional information for how we suggest Paragraph Text-Lines might be distinguished from Block-Text and how the evaluation scoring mechanism will account for the different types of text segments.