...
- Use real headings (e.g. using styles in Word, InDesign, etc.), real lists, add alt text to images, designate table headers in data tables, and other accessibility techniques.
- Convert scanned documents to real text with OCR in Acrobat Pro. (View > Tools > Text Recognition > In This File, then run the “Find All Suspects.”)
Save as Tagged PDF Document
...
- Add a Title: File > Properties > Description > Title.
- Ensure Title is read: File > Properties > Initial View > Show > Document Title.
- Specify the language: File > Properties > Advanced > Language.
Fine Tune Tags
Caution: There is no undo feature when modifying tags!
- Activate the “Tags” Pane: View > Show/Hide > Navigation Panes > Tags.
- If Tag Pane says, “No tags available,” add tags to the document. (Right click on “No tags available” > Add Tags to Document)
- Turn on “Highlight Content” to see where the tagged items are in the document. (Right click on any tag > Highlight Content)
- Correct any incorrect tags: Click on the tag to select it, click it again to make it editable, then change the value (e.g. change <P> to <H1> or <TD> to <TH>).
- Headings should be marked with <h1>, <h2>, <h3>, etc.
- Add tags to any untagged elements (unless they are decorative or unimportant).
- Activate the “Accessibility” sidebar (Tools Tab > Accessibility).
- Open the “Touch Up Reading Order” dialog box (in the “Accessibility” sidebar).
- Select the item (e.g. an image or a paragraph) by dragging a box around it.
- Assign a tag to the item by selecting the appropriate type of tag in the “Touch Up Reading Order” dialog (Text, Figure, Heading 1, Table, etc.).
- Delete empty or unnecessary tags: Right-click on the tag > Delete Tag.
Caution: There is no undo (but you can add tags back in using the instructions above).
- Images should be marked as <Figure> and should have alt text. (Right click on the tag > Properties > Tag > Alternate Text.)
- Data tables should have header cells <TH>, similar to HTML tables.
- Multi-dimensional tables require giving IDs to each header cell and associating each data cell with the corresponding header cells:
- Activate the “Accessibility” sidebar (Tools Tab > Accessibility).
- Open the “Touch Up Reading Order” dialog box (in the “Accessibility” sidebar)
- Click on the number in the upper left corner of the data table in the document.
- Select “Table Editor” in the “Touch Up Reading Order” dialog box.
- Assign IDs to the header cells: For each table header cell, right click on the cell, select “Table cell properties” and type an ID (make it easy to remember).
- Associate the header cells with the data cells: For each data cell, right click on the cell, select “Table cell properties,” click on the “+” button, and select the corresponding header cell IDs for that data cell.
- Rearrange the tags if necessary: drag the tag to the desired location in the Tag pane.
...
PDF Tag | Description | Equivalent HTML Tag |
Art | Article element. A self-contained body of text considered to be a single narrative. | <article> |
Annot | Annotation. A comment, note, or other annotation added by a document editor after the document was published. | none |
BibEntry | Bibliography Entry element. A description of where some cited information may be found. | none |
BlockQuote | Block Quote element. One or more paragraphs of text attributed to someone other than the author of the immediate surrounding text. | <blockquote> |
Caption | Caption element. A brief portion of text that describes a table or a figure. | <caption> (for tables), <figcaption> (for images and other objects in <figure>) |
Code | Code element. Computer program text embedded within a document. | <code> |
Div | Division element. A generic block-level element or group of block-level elements. | <div> |
Document | Document element. The root element of a document’s tag tree. | <html> |
Figure | Figure element. A graphic or graphic representation associated with text. | <img> (for regular images), <figure> (for images and other objects, usually paired with <figcaption>) |
Form | Form element. A PDF form annotation that can be or has been filled out. | <form> |
Formula | Formula element. A mathematical formula. | <math> (in MathML) |
H | Heading. A generic heading that inherits its level based on the nested structure of the document. | none |
H1 | Heading Level 1 | <h1> |
H2 | Heading Level 2 | <h2> |
H3 | Heading Level 3 | <h3> |
H4 | Heading Level 4 | <h4> |
H5 | Heading Level 5 | <h5> |
H6 | Heading Level 6 | <h6> |
Index | Index element. A sequence of entries that contain identifying text and reference elements that point out the occurrence of the text in the main body of the document. | none |
Lbl | Label element. A bullet, name, or number that identifies and distinguishes an element from others in the same list. | none |
Link | Link element. A hyperlink that is embedded within a document. The target can be in the same document, in another PDF document, or on a website. | <a> |
L | List element. Any sequence of items of similar meaning or other relevance; immediate child elements should be list item elements. | <ul> for unordered lists or <ol> for ordered lists or <dl> for definition lists |
LI | List Item element. Any one member of a list; may have a label element (optional) and a list body element (required) as a child. | <li> |
LBody | List Item Body element. The descriptive content of a list item. | none |
Note | Note element. Explanatory text or documentation, such as a footnote or endnote, that is referred to in the main body of text. | <note> |
P | Paragraph | <p> |
Part | Part element. A large division of a document; may group smaller units of content together, such as division elements, article elements, or section elements. | <section> |
Quote | Quote element. An inline portion of text that is attributed to someone other than the author of the text surrounding it; different from a block quote, which is a whole paragraph or multiple paragraphs, as opposed to inline text. | <q> |
Reference | Reference element. A citation to text or data that is found elsewhere in the document. | href attribute of the <a>element |
Sect | Section element. A general container element type, comparable to Division (div class="sect") in HTML, which is usually a component of a part element or an article element. | <section> |
Span | Span element. Any inline segment of text; commonly used to delimit text that is associated with a set of styling properties. | <span> |
Table | Table element. A two-dimensional arrangement of data or text cells that contains table row elements as child elements and may have a caption element as its first or last child element. | <table> |
TD | Table Data Cell element. A table cell that contains nonheader data. | <td> |
TH | Table Header Cell element. A table cell that contains header text or data describing one or more rows or columns of a table. | <th> |
TOC | Table of Contents element. An element that contains a structured list of items and labels identifying those items; has its own discrete hierarchy. | none |
TOCI | Table of Contents Item element. An item contained in a list associated with a table of contents element. | none |
TR | Table Row element. One row of headings or data in a table; may contain table header cell elements and table data cell elements. | <tr> |
Created 05/01/18 by Edward Pritchard (edward.pritchard@domail.maricopa.edu) - Information sourced from Deque Systems, Inc and Deque University