Anchor

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with "In the scope of the project LACE, an anchor is an element that can be used to align texts. ===List of all HTML elements=== {| class="wikitable" |- ! Tag !! Description...")
 
(Observations)
 
(19 intermediate revisions by one user not shown)
Line 1: Line 1:
In the scope of the project [[LACE]], an anchor is an element that can be used to align texts.
+
In the scope of the project [[LACE]], an '''anchor''' is an element that may facilitate word alignment at the document level.
  
 +
== HTML elements ==
 +
The following HTML elements are used to define the set of anchors in the project [[LACEhpc]]. They are said to involve smaller texts and, therefore, are more likely to provide lexical mappings.
  
 
 
===List of all HTML elements===
 
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
 
! Tag !! Description
 
! Tag !! Description
|-
 
| <!--...--> || Defines a comment
 
|-
 
| <!DOCTYPE> || Defines the document type
 
 
|-
 
|-
 
| <a> || Defines a hyperlink
 
| <a> || Defines a hyperlink
|-
 
| <abbr> || Defines an abbreviation
 
|-
 
| <acronym> || Not supported in HTML5. Defines an acronym
 
|-
 
| <address> || Defines contact information for the author/owner of a document
 
|-
 
| <applet> || Not supported in HTML5. Deprecated in HTML 4.01. Defines an embedded applet
 
|-
 
| <area> || Defines an area inside an image-map
 
|-
 
| <article> || Defines an article
 
|-
 
| <aside> || Defines content aside from the page content
 
|-
 
| <audio> || Defines sound content
 
 
|-
 
|-
 
| <b> || Defines bold text
 
| <b> || Defines bold text
|-
 
| <base> || Specifies the base URL/target for all relative URLs in a document
 
|-
 
| <basefont> || Not supported in HTML5. Deprecated in HTML 4.01. Specifies a default colour, size, and font for all text in a document
 
|-
 
| <bdi> || Isolates a part of text that might be formatted in a different direction from other text outside it
 
|-
 
| <bdo> || Overrides the current text direction
 
|-
 
| <big> || Not supported in HTML5. Defines big text
 
|-
 
| <blockquote> || Defines a section that is quoted from another source
 
|-
 
| <body> || Defines the document's body
 
|-
 
| <br> || Defines a single line break
 
|-
 
| <canvas> || Used to draw graphics, on the fly, via scripting (usually JavaScript)
 
 
|-
 
|-
 
| <caption> || Defines a table caption
 
| <caption> || Defines a table caption
|-
 
| <center> || Not supported in HTML5. Deprecated in HTML 4.01. Defines centred text
 
|-
 
| <cite> || Defines the title of a work
 
|-
 
| <code> || Defines a piece of computer code
 
|-
 
| <col> || Specifies column properties for each column within a <colgroup> element 
 
|-
 
| <colgroup> || Specifies a group of one or more columns in a table for formatting
 
|-
 
| <command> || Defines a command button that a user can invoke
 
|-
 
| <datalist> || Specifies a list of pre-defined options for input controls
 
|-
 
| <dd> || Defines a description of an item in a definition list
 
|-
 
| <del> || Defines text that has been deleted from a document
 
|-
 
|  <details> || Defines additional details that the user can view or hide
 
|-
 
|  <dfn> || Defines a definition term
 
|-
 
| <dialog> || Defines a dialog box or window
 
|-
 
| <dir> || Not supported in HTML5. Deprecated in HTML 4.01. Defines a directory list
 
|-
 
| <div> || Defines a section in a document
 
|-
 
| <dl> || Defines a definition list
 
 
|-
 
|-
 
| <dt> || Defines a term (an item) in a definition list
 
| <dt> || Defines a term (an item) in a definition list
 
|-
 
|-
 
| <em> || Defines emphasized text 
 
| <em> || Defines emphasized text 
|-
 
| <embed> || Defines a container for an external (non-HTML) application
 
|-
 
| <fieldset> || Groups related elements in a form
 
 
|-
 
|-
 
| <figcaption> || Defines a caption for a <figure> element
 
| <figcaption> || Defines a caption for a <figure> element
|-
 
|  <figure> || Specifies self-contained content
 
|-
 
| <font> || Not supported in HTML5. Deprecated in HTML 4.01. Defines font, colour, and size for text
 
|-
 
| <footer> || Defines a footer for a document or section
 
|-
 
| <form> || Defines an HTML form for user input
 
|-
 
| <frame> || Not supported in HTML5. Defines a window (a frame) in a frameset
 
|-
 
| <frameset> || Not supported in HTML5. Defines a set of frames
 
 
|-
 
|-
 
| <h1> to <h6> || Defines HTML headings
 
| <h1> to <h6> || Defines HTML headings
|-
 
| <head> || Defines information about the document
 
|-
 
|  <header> || Defines a header for a document or section
 
|-
 
| <hgroup> || Groups heading (<h1> to <h6>) elements
 
|-
 
|  <hr> || Defines a thematic change in the content
 
|-
 
| <html> ||  Defines the root of an HTML document
 
 
|-
 
|-
 
| <i> || Defines a part of text in an alternate voice or mood
 
| <i> || Defines a part of text in an alternate voice or mood
|-
 
| <iframe> || Defines an inline frame
 
|-
 
| <img> || Defines an image
 
|-
 
| <input> || Defines an input control
 
|-
 
| <ins> || Defines a text that has been inserted into a document
 
|-
 
| <kbd> || Defines keyboard input
 
|-
 
| <keygen> || Defines a key-pair generator field (for forms)
 
|-
 
| <label> || Defines a label for an <input> element
 
 
|-
 
|-
 
| <legend> || Defines a caption for a <fieldset>, < figure>, or <details> element
 
| <legend> || Defines a caption for a <fieldset>, < figure>, or <details> element
 
|-
 
|-
 
| <li> || Defines a list item
 
| <li> || Defines a list item
|-
 
| <link> || Defines the relationship between a document and an external resource (most used to link to style sheets)
 
|-
 
| <map> || Defines a client-side image-map
 
 
|-
 
|-
 
| <mark> || Defines marked/highlighted text
 
| <mark> || Defines marked/highlighted text
|-
 
| <menu> || Defines a list/menu of commands
 
|-
 
| <meta> || Defines metadata about an HTML document
 
|-
 
| <meter> || Defines a scalar measurement within a known range (a gauge)
 
 
|-
 
|-
 
| <nav> || Defines navigation links
 
| <nav> || Defines navigation links
|-
 
| <noframes> || Not supported in HTML5. Defines an alternate content for users that do not support frames
 
|-
 
| <noscript> || Defines an alternate content for users that do not support client-side scripts
 
|-
 
| <object> || Defines an embedded object
 
|-
 
| <ol> || Defines an ordered list
 
|-
 
| <optgroup> || Defines a group of related options in a drop-down list
 
|-
 
| <option> || Defines an option in a drop-down list
 
|-
 
| <output> || Defines the result of a calculation
 
|-
 
| <p> || Defines a paragraph
 
|-
 
| <param> || Defines a parameter for an object
 
|-
 
| <pre> || Defines pre-formatted text
 
|-
 
| <progress> || Represents the progress of a task
 
 
|-
 
|-
 
| <q> || Defines a short quotation
 
| <q> || Defines a short quotation
|-
 
| <rp> || Defines what to show in browsers that do not support ruby annotations
 
|-
 
| <rt> || Defines an explanation/pronunciation of characters (for East Asian typography)
 
|-
 
| <ruby> || Defines a ruby annotation (for East Asian typography)
 
|-
 
| <s> || Defines text that is no longer correct
 
|-
 
| <samp> || Defines sample output from a computer program
 
|-
 
| <script> || Defines a client-side script
 
|-
 
| <section> || Defines a section in a document
 
|-
 
| <select> || Defines a drop-down list
 
 
|-
 
|-
 
| <small> || Defines smaller text
 
| <small> || Defines smaller text
|-
 
| <source> || Defines multiple media resources for media elements (<video> and <audio>)
 
|-
 
| <span> || Defines a section in a document
 
 
|-
 
|-
 
| <strike> || Not supported in HTML5. Deprecated in HTML 4.01. Defines strike-through text
 
| <strike> || Not supported in HTML5. Deprecated in HTML 4.01. Defines strike-through text
 
|-
 
|-
 
| <strong> || Defines important text
 
| <strong> || Defines important text
|-
 
| <style> || Defines style information for a document
 
 
|-
 
|-
 
| <sub> || Defines subscripted text
 
| <sub> || Defines subscripted text
|-
 
| <summary> || Defines a visible heading for a <details> element
 
 
|-
 
|-
 
| <sup> || Defines superscripted text
 
| <sup> || Defines superscripted text
|-
 
| <table> || Defines a table
 
|-
 
| <tbody> || Groups the body content in a table
 
 
|-
 
|-
 
| <td> || Defines a cell in a table
 
| <td> || Defines a cell in a table
|-
 
| <textarea> || Defines a multi-line input control (text area)
 
|-
 
| <tfoot> || Groups the footer content in a table
 
 
|-
 
|-
 
| <th> || Defines a header cell in a table
 
| <th> || Defines a header cell in a table
 +
|}
 +
 +
== Observations ==
 +
;Nesting
 +
:Anchors ignore nesting. For instance, the sequence <nowiki><b><i>ABC</i>DEF</b></nowiki> was considered to have two anchors: <nowiki><i>ABC</i></nowiki> and <nowiki><b>ABCDEF</b></nowiki>.
 +
;Attributes and events
 +
:HTML attributes and events are ignored. For instance: given <nowiki><a href="http://www.unlweb.net" target="_blank" title="UNL">UNLweb</a></nowiki>, the anchor is simply <nowiki><a>UNLweb</a></nowiki>.
 +
 +
== Example ==
 +
 +
{| class="wikitable"
 +
!width="50%"|Original
 +
!width="50%"|Anchors
 
|-
 
|-
| &lt;thead&gt; || Groups the header content in a table
+
|<nowiki>
|-
+
<HEAD>
| &lt;time&gt; || Defines a date/time
+
<TITLE>Basic HTML Sample Page</TITLE>
|-
+
</HEAD>
| &lt;title&gt; || Defines a title for the document
+
<BODY BGCOLOR="WHITE">
|-
+
<CENTER>
| &lt;tr&gt; || Defines a row in a table
+
<H1>A Simple Sample Web Page</H1>
|-
+
Extracted from <a href="http://sheldonbrown.com/web_sample1.html">Sheldon Brown</a>.
| &lt;track&gt; || Defines text tracks for media elements (&lt;video&gt; and &lt;audio&gt;)
+
<IMG SRC="scb_eagle_contact.jpeg">
|-
+
<H2>Demonstrating a few HTML features</H2>
| &lt;tt&gt; || Not supported in HTML5. Defines Teletype text
+
</CENTER>
|-
+
<b>HTML</b> is really a very simple language. It consists of ordinary text, with commands that are enclosed by "<" and ">" characters, or bewteen an "&" and a ";". <P>
| &lt;u&gt; || Defines text that should be stylistically different from normal text
+
You don't really need to know much HTML to create a page, because you can copy bits of HTML from other pages that do what you want, then change the text!<P>
|-
+
<H3>Line Breaks</H3>
| &lt;ul&gt; || Defines an unordered list
+
HTML doesn't normally use line breaks for ordinary text. A white space of any size is treated as a single space. This is because the author of the page has no way of knowing the size of the reader's screen, or what size type they will have their browser set for.<P>
|-
+
If you want to put a line break at a particular place, you can use the "<BR>" command, or, for a paragraph break, the "<P>" command, which will insert a blank line. The heading command ("<4></4>") puts a blank line above and below the heading text.
| &lt;var&gt; || Defines a variable
+
<H4>Starting and Stopping Commands</H4>
|-
+
Most HTML commands come in pairs: for example, "<H4>" marks the beginning of a size 4 heading, and "</H4>" marks the end of it. The closing command is always the same as the opening command, except for the addition of the "/".<P>
| &lt;video&gt; || Defines a video or movie
+
Modifiers are sometimes included along with the basic command, inside the opening command's < >. The modifier does not need to be repeated in the closing command.
|-
+
<H1>This is a size "1" heading</H1>
| &lt;wbr&gt; || Defines a possible line-break
+
<H2>This is a size "2" heading</H2>
 +
<H3>This is a size "3" heading</H3>
 +
<H4>This is a size "4" heading</H4>
 +
<H5>This is a size "5" heading</H5>
 +
<H6>This is a size "6" heading</H6>
 +
<center>
 +
</body>
 +
</nowiki>
 +
|
 +
<pre>
 +
<H1>A Simple Sample Web Page</H1>
 +
<a>Sheldon Brown</a>
 +
<H2>Demonstrating a few HTML features</H2>
 +
<b>HTML</b>
 +
<H3>Line Breaks</H3>
 +
<H1>This is a size "1" heading</H1>
 +
<H2>This is a size "2" heading</H2>
 +
<H3>This is a size "3" heading</H3>
 +
<H4>This is a size "4" heading</H4>
 +
<H5>This is a size "5" heading</H5>
 +
<H6>This is a size "6" heading</H6>
 +
</pre>
 
|}
 
|}

Latest revision as of 15:23, 22 July 2013

In the scope of the project LACE, an anchor is an element that may facilitate word alignment at the document level.

HTML elements

The following HTML elements are used to define the set of anchors in the project LACEhpc. They are said to involve smaller texts and, therefore, are more likely to provide lexical mappings.

Tag Description
<a> Defines a hyperlink
<b> Defines bold text
<caption> Defines a table caption
<dt> Defines a term (an item) in a definition list
<em> Defines emphasized text 
<figcaption> Defines a caption for a <figure> element
<h1> to <h6> Defines HTML headings
<i> Defines a part of text in an alternate voice or mood
<legend> Defines a caption for a <fieldset>, < figure>, or <details> element
<li> Defines a list item
<mark> Defines marked/highlighted text
<nav> Defines navigation links
<q> Defines a short quotation
<small> Defines smaller text
<strike> Not supported in HTML5. Deprecated in HTML 4.01. Defines strike-through text
<strong> Defines important text
<sub> Defines subscripted text
<sup> Defines superscripted text
<td> Defines a cell in a table
<th> Defines a header cell in a table

Observations

Nesting
Anchors ignore nesting. For instance, the sequence <b><i>ABC</i>DEF</b> was considered to have two anchors: <i>ABC</i> and <b>ABCDEF</b>.
Attributes and events
HTML attributes and events are ignored. For instance: given <a href="http://www.unlweb.net" target="_blank" title="UNL">UNLweb</a>, the anchor is simply <a>UNLweb</a>.

Example

Original Anchors
<HEAD> <TITLE>Basic HTML Sample Page</TITLE> </HEAD> <BODY BGCOLOR="WHITE"> <CENTER> <H1>A Simple Sample Web Page</H1> Extracted from <a href="http://sheldonbrown.com/web_sample1.html">Sheldon Brown</a>. <IMG SRC="scb_eagle_contact.jpeg"> <H2>Demonstrating a few HTML features</H2> </CENTER> <b>HTML</b> is really a very simple language. It consists of ordinary text, with commands that are enclosed by "<" and ">" characters, or bewteen an "&" and a ";". <P> You don't really need to know much HTML to create a page, because you can copy bits of HTML from other pages that do what you want, then change the text!<P> <H3>Line Breaks</H3> HTML doesn't normally use line breaks for ordinary text. A white space of any size is treated as a single space. This is because the author of the page has no way of knowing the size of the reader's screen, or what size type they will have their browser set for.<P> If you want to put a line break at a particular place, you can use the "<BR>" command, or, for a paragraph break, the "<P>" command, which will insert a blank line. The heading command ("<4></4>") puts a blank line above and below the heading text. <H4>Starting and Stopping Commands</H4> Most HTML commands come in pairs: for example, "<H4>" marks the beginning of a size 4 heading, and "</H4>" marks the end of it. The closing command is always the same as the opening command, except for the addition of the "/".<P> Modifiers are sometimes included along with the basic command, inside the opening command's < >. The modifier does not need to be repeated in the closing command. <H1>This is a size "1" heading</H1> <H2>This is a size "2" heading</H2> <H3>This is a size "3" heading</H3> <H4>This is a size "4" heading</H4> <H5>This is a size "5" heading</H5> <H6>This is a size "6" heading</H6> <center> </body>
<H1>A Simple Sample Web Page</H1>
<a>Sheldon Brown</a>
<H2>Demonstrating a few HTML features</H2>
<b>HTML</b>
<H3>Line Breaks</H3>
<H1>This is a size "1" heading</H1>
<H2>This is a size "2" heading</H2>
<H3>This is a size "3" heading</H3>
<H4>This is a size "4" heading</H4>
<H5>This is a size "5" heading</H5>
<H6>This is a size "6" heading</H6>
Software