SuikaWiki Markup Language (SWML)

SuikaWiki Project, 15 November 2016

Latest version
https://suikawiki.github.io/spec-swml/spec/
Version history
https://github.com/suikawiki/spec-swml/commits/gh-pages

Abstract

This document defines the SWML syntax and the SWML vocabulary.

Table of contents

  1. 1 Introduction
    1. 1.1 History
  2. 2 Terminology
    1. 2.1 Namespaces
    2. 2.2 Definitions
  3. 3 The SWML text serialization
    1. 3.1 Document structure and header
    2. 3.2 Body part blocks
    3. 3.3 Inline contents
    4. 3.4 Images
    5. 3.5 Lexical structures
  4. 4 Parsing documents in the SWML text serialization
    1. 4.1 Tokenization of lines
      1. 4.1.1 The "initial" mode
      2. 4.1.2 The "body" mode
      3. 4.1.3 The "preformatted" mode
      4. 4.1.4 The "preformatted block" mode
      5. 4.1.5 The "image data" mode
    2. 4.2 Tokenization of a table row
    3. 4.3 Tokenization of a text
    4. 4.4 Parsing a magic line
    5. 4.5 Tree construction
      1. 4.5.1 The "in section" insertion mode
      2. 4.5.2 The "in table row" insertion mode
      3. 4.5.3 The "in paragraph" insertion mode
  5. 5 Serializing SWML text serialization documents
  6. 6 Element definitions for the SWML text serialization
  7. 7 The text/x-suikawiki and text/x.suikawiki.image Internet Media Types
  8. 8 The SWML XML serialization
    1. 8.1 ... xml media type
  9. 9 Semantics of Elements and Attributes
    1. 9.1 Document structures
      1. 9.1.1 The document element in the SuikaWiki/0.9 namespace
      2. 9.1.2 The Name attribute in the SuikaWiki/0.9 namespace
      3. 9.1.3 The Version attribute in the SuikaWiki/0.9 namespace
      4. 9.1.4 The parameter element in the SuikaWiki/0.9 namespace
      5. 9.1.5 The value element in the SuikaWiki/0.9 namespace
      6. 9.1.6 The class attribute
      7. 9.1.7 The id attribute
    2. 9.2 Blocks
      1. 9.2.1 The dr element in the SuikaWiki/0.9 namespace
      2. 9.2.2 The comment-p element in the SuikaWiki/0.10 namespace
      3. 9.2.3 The history element in the SuikaWiki/0.9 namespace
      4. 9.2.4 The example element in the SuikaWiki/0.9 namespace
      5. 9.2.5 The preamble element in the SuikaWiki/0.9 namespace
      6. 9.2.6 The postamble element in the SuikaWiki/0.9 namespace
    3. 9.3 Dialogues
      1. 9.3.1 The dialogue element in the SuikaWiki/0.9 namespace
      2. 9.3.2 The talk element in the SuikaWiki/0.9 namespace
      3. 9.3.3 The speaker element in the SuikaWiki/0.9 namespace
    4. 9.4 Hyperlinks
      1. 9.4.1 The anchor element in the SuikaWiki/0.9 namespace
      2. 9.4.2 The anchor-internal element in the SuikaWiki/0.9 namespace
      3. 9.4.3 The anchor-end element in the SuikaWiki/0.9 namespace
      4. 9.4.4 The anchor attribute in the SuikaWiki/0.9 namespace
      5. 9.4.5 The anchor-external element in the SuikaWiki/0.9 namespace
      6. 9.4.6 The resScheme attribute in the SuikaWiki/0.9 namespace
      7. 9.4.7 The resParameter attribute in the SuikaWiki/0.9 namespace
    5. 9.5 Embedded objects
      1. 9.5.1 The aa element in the AA namespace
      2. 9.5.2 The form element in the SuikaWiki/0.9 namespace
      3. 9.5.3 The image element in the SuikaWiki/0.9 namespace
      4. 9.5.4 The replace element in the SuikaWiki/0.9 namespace
      5. 9.5.5 The text element in the SuikaWiki/0.9 namespace
    6. 9.6 Citations
      1. 9.6.1 The csection element in the SuikaWiki/0.10 namespace
      2. 9.6.2 The src element in the SuikaWiki/0.10 namespace
      3. 9.6.3 The refs element in the SuikaWiki/0.9 namespace
    7. 9.7 Editorial annotations
      1. 9.7.1 The insert element in the SuikaWiki/0.9 namespace
      2. 9.7.2 The delete element in the SuikaWiki/0.9 namespace
      3. 9.7.3 The ed element in the SuikaWiki/0.10 namespace
    8. 9.8 Inline annotations
      1. 9.8.1 The rubyb element in the SuikaWiki/0.9 namespace
      2. 9.8.2 The weak element in the SuikaWiki/0.9 namespace
      3. 9.8.3 The title element in the SuikaWiki/0.10 namespace
    9. 9.9 Values
      1. 9.9.1 The f element in the SuikaWiki/0.9 namespace
      2. 9.9.2 The key element in the SuikaWiki/0.10 namespace
      3. 9.9.3 The n element in the SuikaWiki/0.9 namespace
      4. 9.9.4 The lat element in the SuikaWiki/0.9 namespace
      5. 9.9.5 The lon element in the SuikaWiki/0.9 namespace
    10. 9.10 Conformance keywords
      1. 9.10.1 The MUST element in the SuikaWiki/0.9 namespace
      2. 9.10.2 The SHOULD element in the SuikaWiki/0.9 namespace
      3. 9.10.3 The MAY element in the SuikaWiki/0.9 namespace
    11. 9.11 Qualified names
      1. 9.11.1 The qn element in the SuikaWiki/0.10 namespace
      2. 9.11.2 The qname element in the SuikaWiki/0.10 namespace
      3. 9.11.3 The nsuri element in the SuikaWiki/0.10 namespace
    12. 9.12 Fallback elements
      1. 9.12.1 The attrvalue element in the SuikaWiki/0.10 namespace
      2. 9.12.2 Uppercase elements in the SuikaWiki/0.10 namespace
  10. References
    1. Normative references
  11. Tests and implementation
  12. Author

1 Introduction

This section is non‐normative.

This specification defines SuikaWiki Markup Language (SWML). SWML is the markup language developed and implemented for SuikaWiki hypertext system.

1.1 History

SuikaWiki's Wiki syntax (now known as SWML text serialization) derived from WalWiki, which derived from YukiWiki, in 2002.

The first specification of the extended language, SuikaWiki/0.9 Document Markup Format: Syntax Specification, was published in and was frequently updated until .

Then several updates to the language, known as SuikaWiki/0.10, were (incompletely) defined by following documents:

These "two" versions of the language was merged and rewritten as the SWML specification in 2008.

Previous versions of the SWML specification were published at https://suika.suikawiki.org/www/markup/suikawiki/spec/swml-work.

There is an obsolete list of possible new features that might have been introduced in a revision of this specification.

Revisions of the SWML specification are now available in the GitHub repository.

2 Terminology

All diagrams, examples, and notes in this specification are non-normative, as are all sections explicitly marked non-normative. Everything else in this specification is normative.

The key words “MUST”, “MUST NOT”, “SHOULD”, and “MAY” in the normative parts of this document are to be interpreted as described in RFC 2119 [RFC2119].

Requirements phrased in the imperative as part of algorithms (such as “strip any leading space characters” or “return false and abort these steps”) are to be interpreted with the meaning of the key word (e.g. “MUST”) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps MAY be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)

When it is stated that some element or attribute is ignored, or treated as some other value, or handled as if it was something else, this refers only to the processing of the node after it is in the DOM. A user agent MUST NOT mutate the DOM in such situations.

2.1 Namespaces

For historical reason, elements and attributes defined or used in this specification belong to various namespaces.

The AA namespace is http://pc5.2ch.net/test/read.cgi/hp/1096723178/aavocab#. The preferred prefix is aa.

The HTML namespace is http://www.w3.org/1999/xhtml. The preferred prefix is html. Following elements are defined in the HTML Standard:

The HTML3 namespace is urn:x-suika-fam-cx:markup:ietf:html:3:draft:00:. Following element is defined in the HTML3 namespace @@ ref:

The MathML namespace is http://www.w3.org/1998/Math/MathML. The preferred prefix is math. Following elements are defined in the MathML specification:

The SuikaWiki/0.9 namespace is urn:x-suika-fam-cx:markup:suikawiki:0:9:. The preferred prefix is sw.

The SuikaWiki/0.10 namespace is urn:x-suika-fam-cx:markup:suikawiki:0:10:. The preferred prefix is sw10.

The XHTML2 namespace is http://www.w3.org/2002/06/xhtml2/.

The XML namespace is http://www.w3.org/XML/1998/namespace. The preferred prefix is xml.

2.2 Definitions

Terms node tree, element, element's local name, element's namespace, element's namespace prefix, element's children, attribute, attribute's local name, attribute's namespace, and attribute's namespace prefix are defined by the DOM Standard.

Terms content attribute, IDL attribute, valid integer, rules for parsing integers, represents, inter-element whitespace, text, flow content, phrasing content, script-supporting elements, and nothing are defined by the HTML Standard.

White space characters are U+0009 CHARACTER TABULATION and U+0020 SPACE.

Digits are characters in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.

Uppercase letters are characters in the range U+0041 LATIN CAPITAL LETTER A .. U+005A LATIN CAPITAL LETTER Z.

Lowercase letters are characters in the range U+0061 LATIN SMALL LETTER A .. U+007A LATIN SMALL LETTER Z.

Language tag characters are digits, uppercase letters, lowercase letters, and U+002D HYPHEN-MINUS.

Scheme characters are digits, uppercase letters, lowercase letters, U+0025 PERCENT SIGN, U+002A PLUS SIGN, U+002D HYPHEN-MINUS, U+002E FULL STOP and U+005F LOW LINE.

A language specification is a string consist of a @ character followed by zero or more language tag characters. The body of a language specification is the substring in the language specification except for the first @ character. It might be the empty string.

Semantically, the body of a language specification represents a language tag, similar to the xml:lang attribute @@ ref.

The text content of an element is the value that would be returned by the getter of the textContent IDL attribute of the element.

3 The SWML text serialization

This section is non‐normative.

Obviously, this section is incomplete; some prose definition is not yet available; some xrefs does not work yet. It should be specified why this is non-normative. ABNF definition & charset consideration need to be addressed.

Both prose and ABNF descriptions are non-normative. The conformance of a SWML text serialization document is defined in terms of the parser and its output.

Conformance checking steps

3.1 Document structure and header

A document in the SWML text serialization consists of three parts: header part, body part, and optional image.

Several construct in a document refers page. A page is a unit of data in a hypertext database. The name of a page is sometimes referred to as WikiName. A page sometimes represents or is associated with an image. How to implement these concept, including how to resolve WikiNames, is not defined in this specification.

document
= header-part body-part [obs-image]

The header part has to be empty. In previous versions of SWML, a magic line could be contained, and in fact was required in some versions, in the header part of a document.

A magic line has to contain a string #?, followed by the format name, followed by a / character, followed by the format version. They identifies the version of the markup language in which the document is written. Historically, only two combinations of format name and format version as shown in the table below were defined, used, and implemented:

Format name Format version Description
SuikaWiki 0.9 The SuikaWiki/0.9 markup language.
SuikaWikiImage 0.9 The SuikaWikiImage/0.9 markup language.

A magic line can contain zero or more parameters after the format version. A parameter consists of one or more white space characters, followed by the name, followed by a = character, followed by a quoted string whose value representing zero or more values separeted by a , character. A parameter value consists of zero or more characters except for the separator character ,. Historically, following combinations of parameter names and values was defined and used:

Name Values Description
default-name Zero or more characters except for , The value represetns the default user name for WikiForm input fields. Exactly one value can be specified. The default when this parameter is implementation dependent.
import Zero or more characters except for , A value represents the WikiName by which definitions for entity references are imported. When this parameter is not specified, no definition is imported.
interactive yes or no Value yes means that the document contains an interactive content such as WikiForm. Value no, the default value used when the parameter is not specified, means the document does not contain such a content. It was intended to be used for the convinience of cache control mechanisms.
obsolete yes or no Value yes means the content of the document is obsolete, and value no, the default value used when the parameter is not specified, means the content is not obsolete.

The parameter name obsolete was defined in the SuikaWiki/0.9 specification, but the parameter name that had been actually implemented in SuikaWiki2 and used was the parameter name obsoleted.

obsoleted
page-icon Zero or more characters except for , The value represents the WikiName by which the page icon is imported. The page icon can be used as favicon @@ [ref], for example. Exactly one value can be specified. The default when this parameter is implementation dependent.
image-alt Zero or more characters except for , The value represents the alternative text for the image embedded in the document. Exactly one value can be specified. The default when this parameter is the empty string.
image-type An Internet Media Type with no parameter, white spaces, comments The value represents the type of the image embedded in the document. Exactly one value can be specified. This parameter has to be specified when the document contains an image.

The order in which parameters are specified is not significant. The parameter name of a parameter has to be different from the parameter name of any other parameter.

A magic line has to be terminated by zero or more white space characters followed by a newline.

header-part
= [obs-magic-line]
obs-magic-line
= "#?" format-name "/" format-version *(1*white-space parameter) *white-space newline
format-name
= identifier
format-version
= identifier
parameter
= parameter-name "=" quoted-string
parameter-name
= identifier
parameter-value-list
= [parameter-value *("," parameter-value)]
parameter-value
:= *(char − ",")

3.2 Body part blocks

The body part of a document consists of zero or more blocks.

There are several kinds of blocks: paragraphs, headings, lists, labeled lists, quotations, preformatted paragraphs, editted sections, tables, editorial notes, comment paragraphs, hrs, and empty blocks. In addition, forms and entity references can also be used as blocks.

Empty blocks, which is represented by an empty line, can be inserted between any two blocks. It is sometimes necessary to prevent a block from being interpreted as a part of the previous block.

For example, consider the following fragment:

- List item.
This line is part of the list item.

The second line is part of the list, by definition. If it is not desired, an empty block can be inserted between two lines as:

- List item.

This line is not part of the list item.

... such that the third line represents a paragraph.

body-part
= *block
block
= paragraph / heading / list / labeled-list / quotation / preformatted-paragraph / section-block / table / editorial-note / comment-paragraph / hr / empty-block / form / obs-entity-reference
empty-block
= newline

A paragraph represents a unit of the text, similar to HTML's p element. It consists of an optional destination anchor number, followed by a line contents, followed by a newline, followed by zero or more block children.

A paragraph cannot begin with a form or entity reference, since it is treated as a block when it appears at the beginning of a line. A paragraph cannot begin with a white space character, since it is treated as a preformatted paragraph then.

A block child is one of an optional destination anchor number followed by line contents followed by a newline, a list, a labeled list, a preformatted paragraph, an section block, a table, an editorial note, a comment paragraph, or an hr.

An editorial note represents an editorial note. It is represented by a string @@, followed by zero or more white space characters, followed by zero or more block children.

A comment paragraph represents a note. It is represented by a string ;;, followed by zero or more white space characters, followed by zero or more block children.

An hr represents a break in the run of blocks in which it occurs, smilar to the HTML hr element. It is represented by a string -*-*-, followed by an optional class specification, followed by zero or more white space characters, finally followed by a newline.

paragraph
= [destination-anchor-number] line-contents newline *block-child
comment-paragraph
= ";;" *white-space [destination-anchor-number] [line-contents] newline *block-child
editorial-note
= "@@" *white-space [destination-anchor-number] [line-contents] newline *block-child
hr
= "-*-*-" [class-specification] *white-space newline
block-child
= [destination-anchor-number] line-contents newline / list / labeled-list / preformatted-paragraph / section-block / table / editorial-note / comment-paragraph / hr

A heading introduces a section. It is represented by one or more * characters, followed by zero or more white space characters, optionally followed by a destination anchor number, optionally followed by line contents, followed by a newline. The number of the * represents the depth of the section. A heading with only one * character begins a larger section than a heading>heading with more than one * characters. The line contents represents the name or caption for the section.

heading
= 1*"*" *white-space [destination-anchor-number] [line-contents] newline

There are three kinds of lists: ordered lists, unordered lists, and labeled lists. Ordered lists and unordered lists are called lists in this specification.

A list consists of zero or more items. An item in the list is represented by one or more - or = characters, followed by zero or more white space characters, optionally followed by a destination anchor number, optionally followed by line contents, followed by a newline, followed by zero or more block children. The number of - or = characters at the beginning of the item represents the depth of the list. In a list, depth of items has to be the same value. If there is another list in block children, it's items' depth has to be greater than the depth of the parent item. The last character that represents the depth of an item indicates the type of the list: - indicates unordered list while = indicates ordered list. In a list all items has to be same type.

A labeled list consists of one or more labeled list items. A labeled list item is represented by a : character, followed by zero or more white space characters, optionally followed by a destination anchor number, optionally followed by line contents, followed by zero or more white space characters, followed by a : character, followed by a destination anchor number, followed by zero or more white space characters, optionally followed by line contents, followed by newline, followed by zero or more block chidlren. The former line contents, if any, represents the label. Block children cannot contain a labeled list.

list
= 1*list-item
list-item
= 1*("-" / "=") *white-space [destination-anchor-number line-contents] newline *block-child
labeled-list
= 1*labeled-list-item
labeled-list-item
= ":" *white-space [destination-anchor-number] [line-contents] *white-space [destination-anchor-number] [line-contents] newline *block-child

The following example contains no quotation:

>>1 This is a reference, not a quote.
quotation
= 1*quoted-block
quoted-block
= 1*">" *white-space (paragraph / editorial-note / comment-paragraph / newline)
preformatted-paragraph
= preformatted-paragraph-block / obs-preformatted-paragraph
preformatted-paragraph-block
= '[PRE[' [class-specification] "[" *white-space newline *([destination-anchor-number] [line-contents] newline) ']PRE]' *white-space
obs-preformatted-paragraph
= white-space [line-contents] newline *([destination-anchor-number] [line-contents] newline)

A section block is a marked section of zero or more blocks, preceded by a section block start tag and followed by a section block end tag.

A section block start tag is a [ character, followed by a section block tag name, followed by an optional class specification, followed by a [ character, followed by zero or more white space characters, optionally followed by line contents, followed by a newline.

Whether the line contents component is allowed or not and its semantics depends on the section block tag name.

For example, the line contents component of a FIG block represents a caption (i.e. a short form of FIGCAPTION child.

A section block end tag is a ] character, followed by a section block tag name, followed by a ] character, followed by zero or more white space characters, followed by a newline.

A section block tag name represents the type of the section block. The section block tag name of a section block has to be the same value. Their semantics are described by the Block Element Table.

section-block
= '[' tag-name [class-specification] "[" *white-space [destination-anchor-number] [line-contents] newline body-part ']' tag-name ']' *white-space newline

A table represents a two-dimensional tabular data. It is similar to HTML table element, but what can be represented is even narrower than HTML table model. A table consists of one or more table rows. A table row consists of one or more table cells. Syntactically a table row is followed by a newline.

There are three kinds of table cells: data cells, header cells, and colspan cells. The first cell in a row has to be a data cell or a header cell. Syntactically a cell is preceded by a , character followed by zero or more white space characters, and is followed by zero or more white space characters.

A data cell represents a cell that contains data, like HTML td element. Likewise, a header cell represents a cell that contains data, like HTML th element. The data of a header cell has to be preceded by a * character. The cell consists of an optional destination anchor number, optionally followed by line contents. Syntactically, the cell can be provided as a quoted string, in which case its value is interpreted as an optional destination anchor number, optionally followed by line contents.

A colspan cell represents that the cell that would be placed there forms an integrated part of the cell just before that cell. The cell just before that cell might also be a colspan cell.

table
= 1*table-row
table-row
= "," data-cell *("," cell) newline
cell
= data-cell / header-cell / colspan-cell
data-cell
= *white-space ([cstartchar *cchar] / quoted-string) *white-space
header-cell
= *white-space "*" *white-space ([cstartchar *cchar] / quoted-string) *white-space
cstartchar
= char − ("," / %x22 / white-space)
cchar
= char − ","
colspan-cell
= "=="

3.3 Inline contents

Need prose definitions...

line-contents
= 1*(text / anchor-internal / anchor-external / anchor / tagged-inline-element / form / strong / emphasis / obs-entity-reference)
text
= 1*char
External reference scheme Syntax of external reference parameter Semantics
IW (identifier / quoted-string) ":" (identifier / quoted-string) InterWiki reference (An InterWikiName followed by a parameter)
MAIL RFC 2822 addr-spec but not RFC 2822 obs-addr-spec; no leading or trailing RFC 2822 FWS; no control characters (%x00-1f / %x7f) E-mail address
URI RFC 3986 URI reference URL
URL RFC 3986 URI reference URL

Maybe these schemes should reference Web Applications 1.0's URL and mail address syntax.

InterWiki is a mechanism for the hyperlinking and the combination of an InterWikiName and a parameter identifies the destination of the link. The interpretation of an InterWiki link is implementation dependent.

External reference schemes URI and URL ought not to be used.

destination-anchor-number
= "[" 1*DIGIT "]"
anchor-internal
= ">>" 1*DIGIT
anchor-external
= "<" external-reference ">"
external-reference
= URL / external-reference-scheme ":" external-reference-parameter
URL
= 1*uschar ":" external-reference-parameter
external-reference-scheme
= 1*xschar
external-reference-parameter
= *(char − ("<" / ">" / %x22) / quoted-string)
uschar
= char − (":" / UALPHA)
xschar
= char − (":" / LALPHA)
anchor
= "[[" [line-contents] [inline-middle-tag [line-contents]] inline-end-tag
Tag name Number of middle tags Internal reference source anchor External reference source anchor Semantics
AA 0 Not allowed Not allowed Character art (so-called ASCII-art, aa element)
ABBR 0 or 1 Not allowed Not allowed Abbreviation (HTML abbr element)
CITE 0 Not allowed Not allowed Title of a work (HTML cite element)
CODE 0 or 1 Not allowed Not allowed Code (HTML code element)
CSECTION 0 Not allowed Not allowed Title of a section in a work (csection element)
DEL 0 Allowed Allowed Removal (HTML del element)
DFN 0 or 1 Not allowed Not allowed Defined term (HTML dfn element)
F 0 Not allowed Not allowed Field name (f element)
FRAC 1 Not allowed Not allowed Fraction (mfrac element)
INS 0 Allowed Allowed Insertion (HTML ins element)
KBD 0 Not allowed Not allowed User input (HTML kbd element)
KEY 0 Not allowed Not allowed Keyboard's key (key element)
LAT 0 or 1 Not allowed Not allowed Latitude (lat element)
LON 0 or 1 Not allowed Not allowed Longitude (lon element)
MAY 0 Not allowed Not allowed RFC 2119 keyword "MAY" (MAY element)
MUST 0 Not allowed Not allowed RFC 2119 keyword "MUST" (MUST element)
N 0 or 1 Not allowed Not allowed Number (n element)
Q 0 Allowed Allowed Quotation (HTML q element)
QN 0 or 1 Not allowed Not allowed Qualified name (qn element)
RUBY 1 or 2 Not allowed Not allowed Ruby annotation (HTML ruby element)
RUBYB 1 Not allowed Not allowed Secondary ruby annotation (rubyb element)
SAMP 0 Not allowed Not allowed Sample (HTML samp element)
SHOULD 0 Not allowed Not allowed RFC 2119 keyword "SHOULD" (SHOULD element)
SPAN 0 or 1 Not allowed Not allowed Span of text (HTML span element)
SRC 0 Not allowed Not allowed Short annotation for citation (src element)
SUP 0 Not allowed Not allowed Superscript (HTML sup element)
SUB 0 Not allowed Not allowed Subscript (HTML sub element)
TIME 0 or 1 Not allowed Not allowed Date or time (HTML time element)
TZ 0 or 1 Not allowed Not allowed Time zone offset (tz element)
VAR 0 Not allowed Not allowed Variable (HTML var element)
WEAK 0 Not allowed Not allowed Small print (weak element)

A future revison to this specification might define more tag names.

An inline start tag whose tag name is INS or DEL might not be placed at the beginning of a line contents construct, since it could be interpreted as a block start tag.

A class specification represents class names unless otherwise specified. The class specification syntactically consist of a ( character followed by the body of the class specification followed by a ) character. The body of a class specification consists of zero or more characters excluding (, ), and \. The body of the class specification has similar semantics and processed similarly to HTML class attribute.

tagged-inline-element
= inline-start-tag [line-contents] *(inline-middle-tag [line-contents]) inline-end-tag
inline-start-tag
= "[" tag-name [class-specification] [language-specification] "["
tag-name
= 1*LALPHA
class-specification
= "(" *clchar ")"
clchar
= char − ("(" / ")" / "\")
language-specification
= "@" *ltchar
ltchar
= ALPHA / DIGIT / "-"
inline-middle-tag
= "]" *white-space [language-specification] "["
inline-end-tag
= "]" [anchor-internal / anchor-external] "]"

The form name specification, if any, defines the name of the form. It has to be different from any other form name defined in the document. A form name specification is syntactically class specification and the body of it is the form name. A form name cannot contain white space characters.

Specific form name Syntax of specific form parameters Semantics
comment Empty Comment input form.
embed ['IMG:'] identifier Embedding another page. The parameter specifies the WikiName of the page embedded. If the parameter begins with a string IMG:, the page is embedded as an image and the string does not form the part of the WikiName.
form N/A Reserved.
rcomment Empty Comment input form; a new comment is inserted after the form.
searched identifier Insert a search result for the parameter.

The form is an extension mechanism for the SWML text serialization. ...

The generic form can be used to embed a WikiForm specification. WikiForm provides a generic framework for describing user input forms and templates used for processing form inputs.

Three form fields in a form represents input template, output template, and options. Interpretation and processing for these fields are implementation dependent.

The name form cannot be used.

Names embed, rcomment, and searched are obsolete and cannot be used.

form
= generic-form / specific-form
generic-form
= "[[#" 'form' [form-name-specification] ":" form-field ":" form-field [":" form-field] "]]"
form-name-specification
= class-specification
form-field
= "'" *(char − ("'" / "\") / quoted-pair) "'"
specific-form
= "[[#" specific-form-name [":" specific-form-parameters] "]]"
specific-form-name
= 1*(LALPHA / "-")
specific-form-parameters
= identifier *(":" identifier)
strong
= "'''" [line-contents] "'''"
emphasis
= "''" [line-contents] "''"

3.4 Images

A document can contain an image by including a string __IMAGE__ followed by a newline followed by Base64 RFC 2045 encoded image data, at the end of the document. Parameters image-type and image-alt provide metadata for the image.

obs-image
= '__IMAGE__' *char

3.5 Lexical structures

An entity reference is a part of document that is expected to be replaced by a fragment imported from another document. It is no longer supported.

obs-entity-reference
= "__&&" 1*char "&&__"

A character is a character from the coded character set used to encode the document. Unless otherwise specified, for the purpose of this specification, control characters (characters in the range U+0000 .. U+001F and U+007F) are not a character.

A newline can be represented in any of three common conventions: CR (U+000D), LF (U+000A), or CR followed by LF.

A quoted string is zero or more characters enclosed by " characters. In a quoted string, character \ can only be used as part of quoted pair. A quoted pair is \ followed by a character. The value of a quoted string is the string obtained by removing " characters enclosing the quoted string and removing \ characters at the beginning of the quoted pairs.

identifier
= 1*(ALPHA / DIGIT / "-" / non-ascii)
non-ascii
= char − %x00-7f
char
= <Any character> − (%x00-1f / %x7f)
quoted-string
= %x22 *(char − ("\" / %x22) / quoted-pair) %x22
quoted-pair
= "\" char
newline
= %x0d %x0a / %x0d / %x0a
white-space
= %x09 / %x20

4 Parsing documents in the SWML text serialization

This section specifies how to convert a string of characters into a node tree, assuming the string is written in the SWML text serialization. This process is referred to as parsing and an implementation that performes this process is referred to as parser.

How to convert a string of bytes into a string of characters is outside of the scope of this specification.

The parsing process is defined in terms of DOM and relies on HTML5 ... and manakai's extensions to DOM .... However, a conforming parser don't have to implement them, as long as the end result is equivalent.

The parsing process is divided into two stages: tokenization and tree construction. The tokenization stage emits a sequence of tokens, which are used as inputs for the tree construction stage. The tree construction stage constructs a node tree. Some steps invoked in the tokenization stage might also construct a part of the node tree. During the parsing, mutation events MUST NOT be invoked.

Before the actual parsing starts, a new Document object MUST be created. It represents the node tree constructed as a result of the parsing. The innerHTML IDL attribute of the Document object MUST be initially set to <html xmlns="http://www.w3.org/1999/xhtml"><head></head><body></body></html>. The document element is what the documentElement IDL attribute of the Document returns. The head element is what the firstChild IDL attribute of the document element returns at the time immediately after the innerHTML is set. The body element is what the lastChild IDL attribute of the document element returns at the time immediately after the innerHTML is set. The image element is initially null.


When the parser appends a character char to node node, the manakai_append_text method ... MUST be invoked on node with the argument char.

When an element is created, its namespace prefix MUST be set to null.

When an attribute is created, its namespace and namespace prefix MUST both set to null, unless an attribute in namespace is created, in which case its namespace MUST be set to the namespace's URL and its namespace prefix MUST be set to namespace's preferred prefix.


A class specification is a string consist of a ( character, followed by zero or more character that is not one of (, ), or white space characters, and finally followed by a ) character. The body of a class specification is the substring of the class specification between parentheses (exclusive). It might be the empty string.

4.1 Tokenization of lines

When a string of characters is tokenized, the string s MUST be processed as follows:

  1. Let pos be zero (0). It represents the index in s. The index of the first character in data is zero (0).
  2. If pos is greater than or equal to the length of s, then emit an end-of-file token and abort these steps.
  3. Let line be the empty string.
  4. If the posth character of s is U+000D CARRIAGE RETURN, process line. Set line to the empty string. If the (pos + 1)th character of s is U+000A LINE FEED, increment pos by one (1).
  5. Otherwise, if the posth character of s is U+000A LINE FEED, process line. Set line to the empty string.
  6. Otherwise, append the posth character of s to line.
  7. Increase pos by one (1).
  8. Go back to the fourth step of these steps.

The steps above emit one or more sequence of tokens, which are inputs to the tree construction stage. A token can have zero or more properties, depending on the kind of the token. There are several kinds of tokens and properties as follows:

Block start tag token
Classes and tag name properties.
Block end tag token
Tag name property.
Character token
Data property.
Comment paragraph start token
No property.
Editorial note start token
No property.
Element token
Local name, namespace, anchor attribute, by attribute, resScheme attribute, resParameter attribute, and content attribute. Default for these properties are null.
Emphasis token
No property.
Empty line token
No property.
End-of-file token
No property.
Form token
Name, id, and parameters properties.
Heading start token
Depth property.
Heading end token
No property.
Inline start tag token
Tag name, classes, and language properties. Default for these properties is null.
Inline middle tag token
language property, whose default is null.
Inline end tag token
Anchor attribute, resScheme attribute, and resParameter attribute properties. Default for these properties is null.
Labeled list start token
No property.
Labeled list middle token
No property.
List start token
Depth property.
Preformatted start token
No property.
Preformatted end token
No property.
Quotation start token
Depth property.
Strong token
No property.
Table row start token
No property.
Table row end token
No property.
Table cell start token
Header property.
Table cell end token
No property.
Table colspan cell token
No property.
Block element token
classes property, whose default is null.

Mode is a state of the tokenizer and is one of "initial" (the initial value used when the tokenization starts), "body", "preformatted", "preformatted block", and "image data".

Continuous line flag is another flag of the tokenizer, representing whether a new line character should be appended to the data, and takes either true or false. This flag is mainly used in the "body" mode.

When a line is processed, rules specified in the following subsections is used according to the appropriate mode. Rules below sometimes require the line be reprocessed. In such cases, rules for the appropriate mode MUST be followed with the same line.

4.1.1 The "initial" mode

In the "initial" mode, line MUST be processed as follows:

If line starts with #?
Parse a magic line line.
Otherwise
  1. Set the continuous line flag to false.
  2. Switch to the "body" mode and reprocess line.

4.1.2 The "body" mode

In the "body" mode, line MUST be processed as follows:

If line is empty
  1. Set the continuous line flag to false.
  2. Emit an empty line token.
If line starts with a white space character
  1. Emit a preformatted start token.
  2. Run the algorithm to tokenize a text with line.
  3. Switch to the "preformatted" mode.
If line starts with *
  1. Let data be line.
  2. Let depth be zero (0).
  3. While the first character of data, if any, is *, run the following substeps:
    1. Increase depth by one (1).
    2. Remove the first character of data. (The removed character will be *.)
  4. Remove white space characters at the beginning of data, if any.
  5. Emit a heading start token whose depth set to depth.
  6. Run the algorithm to tokenize a text with data.
  7. Emit a heading end token.
  8. Finally, set the continuous line flag to false.
If line is a string consists of -*-*-, optionally followed by a class specification, followed by zero or more white space characters
  1. Let classes be the body of the class specification in the matched substring of data, if any, or null, otherwise.
  2. Emit a block element token whose classes set to classes.
  3. Set the continuous line flag to false.
If line starts with - or =
  1. Let data be line.
  2. Let depth be the empty string.
  3. While the first character of data, if any, is - or =, run the following substeps:
    1. Append the first character of data to depth.
    2. Remove the first character of data.
  4. Remove white space characters at the beginning of data, if any.
  5. Emit a list start token whose depth set to depth.
  6. Run the algorithm to tokenize a text with data.
  7. Finally, set the continuous line flag to true.
If line starts with :
  1. Let name be the empty string.
  2. Let data be line.
  3. Remove the first character of data. (The removed character will be :.)
  4. While data is not empty and the first character of data is not :, run the following substeps:
    1. Append the first character of data to name.
    2. Remove the first character of data.
  5. If name is the empty string, run the following substeps:
    1. Emit a character token whose data is a : character.
    2. Run the algorithm to tokenize a text with name.

    In this case, line does not represent a description list.

  6. Otherwise, run the following substeps:
    1. Remove white space characters at the beginning of name, if any.
    2. Remove white space characters at the end of name, if any.
    3. Emit a labeled list start token.
    4. Run the algorithm to tokenize a text with name.
    5. Remove the first character of data. (The removed character will be :.)
    6. Remove white space characters at the beginning of data, if any.
    7. Emit a labeled list middle token.
    8. Run the algorithm to tokenize a text with data.
  7. Finally, set the continuous line flag to true.
If line starts with >
  1. Let data be line.
  2. Let depth be zero (0).
  3. While the first character of data, if any, is >, run the following substeps:
    1. Increase depth by one (1).
    2. Remove the first character of data. (The removed character will be >.
  4. If depth is two (2), data is not empty, and the first character of data is one of digits, run the following substeps:
    1. Prepend two > characters to data.
    2. If the continuous line flag is true, preprend a U+000A LINE FEED character to data.
    3. Run the algorithm to tokenize a text with data.
    4. Set the continuous line flag to true.
  5. Otherwise, run the following substeps:
    1. Emit a quotation start token whose depth set to depth.
    2. Remove white space characters at the beginning of data, if any.
    3. If the length of data is greater than one (1) and the first two characters of data are @@, run the following substeps:
      1. Remove the first two characters of data. (The removed characters will be @@).
      2. Emit a editorial note start token.
      3. Remove white space characters at the beginning of data, if any.
      4. Set the continuous line flag to true.
    4. If the length of data is greater than one (1) and the first two characters of data are ;;, run the following substeps:
      1. Remove the first two characters of data. (The removed characters will be ;;).
      2. Emit a comment paragraph start token.
      3. Remove white space characters at the beginning of data, if any.
      4. Set the continuous line flag to true.
    5. Otherwise, if data is not empty, set the continuous line flag to true.
    6. Otherwise, set the continuous line flag to false.
    7. In any case, run the algorithm to tokenize a text with data.
If line is a string consist of a [ character, followed by a section block tag name, optionally followed by class specification, followed by a [ character, followed by zero or more white space characters, followed by zero or more characters
  1. Emit a block start tag token whose tag name is the section block tag name, and classes is the body of the class specification, if any, or null otherwise.
  2. Remove the substring of line, from the beginning of the string, to the [ character after the section block tag name and class specification (if any), from line.
  3. Remove white space characters at the beginning of line, if any.
  4. If line is not the empty string:
    1. Set tag name to FIGCAPTION.
    2. If the the section block tag name is TALK, set tag name to SPEAKER.
    3. Emit a block start tag token whose tag name is tag name.
    4. Run the algorithm to tokenize a text with line.
    5. Emit a block end tag token whose tag name is tag name.
  5. Set the continuous line flag to false.
If line is a string consist of [PRE, optionally followed by class specification, followed by a [ character, followed by zero or more white space characters
  1. Emit a block start tag token whose tag name is PRE and classes is the body of the class specification, if any, or null otherwise.
  2. Set the continuous line flag to false.
  3. Switch to the "preformatted block" mode.
If line starts with @@
  1. Let data be line.
  2. Remove the first two characters of data. (The removed characters will be @@.)
  3. Remove white space characters at the beginning of data, if any.
  4. Emit a editorial note start token.
  5. Run the algorithm to tokenize a text with data.
  6. Set the continuous line flag to true.
If line starts with ;;
  1. Let data be line.
  2. Remove the first two characters of data. (The removed characters will be ;;.)
  3. Remove white space characters at the beginning of data, if any.
  4. Emit a comment paragraph start token.
  5. Run the algorithm to tokenize a text with data.
  6. Set the continuous line flag to true.
If line is a string consist of a ] character, followed by a section block tag name, followed by a ] character, followed by zero or more white space characters
  1. Emit a block end tag token whose tag name is the section block tag name.
  2. Set the continuous line flag to false.
If line starts with ,
  1. Run the algorithm to tokenize a table row with line.
  2. Set the continuous line flag to false.
If line is __IMAGE__
Switch to the "image data" mode.
Otherwise
  1. If the continuous line flag is true, emit a character token whose data is a U+000A LINE FEED character.
  2. Run the algorithm to tokenize a text with line.
  3. Set the continuous line flag to true.

4.1.3 The "preformatted" mode

In the "preformatted" mode, line MUST be processed as follows:

If line is the empty string
  1. Emit a preformatted end token.
  2. Switch to the "body" mode and reprocess line.
If line is a string consist of a ] character, followed by a section block tag name, followed by a ] character, followed by zero or more white space characters
  1. Emit a preformatted end token.
  2. Emit a block end tag token whose tag name is the section block tag name.
  3. Set the continuous line flag to false.
  4. Switch to the "body" mode.
Otherwise
  1. Emit a character token whose data is a U+000A LINE FEED character.
  2. Run the algorithm to tokenize a text with line.

4.1.4 The "preformatted block" mode

In the "preformatted block" mode, line MUST be processed as follows:

If line is a string consist of ]PRE] followed by zero or more white space characters
  1. Emit a block end tag token whose tag name is PRE.
  2. Set the continuous line flag to false.
  3. Switch to the "body" mode.
Otherwise
  1. If the continuous line flag is true, emit a character token whose data is a U+000A LINE FEED character.
  2. Run the algorithm to tokenize a text with line.
  3. Set continuous line flag to true.

4.1.5 The "image data" mode

In the "image data" mode, line MUST be processed as follows:

  1. If the image element is null, then create an image element in the SuikaWiki/0.9 namespace and set the image element to that element. Append the image element to the document element.
  2. Otherwise, append a character U+000A LINE FEED to the image element.
  3. Then, append each character in line in the same order to the image element.

4.2 Tokenization of a table row

The algorithm to tokenize a table row data is as follows:

  1. Let pos be zero (0). It represents the index in data. The index of the first character in data is zero (0).
  2. Emit a table row start token.
  3. LOOP: If pos is greater than or equal to the length of data, emit a table row end token and abort this algorithm.
  4. Increase pos by one (1).
  5. Let cell be the empty string.
  6. Let cell quoted be null.
  7. If pos is greater than or equal to the length of data, emit a table row end token and abort this algorithm.
  8. If the posth character in data is a white space character, increase pos by one (1) and go back to the previous step.
  9. If the posth character in data is a * character, set the header cell flag and increase pos by one (1).
  10. If the posth character in data is ", set cell quoted to the empty string and follow the substeps below:
    1. Increase pos by one (1).
    2. If pos is greater than or equal to the length of data, abort these substeps.
    3. Otherwise, if the posth character in data is ", abort these substeps.
    4. Otherwise, if the posth character in data is \, follow the substeps below:
      1. Increase pos by one (1).
      2. If pos is greater than or equal to the length of data, abort these substeps.
      3. Otherwise, append the posth character in data to cell quoted.
    5. Otherwise, append the posth character in data to cell quoted.
    6. Go back to the first substep in these substeps.
  11. While pos is less than the length of data, run the following substeps:
    1. If the posth character in data is ,, abort these substeps.
    2. Append the posth character in data to cell.
    3. Increase pos by one (1).
  12. Remove white space characters at the end of data, if any.
  13. If header cell flag is not set, cell quoted is null, and cell is equal to ==, then emit a table colspan cell token and go back to the step labeled LOOP.
  14. Emit a table cell start token whose header is whether header cell flag is set or not.
  15. If cell quoted is not null, run the algorithm to tokenize a text with cell quoted.
  16. Run the algorithm to tokenize a text with cell.
  17. Emit a table cell end token.
  18. Go back to the step labeled LOOP.

4.3 Tokenization of a text

The algorithm to tokenize a text data is as follows:

  1. Let nest level be zero (0).
  2. If data begins with [ followed by one or more digits followed by ], run the following steps:
    1. Let number be the digits in the matched substring.
    2. Remove the matched substring frm data.
    3. Emit an element token whose local name is anchor-end, namespace is the SuikaWiki/0.9 namespace, anchor attribute is number, and content is [ followed by number followed by ].
  3. While the length of data is not zero (0), run the appropriate steps:
    If data begins with [[#, followed by one or more lowercase letters or U+002D HYPHEN-MINUS
    1. Let name be the lowercase letters and U+002D HYPHEN-MINUS in the matched substring.
    2. Remove the matched substring from data.
    3. Let id be null.
    4. Let parameters be an empty list.
    5. If data begins with a class specification, run the following substeps:
      1. Set the id to the body of the class specification.
      2. Remove the class specification from data.
    6. While the first character of data is :, run the following substeps:
      1. Remove the first character of data.
      2. If the length of data is greater than one (1) and the first two characters of data are ]], abort these substeps.
      3. Let parameter be the empty string.
      4. If data is empty, append parameter to parameters and abort these substeps.
      5. If the first character of data is ', run the following steps:
        1. Remove the first character of data.
        2. If data is empty, abort these substeps.
        3. If the first character of data is ', abort these substeps.
        4. If the first character of data is \, run the following substeps:
          1. Remove the first character of data.
          2. If data is empty, abort these substeps.
          3. Append the first character of data to parameter.
        5. Otherwise, append the first character of data to parameter.
        6. Go back to the first substep in these substeps.
      6. Otherwise, run the following steps:
        1. If data is empty, or if the first character of data is :, abort these substeps.
        2. Append the first character of data to parameter.
        3. Remove the first character of data.
        4. Go back to the first substep of these substeps.
      7. Append parameter to parameters.
    7. If the length of data is greater than one (1) and the first two characters of data are ]], remove these characters from data.
    8. Emit a form token whose name is name, id is id, and parameters is parameters.
    Otherwise, if the data begins with [[
    1. Remove the matched substring from data.
    2. Emit an inline start tag token.
    3. Increase nest level by one (1).
    If data begins with [, followed by one or more uppercase letters, optionally followed by a class specification, optionally followed by a language specification, followed by [
    1. Let tag name be the uppercase letters in the matched substring of data.
    2. Let classes be the body of the class specification in the matched substring of data, if any, or null, otherwise.
    3. Let language be the body of the language specification in the matched substring of data, if any, or null, otherwise.
    4. Remove the matched substring from data.
    5. Emit an inline start tag token whose tag name is tag name, classes is classes, and language is language.
    6. Increase nest level by one (1).
    If data begins with ]]
    1. Remove the matched substring from data.
    2. Emit an inline end tag token.
    3. If nest level is greater than zero (0), decrease nest level by one (1).
    If data begins with ]<, followed by one or more scheme characters, followed by :
    1. Remove the matched substring from data and then act as if the first two character of the original data before the removal were < instead of ]<, except that the emitted token is an inline end tag token instead of an element token. The resScheme attribute of the token MUST be the resScheme attribute of the token that would be emitted if the first two character were <. The resParameter attribute of the token MUST be the resParameter attribute of the token that would be emitted if the first two character were <.
    2. If data begins with ], remove the character from data.
    3. If nest level is greater than zero (0), decrease nest level by one (1).
    If data begins with ]>> followed by one or more digits, followed by ]
    1. Let number be the digits in the matched substring.
    2. Remove the matched substring from data.
    3. Emit an inline end tag token whose anchor is number.
    4. If nest level is greater than zero (0), decrease nest level by one (1).
    If nest level is greater than zero (0) and data begins with ] followed by zero or more white space characters followed by [
    If nest level is greater than zero (0) and data begins with ] followed by zero or more white space characters followed by a language specification followed by [
    1. Let lang be the body of the language specification in the matched substring of data, if any, or null, otherwise.
    2. Remove the matched substring from data.
    3. Emit an inline middle tag token whose language is lang.
    If data begins with <, followed by one or more scheme characters, followed by :
    1. Let scheme be the scheme characters part of the matched substring.
    2. Remove the matched substring from data.
    3. Let value be the empty string.
    4. Run the following steps:
      1. If data is empty, abort these steps.
      2. If the first character of data is >, remove the first character of data and abort these steps.
      3. If the first character of data is ", append " to data and run the following substeps:
        1. Remove the first character of data.
        2. If data is empty, abort these steps.
        3. If the first character of data is ", append " to value, remove the first character of data, and abort these substeps.
        4. If the first character of data is \, run the following substeps:
          1. Append \ to value.
          2. Remove the first character of data.
          3. If data is empty, abort these steps.
          4. Append the first character of data to value.
        5. Otherwise, append the first character of data to value.
        6. Return back to the first substep of these substeps.
      4. Otherwise, run the following substeps:
        1. Append the first character of data to value.
        2. Remove the first character of data.
      5. Go back to the first substeps in these substeps.
    5. Let content be scheme followed by : followed by value.
    6. If scheme does not contain one of uppercase letters, set value to content and set scheme to URI.
    7. Emit an element token whose local name is anchor-external, namespace is the SuikaWiki/0.9 namespace, resScheme attribute is scheme, resParameter attribute is value, and content is content.
    If data begins with '''
    1. Remove the matched substring from data.
    2. Emit a strong token.
    Otherwise, if data begins with ''
    1. Remove the matched substring from data.
    2. Emit an emphasis token.
    If data begins with >> followed by one or more digits
    1. Emit an element token whose local name is anchor-internal, namespace is the SuikaWiki/0.9 namespace, anchor attribute is the digits part of the matched substring, and content is the matched substring.
    2. Remove the matched substring from data.
    If data begins with __&&
    1. Remove the matched substring from data.
    2. If data begins with &&__, or if data does not contain &&__ as a substring, emit four character tokens whose data are _, _, &, and & respectively and remove the first four characters of data and abort these steps.
    3. Let name be the substring of data between the beginning of the string and the first occurence of &&__ (exclusive).
    4. Remove the first occurence of &&__ and any character before it from data.
    5. Emit an element token whose local name is replace, namespace is the SuikaWiki/0.9 namespace, by attribute is name.
    Otherwise
    1. Emit a character token whose data set to the first character of data.
    2. Remove the first character of data.

4.4 Parsing a magic line

To parse a magic line data, the following steps MUST be used:

  1. Remove the first two characters of data. (It will be #?.)
  2. If there are one or more characters that are not white space characters at the beginning of data, run the following substeps:
    1. Let name be those characters.
    2. Let version be null.
    3. Remove those characters from data.
    4. If name contains /, set the substring after the first occurence of the character to version. Note that version might become the empty string. Remove the / character and the substring after the character from name.
    5. Set the Name content attribute of the document element in the SuikaWiki/0.9 namespace to name.
    6. If version is not null, set the Version content attribute of the document element in the SuikaWiki/0.9 namespace to version.
  3. Run the following substeps:
    1. If data is empty, abort these substeps.
    2. If the first character of data is a white space character, remove the character from data and go back to the first substep of these substeps.
    3. Let name be the empty string.
    4. If data begins with one or more characters that are not =, set name to those characters and remove those characters from data.
    5. Let parameter be a newly created parameter element in the SuikaWiki/0.9 namespace and set the name content attribute of parameter to name.
    6. Remove the first character of data. (It will be =.)
    7. If the first character of data, if any, is ", remove that character from data.
    8. Run the following substeps:
      1. Let value be the empty string.
      2. If data is empty, or if the first character of data is ", create a value element in the SuikaWiki/0.9 namespace, set the textContent IDL attribute of the node to value, and append the node to parameter.
      3. Otherwise, if the first character of data, if any, is \, run the following substeps:
        1. Remove the first character of data. (The removed character will be \.)
        2. If the first character of data, if any, is ,, abort these substeps.
        3. Otherwise, append the first character of data, if any, to value.
      4. In any case, if the first character of data is ,, create a value element in the SuikaWiki/0.9 namespace, set the textContent IDL attribute of the node to value, append the node to parameter, and go back to the first substep of these substeps.
      5. Otherwise, append the first character of data to value.
      6. Go back to the second substep of these substeps.
    9. If the first character of data, if any, is ", remove that character from data.
    10. Append parameter to the head element.
    11. Go back to the first substep of these substeps.

4.5 Tree construction

The tree construction stage constructs a node tree from a series of tokens emitted by the tokenization stage. The tree construction stage has two state variables: insertion mode and stack of open elements.

The insertion mode is one of "in section", "in table row", or "in paragraph". The default that MUST be used when the tree construction begins is the "in section" insertion mode. The rules for these insertion modes are described in the subsections below.

When the algorithm below says that the parser is to do something “using the rules for the m insertion mode”, the parser MUST use the rules described under the m insertion mode's section, but MUST leave the insertion mode unchanged.

The stack of open elements contains tuples of (element node, section depth, quotation depth, list depth). These stack grows downdards; the topmost entry on the stack is the first one added to the stack, and the bottommost entry of the stack is the most recently added entry in the stack. It initially contains only a tuple: (the body element, 0, 0, 0). When an entry is pushed to the stack of open elements, the items of the new tupple is set to the same values as the bottommost tuple unless otherwise specified.

The current element is the element node of the bottommost entry in the stack of open elements.

4.5.1 The "in section" insertion mode

In the "in section" insertion mode, a token MUST be processed as follows:

A heading start token
  1. If the local name of the current element is not one of body, section, and block elements, then pop the element off the stack of open elements and follow this substep again.
  2. Let current depth be the section depth of the bottommost entry in the stack of open elements.
  3. If depth of the token is less than or equal to the current depth, pop the element off the stack of open elements and go back to the first substep of these substeps.
  4. Otherwise, if depth of the token is greater than current depth + 1, create a section element in the HTML namespace, append the element created to the current element, push the element created to the stack of open elements with section depth set to current depth + 1, quotation depth set to zero (0), and list depth set to zero (0), and go back to the first substep of these substeps.
  5. Create a section element in the HTML namespace.
  6. Append the element created to the current element.
  7. Push the element created to the stack of open elements with section depth set to depth, quotation depth set to zero (0), and list depth set to zero (0).
  8. Create a h1 element in the HTML namespace.
  9. Append the element created to the current element.
  10. Push the element created to the stack of open elements.
  11. Switch to the "in paragraph" insertion mode.
A block start tag token whose tag name is not PRE
  1. If the token's tag name is TALK:
    1. If the current element's local name is not dialogue:
      1. Let element be a dialogue element in the SuikaWiki/0.9 namespace.
      2. Append element created to the current element.
      3. Push element to the stack of open elements.
    2. Otherwise:
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Let row be the table row of the Block Element Table whose tag name is the token's tag name.
      3. Create an element whose namespace is row's namespace and whose local name is row's namespace.
      4. Append the element created to the current element.
      5. Push the element created to the stack of open elements with section depth set to zero (0), quotation depth set to zero (0), and list depth set to zero (0).
      6. If the token's classes is not null, set the class content attribute of the element created to classes.
      A block end tag token whose tag name is not PRE
      1. Let row be the table row of the Block Element Table whose tag name is the token's tag name.
      2. Let local name be row's local name.
      3. If the stack of open elements contains an element whose local name is local name, pop the current element off the stack of open elements until an element whose local name is local name has been popped from the stack of open elements.
      4. Set the continuous line to false.
      A block element token
      1. Create an hr element in the HTML namespace.
      2. Append the element created to the current element.
      3. If the token's classes is not null, set the class content attribute of the element created to classes.
      A quotation start token
      1. If the local name of the current element is not one of blockquote, body, section, and block elements, then pop the element off the stack of open elements and follow this substep again.
      2. Let current depth be the quotation depth of the bottommost entry in the stack of open elements.
      3. If depth of the token is less than the current depth, pop the element off the stack of open elements and go back to the first substep of these substeps.
      4. Otherwise, if depth of the token is greater than current depth, create a blockquote element in the HTML namespace, append the element created to the current element, push the element created to the stack of open elements with section depth set to zero (0), quotation depth set to current depth + 1, and list depth set to zero (0), and go back to the first substep of these substeps.
      A list start token
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Let current depth be the list depth of the current element.
      3. Let inserted depth be the length of depth of the token.
      4. Let local name be ul, if the last character in depth is -, or ol, otherwise.
      5. If current depth is greater than inserted depth, pop the current element off the stack of open elements and go back to the first substep of these substeps.
      6. If the list depth of the current element is equal to inserted depth and the local name of the current element is not local name, pop the current element off the stack of open elements and go back to the first substep of these substeps.
      7. If current depth is less than inserted depth, run the following substeps:
        1. Let type be the character at the index equal to current depth in depth of the token, where the index of the first character in depth is zero (0).
        2. If type is -, create a ul element in the HTML namespace.
        3. Otherwise, create a ol element in the HTML namespace.
        4. Append the element created to the current element.
        5. Push the element created to the stack of open elements, with list depth set to current depth + 1.
        6. If current depth + 1 is less than inserted depth, run the following substeps:
          1. Create a li element in the HTML namespace.
          2. Append the element created to the current element.
          3. Push the element created to the stack of open elements.
        7. Go back to the first substep for the list start token.
      8. Create a li element in the HTML namespace.
      9. Append the element created to the current element.
      10. Push the element created to the stack of open elements.
      11. Switch to the "in paragraph" insertion mode.
      A labeled list start token
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. If the local name of the current element is dd, pop the element off the stack of open elements.
      3. If the local name of the current element is not dl, create a dl element in the HTML namespace, append the element created to the current element, and push the element created to the stack of open elements.
      4. Create a dt element in the HTML namespace.
      5. Append the element created to the current element.
      6. Push the element created to the stack of open elements.
      7. Switch to the "in paragraph" insertion mode.
      A table row start token
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Create a table element in the HTML namespace.
      3. Append the element created to the current element.
      4. Push the element created to the stack of open elements.
      5. Create a tbody element in the HTML namespace.
      6. Append the element created to the current element.
      7. Push the element created to the stack of open elements.
      8. Create a tr element in the HTML namespace.
      9. Append the element created to the current element.
      10. Push the element created to the stack of open elements.
      11. Switch to the "in table row" insertion mode.
      A block start tag token whose tag name is PRE
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Create a pre element in the HTML namespace.
      3. Append the element created to the current element.
      4. Push the element created to the stack of open elements.
      5. If the token's classes is not null, set the class content attribute of the element created to classes.
      6. Switch to the "in paragraph" insertion mode.
      A preformatted start token
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Create a pre element in the HTML namespace.
      3. Append the element created to the current element.
      4. Push the element created to the stack of open elements.
      5. Switch to the "in paragraph" insertion mode.
      A comment paragraph start token
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Create a comment-p element in the SuikaWiki/0.10 namespace.
      3. Append the element created to the current element.
      4. Push the element created to the stack of open elements.
      5. Switch to the "in paragraph" insertion mode.
      A editorial note start token
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. Create a ed element in the SuikaWiki/0.10 namespace.
      3. Append the element created to the current element.
      4. Push the element created to the stack of open elements.
      5. Switch to the "in paragraph" insertion mode.
      An empty line token
      1. If the current element's local name is not one of body, section, dialogue, and block elements, then pop the element off the stack of open elements and follow this substep again.
      A form token
      An element token whose local name is replace
      Process the token using the rules for the "in paragraph" insertion mode.
      An end-of-file token
      Now the Document has been constructed. Abort the parser.
      Any other block start tag token
      A labeled list middle token, heading end token, preformatted end token, table row end token, table cell start token, table cell end token, or table colspan cell token
      Ignore the token.
      Anything else
      1. If the current element's local name is dialogue, pop the current element off the stack of open elements.
      2. If the current element's local name is not one of p, li, dd, figcaption, comment-p, or ed, or if the current element's local name is figcaption or speaker and the current element's children is not empty, run the following substeps:
        1. Create a p element in the HTML namespace.
        2. Append the element created to the current element.
        3. Push the element created to the stack of open elements.
      3. Switch to the "in paragraph" insertion mode and reprocess the token.

4.5.2 The "in table row" insertion mode

In the "in table row" insertion mode, a token MUST be processed as follows:

A table cell start token
  1. Let local name be th if the header of the token is true, or td otherwise.
  2. Create a local name element in the HTML namespace.
  3. Append the element created to the current element.
  4. Push the element created to the stack of open elements.
  5. Switch to the "in paragraph" insertion mode.
A table colspan cell token
  1. If the local name of the node returned by the lastChild IDL attribute of the current element, if any, is td or th, increase the value of colspan IDL attribute of the node by one (1) and abort these substeps.
  2. Create a td element in the HTML namespace.
  3. Append the element created to the current element.
A table row end token
If the local name of the current element is tr, pop the element off the stack of open elements.
A table row start token
  1. Create a tr element in the HTML namespace.
  2. Append the element created to the current element.
  3. Push the element created to the stack of open elements.
Anything else
Switch to the "in section" insertion mode and reprocess the token.

4.5.3 The "in paragraph" insertion mode

In the "in paragraph" insertion mode, a token MUST be processed as follows:

A character token
Append the character in data of the token to the current element.
An inline start tag token whose tag name is null
  1. Create an anchor element in the SuikaWiki/0.9 namespace.
  2. Append the element created to the current element.
  3. Push the element created to the stack of open elements.
Any other inline start tag token
  1. Create an element. The namespace and local name of the element is determined according to the tag name of the inline start tag token as shown in the following table:
    Tag name Namespace Local name
    AA The AA namespace aa
    ABBR The HTML namespace abbr
    CITE The HTML namespace cite
    CODE The HTML namespace code
    CSECTION The SuikaWiki/0.10 namespace csection
    DEL The HTML namespace del
    DFN The HTML namespace dfn
    F The SuikaWiki/0.9 namespace f
    FRAC The MathML namespace mfrac
    INS The HTML namespace ins
    KBD The HTML namespace kbd
    KEY The SuikaWiki/0.10 namespace key
    LAT The SuikaWiki/0.9 namespace lat
    LON The SuikaWiki/0.9 namespace lon
    MAY The SuikaWiki/0.9 namespace MAY
    MUST The SuikaWiki/0.9 namespace MUST
    N The SuikaWiki/0.9 namespace n
    Q The HTML namespace q
    QN The SuikaWiki/0.10 namespace qn
    RUBY The HTML namespace ruby
    RUBYB The SuikaWiki/0.9 namespace rubyb
    SAMP The HTML namespace samp
    SHOULD The SuikaWiki/0.9 namespace SHOULD
    SPAN The HTML namespace span
    SRC The SuikaWiki/0.10 namespace src
    SUB The HTML namespace sub
    SUP The HTML namespace sup
    TIME The HTML namespace time
    TZ The SuikaWiki/0.9 namespace tz
    VAR The HTML namespace var
    WEAK The SuikaWiki/0.9 namespace weak
    Anything else The SuikaWiki/0.10 namespace Same as tag name
  2. If the token's classes is not null, set the class content attribute of the element created to classes.
  3. If the token's language is not null, set the lang content attribute in the XML namespace of the element created to language.
  4. Append the element created to the current element.
  5. Push the element created to the stack of open elements.
  6. If token's tag name is FRAC:
    1. Create an mi element in the MathML namespace.
    2. Append the element created to the current element.
    3. Push the element created to the stack of open elements.
A inline middle tag token
  1. Let local name be title.
  2. Let namespace be the SuikaWiki/0.10 namespace.
  3. If the local name of the current element is rt, set local name to rt, set namespace to the HTML namespace, and pop the current element off the stack of open elements.
  4. Otherwise, if the local name of the current element is title, nsuri, tz, n, lat, or lon, set local name to attrvalue and pop the current element off the stack of open elements.
  5. Otherwise, if the local name of the current element is qn, set local name to nsuri.
  6. Otherwise, if the local name of the current element is ruby or rubyb, set local name to rt and set namespace to the HTML namespace.
  7. Otherwise, if the local name of the current element is mi, set local name to mi.
  8. Create an element whose local name local name in the namespace.
  9. If the token's language is not null, set the lang content attribute in the XML namespace of the element created to language.
  10. Append the element created to the current element.
  11. Push the element created to the stack of open elements.
A inline end tag token
  1. If the local name of the current element is one of rt, title, nsuri, mi, or attrvalue, pop the element off the stack of open elements.
  2. If the current element is one of structural elements, or if the local name of the current element is strong or em, run the following substeps:
    1. If both resScheme attribute and anchor attribute of the token are null, append characters ]] to the current element, push the current element to the stack of open elements, and abort these substeps.

      As a result, the bottommost and second bottommost entries becomes equal, but one of them is popped from the stack of open elements soon.

    2. If resScheme attribute of the token is not null, create an anchor-external element in the SuikaWiki/0.9 namespace.
    3. Otherwise, create a anchor-internal element in the SuikaWiki/0.9 namespace.
    4. Append the element created to the current element.
    5. Set the textContent IDL attribute of the element created to ]].
    6. Push the element created to the stack of open elements.
  3. If anchor attribute of the token is not null, set the anchor content attribute in the SuikaWiki/0.9 namespace of the current element to anchor attribute of the token.
  4. If resScheme attribute of the token is not null, set the resScheme content attribute in the SuikaWiki/0.9 namespace of the current element to resScheme attribute of the token.
  5. If resParameter attribute of the token is not null, set the resParameter content attribute in the SuikaWiki/0.9 namespace of the current element to resParameter attribute of the token.
  6. Pop the current element off the stack of open elements.
A strong token
  1. If the local name of the current element is strong, pop the element off the stack of open elements and abort these substeps.
  2. Create a strong element in the HTML namespace.
  3. Append the element created to the current element.
  4. Push the element created to the stack of open elements.
An emphasis token
  1. If the local name of the current element is em, pop the element off the stack of open elements and abort these substeps.
  2. Create an em element in the HTML namespace.
  3. Append the element created to the current element.
  4. Push the element created to the stack of open elements.
A form token whose name is form
  1. Create a form element in the SuikaWiki/0.9 namespace.
  2. If id of the form token is not null, set the id content attribute of the element created to id of the form token.
  3. Set the input content attribute of the element created to the first item in parameters of the form token, if any, or the empty string otherwise.
  4. Set the template content attribute of the element created to the second item in parameters of the form token, if any, or the empty string otherwise.
  5. Set the option content attribute of the element created to the third item in parameters of the form token, if any, or the empty string otherwise.
  6. If the parameters contains four or more items, set the parameter content attribute of the element created to the concatenation of items in parameters, separated by a : character, in the same order.
  7. Append the element created to the current element.
Any other form token
  1. Create a form element in the SuikaWiki/0.9 namespace.
  2. Set the ref content attribute of the element created to name of the form token.
  3. Set the id of the form token is not null, set the id content attribute of the element created to id of the form token.
  4. If parameters of form token is not empty, set the parameter content attribute of the element created to the concatenation of items in parameters, separated by a : character, in the same order. The result value might be the empty string.
  5. Append the element created to the current element.
An element token
  1. Create an element whose local name is local name of the element token and namespace is namespace of the element token.
  2. If anchor attribute of the element token is not null, set the anchor content attribute in the SuikaWiki/0.9 namespace of the element created to anchor attribute of the element token.
  3. If by attribute of the element token is not null, set the by content attribute of the element created to by attribute of the element token.
  4. If resScheme attribute of the element token is not null, set the resScheme content attribute in the SuikaWiki/0.9 namespace of the element created to resScheme attribute of the element token.
  5. If resParameter attribute of the element token is not null, set the resParameter content attribute in the SuikaWiki/0.9 namespace of the element created to resParameter attribute of the element token.
  6. If content of the element token is not null, set the textContent IDL attribute of the element created to content of the element token.
  7. Append the element created to the stack of open elements.
A labeled list middle token
  1. If the current element is not one of structural elements, pop the element off the stack of open elements and follow this substep again.
  2. If the local name of the current element is dt, pop the element off the stack of open elements.
  3. Create a dd element in the HTML namespace.
  4. Append the element created to the current element.
  5. Push the element created to the stack of open elements.
A heading end token
  1. If the current element is not one of structural elements, pop the element off the stack of open elements and follow this substep again.
  2. If the local name of the current element is h1, pop the element off the stack of open elements.
  3. Switch to the "in section" insertion mode.
A table cell end token
  1. If the current element is not one of structural elements, pop the element off the stack of open elements and follow this substep again.
  2. If the local name of the current element is td or th, pop the element off the stack of open elements.
  3. Switch to the "in table row" insertion mode.
A block end tag token whose tag name is PRE
A preformatted end token
  1. If the current element is not one of structural elements, pop the element off the stack of open elements and follow this substep again.
  2. If the local name of the current element is pre, pop the element off the stack of open elements.
  3. Switch to the "in section" insertion mode.
Anything else
  1. If the current element is not one of structural elements, pop the element off the stack of open elements and follow this substep again.
  2. Switch to the "in section" insertion mode and reprocess the token.

5 Serializing SWML text serialization documents

...

6 Element definitions for the SWML text serialization

The following Block Element Table is referenced from the parser:

Tag name Local name Namespace Semantics (non-normative) Semantics of start tag's line contents (non-normative)
DEL delete SuikaWiki/0.9 namespace Removal. Not allowed.
EG example SuikaWiki/0.9 namespace Example. Not allowed.
FIG figure HTML namespace Figure. Figure caption.
FIGCAPTION figcaption HTML namespace Figure caption. Not allowed.
HISTORY history SuikaWiki/0.9 namespace Historical notes. Not allowed.
INS insert SuikaWiki/0.9 namespace Insertion. Not allowed.
NOTE note HTML3 namespace Note. Not allowed.
POSTAMBLE postamble SuikaWiki/0.9 namespace Postamble. Not allowed.
PREAMBLE preamble SuikaWiki/0.9 namespace Preamble. Not allowed.
REFS refs SuikaWiki/0.9 namespace References and quotations. Not allowed.
SPEAKER speaker SuikaWiki/0.9 namespace Speaker name of a talk. Not allowed.
TALK talk SuikaWiki/0.9 namespace A single talk part in a dialogue. Not allowed.

A future revison to this specification might define more tag names.

The semantics of an element is formally defined in terms of corresponding DOM elements.

Block elements are elements whose local name is one of local names in the Block Element Table.

Structural elements are block elements and elements whose local name is one of body, section, blockquote, h1, ul, ol, dl, li, dt, dd, table, tbody, th, tr, td, p, comment-p, ed, and pre.

These definitions are referenced from the parser. The parser does not have to check elements' namespaces.

7 The text/x-suikawiki and text/x.suikawiki.image Internet Media Types

The SWML text serialization can be identified by Internet Media Type text/x-suikawiki.

An entity labeled as text/x-suikawiki MUST be an SWML text serialization and MUST be processed as an SWML text serialization.

Additionally, for historical reason, an entity labeled as text/x.suikawiki.image MUST be processed as an SWML text serialization. This Internet Media Type MUST NOT be used for a new entity.

It was originally intended that a document with format name equal to SuikaWiki is labeled as text/x-suikawiki while a document with format name equal to SuikaWikiImage is labeled as text/x.suikawiki.image.

The charset parameter of these Internet Media Types represents the character encoding used for the entity. It has the same requirements as the charset parameter for the text/html Internet Media Type @@ todo: ref.

The version parameter MAY has the value 0.9 or 0.10 but SHOULD NOT be used. The parameter MUST be ignored.

This parameter was originally used to encode format version in favor of magic line.

... IMT template; fragment identifier

8 The SWML XML serialization

...

8.1 ... xml media type

9 Semantics of Elements and Attributes

This specification is the specification for the SuikaWiki/0.9 namespace and the SuikaWiki/0.10 namespace. Anything belongging to those namespaces is defined in this specification.

Elements and attributes in the SuikaWiki/0.9 namespace and in the SuikaWiki/0.10 namespace, as well as attributes in no namespace for elements in the SuikaWiki/0.9 namespace and in the SuikaWiki/0.10 namespace, MUST NOT be used in context where they are not allowed explicitly.

A namespaced attribute allowed in another specification can be used on elements in the SuikaWiki/0.9 namespace and in the SuikaWiki/0.10 namespace. For example, a lang attribute in the XML namespace is allowed to be specified for an XML element, as defined in the XML specification [XML]. Note the allowed attributes entry in following subsections only lists up attributes defined in this specification.

Elements in the SuikaWiki/0.9 namespace and in the SuikaWiki/0.10 namespace defined in this specification MUST conform to their content model.

Inter-element whitespace, comment nodes, and processing instruction nodes MUST be ignored when establishing whether an element matches its content model or not.

Elements in the SuikaWiki/0.9 namespace and in the SuikaWiki/0.10 namespace MAY be orphan nodes (i.e. without a parent node).

In the following subsections, attributes listed in the allowed attributes entry MAY be specified to an element described in that subsection.

Some elements belong to categories such as flow content and phrasing content.

An attribute is said to be specified to an element if the hasAttributeNS method invoked on the element with appropriate arguments would return true.

That is, the term specified is irrelevant from the specified IDL attribute.

9.1 Document structures

9.1.1 The document element in the SuikaWiki/0.9 namespace

Category
None.
Content model
A head element in the XHTML2 namespace, followed by a body element in the XHTML2 namespace, optionally followed by a image element in the SuikaWiki/0.9 namespace.
Allowed attributes
None.

This element MUST NOT be used.

...

9.1.2 The Name attribute in the SuikaWiki/0.9 namespace

This attribute MUST NOT be used.

...

9.1.3 The Version attribute in the SuikaWiki/0.9 namespace

This attribute MUST NOT be used.

...

9.1.4 The parameter element in the SuikaWiki/0.9 namespace

Category
None.
Content model
Zero or more value element in the SuikaWiki/0.9 namespace.
Allowed attributes
name

This element MUST NOT be used.

... name

9.1.5 The value element in the SuikaWiki/0.9 namespace

Category
None.
Content model
Text.
Allowed attributes
None.

This element MUST NOT be used.

...

9.1.6 The class attribute

All elements in the HTML namespace have class attribute ....

The class attribute of an element in the SuikaWiki/0.9 namespace and SuikaWiki/0.10 namespace has the same semantics and requirements as the HTML class attribute.

The class attribute of an element in the AA namespace SHOULD be considered as having the same semantics and requirements as the HTML class attribute.

9.1.7 The id attribute

The id attribute of an element in the SuikaWiki/0.9 namespace and SuikaWiki/0.10 namespace has the same semantics and requirements as the id attribute of HTML5 ....

9.2 Blocks

9.2.1 The dr element in the SuikaWiki/0.9 namespace

Category
None.
Content model
A dt element in the XHTML2 namespace, followed by a dd element in the XHTML2 namespace.
Allowed attributes
None.

This element MUST NOT be used.

...

9.2.2 The comment-p element in the SuikaWiki/0.10 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
None.

The comment-p element represents a note.

Historically, the p suffix in the element name implied that it represented a paragraph. As its content is any flow content, it now can contain any number of paragraphs.

9.2.3 The history element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The history element represents a description of history or an out-of-date content.

9.2.4 The example element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The example element represents an example.

9.2.5 The preamble element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The preamble element represents a preamble or preface.

9.2.6 The postamble element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The postamble element represents a postamble.

9.3 Dialogues

9.3.1 The dialogue element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Zero or more talk or script-supporting elements.
Allowed attributes
None.

The dialogue element represents a conversation between one or more persons.

Each piece of the conversation is represented by child talk elements.

A dialogue element SHOULD have at least one talk element child.

9.3.2 The talk element in the SuikaWiki/0.9 namespace

Category
None.
Content model
One speaker element followed by flow content.
Allowed attributes
class

The talk element represents a group of sentences by a person (or a specific group of persons) in the dialogue.

The speaker of a talk element is the first speaker child element of the element, if any, or null. If the speaker is not null, it describes the speaker(s) of the talk. Otherwise, the speaker is not explicitly described.

Interviewer's questions are often identified by lack of explicit speaker name.

The class attribute can be used to style talks in a dialogue based on the speaker of them (e.g. use different colors for different speakers).

9.3.3 The speaker element in the SuikaWiki/0.9 namespace

Category
None.
Content model
Phrasing content.
Allowed attributes
class

The speaker element represents a short string used to credit the person (or a group of person) of the piece of the conversation.

It can also contain other metadata than person name, such as affiliation of the person or the timestamp of the talk, if desired.

9.4 Hyperlinks

Some of elements defined by this specification or used in SWML documents are considered as implicit link elements. Elements abbr, cite, code, and kbd in the HTML namespace are impicit link elements.

9.4.1 The anchor element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
anchor in the SuikaWiki/0.9 namespace

...

9.4.2 The anchor-internal element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
anchor in the SuikaWiki/0.9 namespace

...

9.4.3 The anchor-end element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
anchor in the SuikaWiki/0.9 namespace.

...

9.4.4 The anchor attribute in the SuikaWiki/0.9 namespace

The anchor attribute in the SuikaWiki/0.9 namespace, when specified to an anchor-end element, defines an anchor number for the parent element of the anchor-end element, if any.

The attribute MUST be specified and its value MUST be a valid integer. The integer MUST have different value from any other anchor attribute in the SuikaWiki/0.9 namespace specified in an anchor-end element in the SuikaWiki/0.9 namespace that belongs to the same tree as the first attribute.


The anchor attribute in the SuikaWiki/0.9 namespace MAY be specified to q elements in the HTML namespace and in the XHTML2 namespace, as well as ins and del elements in the HTML namespace. The attribute can also be specified to anchor and anchor-internal elements in the SuikaWiki/0.9 namespace.

In these cases, the attribute represents the anchor number of the element referenced. If the element on which the attribute is specified is an anchor element, the element referenced might be found in the document referenced by the element. Otherwise, the element is in the tree the element belongs to.

If the element on which the attribute is specified is not an anchor or anchor-internal element, the attribute has similar semantics to that of the cite attribute on the element. In such cases, the anchor attribute in the SuikaWiki/0.9 namespace MUST NOT be specified when the cite attribute is specified. A user agent MUST ignore the anchor attribute in the SuikaWiki/0.9 namespace if the cite attribute is specified.

The attribute value MUST be a valid integer. Unless the element is anchor, the integer MUST be equal to one of the integer represented by the anchor attribute in the SuikaWiki/0.9 namespace specified to an anchor-internal element in the SuikaWiki/0.9 namespace that belongs to the same tree.

9.4.5 The anchor-external element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
resParameter in the SuikaWiki/0.9 namespace
resScheme in the SuikaWiki/0.9 namespace

...

9.4.6 The resScheme attribute in the SuikaWiki/0.9 namespace

The resScheme attribute in the SuikaWiki/0.9 namespace MAY be specified to q elements in the HTML namespace and in the XHTML2 namespace, as well as ins and del elements in the HTML namespace. The attribute can also be specified to an anchor-external element in the SuikaWiki/0.9 namespace.

...

9.4.7 The resParameter attribute in the SuikaWiki/0.9 namespace

The resParameter attribute in the SuikaWiki/0.9 namespace MAY be specified to q elements in the HTML namespace and in the XHTML2 namespace, as well as ins and del elements in the HTML namespace. The attribute can also be specified to an anchor-external element in the SuikaWiki/0.9 namespace.

...

9.5 Embedded objects

9.5.1 The aa element in the AA namespace

The aa element in the AA namespace ... falls into the phrasing content category for the purpose of the content models in this specification.

The content model of this element SHOULD be considered as phrasing content.

9.5.2 The form element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Nothing.
Allowed attributes
id
input
option
parameter
ref
template

... ref, parameter.


... input, template, option

9.5.3 The image element in the SuikaWiki/0.9 namespace

Category
None.
Content model
Text.
Allowed attributes
None.

This element MUST NOT be used.

...

9.5.4 The replace element in the SuikaWiki/0.9 namespace

Category
None.
Content model
Nothing.
Allowed attributes
by

This element MUST NOT be used.

... by

9.5.5 The text element in the SuikaWiki/0.9 namespace

Category
None.
Content model
Text.
Allowed attributes
None.

This element MUST NOT be used.

...

9.6 Citations

9.6.1 The csection element in the SuikaWiki/0.10 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
class

...

9.6.2 The src element in the SuikaWiki/0.10 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
class

...

9.6.3 The refs element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The refs element represents a list of referenced documents.

9.7 Editorial annotations

9.7.1 The insert element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The insert element represents an insertion to the document.

9.7.2 The delete element in the SuikaWiki/0.9 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
class

The delete element represents a removal from the document.

9.7.3 The ed element in the SuikaWiki/0.10 namespace

Category
Flow content.
Content model
Flow content.
Allowed attributes
None.

...

9.8 Inline annotations

9.8.1 The rubyb element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content, followed by a rt element in the HTML namespace.
Allowed attributes
class

...

9.8.2 The weak element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content.
Allowed attributes
class

...

9.8.3 The title element in the SuikaWiki/0.10 namespace

Category
None.
Content model
Phrasing content.
Allowed attributes
None.

This element MAY be inserted as the last child of abbr, dfn, span, or time element in the HTML namespace when the title attribute of that element is not specified.

If the parent element of the element has a title attribute specified, or the element is not the last child, the element MUST be ignored.

Inter-element whitespaces, comments, and processing instructions can be inserted after this element.

...

9.9 Values

Some elements are defined as elements with value. For an element element with value, the element value is the value returned by the following steps:

  1. If element has a child attrvalue element:
    1. Let value element be the first attrvalue element child of element.
    2. Return the text content of element.
  2. Otherwise, return the text content of element.

9.9.1 The f element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Content model
Phrasing content.
Allowed attributes
class

The f element represents a field name or key of some structure, such as a field name of a C data structure, a key of a Perl hash, or a property name of an XML information item.

9.9.2 The key element in the SuikaWiki/0.10 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Content model
Phrasing content.
Allowed attributes
class

...

9.9.3 The n element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Elements with value.
Content model
Phrasing content, optionally followed by an attrvalue element.
Allowed attributes
class

The n element represents the number given as the element value.

9.9.4 The lat element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Elements with value.
Content model
Phrasing content, optionally followed by an attrvalue element.
Allowed attributes
class

The lat element represents a latitude given as the element value.

9.9.5 The lon element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Elements with value.
Content model
Phrasing content, optionally followed by an attrvalue element.
Allowed attributes
class

The lon element represents a longitude given as the element value.

9.10 Conformance keywords

9.10.1 The MUST element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Content model
Phrasing content.
Allowed attributes
class

The MUST element represents an RFC 2119 keyword "MUST".

9.10.2 The SHOULD element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Content model
Phrasing content.
Allowed attributes
class

The SHOULD element represents an RFC 2119 keyword "SHOULD".

9.10.3 The MAY element in the SuikaWiki/0.9 namespace

Category
Phrasing content.
Flow content.
Implicit link elements.
Content model
Phrasing content.
Allowed attributes
class

The MUST element represents an RFC 2119 keyword "MAY".

9.11 Qualified names

9.11.1 The qn element in the SuikaWiki/0.10 namespace

Category
Phrasing content.
Flow content.
Content model
Phrasing content, optionally followed by a nsuri element in the SuikaWiki/0.10 namespace.
Allowed attributes
class

...

9.11.2 The qname element in the SuikaWiki/0.10 namespace

Category
None.
Content model
Phrasing content.
Allowed attributes
None.

This element MUST NOT be used.

...

9.11.3 The nsuri element in the SuikaWiki/0.10 namespace

Category
None.
Content model
Phrasing content.
Allowed attributes
None.

...

9.12 Fallback elements

9.12.1 The attrvalue element in the SuikaWiki/0.10 namespace

Category
None.
Content model
Phrasing content.
Allowed attributes
None.

Unless otherwise specified, this element MUST NOT be used.

...

9.12.2 Uppercase elements in the SuikaWiki/0.10 namespace

Category
None.
Content model
Phrasing content.
Allowed attributes
class

Uppercase elements are elements in the SuikaWiki/0.10 namespace whose local name consists of one or more uppercase letters.

These elements MUST NOT be used.

These elements might be inserted into a node tree by a parser when an inline start tag with unknown tag name is found.

References

Normative references

AAVOCAB
...
MANAKAI
manakai's DOM extensions.
RFC2119
Key words for use in RFCs to Indicate Requirement Levels, Scott Bradner, IETF BCP 14, March 1997.
XHTML2
...
XML
...

Tests and implementation

There are test data.

There is a Perl implementation.

Author

This document is written by Wakaba <wakaba@suikawiki.org>.

This document is developed as part of the SuikaWiki project.

Per CC0, to the extent possible under law, the author has waived all copyright and related or neighboring rights to this work.