3.8 Marked Sections

XML provides a mechanism to indicate that particular pieces of the document should be processed in a special way. These are termed “marked sections”.

Example 3-14. Structure of A Marked Section

<![KEYWORD[
  Contents of marked section
]]>

As you would expect, being an XML construct, a marked section starts with <!.

The first square bracket begins to delimit the marked section.

KEYWORD describes how this marked section should be processed by the parser.

The second square bracket indicates that the content of the marked section starts here.

The marked section is finished by closing the two square brackets, and then returning to the document context from the XGML context with >.

3.8.1 Marked Section Keywords

3.8.1.1 CDATA

These keywords denote the marked sections content model, and allow you to change it from the default.

When an XML parser is processing a document it keeps track of what is called the “content model”.

Briefly, the content model describes what sort of content the parser is expecting to see, and what it will do with it when it finds it.

The content model you will probably find most useful is CDATA.

CDATA is for “Character Data”. If the parser is in this content model then it is expecting to see characters, and characters only. In this model the < and & symbols lose their special status, and will be treated as ordinary characters.

Note: When you use CDATA in examples of text marked up in XML, keep in mind that the content of CDATA is not validated. You have to check the included XML text using other means. You could, for example, write the example in another document, validate the example code, and then paste it to your CDATA content.

Example 3-15. Using a CDATA Marked Section

<para>Here is an example of how you would include some text
  that contained many <literal>&lt;</literal>
  and <literal>&amp;</literal> symbols.  The sample
  text is a fragment of XHTML.  The surrounding text (<para> and
  <programlisting>) are from DocBook.</para>

<programlisting>
  <![CDATA[
    <p>This is a sample that shows you some of the elements within
      XHTML.  Since the angle brackets are used so many times, it is
      simpler to say the whole example is a CDATA marked section
      than to use the entity names for the left and right angle
      brackets throughout.</p>

    <ul>
      <li>This is a listitem</li>
      <li>This is a second listitem</li>
      <li>This is a third listitem</li>
    </ul>

    <p>This is the end of the example.</p>
  ]]>
</programlisting>

If you look at the source for this document you will see this technique used throughout.

3.8.1.2 INCLUDE and IGNORE

If the keyword is INCLUDE then the contents of the marked section will be processed. If the keyword is IGNORE then the marked section is ignored and will not be processed. It will not appear in the output.

Example 3-16. Using INCLUDE and IGNORE in Marked Sections

<![INCLUDE[
  This text will be processed and included.
]]>

<![IGNORE[
  This text will not be processed or included.
]]>

By itself, this is not too useful. If you wanted to remove text from your document you could cut it out, or wrap it in comments.

It becomes more useful when you realize you can use parameter entities to control this, yet this usage is limited to entity files.

For example, suppose that you produced a hard-copy version of some documentation and an electronic version. In the electronic version you wanted to include some extra content that was not to appear in the hard-copy.

Create an entity file that defines general entities to include each chapter and guard these definitions with a parameter entity that can be set to either INCLUDE or IGNORE to control whether the entity is defined. After these conditional general entity definitions, place one more definition for each general entity to set them to an empty value. This technique makes use of the fact that entity definitions cannot be overridden but always the first definition takes effect. So you can control the inclusion of your chapter with the corrsponding parameter entity; if you set it to INCLUDE, the first general entity definition will be read and the second one will be ignored but if you set it to IGNORE, the first definition will be ignored and the second one will take effect.

Example 3-17. Using A Parameter Entity to Control a Marked Section


<!ENTITY % electronic.copy "INCLUDE">

<![%electronic.copy;[
<!ENTITY chap.preface	SYSTEM "preface.xml">
]]>

<!ENTITY chap.preface "">

When producing the hard-copy version, change the parameter entity's definition to:

<!ENTITY % electronic.copy "IGNORE">

3.8.2 For You to Do…

  1. Modify the entities.ent file to contain the following:

    <!ENTITY version "1.1">
    <!ENTITY % conditional.text "IGNORE">
    
    <![%conditional.text;[
    <!ENTITY para1 SYSTEM "para1.xml">
    ]]>
    
    <!ENTITY para1 "">
    
    <!ENTITY para2 SYSTEM "para2.xml">
    <!ENTITY para3 SYSTEM "para3.xml">
    
  2. Normalize the example.xml file and notice that the conditional text is not present on the output document. Now if you set the parameter entity guard to INCLUDE and regenerate the normalized document, it will appear there again. Of course, this method makes more sense if you have more conditional chunks that depend on the same condition, for example, whether you are generating printed or online text.