How to set up smartphones and PCs. Informational portal

How html differs from xhtml. Glossary of terms

05/20/16 3.4K

Both HTML and XHTML are languages ​​for creating web pages. HTML is built on SGML and XHTML is on XML based... They are like two sides of the same coin. XHTML was created from HTML for the purpose of conforming to XML standards. Consequently, XHTML is stricter than HTML and does not allow you to deviate from the rules of coding.

XHTML was developed because of some tag confusion. Pages written in HTML were rendered in different browsers differently.

comparison table

Html XHTML
Definition (from Wikipedia) HTML or HyperText Markup Language is the primary markup language for creating web pages and other documents that can be viewed in a browser. XHTML (Extensible HyperText Markup Language) is a family of XML markup languages ​​that extend and extend the Hypertext Markup Language (HTML) in which web pages are written.
File extensions .html, .htm. .xhtml, .xht, .xml, .html, .htm.
Usage format text / html. app / xhtml + xml.
Designed by W3C and WHATWG. World Wide web Consortium.
Format type Format of documents. Markup language.
Expanded from SGML. XML, HTML.
Decryption Hypertext Markup Language. Extensible Hypertext Markup Language.
Appendix Standard Generalized Markup Language (SGML) application. XML application.
Functions Web pages are written in HTML. An extended version of HTML, more rigorous, based on XML.
Behavior Flexible frameworks do not require HTML syntax parsing. Limited XML rules and requires compliance with them.
Origin Proposed by Tim Berners-Lee in 1987. Recommendation World wide 2000 Web Consortium.
Versions HTML 2, HTML 3.2, HTML 4.0, HTML 5. XHTML 1, XHTML 1.1, XHTML 2, XHTML 5.

Overview of HTML and XHTML

HTML is the primary markup language for web pages. It creates structured documents by highlighting elements such as headings, lists, links, quotes, etc. It allows you to embed images and objects to create interactive forms. HTML is specified using tags in angle brackets - for example, ... Also, its code may contain scripts written in JavaScript.

XHTML is a family of XML languages ​​that extend or extend HTML versions... They do not allow any tags to be omitted or attributes minimized. XHTML requires that each opening tag has a matching closing tag in the correct order. For example, if a single tag is allowed in the hypertext language
, then in XHTML, unlike HTML, you need to write the tag
... This is the difference.

Functions of HTML and XHTML documents

HTML syntax consists of the following components: opening and closing tags, element attributes ( given in tags), text and graphic content. An HTML element is anything between tags, including the tags themselves.

An XHTML document contains only one root element. All elements, including variables, must be written in lowercase, and assigned values ​​must be quoted, closed, and nested. In XHTML, this is mandatory requirement- unlike HTML. The XHTML DOCTYPE declaration defines the rules for documents to be followed.

Basic HTML syntax accepts many abbreviations that are not allowed in XHTML. For example, elements that do not require both an opening and closing tag. XHTML requires all elements to have both an opening and closing tag. At the same time, XHTML introduces new abbreviations: an XHTML tag can be opened and closed with a forward slash (
).

The introduction of a syntax that is not used in SGML declarations for HTML 4.01 could lead to confusion in applications in the early stages. To solve this problem, you need to use a space before the closing tag:
.

XHTML and HTML specification

HTML and XHTML can be documented together. Both HTML 4.01 and XHTML 1.0 have three sub-specifications — strict, lax, and framing. The difference HTML documents and XHTML is about declaring documents. Other differences are syntactic. HTML allows for no closing tag, empty elements without a closing tag. Extensible Hypertext Markup Language is very strict about opening and closing XHTML tags. It uses a built-in language for defining attribute functionality. All XML syntax requirements are followed in the XHTML document.

But these differences only show up when the XHTML document is used like XML application; that is, as MIME types application / XHTML + XML, application / XML, or text / XML. An XHTML document used as a text / HTML MIME type must be interpreted as HTML, so that in this case apply HTML rules... CSS written for XHTML used as a text / HTML MIME type may not work correctly in a document that is used as an application / XHTML + XML MIME type. For getting additional information for MIME types, see the related documentation.

This can be important when you are using XHTML documents like text / HTML. If you are unaware of these differences, you might create CSS that will not work as expected if the document is used like XHTML.

Where the terms “ XHTML" and " XHTML document“, For the remainder of this section, they are assumed to define the use of XHTML markup as an XML MIME type. XHTML markup used as text / HTML is an HTML document.

How to switch from HTML to XHTML

  • Include the xml: lang and lang attributes for elements that set the language;
  • Use empty element syntax for elements specified in HTML as empty;
  • Use extra space on empty element tags: ;
  • Use closing tags for elements that may contain content but are empty: ;
  • Do not include XML declaration.

XHTML stands for ExtensibleHypertext Markup Language and in Russian - Extensible Hypertext Markup Language. Note not extended, but extensible. This means that this language

It is replenished (expanded) up to now. So what is XHTML? The main honors XHTML from the same HTML is the way of processing a document (web page). Another definition is worth introducing. A parser is a program or part of a program that executes parsing... It is also called a parser. If it's even simpler, then this program analyzes the entire structure of the page, the entire code of the page. In HTML, when an error was found, during the analysis it was corrected, which required additional time - the browser needed to understand what the author (developer) wanted to write. For example, if there was an error in any tag, it was simply displayed with the rest of the text.

Another difference is that all elements must be closed and single tags must have a / after the characters, for example:
... I will write a separate one about tags. great article, I will also write about each tag separately. XHTML encoding is UTF-8 (currently the most common), while HTML used ISO 8859-1.

What is XHTML Modularization?

XHTML modularization is the division of XHTML 1.0, relative to HTML 4, into a collection of abstract modules that provide specific types of functionality. These abstract modules are implemented in this specification using the XML Document Type Definition language, but an implementation using XML Schema is expected.
Rules for defining abstract modules and implementing them using DTDs are also defined in this document.


xhtml

These modules can be combined with each other and with other modules to create a subset and extension of XHTML document types that qualify as members of the XHTML document type family.

What is XHTML Modularization for?

Formatting model

Previous versions of HTML tried to define the parts of such a model that were required of the user agent (user agent) to use when formatting a document. With the advent, the W3C began the process of separating presentation from structure. XHTML 1.0 maintains this separation, and this document continues to move from HTML and its descendants in that direction. Accordingly, this document does not put forward any requirements for a formatting model associated with the presentation of documents marked up using the XHTML Family document types.


xhtml

Rather, this document recommends that content authors rely on styling mechanisms such as CSS when defining a formatting model for their content.
If user agents support style mechanisms, then documents will be formatted as expected.
If user agents do not support styling mechanisms, then documents will be formatted as the user agent specifies. This allows XHTML Family user agents to support complex (fancy; A.R.) formatting models on those devices where possible, and change formatting models on those devices where possible.

Extensible Hypertext Markup Language (XHTML) is a quick way to reference several language recommendations that are widely used on Internet-enabled devices for web browsing. Although it is named after its predecessor, Hypertext Markup Language (), it is in fact based on Extensible Markup Language (XML), which is a highly selective part of the Standard Generalized Markup Language (SGML).
In fact, they are all descendants of SGML. Although HTML is a direct application of SGML, XHTML is what is called a namespace, or a set of definitions for an XML document, that helps eliminate ambiguity when more than one XML vocabulary is used in any given situation.

The language originated from several limitations of HTML and the various ways in which HTML is implemented. Around the time HTML was updated to version 4, it began to weaken, when properly used by many HTML interpreters, computer programs that parse HTML documents into a formatted, viewable web page. As mobile devices and other platforms for web browsing emerged, a better solution was needed. XML is a much stricter implementation of SGML than HTML, and different XML namespaces can be used in the same instance. Therefore, around 2000, the World Wide Web Consortium (W3C) developed and made XHTML one of its recommendations to address some of these emerging problems.

For all intents and purposes, in most cases XHTML mimics HTML, but since the former uses the XML namespace, it can be parsed by any XML interpreter, and HTML is limited to HTML interpreters only. XHTML is actually a re-created HTML under a more restrictive subset of XML SGML. Thus, the more recent language could immediately be interpreted by existing web browsers, and also became available for other platforms. It is also important to remember that it must conform to the extensible aspect of XHTML firmware. Not only does it enable more programs and platforms to be read, but it also expands to allow other XML namespaces to be used in its documents.

Because of XHTML's ability to include other XML namespaces in a document, it can be extended in several ways to represent more than just page formatting. For example, Math Markup Language (MathML) can be included in these documents to display mathematical formulas and notation. Images can also be embedded using the Scalable Vector Graphics (SVG) namespace in this document type. Thus, XHTML can also be included in another XML document.

Since XHTML is really just HTML refined according to XML rules, it offers three document type definitions (DTDs) that duplicate those in HTML version 4. A DTD is a detailed description of the elements of a markup language, including when, where and how it can be used, and any associated attributes. But later versions of XHTML introduced XML Schemas, another, more reliable way of describing an XML document, which extended XHTML even further. In turn, various stripped-down versions of XHTML have been developed that can then be used for specific purposes, many of which revolve around mobile computing platforms.

When choosing DOCTYPE it is necessary to clearly determine which of the two standards to choose: Html or XHTML... And to facilitate your choice, I decided to disassemble difference between HTML and XHTML.

The main difference between HTML and XHTML is that XHTML based on syntax XML... And, consequently, he is more strict, and in him one should not allow those liberties that can be allowed in Html.

And now we will analyze the points specifics of XHTML syntax:

1. Each tag must be closed

Paired tags must be closed in Html too, but we all know that in Html there are many single tags (e.g.<img>), and we could safely write this way:

However, in XHTML all tags must be closed, even single ones, and they are closed as follows:

The only difference is the slash before the second angle bracket.

2. All special characters should be replaced with entities

That is, you cannot write like this: " & ", you need to write this character only as an entity, that is," & ". V Html there is no such rule.

3. All attribute values ​​must be in quotation marks.

We all know that in Html can be written like this:

That is, we have the value of the attribute " width"is found without quotes. XHTML this is unacceptable, and there it is necessary to write like this:

4. All tags and attributes must be written in lower case.

To be honest, I never understood why people write tags in uppercase. In my opinion, this disfigures the code, and there is a feeling that it was written without squeezing " CAPSLOCK". But if in Html is a matter of taste, then in XHTML- this is the rule: write only in lower case.

As you can see, the only difference is in the syntax. There are other minor differences as well, but we won't talk about them. In other words, the only benefit XHTML- it's lighter parsing a document... As well as XHTML very suitable for lovers of "clean" code. There are no more advantages. All browsers display correctly and Html, and XHTML... And often browsers XHTML treated like Html, therefore, for the appearance of serious differences in HTML and XHTML definitely won't.

I chose for myself XHTML, because I really like it when the code is "clean" and when it can be easily parsed into its component parts ( parsing). Yes, and in general I'm used to the strict syntax of other languages, for example, Java, so I'll still write as validly as possible. And what you choose is up to you, but about differences between HTML and XHTML You already know.

XHTML is written using the same syntax as HTML. That said, the difference between HTML and XHTML lies in the set of some mandatory rules.

The XHTML rules are as follows.

  1. All tags and their attributes must be typed in lowercase (lowercase characters).
  2. The values ​​of any attributes must be enclosed in quotation marks.
  3. All tags must be closed, even those that do not have an associated end tag.
  4. The correct nesting of tags must be observed.
  5. You cannot use shorthand tag attributes.
  6. Use id instead of the name attribute.
  7. DTD (document type definition) should be defined using the element .

Tags must be typed in lowercase

This rule came about because XHTML is case-sensitive, so the tags


and
differ. To avoid confusion, the syntax forces all tags, as well as their attributes, to be lowercase. Example 3.1 shows the wrong use of tags.

Example 3.1. Wrong spelling of tags

XHTML 1.0 IE Cr Op Sa Fx

XHTML

Lorem ipsum dolor sit amet ...

In this example, the tags and

Are typed in uppercase characters, which is an error. Example 3.2 shows the correct code.

Example 3.2. Correct spelling of tags

XHTML 1.0 IE Cr Op Sa Fx

XHTML

Lorem ipsum dolor sit amet ...

Values ​​of any attributes must be enclosed in quotation marks

Although HTML also requires you to enclose values ​​in quotes, their absence does not affect the correctness of the code in any way. So we can say that in HTML, the use of quotation marks is only a guideline. In XHTML, the use of quotes is elevated to a rule and any attribute values ​​need to be specified only in them (example 3.3).

Example 3.3. Using quotes

XHTML 1.0 IE Cr Op Sa Fx

XHTML

Cheburashka Shapoklyak
1 5
4 13

In this example, all the attributes of the tag

, as well as
are specified in quotes.

All tags are required to be closed

In HTML, tags are divided into two categories - paired tags, also called containers, and single tags. Paired tags consist of a start tag and an end tag, and in some cases, the end tag can be omitted. In XHTML, the end tag is required anytime, anywhere. Example 3.4 shows the code with an error due to the missing tag

.

Example 3.4. No end tag

XHTML 1.0 IE Cr Op Sa Fx

XHTML

and climbs, stealthily, into the plane,

and puts a bomb in his belly,

End tags are ignored by some developers ,

, but XHTML considers their absence to be an error. Example 3.5 shows the correct use of lists.

Example 3.5. Adding a list

XHTML 1.0 IE Cr Op Sa Fx

XHTML

  • East
  • West
  • South
  • North

In this example, each start tag has its own end tag.

Elementis not part of the XHTML document, so it does not require an end tag.

Single tags must end with a slash before the closing angle bracket, as shown in Example 3.6.

Example 3.6. Adding an image

XHTML 1.0 IE Cr Op Sa Fx

XHTML

In this example, notice the required space that precedes the /> construct.

Table 3.1 shows some HTML tags and how they are written in an XHTML document.

Proper nesting of tags must be respected

XHTML is critical of errors of the following types: incorrect nesting of one tag within another, and the location of the tag in an inappropriate container.

Correct nesting of tags

Each tag must be located inside another tag, while their "intersection" is not allowed, as shown in example 3.7.

Example 3.7. Tag position error

XHTML 1.0 IE Cr Op Sa Fx

XHTML

Lorem ipsum dolor sit amet ...

In this example, the end tag precedes the tag, although it should be the other way around, which leads to the error. If you swap the tags, the code becomes correct.

Although the code validator throws an error if the tags are positioned incorrectly, browsers render the web page correctly.

Hierarchy of tags

All tags have a strict hierarchical system in the sense that each tag must be inside another tag and nothing else. The root element is located at the conditional top , and all other tags can contain other tags inside themselves, which are called child tags. Accordingly, child tags are located in the parent element.

You need to know and follow the tag subordination system when writing XHTML code. Example 3.8 shows the basic structure of a document.

Example 3.8. Document structure

XHTML 1.0 IE Cr Op Sa Fx

new document

In this example, the tag is given first , inside which the tags are located and ... Inside a section the title of the document is stored ( ) and page encoding ( <meta> ).</p> <h2>Can't use shorthand tag attributes</h2> <p>An attribute with no assigned value is abbreviated. Example 3.9 shows a form using these attributes.</p> <p>Example 3.9. Error using attributes</p> <p>XHTML 1.0 IE Cr Op Sa Fx</p><p> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>XHTML

"Empty value", as this error is also called, is easily corrected by assigning a value to the attribute that matches the name. Table 3.2 shows some of the attributes and how they are written in HTML and XHTML.

Table 3.2. Mapping Attributes in HTML and XHTML
Html XHTML
checked checked = "checked"
compact compact = "compact"
disabled disabled = "disabled"
ismap ismap = "ismap"
multiple multiple = "multiple"
nohref nohref = "nohref"
noresize noresize = "noresize"
noshade noshade = "noshade"
nowrap nowrap = "nowrap"
readonly readonly = "readonly"
selected selected = "selected"

Example 3-10 shows the correct use of the above form.

Example 3.10. Correct use of attributes

XHTML 1.0 IE Cr Op Sa Fx

XHTML

Instead of the name attribute, you must specify id

Name attribute is defined in HTML for tags , ,