How to set up smartphones and PCs. Informational portal
  • home
  • Windows Phone
  • Uniform Resource Identifier URI. Internet resource addressing schemes

Uniform Resource Identifier URI. Internet resource addressing schemes

URI (Uniform Resource Identifier) is a unified (uniform) resource identifier. URI is a character string that allows you to identify any resource: document, image, file, service, email box, etc. First of all, we are talking, of course, about Internet resources and World wide web... A URI provides a simple and extensible way to identify resources. URI extensibility means that several identification schemes already exist within a URI, and more will be created in the future.

Relationship between URI, URL and URN

Venn diagram showing subsets of the URI scheme: URL and URN.

The URI is either a URL, a URN, or both.

  • A URL is a URI that, in addition to identifying a resource, also provides information about the location of that resource.
  • A URN is a URI that only identifies a resource in a specific namespace (respectively, in a specific context), but does not indicate its location. For instance, URN urn: ISBN: 0-395-36341-1 is a URI that points to a resource (book) 0-395-36341-1 in the ISBN namespace, but, unlike a URL, the URN does not indicate the location of this resource: in it it is not said in which store it can be bought or on which website to download it.

Since the URI does not always indicate how to obtain a resource, unlike a URL, but only identifies it, this makes it possible to describe resources using RDF (Resource Description Framework) that cannot be obtained via the Internet (for example, a person, a car, city, etc.).

Story

In 1990, in Geneva, Switzerland, within the walls of the European Council for Nuclear Research, British scientist Tim Berners-Lee invented the resource location locator URL. Since URL is the most commonly used subset of URIs, 1990 is considered to be the year of birth of the URI. But, strictly speaking, the concept of URI was only documented in June 1994 in RFC 1630.

A new version URI was defined in 1998 by RFC 2396, at the same time the word Universal in the title has been changed to Uniform.

Flaws

The URL was a fundamental innovation on the Internet, so the principles of URI were documented to ensure full URL compatibility. This is where the big disadvantage of URIs comes from, inheriting from URLs. A URI, like a URL, can only use a limited set of Latin characters and punctuation marks (even less than in ASCII). In other words, if we want to use Cyrillic characters, or hieroglyphs, or, say, specific characters of the French language, in the URI, we will have to encode the URI in the same way that Wikipedia encodes URLs with Unicode characters. For example, a line like this:

https://ru.wikipedia.org/wiki/Cyrillic

URL encoded as:

https://ru.wikipedia.org/wiki/%D0%9A%D0%B8%D1%80%D0%B8%D0%BB%D0%BB%D0%B8%D1%86%D0%B0

Since the letters of all alphabets are subjected to such a transformation, except for the one used in English language Latin letters, then URIs with words in other languages ​​(even European ones) lose their ability to be perceived by people. And this is in gross contradiction with the principle of internationalism, proclaimed by all the leading organizations of the Internet, including the W3C and ISOC. This problem is intended to be solved by the IRI standard (eng. Internationalized Resource Identifier) - international resource identifiers in which it would be possible to use Unicode characters without problems, and which would not infringe upon the rights of other languages. Also, the creator of the URI, Tim Berners-Lee, said that the domain name system that underlies URLs is a bad decision, forcing resources on a hierarchical architecture that is not suitable for the hypertext web.

URI structure

URI = [scheme ":"] hierarchical - part [ "?" request] ["#" snippet]

In this entry:

Scheme

scheme for accessing a resource (often indicates a network protocol), for example, http, ftp, file, ldap, mailto, urn

Hierarchical-part

contains data, usually organized in a hierarchical form, which, when combined with data in a non-hierarchical component inquiry, serve to identify the resource within the scope of the URI scheme. Usually hierarchical-part contains the path to the resource (and, possibly, in front of it, the address of the server on which it is located) or the resource identifier (in the case of URN).

Inquiry

this optional URI component is described above.

Fragment

(also an optional component)

Allows you to indirectly identify a secondary resource by referencing the primary resource and specifying additional information. An identifiable secondary resource can be some part or subset of the primary, some representation of it, or another resource defined or described by such a resource.

Parsing the structure of the URI. For the so-called "parsing" of URIs (eng. parsing), that is, to decompose URIs into their constituent parts and their subsequent identification, it is most convenient to use the regular expression system, which is now available in almost all modern programming languages. The following pattern is recommended for parsing URIs in RFC 3986:

This pattern includes 9 groups indicated above by numbers (for more information about patterns and groups, see Regular Expressions), which most fully and accurately parse a typical URI structure, where:

  • group 2 - scheme,
  • group 4 - source,
  • group 5 - path,
  • group 7 - request,
  • group 9 - fragment.

Thus, if using of this template parse, for example, a typical URI like this:

http://www.ics.uci.edu/pub/ietf/uri/#Related

then the 9 above template groups will give the following results respectively:

  1. http:
  2. //www.ics.uci.edu
  3. www.ics.uci.edu
  4. / pub / ietf / uri /
  5. no result
  6. no result
  7. #Related
  8. Related

Examples of URIs:

Absolute URIs

  • https://ru.wikipedia.org/wiki/URI
  • ftp://ftp.is.co.za/rfc/rfc1808.txt
  • file: // C: \ UserName.HostName \ Projects \ Wikipedia_Articles \ URI.xml
  • file: /// C: /file.wsdl
  • file: ///Users/John/Documents/Projects/Web/MyWebsite/about.html
  • ldap: /// c = GB? objectClass? one
  • mailto: [email protected]
  • sip: [email protected]
  • news: comp.infosystems.www.servers.unix
  • data: text / plain; charset = iso-8859-7,% be% be% be
  • tel: + 1-816-555-1212
  • telnet: //192.0.2.16: 80 /
  • urn: oasis: names: specification: docbook: dtd: xml: 4.1.2

2) Relative URIs

  • /relative/URI/with/absolute/path/to/resource.txt
  • //example.org/scheme-relative/URI/with/absolute/path/to/resource.txt
  • relative / path / to / resource.txt
  • ../../../resource.txt
  • resource.txt
  • /resource.txt#frag01
  • # frag01

[empty string] - is equivalent to parsing the identifier by the parser with the result [empty string], that is, the link leads to the default object in the default schema

DNS service

DNS stands for Domain Name System. Domain names DNS systems- synonyms for the IP address, just like the names in your phone's address book - synonyms for phone numbers. They are symbolic, not numeric; they are more convenient for memorization and orientation; they carry a semantic load. www.irnet.ru → DNS tables → 193.232.70.36 Domain names are also unique, i.e. there are no two identical domain names in the world. Domain names, unlike IP addresses, are optional, they are purchased additionally.

Rice. 2. Hierarchy in the DNS.

Also unique are the addresses that are indicated on envelopes when delivering letters by regular mail. There are no countries in the world with the same names. And if the names of cities are sometimes repeated, then in combination with the division into larger administrative units such as districts and regions, they become unique. And street names should not be repeated within the same city. Thus, the address, based on geographical and administrative names, uniquely identifies the destination. Domains have a similar hierarchy. Domain names are separated from each other by periods: lingvo.yandex.ru, krkime.com.

DNS has following characteristics:

  • Distributed administration... Different people or organizations are responsible for different parts of the hierarchy.
  • Distribution of information storage... Each node of the network must necessarily store only the data that is included in its area of ​​responsibility, and (possibly) addresses root DNS servers.
  • Information caching... Knot maybe store some data outside of their area of ​​responsibility to reduce the load on the network.
  • Hierarchical structure, in which all the nodes are combined into a tree, and each node can either independently determine the work of the lower-level nodes, or delegate(transfer) them to other nodes.
  • Reservation... For the storage and maintenance of their nodes (zones) are (usually) several servers, separated both physically and logically, which ensures the safety of data and the continuation of work even in the event of a failure of one of the nodes.

Domain levels. There are three levels of domains.

Domains first or top level are divided into two groups:

1) These are domains with territorial affiliation, for example: .ru .by .ua .de .us, etc. That is, these are domains that are assigned to a particular country. By them, you can, for example, determine which country a particular site belongs to.

2) The second group of first-level domains are domains of some specific purpose. For example: .com - for commercial organizations, .info - for informational sites, .tv - for television companies, etc. These domains can be used to determine the specific focus of the site. Although, to tell the truth, lately they are more and more used in any way and often do not adhere to their purpose.

Domains of the first level cannot be used as the address of your site. They serve to create domains second level , therefore, on any of the first-level domains, you can register a second-level domain. Second level domain consists of the following elements: www.site_name.first level domain. For example: www.webmastermix.ru. It is recommended to use second-level domain names for the site address. They are best read and remembered by people, as well as perceived search engines... Therefore, most sites have domain names at this level.

In addition, there are domains third level ... They are created based on second-level domains. The third-level domain looks like this: www.forum.webmastermix.ru. Having registered a second-level domain, you can independently create on its basis as many third-level domains as you like. Register Domain name for your site, you can use special services.

WEB TECHNOLOGIES: HTML, JAVASCRIPT

The first part of the didactic block of the above topic was devoted to Internet technologies. Now we are starting to study the technologies used in the World Wide Web, or web technologies.

First, you need to understand the basic concepts of web technologies: website and web page. A web page is the minimum logical unit of the World Wide Web, which is a document that is uniquely identified by a unique URL. A website is a collection of thematically related web pages located on the same server and owned by the same owner. In a particular case, a website can be represented by one single web page. The World Wide Web is the collection of all websites.

The basis of the entire World Wide Web is the hypertext markup language HTML - Hyper Text Markup Language (Fig. 3). It serves for logical (semantic) markup of a document (web page). Sometimes it is improperly used to control the way the content of web pages is displayed on a monitor screen or when outputting to a printer, which fundamentally contradicts the ideology adopted on the World Wide Web.

Rice. 3. Web technologies

Cascading Style Sheets (CSS) are intended to control the display of content on web pages. CSS is similar in many ways to the styles used in the popular word processor Word.

Scripting languages ​​are used to add dynamism to web pages (drop-down menus, animation). The standard scripting language on the world wide web is JavaScript. Core JavaScript language is ECMAScript.

HTML, CSS, JavaScript are languages ​​with which you can create any complex website. But this is just linguistic support, while in browsers documents are represented as a collection of objects, many of which are the browser object model (BOM). The browser object model is unique to each model, and thus problems arise when building cross-browser applications. Therefore, the Web Consortium proposed object model document (DOM), which is in a standard way presentation of web pages using a set of objects.

Syntax modern HTML described using the Extensible Markup Language XML. XML will allow you to create your own markup languages ​​similar to HTML in the form of DTDs. There are many such languages: for representing mathematical and chemical formulas, knowledge, etc.

As you can see from the above, all web technologies are closely interconnected. Understanding this fact will make it easier to understand the purpose of a particular mechanism used to create web applications.

EMAIL

Electronic mail (email, e-mail, from English electronic mail) - technology and the services it provides for sending and receiving e-mails(called "letters" or " emails") Distributed computer network... The main difference from other messaging systems is the possibility of delayed delivery and a developed system of interaction between independent mail servers.

E-mail makes it possible to send and receive messages, respond to correspondents' letters automatically using their addresses, send copies of the letter to several recipients at once, forward the received letter to another address, use logical names instead of addresses (numeric or domain names), create several subsections of the mailbox for all kinds of correspondence, include in letters text files, use the system of "mail bouncers" to conduct discussions with a group of your correspondents, and so on. To send a postal message by e-mail, it is necessary to indicate the address of the mailbox. An e-mail subscriber's mailbox is an area on the hard disk of a mail server reserved for the user.

The development of Internet technology has led to the emergence of modern messaging protocols, which provide great opportunities for processing letters, a variety of services and ease of use. For example, SMTP protocol, working on the client-server principle, is designed to send messages from a computer to the addressee. Typically, access to the SMTP server is not password protected, so any known server on the network can be used to send emails. Unlike servers for sending letters, access to servers for storing messages is password protected. Therefore it is necessary to use the server or service in which the Account... These servers use the POP and IMAP protocols, which differ in the way they store messages.

In accordance with the POP3 protocol, messages arriving at a specific address are stored on the server until they are downloaded to the computer during the next session. After downloading messages, you can disconnect from the network and start reading mail. Thus, using POP3 mail is the fastest and most convenient to use.

IMAP protocol convenient for those people who use a permanent connection to the network. Messages received by the address are also stored on the server, but, unlike POP3, when checking mail, only the message headers will be downloaded first. The letter itself can be read after selecting the message header (it will be downloaded from the server). It is clear that with a dial-up connection, working with mail using this protocol leads to unnecessary loss of time.

There are several protocols for receiving and transferring mail between multi-user systems.

Short description some of them:

1) SMTP (Simple Mail Transfer protocol) is a network protocol designed for the transmission of e-mail in TCP / IP networks, and the transmission must necessarily be initiated by the transmitting system itself.

MTA (Mail Transfer Agent) - the mail transfer agent - is the main component of the Internet mail transfer system, which represents this network computer for the network e-mail system. Typically, users do not work with the MTA, but with the MUA (Mail User Agent) program - an email client. The principle of interaction is schematically shown in the figure.

2) POP, POP2, POP3 (Post Office Protocol)- three fairly simple non-interchangeable protocols, developed for delivering mail to a user from a central mail server, deleting it from it, and for identifying a user by name / password. POP includes SMTP, which is used to transfer mail from a user. Mail messages can be received in the form of headers, without receiving the entire message.

After the connection is established, the POP3 protocol goes through three consecutive states

      1. Authorization the client goes through the authentication procedure
      2. The client transaction receives information about the state of the mailbox, accepts and deletes mail.
      3. Updating the server deletes the selected emails and closes the connection.

3) IMAP2, IMAP2bis, IMAP3, IMAP4, IMAP4rev1 (Internet Message Access Protocol) - provides the user with rich opportunities for working with mailboxes located on a central server

o IMAP stores mail on the server in file directories, and also provides the client with the ability to search for strings in mail messages on the server itself.

o IMAP2 - used in rare cases.

o IMAP3 - incompatible solution, not used.

o IMAP2bis - an extension of IMAP2, allows servers to parse messages into MIME-structure (Multipurpose Internet Mail Extensions), still in use.

o IMAP4 is a reworked and enhanced IMAP2bis that can be used anywhere.

o IMAP4rev1 - Extends IMAP with a wide range of features, including those used by DMSP (Distributed Mail System for Personal Computers).

4) ACAP (Application Configuration Access Protocol) - a protocol developed to work with IMAP4; adds the ability to search subscription and subscription to message boards, mailboxes and is used to search for address books.

5) DMSP (or PCMAIL) is a protocol for receiving / sending mail, the peculiarity of which is that the user can have more than one workstation in his use. The workstation contains status information about mail, the directory through which the exchange takes place, which, when connected to the server, is updated to the current state on the mail server.

6) MIME is a standard that defines mechanisms for sending all kinds of information via e-mail, including text in languages ​​other than English, for which character encodings other than ASCII are used, as well as 8-bit binary content such as pictures, music, films and programs.

Independent work.

Execute the example given in the text (handout) Save to own folder on your desktop.

9.2. Working with a teacher:

If you have difficulty or erroneous actions contact your teacher to correct errors.

By the end of the lesson, show the teacher a report on the work performed and get a credit for this work.

9.3. Control of the initial and final level of knowledge:

Testing on a computer .


Similar information.


And so on. First of all, we are talking, of course, about the resources of the Internet and the World Wide Web. A URI provides a simple and extensible way to identify resources. URI extensibility means that several identification schemes already exist within a URI, and more will be created in the future.
For more details see"URI structure" below.

The most famous examples of URIs are URNs. A URL is a URI that, in addition to identifying a resource, also provides information about the location of that resource. A URN is a URI that identifies a resource in a specific namespace (and thus in a specific context). For example, URN urn: ISBN: 0-395-36341-1 is a URI that points to a resource (book) 0-395-36341-1 in the ISBN namespace, but unlike a URL, the URN does not point to the location of that resource. However, there has recently been a tendency to speak simply of a URI about any identifier string, without further elaboration. So perhaps the terms URL and URN will soon become a thing of the past.

Story

The new version of the URI was defined in 1998 in RFC 2396, at the same time the word Universal in the title has been changed to Uniform... In December 1999, RFC 2732 introduced minor changes to the URI specification to ensure compatibility with August 2002. RFC 3305 announced the deprecation of the URL term and URI precedence. The current structure and syntax of URIs is governed by RFC 3986, released in January 2005. Many the latest technology semantic web (eg RDF) are based on the URI standard. Now the leading role in the development of URI belongs to the World Wide Web Consortium.

Flaws

The URL was a fundamental innovation on the Internet, so the principles of URI were documented to ensure full URL compatibility. This is where the big disadvantage of URIs comes from, inheriting from URLs. In the URI, as in the URL, only a limited set of Latin characters and punctuation marks can be used (even less than in Cyrillic, or hieroglyphs, or, say, specific characters of the French language, then we will have to encode the URI in the same way as in Wikipedia URLs with Unicode characters are encoded, for example, a string like this:

http://ru.wikipedia.org/wiki/Microcredit

URL encoded as:

http://ru.wikipedia.org/wiki/%D0%9C%D0%B8%D0%BA%D1%80%D0%BE%D0%BA%D1%80%D0%B5%D0%B4%D0 % B8% D1% 82

Since letters of all alphabets, except for the Latin alphabet used in English, undergo such a transformation, URIs with words in other languages ​​(even European) lose their ability to be perceived by people. And this is in gross contradiction with the principle of internationalism, proclaimed by all the leading organizations of the Internet, including the W3C and IRI (eng. International Resource Identifier ) - international resource identifiers in which it would be possible to use Unicode characters without problems, and which would not infringe upon the rights of other languages. Although it's hard to say in advance if identifiers will ever be able to. This format seeks to create identifiers that would be completely independent of the context, that is, they would not depend on the protocol, nor on the domain, nor on the path, nor on the application, nor on the platform - would be utterly independent.

Likewise, the creator of the URI, Tim Berners-Lee, said that the domain name system that underlies URLs is a bad decision, imposing a hierarchical architecture on resources that is not suitable for the hypertext web.

URI structure

Parsing the URI structure

For the so-called "parsing" URI (eng. parsing), that is, to decompose URIs into their constituent parts and their subsequent identification, it is most convenient to use the regular expression system, which is now available in almost all modern programming languages. It is recommended to use the following pattern to parse URIs:

^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? 12 3 4 5 6 7 8 9

This pattern includes 9 groups indicated above by numbers (for more information on patterns and groups, see Regular Expressions), which most fully and accurately parse a typical URI structure, where:

  • group 2 - scheme,
  • group 4 - source,
  • group 5 - path,
  • group 7 - request,
  • group 9 - fragment.

Thus, if, using this template, we parse, for example, such a typical URI:

Http://www.ics.uci.edu/pub/ietf/uri/#Related

then the 9 above template groups will give the following results respectively:

  1. http:
  2. //www.ics.uci.edu
  3. www.ics.uci.edu
  4. / pub / ietf / uri /
  5. no result
  6. no result
  7. #Related
  8. Related

Distinguishing a URI from a URL

The URI does not always indicate how to get the resource, unlike the URL, but only identifies it. This makes it possible to describe using RDF (Resource Description Framework) resources that cannot be obtained via the Internet (for example, person, car, city, etc.).

URI examples

Absolute URIs

http://ru.wikipedia.org/wiki/URI ftp://ftp.is.co.za/rfc/rfc1808.txt file: // C: \ UserName.HostName \ Projects \ Wikipedia_Articles \ URI.xml ldap: /// c = GB? objectClass? one mailto: [email protected] sip: [email protected] news: comp.infosystems.www.servers.unix data: text / plain; charset = iso-8859-7,% be% fg% be tel: + 1-816-555-1212 telnet: //192.0.2.16: 80 / urn: oasis: names: specification: docbook: dtd: xml: 4.1.2

URI references

/relative/URI/with/absolute/path/to/resource.txt relative / path / to / resource.txt ../../../resource.txt resource.txt /resource.txt#frag01 # frag01 [empty line]

see also

Links

  • RFC 3986 / STD 66 (2005)
  • RFC 2396 (1998) - Obsolete syntax

Notes (edit)


Wikimedia Foundation. 2010.

See what "Uri" is in other dictionaries:

    Uri- may refer to: Geography: * Canton of Uri is a canton (region) of Switzerland * Uri (India), a region and town in Kashmir * Uri (SS), a city in Sardinia, Italy * Úri, a village in Pest county, Hungary * Sumerian URI, the land of AgadeURI, a three…… Wikipedia

    urî- URÎ, urăsc, vb. IV. 1.tranz. A avea un puternic sentiment de antipatie, de duşmănie împotriva cuiva sau a ceva; a nu putea suferi pe cineva sau ceva. 2.refl. impers. (Construit cu dativul) A se plictisi, a se sătura de ceva sau de cineva. ♢…… Dicționar Român

    uri- urì interj., urỹ NdŽ, Jn, Aln, ùri kartojant 1.nusakomas puolančio šuns (ar šunų) urzgimas: Tik urỹ urỹ ir apipuolo mane šunes K.Būg (Ds). Urì urì šunes kad pradeda loti Šmn. ║ Ds sakoma pjudant šuniu. 2. Vžns nusakomas triukšmingas ... ... Dictionary of the Lithuanian Language

And Google Play referrer.

The Android platform is extremely high level fragmentation, since Google forces device developers to independently carry out the transfer of the OS, providing backward compatibility and support multiple devices. As a consequence, long if-else statements are often used to ensure that the most best method in the appropriate context.

The situation is exactly the same with direct links in Android. Over time, a myriad of technical requirements have emerged that need to be met depending on the circumstances and user context. Branch's solution brings all of these implementations together, it is a linking framework that works in all edge cases. Branch links let you work around the complexity and use a standard solution, so you don't have to worry about compatibility. We strongly recommend using our solutions rather than trying to recreate similar functionality from scratch, as we provide them for free.

This series of publications describes all of the various direct link mechanisms we use and explains their implementation.

You can start working on the site start.branch.io or click on the button below.

Android URI scheme and intent filter

In Android 1.0, a direct linking mechanism was created based on the URI scheme. With it, a developer can register his application with a URI ( universal code resource) in operating system for a specific device after installing the app. Any text string can be used as the URI without special characters such as HTTP, pinterest, fb, or myapp. After registration, if you add ": //" to the end of the URI (for example, pinterest: //) and click this link, it will open Pinterest app... If the Pinterest app is not installed, a "Page not found" error will appear.

Requirements for using URI schemes in Android

  • Register an action to respond to a URI with an intent filter in the manifest.
  • The app must be installed to use. If the application is not installed, an error message will appear.

Setting up a URI scheme in Android

Configuring your application for a URI scheme is easy. First, you need to select an action in your application that your application should take when you enable a URI scheme and register an intent filter for it. Add the following code to the tag in the manifest corresponding to the action to open.

You can change your_uri_scheme to your desired URI scheme. The schema should ideally be unique. If it matches the URI scheme of another application, then when clicking on the link, the user will see a window Android selection... You will often see this window if you have multiple web browsers installed on your device as they are all registered for HTTP URIs.

Handling direct links in an Android app

You will then need to parse the string to read the values ​​appended by the URI scheme.

Using URI Schemes in Android in Practice

There are significant limitations in how URIs handle direct links. We do not recommend using it without significant changes, because if there is no application on the device, an error message will simply be displayed. To use the URI scheme effectively, you will need to add additional tools to handle edge cases, such as when the application is not installed.

Therefore, to provide sufficient user experience when the application is not installed, you need to enclose the URI scheme in client-side JavaScript that can be executed in a browser. This JS code will be hosted on your server, and you will send the link to users. Below is an example.

The code will try to open the app by setting the source of the iFrame to the URI scheme and then safely return to the store Google applications Play if the app fails to load.

Conclusion

Stay tuned for further posts on Android direct links.

Direct links in Android are very complex, edge cases come across at every step. You may think that everything works great, until suddenly some user complains that he does not open links from Facebook in Android 4.4.4. That is why it is worth using programs like Branch: you can just forget all these difficulties like a bad dream and get used to the fact that links just always work.

Related Posts

Direct links, universal links, URI / URL schemes, and app links are available last years all these mechanisms have significantly changed the principle of communication with content in mobile applications... Many application developers do not have a clear ...

Every day at Branch we work to bring the linking process to mobile platforms to perfection. Our links provide access to things like smart redirects, showing to the user ...

URI (Uniform Resource Identifier, Generic identifier resource) - a compact string of characters to identify an abstract or physical resource. A resource is understood as any object that belongs to a certain space. The need for a URI has been understood by WWW developers since the inception of the system, since it was supposed to unite in a single information environment the means using different ways identification of information resources. A specification was developed that included calls to FTP, Gopher, WAIS, Usenet, E – mail, Prospero, Telnet, X.500, and of course HTTP (WWW). As a result, a universal specification was developed that allows expanding the list of addressable resources due to the emergence of new schemes.

Where URIs are used are hypertext links that are written in tags and ... Embedded graphics are also addressed by URI specification in tags and ... The implementation of a URI for the WWW is called a URL (Uniform Resource Locator). More precisely, a URL is an implementation of a URI scheme mapped to an algorithm for accessing resources over network protocols. There is also a URN (Uniform Resource Name), which maps a URI to a namespace on the network.

The emergence of URNs stems from the desire to address MIME portions of a mail message. Principles of constructing a WWW address. The URI was based on the following principles:

· Extensibility - New addressing schemes should easily fit into existing URI syntax.

· Completeness - whenever possible, any of the existing schemes should be described using a URI.

· Readability - the address had to be easily readable by the user, which is generally typical for WWW technology - documents, along with links, can be developed in a regular text editor.

Before looking at the various address representation schemes, here's an example of a simple URI:

http://polyn.net.kiae.su/polyn/index.html

The colon is preceded by the address scheme identifier - "http". This name is separated by a colon from the remainder of the URI, which is called the path. In this case, the path consists of the domain address of the machine on which the HTTP server is installed and the path from the root of the server tree to the "index.html" file. In addition to the above full record URI, there is a simplified one. It assumes that by the time it is used, many parameters of the resource address have already been defined (protocol, machine address in the network, some path elements). Under such assumptions, the author of hypertext pages can indicate only the relative address of the resource, i.e. an address relative to certain underlying resources.

A URL (Uniform Resource Locator) is a subset of URI schemes that identifies a resource by how it is accessed (for example, its "location on the web") rather than identifying it by name or other attributes of that resource. The URL explicitly describes how to get to the object.

Syntax: :, where:

scheme = "http" | "Ftp" | "Gopher" | "Mailto" | "News" | "Telnet" | "File" | "Man" | "Info" | "Whatis" | "Ldap" | "Wais" | ...- schema name

scheme – specific – part- depends on the scheme. In scheme – specific – part you can use hexadecimal values in the form:% 5f. The non-printable octets must be encoded: 00-1F, 7F, 80-FF.

Examples of URLs:

Http://www.ipm.kstu.ru/index.php

Ftp://www.ipm.kstu.ru/

URN (Uniform Resource Name) is a private "urn:" URI with a subset of the "namespace" that must be unique and immutable even when the resource no longer exists or is inaccessible.

It is assumed that, for example, the browser knows where to look for this resource.

Syntax: urn: namespace: data1.data2, more – data where namespace defines how the data after the second ":" is used.

URN example:

urn: ISBN: 0–395–36341–6

ISBN - thematic classifier for publishers,

0–395–36341–6 – specific number subject of a book or magazine

Upon receipt of the URN, the client program turns to the ISBN (the directory "Topical Classifier for Publishers" on the Internet). And he gets a decryption of the subject number "0-395-36341-6" (for example: "quantum chemistry"). URN was adopted relatively recently, in current versions HTML is not included and directory services are not yet mature, so URNs are not as widespread as URLs.

Internet resource addressing schemes

There are 3 schemes for addressing Internet resources. The scheme specifies its identifier, machine address, TCP port, path in the server directory, variables and their values, label.

HTTP scheme... This is the basic layout for the WWW. The scheme contains its identifier, machine address, TCP-port, path in the server directory, search criterion and label.

Syntax: http: // [ [:@][:][?]]

http- circuit name

user- Username

password- user password

host- hostname

port- port number

url – path- the path to the file and the file itself

query (<имя–поля>=<значение>{&<имя–поля>=<значение>) - query string

By default, port = 80.

Here are some examples of URIs for the HTTP scheme:

http://polyn.net.kiae.su/polyn/manifest.html

This is the most common type of URI used in WWW documents. The schema name (http) is followed by a path consisting of the domain address of the machine and the full address of the HTML document in the HTTP server tree.

The IP address can also be used as the machine address:

http://144.206.160.40/risk/risk.html

If the server HTTP protocol launched on a TCP port other than 80, this is reflected in the address:

http://144.206.130.137:8080/altai/index.html

http://polyn.net.kiae.su/altai/volume4 .html # first

FTP schema... This scheme allows you to address FTP file archives from client programs World wide Web. In this case, the program must support the FTP protocol. In this scheme, it is possible to specify not only the name of the scheme, the address of the FTP-archive, but also the user ID and even his password.

Syntax: ftp: // [ [:@][:]

ftp- circuit name

user- Username

password- user password

host- hostname

port- port number

url – path- the path to the file and the file itself

By default, port = 21, user = anonymous, password = email address.

Most often, this scheme is used to access public FTP archives:

ftp://polyn.net.kiae.su/pub/0index.txt

In this case, a link to the archive "polyn.net.kiae.su" with the identifier "anonymous" or "ftp" (anonymous access) is recorded. If there is a need to specify the user ID and his password, then you can do this in front of the machine address:

ftp: // nobody: [email protected]/ users / local / pub

In this case, these parameters are separated from the machine address by the @ symbol, and from each other by a colon.

TELNET scheme... This scheme is used to access the resource in the remote terminal mode. Usually the client calls additional program to work over the telnet protocol. When using this scheme, you must specify a user ID, a password is allowed.

Syntax: telnet: // [ [:@][:]/

telnet- circuit name

user- Username

password- user password

host- hostname

port- port number

By default, port = 23.

Example: telnet: // name: [email protected]

In reality, access is carried out to public resources, and the identifier and password are generally known, for example, they can be found in the Hytelnet databases.

telnet: // guest: [email protected]

As you can see from the examples above, the URI resource address specification is fairly general and allows you to identify virtually any resource on the Internet. In this case, the number of resources can be expanded by creating new schemes.

WWW service

WWW service (World Wide web) - intended for the exchange of hypertext information, built according to the "client-server" scheme. Browser ( Internet Explorer, Opera ...) is a multi-protocol client and HTML interpreter. And as a typical interpreter, the client performs different functions depending on the commands (tags). The range of these functions includes not only placing text on the screen, but exchanging information with the server as the received HTML text is parsed, which most clearly occurs when displaying graphic images embedded in the text.

HTTP Server(Apache, IIS ...) handles client requests for a file. In the beginning, the WWW service was based on three standards:

· HTML (HyperText Markup Lan – guage) - language of hypertext markup of documents;

URL (Universal Resource Locator) - universal way resource addressing in the network;

· HTTP (HyperText Transfer Protocol) - a protocol for the exchange of hypertext information.

WWW server operation scheme

WWW server is such a part of a global or intranet that allows network users to access hypertext documents located on this server... To interact with the WWW server, the network user must use a specialized software- browser (from the English browser) - viewer.

Let's take a closer look at the WWW-server operation scheme:

1. The network user launches a browser, the functions of which include:

· Establishing connection with the server;

· Obtaining the required document;

· Display of the received document;

· Response to user actions - access to a new document. After starting the browser, at the user's command, or automatically establishes a connection with the specified WWW - server and sends it a request to receive the specified document.

2. The WWW server searches for the requested document and returns the results to the browser.

3. The browser, having received the document, displays it to the user and waits for his reaction. Possible options:

· Entering the address of a new document;

Print, search, other operations on current document;

· Activation (pressing) of special areas of the received document, called links and associated with the address of the new document. In the first and third cases, there is an appeal for a new document.

Top related articles