A variant of the logical data model. Physical and logical database models

13.04.2019 Windows 7, XP

LECTURE

Hierarchical, network, relational data models.

Construction principles.

Advantages and disadvantages

In the process of development of the theory of database systems, the term "data model" had a different meaning. For a deeper understanding of the essence of individual concepts, let us consider some of the features of the use of this concept in the context of the evolution of databases.

11.1. About the concept of "data model"

Initially, the concept of a data model was used as a synonym for the data structure in a particular database. The structural interpretation was fully consistent with the mathematical definition of the concept of a model as a set with relations specified on it. But it should be noted that the object of modeling in this case is not data in general, but a specific database. The development of new architectural approaches based on the ideas of a multi-tier DBMS architecture has shown that it is no longer enough to consider the display of views of a specific database. A solution was required at the meta-level that would allow one to operate with sets of all kinds of admissible representations of databases within a given DBMS or, equivalently, with the tools used to specify them. In this regard, a need arose for a term that would denote a tool, not a simulation result, and thus would correspond to a variety of different databases of a certain class. Those. a database modeling tool should include not only data structuring tools, but also data manipulation tools. Therefore, a data model in the instrumental sense began to be understood as an algebraic system - a set of all possible admissible data types, as well as relations and operations defined on them. Later, this concept began to include also integrity constraints that can be imposed on data. As a result, the problem of displaying data in multi-level DBMS and systems distributed databases data came to be seen as a mapping problem for data models.

It is important to emphasize that for developers and users of a DBMS, the exact definition of the data model implemented in it is actually the language means for defining data and manipulating data. Therefore, it is inappropriate to identify such a language with a database schema (the result of modeling) - a specific specification in this language.

Since the mid-70s, under the influence of the concept of abstract types proposed at that time, the very concept of a data type in programming languages began to transform in such a way that not only structural properties, but also elements of behavior (data changes) began to be put into it. In the future, this served as the basis for the formation of the concept of the object, on which modern object models are based.

In this regard, it was proposed new approach in which the data model is treated as a type system. This approach provided natural opportunities for the integration of databases and programming languages, contributed to the formation of the direction associated with the creation of so-called database programming systems. The treatment of a data model as a type system is matched not only by existing widely used models, but also by object models that are gaining more and more influence.

So the data model is the model logic level database design. It can be seen as a combination of three components(slide 2):

1. Structural component, i.e. a set of rules by which a database can be built.

2. A control component that determines the types of allowed operations with data (this includes operations for updating and retrieving data, as well as operations for changing the database structure).

3. Support for a set of (optional) data integrity constraints to ensure that the data used is correct.

From the point of view of the structural component, the models are distinguished on the basis of records. In a record-based model, the data structure is a combination of several types of fic records formatted format. Each record type defines a fixed number of fields, each of which has fixed straight length.

There are three main types of record-based logical data models ( slide 3):

- relational data model ( relational data model);

- network data model;

- hierarchical data model ( hierarchical data model).

Hierarchical and network data models were created almost ten years earlier than the relational data model, therefore their relationship with conceptstraditional file handling is more obvious.

11.2. Relational data model

The relational data model is based on the concept mathematical relations. In the relational model, data and relationships are represented as tables, each with several columns with unique names. On the slide ( slide 4 ) shows an example of a relational schema , containing information about the departments of the university and the staff. For example, from the table "Personnel composition" it is clear that the employee Ivanov I.I. works in the position of head of department 22, which, according to the data from the table "Structure", is located in building A, in room 322. It is important to note here that between the relations "Personnel" and "Structure" there is the following connection: employee works at the department. However, there is no explicit connection between these two relations: its existence can be noticed only if one knows that the attribute Caf in relation to "Personnel" is equivalent to the attribute Caf in relation to "Structure".

It should be noted that in the relational data model, the only requirement isThe key is to make the database look like a collection of tables from the user's point of view. However, this perception applies only to logicaldatabase structure, i.e. to the external and conceptual levels of architectures ANSI / SPARC tours ... It does not refer to the physical structure of the database, butToraya can be implemented using a variety of storage structures.

On slides ( slides 5, 6 ) the relational data model for the SbA "employees-projects-details-suppliers" is presented.

11.3. Network data model

V network model data is presented as collections records, and links in the form sets. In contrast to the relational model, links here are explicitly modeled by sets, which are implemented using pointers ( slide 5 ). The network model can be represented as a graph with records in the form knots graph and sets in the form of its ribs. The slide shows an example network diagram for the same datasets shown in the relational model.

The most popular network DBMS is the system IDMS / R from Computer Associates.

On slides ( slides 8, 9 ) the variants of the network data model for the SbA "employees-projects-details-suppliers" are presented.

11.4. Hierarchical data model

The hierarchical model is a limited subtype of the network model. It also presents data as collections records, and connections - how sets. However, in a hierarchical model, a node can only have one parent. The hierarchical model can be represented as a tree graph with entries in the form of nodes (also called segments) and sets in the form of edges ( slide 6 ). The slide shows an example of a hierarchical schema for the same datasets shown in the previous models.

The most common hierarchical DBMS is the system IBM Corporation IMS although it also possesses some other non-hierarchical features.

On slides ( slides 11, 12 ) the variants of the hierarchical data model for the SbA "employees-projects-parts-suppliers" are presented.

11.5. Advantages and disadvantages of models

Record-based (logical) data models are used to define the overall structure of a database and high-level descriptions of its implementation. Their main disadvantage is that they do not provide adequate means to explicitly indicate the constraints on data. At the same time in object models data, there are no means of indicating their logical structure, but by providing the user with the ability to specify restrictions for the data, they allow to better represent the semantic essence of the stored information.

Most modern commercial systems is based on a relational model, while the earliest database systems were based on a network or hierarchical model. When using the latter two models, the user is required to know physical organization the database to which he must access. When working with a relational model, data independence is provided to a much greater extent. Therefore, if a declarative approach is adopted in relational systems for processing information in a database (i.e., they indicate, what kind data should be extracted), then in network and hierarchical systems - a navigation approach (i.e. they indicate, how they should be removed).

Networking and hierarchical structures are mainly focused on ensuring that the links between data are stored along with the data itself. Such unification was realized, for example, by data aggregation (construction of complex conceptual structures and data) or by the introduction of a reference apparatus that fixes semantic links directly in the data record.

The tabular form of information presentation is the most common and understandable. In addition, such semantically more complex forms as trees and networks, by introducing some redundancy, can be reduced to tabular ones. At the same time, links between data will also be presented in the form of two-dimensional tables.

The relational approach, which is based on the principle of separating data and relationships, provides, on the one hand, data independence, and on the other, simpler ways to implement storage and updating.

Multidimensional models, commercial implementations of which appeared in the early 90s to support OLAP technologies represent some extension of the model of universal relations with new operational capabilities that provide, in particular, the data aggregation functions necessary for OLAP. Thus, multidimensional models are a special kind of relational models.

11.6. Documenting systems and model integration

The above provisions were developed and are really widely used for databases well structured information... However, today one of the most important problems is to ensure the integration of heterogeneous information resources, and in particular semi-structured data. The need to solve it is associated with the desire to fully integrate database systems into the environment of Web technologies. At the same time, it is no longer enough to simply provide access to the database in the traditional way “from under” HTML forms. Integration at the model level is needed. And in this case, the problem of semantic interoperability of information resources is reduced to the task of developing tools and technologies that provide for an explicit specification of metadata for semi-structured data resources based on traditional modeling technologies from the field of databases.

It is to achieve this goal that intensive development is directed. Www -consortium XML language and its infrastructure (in fact, a new data model for this environment), a document object model, and other tools that can be expected to become the backbone of information resource management technologies in the near future. This direction is associated with another a global problem- organization of distributed heterogeneous information systems based on the construction of metadata repositories (this concept in classical works on database design corresponds to the concept of a data dictionary), providing the possibility of semantic identification of resources and, thus, the possibility of their purposeful reuse.

To represent mathematical knowledge in mathematical logic, they use logical formalisms - propositional calculus and predicate calculus. These formalisms have clear formal semantics and inference mechanisms have been developed for them. Therefore, predicate calculus was the first logical language that was used to formally describe subject areas associated with solving applied problems.

Logical models knowledge representations are realized by means of predicate logic.

The predicate is a function that takes two values (true or false) and is designed to express the properties of objects or relationships between them. An expression that asserts or denies the existence of any properties of an object is called utterance. Constants serve to name objects of the subject area. Logical sentences or statements form atomic formulas. Interpreting a predicate is the set of all valid variable bindings with constants. Binding is constant substitution of variables instead of variables. A predicate is considered valid if it is true in all possible interpretations. It is said that a statement logically follows from given premises if it is true whenever the premises are true.

Descriptions of subject areas made in logical languages are called logical models .

GIVE (MICHAEL, VLADIMIR, BOOK);

($ x) (ELEMENT (x, EVENT-GIVE)? SOURCE (x, MICHAEL)? ADDRESS? (x, VLADIMIR) OBJECT (x, BOOK).

Here are two ways of recording one fact: "Mikhail gave the book to Vladimir."

Inference is carried out using a syllogism (if A follows B, and B follows C, then A follows C).

V general case logical models are based on the concept formal theory given by the four:

S = ,

where B is a countable set basic characters (alphabet) theory S;

F - subset expressions of theory S called theory formulas(by expressions we mean finite sequences of basic symbols of the theory S);

A is a dedicated set of formulas called axioms of theory S, that is, a set of a priori formulas;

R is a finite set of relations (r 1, ..., r n) between formulas, called withdrawal rules.

The advantage of logical models of knowledge representation lies in the ability to directly program the mechanism for outputting syntactically correct statements. An example of such a mechanism is, in particular, the withdrawal procedure based on the resolution method.

Let us show the method of resolutions.

The method uses several concepts and theorems.

Concept tautologies, a logical formula, the value of which will be "true" for any values of the atoms included in them. Is it denoted by ?, read as "universally valid" or "always true."

Theorem 1. A? B if and only if? A B.

Theorem 2. A1, A2, ..., An? B if and only when? (A1? A2? A3?…? An) Q.

Symbol? reads as "true that" or "can be inferred."

The method is based on the proof of tautology

? (X? A)?(Y? ? A)? (X? Y) .

Theorems 1 and 2 allow us to write this rule in the following form:

(X? A), (Y? ? A) ? (X? Y),

which gives grounds to assert: from the premises and can be deduced.

In the process of inference using the resolution rule, the following steps are performed.

1. The operations of equivalence and implication are eliminated:

2. The negation operation moves inside the formulas using de Morgan's laws:

3. Logical formulas are reduced to disjunctive: .

The resolution rule contains a conjunction of clauses on the left side, therefore, bringing the premises used for the proof to a form that is a conjunction of clauses is a necessary step in almost any algorithm that implements logical inference based on the resolution method. The resolution method is easily programmable; this is one of its most important advantages.

Suppose you need to prove that if the relations and are true, then you can derive a formula. To do this, follow these steps.

1.Reduction of parcels to disjunctive form:
, , .

2. Constructing the denial of the conclusion to be inferred. The resulting conjunction is valid when and are simultaneously true.

3.Application of the rule of resolution:

(contradiction or "empty clause").

So, assuming that the conclusion drawn is false, we obtain a contradiction, therefore, the conclusion drawn is true, i.e. , is deduced from the initial premises.

It was the resolution rule that served as the basis for the creation of the language logic programming PROLOG. In fact, the PROLOG language interpreter independently implements an output similar to the one described above, forming an answer to a user's question addressed to the knowledge base.

In the logic of predicates, in order to apply the rule of resolution, it is necessary to carry out a more complex unification of logical formulas in order to reduce them to a system of clauses. This is due to the presence of additional syntax elements, mainly quantifiers, variables, predicates and functions.

The algorithm for unification of predicate logical formulas includes the following steps.

After completing all the steps of the described unification algorithm, the resolution rule can be applied.Usually, in this case, the deduction of the deduced conclusion is carried out, and the deduction algorithm can be briefly described as follows: R from the axioms of the theory Th, the negation is constructed R and is added to Th, thus obtaining a new theory Th1. After reducing the theory axioms to a system of clauses, it is possible to construct the conjunction and the axioms of the theory Th. In this case, it is possible to derive clauses - consequences from the initial clauses. If R is deducible from the axioms of the theory Th, then in the process of deduction it is possible to obtain a certain clause Q, consisting of one letter, and the opposite clause. This contradiction suggests that R deduced from the axioms Th. Generally speaking, there are many strategies of proof, we have considered only one of the possible - top-down.

Example: Let's represent the following text by means of predicate logic:

"If a student knows how to program well, then he can become a specialist in the field of applied computer science."

"If a student has passed the information systems exam well, then he or she can program well."

Let us represent this text by means of first-order predicate logic. Let us introduce the notation: X- a variable to denote a student; OK- a constant corresponding to the skill level; R(NS)- a predicate expressing the possibility of a subject X become a specialist in applied informatics; Q(X, okay)- a predicate denoting the skill of the subject X program with assessment OK; R(X, okay)- a predicate that specifies the student's relationship X with examination marks on information systems.

Now let's build a lot of well-formed formulas:

Q (X, good).

R(X, good)Q(X, good).

Let us supplement the obtained theory with a specific fact
R(ivanov, good).

Let us perform inference using the resolution rule to determine if the formula is R(ivanov) a consequence of the above theory. In other words, is it possible to deduce from this theory the fact that student Ivanov will become a specialist in applied computer science if he has passed the exam in information systems well?

Proof

1. Let's transform the initial formulas of the theory in order to bring them to the disjunctive form:

(X, good) R (X);

(X, good) (X, good);

R(ivanov, OK).

2. Let us add to the existing axioms the negation of the deduced conclusion

(ivanov).

3. We construct a conjunction of clauses

(X, good) R (X)? ? P(ivanov, good)? ? Q(ivanov, good), replacing the variable X by a constant ivanov.

The result of applying the rule of resolution is called resolution... In this case, the resolvent is (ivanov).

4. Construct a conjunction of clauses using the resolvent obtained in step 3:

(X, good) (X, good) (ivanov, good) (ivanov, good).

5. Let us write the conjunction of the resulting resolvent with the last clause of the theory:

(ivanov, good) (ivanov, good)(contradiction).

Therefore, the fact R(ivanov) deduce from the axioms of this theory.

To determine the order of application of axioms in the inference process, there are the following heuristic rules:

In the first step of inference, the denial of the inference is used.
In each subsequent step of the derivation, the resolvent obtained at the previous step participates.

However, with the help of the rules that define the syntax of the language, it is impossible to establish the truth or falsity of this or that statement. This applies to all languages. The statement may be syntactically correct, but it may turn out to be completely meaningless. A high degree of uniformity also entails another disadvantage. logical models- the complexity of using in the proof of heuristics that reflect the specifics of a specific subject problem. Other disadvantages of formal systems include their monotony, the lack of means for structuring the elements used, and the inadmissibility of contradictions. Further development knowledge bases went the way of work in the field of inductive logics, logics of "common sense", logics of faith and other logical schemes that have little in common with classical mathematical logic.

Logical models

Logical models use the language of predicate calculus. The first predicate corresponds to relationship name , and the term arguments - objects ... All boolean expressions used in predicate logic have values true or falsely.

Example: consider the expression John is an Information Technology Specialist... This expression can be represented as follows: is (John, Information Technology Specialist)... Let be NS - an object ( John) who is an information technology specialist. Then the following notation is used: is an ( NS, Information Technology Specialist).

Expression: Smith works for IBM as a specialist can be represented as a predicate with three arguments: works (Smith, IBM, specialist).

When working with logical models, the following rules must be observed:

1. The order of the arguments should always be specified in accordance with the interpretation of predicates accepted in the given subject area. The programmer decides on a fixed order of arguments and respects it from start to finish.

2. A predicate can have arbitrary number arguments.

3. Individual statements, consisting of a predicate and associated arguments, can be combined into complex statements using logical connectives: AND (END,), OR (or,), NOT (not, ~), → - implication used to formulate rules in the form : IF…, THEN…

Let's look at a few examples:

1 ) Predicate name - is an.

Is (Smith, IT Specialist) ∩ reads (Smith, literature).

Smith is an IT professional and reads literature.

2 ) Predicate name - reports.

Reports (Smith, John) → leads (John, Smith).

If Smith reports to John, then John leads Smith.

3 ) Predicate name - wrote.

Posted by (Smith, program) ∩ NOT works (program) -> debug (smith, program, evening) OR transfer (program, programmer, next day).

IF Smith wrote the program AND she does not work, THEN Smith should debug the program tonight. OR transfer to the programmer the next day.

Variables can also be used as arguments in statements. In this case, to work with variables, the concept is introduced quantifier .

There are two types of quantifiers:

1 ... The quantifier of universality.

2. The quantifier of existence.

(x ) means that all values of the variable in parentheses related to a certain area must be true.

(x ) means that only some of the values x truth.

And they can be part of each other.

Examples of:

1 . (x ) (IT specialist ( X )→programmer(X)).

All IT professionals are programmers.

2 . (x ) (IT specialist ( X )→good programmers(X )).

Some IT Pros Are Good Programmers.

3 . (x ) (y ) (employee ( X ) → manager ( Y , X )).

Every employee has a leader.

4 . (Y ) (X ) (employee ( X ) → manager ( Y , X )).

There is a certain person who is in charge of all.

Questions:

1 ... What is artificial intelligence?

2 ... What is an expert system?

3 ... Development stages of artificial intelligence systems.

4 ... The competence of ES, in comparison of the human intelligence system and the AI system;

5 ... What is the difference between logical and heuristic models?

Lecture 11.

Knowledge representation.

Network semantic models... These models are based on the concept the network , tops , arcs ... There are networks: simple and hierarchical, where vertices are some concepts, entities, objects, events, processes or phenomena. The relationship between these entities is expressed by arcs. The concepts are usually abstract or specific objects, and relationships are relationships like this is , It has part , belongs , loves . Simple networks Dont Have internal structure, and in hierarchical networks, some vertices have an internal structure.

A characteristic feature of semantic networks is the obligatory presence of three types of relations:

1 ... class-element of the class;

2 ... property-value;

3 ... example of a class element.

In hierarchical semantic networks, networks are divided into subnets (space) and relationships are established not only between nodes, but also between spaces.

Space tree.

For space P 6 all vertices of space are visible, lying in the space of ancestors P 4, P 2, P 0 and the rest are invisible. The relationship of "visibility" makes it possible to group space in ordering a variety of "perspectives".

Consider rules or conventions graphic image hierarchical networks:

1. vertices and arcs lying in the same space are limited by a straight line or a polygon;

2. the arc belongs to the space in which its name is located;

3. space P i depicted inside space P j , is considered a descendant (internal level), i.e. from P i "Apparently" P j . P i can be viewed as a "super vertex" that lies in P j .

The problem of finding a solution in a knowledge base of the type of a semantic network is reduced to the problem of finding a fragment of a network corresponding to a certain subnet corresponding to the given network.

The main advantage of network semantic models is in accordance with modern ideas about the organization long-term memory person.

The disadvantage of models is the difficulty of finding inference in the semantic network.

Frame Models.

The desire to develop views that combine the merits of different models has led to the emergence of frame views.

Frame ( English. Frame – carcass or frame ) Is a knowledge structure designed to represent some standard situation or abstract image.

The following information is associated with each frame:

1 ... how to use the frame;

2 ... what are the expected results of executing the frame;

3 ... what to do if expectations are not met.

The upper levels of the frame are fixed and represent the entities or true situations that are described by the given frame. The lower levels are represented by slots which are filled with information when the frame is called. Slots are empty values of some attributes.

A frame is also called a formalized model for displaying an image or situation.

The frame structure can be represented as follows:

FRAME NAME:

(1st slot name: 1st slot value),

(2nd slot name: 2nd slot value),

…………………………………………

(Nth slot name: Nth slot value),

Frame systems are usually represented in the form of information search network, which is used when the proposed frame cannot be brought into conformity with a certain situation, i.e. when slots cannot be assigned values that satisfy the conditions associated with those slots.

V similar situations the web is used to find and suggest another frame.

The most important property of the theory of frames is borrowed from the theory of semantic networks. In both frames and semantic networks, inheritance occurs over A-Kind-of = this. The AKO slot points to a frame of a higher hierarchy level, from where it is not explicitly inherited, i.e. the values of similar slots are transferred.

Frame network.

Here the concept of "student" inherits the property of the frames "child" and "person", which are at a higher level. Then to the question: "Do the students love sweets?" you should answer “Yes” (since children have this property). Property inheritance can be partial, as the age for students is not inherited from the child frame since it is explicitly specified in its own frame.

The main advantage of frames is the ability to reflect the conceptual basis of human memory organization, as well as its flexibility and visibility.

Production models.

In traditional programming, if i - th command is not a branch command, then it is followed by i + 1- th command. This programming method is convenient in cases where the processing sequence depends little on the knowledge being processed.

Otherwise, the program is better viewed as a collection of independent modules, sample-driven ... Such a program at each step during the analysis of samples determines which module is suitable for handling a given situation. A sample-driven module is appropriate to handle this situation. A sample-driven module consists of a mechanism for examining and modifying one or more structures. Each such module implements a certain production rule ... The control functions are performed by the interpreter. In terms of knowledge representation, the approach using sample-driven modules is characterized by the following features:

1. separation of permanent knowledge stored in the knowledge base and temporary knowledge from working memory;

2. structural independence of modules;

3. separation of the control scheme from modules carrying knowledge about the problem area.

This allows you to consider and implement various control schemes, facilitates the modification of the system and knowledge.

Main components of ES.

The main components of information technology used in the expert system are (Fig. 1): user interface, knowledge base, interpreter, system creation module.

Rice. 1... The main components of information technology expert systems.

User interface.

The manager (specialist) uses the interface to enter information and commands into the expert system and receive output information from it. Commands include parameters that guide the processing of knowledge. Information is usually given in the form of values assigned to specific variables. The manager can use four methods input information: menus, commands, natural language and own interface. The technology of expert systems provides for the ability to receive quality day off information, not only the solution, but also the necessary explanations. There are two types of explanations:

Ø explanations issued upon request. The user at any time can demand an explanation of his actions from the expert system;

Ø explanation of the received solution to the problem. After receiving the decision, the user may request an explanation of how it was obtained. The system must explain each step of its reasoning leading to the solution of the problem.

Although the technology of working with the expert system is not simple, the user interface of these systems is friendly and usually does not cause difficulties in dialogue.

Knowledge base.

It contains facts describing the problem area, as well as the logical relationship of these facts. Central place in the knowledge base belongs to rules. The rule defines what should be done in a given specific situation, and consists of two parts: a condition that can be met or not, and an action that should be performed if the condition is met. All rules used in the expert system form system of rules , which even for a relatively simple system can contain several thousand rules. All types of knowledge, depending on the specifics of the subject area and the qualifications of the designer (knowledge engineer), with varying degrees of adequacy, can be represented using one or more semantic models. The most common models include logical, production, framing, and semantic networks.

Interpreter.

This is a part of the expert system that processes knowledge (thinking) in the knowledge base in a certain order. The interpreter's technology is reduced to sequential consideration of a set of rules (rule by rule). If the condition contained in the rule is met, a specific action is taken, and the user is presented with an option to solve his problem.

In addition, in many expert systems, additional blocks are introduced: a database, a calculation block, a data entry and correction block. The calculation block is necessary in situations related to making managerial decisions. In this case, an important role is played by the database, which contains planned, physical, calculated, reporting and other constant or operational indicators. The block for entering and correcting data is used to promptly and timely reflect the current changes in the database.

System creation module.

It serves to create a set (hierarchy) of rules. There are two approaches that can be used as the basis for the system creation module: the use of algorithmic programming languages and the use of shells of expert systems. Languages are specially designed to represent the knowledge base Lisp and Prologue although any known algorithmic language can be used.

Expert systems shell is a ready-made software environment, which can be tailored to solve a specific problem by creating an appropriate knowledge base. In most cases, using wrappers allows you to create expert systems faster and easier compared to programming.

Questions:

1 . Salient feature semantic networks?

2 ... What is the common feature of frame models?

3 ... A common feature of production models?

4 ... List the main components of ES?

5 ... What is the difference between a knowledge base and a database?

Lecture 12.

Local and global computer networks, telecommunications.

Computer networks... When two or more computers are physically connected, computer network ... In general, to create computer networks, you need a special Hardware – network hardware and special software - network software.

The purpose of all types of computer networks is determined by two functions:

Ø provision sharing hardware and software resources of the network;

Ø provision sharing to data resources.

To transmit data, computers use a wide variety of physical channels, which are usually called transmission medium .

If there is a special computer on the network dedicated for sharing by network participants, it is called file server .

Groups of employees working on one project within a local network are called working groups ... Several workgroups can work within one local network. Team members can have different rights to access shared network resources. The set of techniques for dividing and limiting the rights of participants in a computer network is called network policy ... Network Policy Management is called network administration ... The person managing the organization of the work of the participants of the local computer network is called system administrator .

Basic characteristics and classification of computer networks.

By territorial prevalence networks can be local, global, and regional.

Ø Local network (LAN - Local Area Network) - a network within an enterprise, institution, one organization.

Ø Regional network (MAN - Metropolitan Area Network) - a network within a city or region.

Ø The global network (WAN - Wide Area Network) - a network on the territory of a state or a group of states.

By speed of information transfer computer networks are divided into:

Ø low-speed networks - up to 10 Mbps;

Ø medium-speed networks - up to 100 Mbps;

Ø high-speed networks - over 100 Mbps.

By type of transmission medium networks are divided into:

Ø wired (on coaxial cable, twisted pair, fiber optic);

Ø wireless with information transmission via radio channels or in the infrared range.

By the way of organizing the interaction of computers networks are divided into peer-to-peer and with dedicated server (hierarchical networks).

Peer-to-peer network. All computers are equal. Anyone on the network can access data stored on any computer.

Dignity- ease of installation and operation.

Flaw- it is difficult to solve information security issues.

This method of organization is used for networks with a small number of computers and where the issue of data protection is not critical.

Hierarchical network. During installation, one or more are pre-allocated servers - computers that control the exchange of data and the allocation of network resources. Server Is a persistent repository of shared resources. Any computer that has access to the server's services is called client of the network or workstation ... The server itself can also be a client of a higher-level server in the hierarchy. Servers are usually high-performance computers, possibly with several parallel processors, hard drives large capacity and a high-speed network card.

Dignity- allows you to create the most stable network structure and more efficiently allocate resources and provide more high level data protection.

disadvantages:

Ø The need for an additional OS for the server.

Ø More high complexity installation and modernization of the network.

Ø The need to allocate a separate computer as a server.

By server technology distinguish networks with architecture file server and architecture client-server .

File Server... Most programs and data are stored on the server. At the request of the user, the necessary program and data are sent to him. Information processing is performed at the workstation.

Client-server... Data storage and processing is performed on the server, which also controls access to resources and data. The workstation only receives the query results.

Main characteristics of networks.

Baud rate over a communication channel is measured by the number of bits of information transmitted per unit of time - a second. The unit of measurement is bits per second.

A commonly used unit of measure for speed is baud ... Baud is the number of changes in the state of the transmission medium per second. Since each state change can correspond to several bits of data, then real speed bits per second can exceed the baud rate.

Communication channel bandwidth... The unit of measurement of the communication channel throughput is a character per second.

Reliability of information transfer is estimated as the ratio of the number of erroneously transmitted characters to the total number of transmitted characters. Confidence unit: number of errors per sign - errors / sign... This indicator should be within 10 -6 -10-7 errors / sign, i.e. one error per million characters transmitted or ten million characters transmitted is allowed.

Reliability of communication channels a communication system is determined either by the fraction of uptime in the total operating time, or by the average uptime. The unit of measurement for reliability is an hour. At least a few thousand hours.

Network response time- the time spent by the software and network devices to prepare for the transfer of information over this channel... Network response time is measured in milliseconds.

The amount of information transmitted over the network is called traffic .

Network topology.

Physical LAN media... The physical medium provides the transfer of information between the subscribers of the computer network.

The physical transmission medium of a LAN is represented by three types of cables: twisted pair of wires, coaxial cable, fiber optic cable.

Twisted pair consists of two insulated wires twisted together. Twisting the wires reduces the effect of external electromagnetic fields on the transmitted signals. The simplest twisted pair option is a telephone cable.

The advantage of a twisted pair is its low cost. The disadvantage of twisted pair is poor noise immunity and low speed information transfer - 0.25-1 Mbps.

Coaxial cable has a higher mechanical strength, noise immunity and provides information transfer rates up to 10-50 Mbps... For industrial use, two types of coaxial cables are produced: thick ("10 mm ) and thin ("4 mm ). Thicker cable is more durable and transfers signals of the desired amplitude to greater distance than thin. At the same time, thin cable is much cheaper.

Fiber optic cable- ideal transmission medium. It is not affected by electromagnetic fields and has practically no radiation itself. The latter property makes it possible to use it in networks that require increased secrecy of information.

Information transmission speed over fiber-optic cable more than 50 Mbps... Compared with the previous types of transmission medium, it is more expensive and less technologically advanced in operation.

Basic LAN Topologies.

Computing machines, which are part of the LAN, can be located in the most random way on the territory where the computer network is being created.

LAN topology Is an averaged geometric scheme of network nodes connections. Several specialized terms are used in network topology:

Ø Knot - any device directly connected to the transmission medium of the network;

Ø Branch of the network - a path connecting two adjacent nodes;

Ø End node - a knot located at the end of only one branch;

Ø Intermediate node - a knot located at the ends of more than one branch;

Ø Adjacent nodes - nodes connected by at least, one path that does not contain any other nodes.

Computer network topologies can be very different, but only three are typical for LAN networks: ring, bus, star.

Ring topology provides for the connection of network nodes with a closed curve - a transmission medium cable. The output of one host is connected to the input of another. Information on the ring is transmitted from node to node. Each intermediate node between the transmitter and the receiver relays the sent message. The receiving node recognizes and receives only messages addressed to it.

A ring topology is ideal for networks that take up relatively little space. It does not have a central hub, which increases the reliability of the network. As any types of cables are used for the transmission medium ... But the consistent discipline of servicing the nodes of such a network reduces its performance, and the failure of one of the nodes violates the integrity.

Bus topology- one of the simplest. It is associated with the use as transmission medium coaxial cable ... Data from the transmitting network node is propagated along the bus in both directions. Intermediate nodes do not broadcast incoming messages. Information arrives at all nodes, but only the one to which it is addressed is received by the message. The service discipline is parallel.

High speed LAN. The network is easy to grow, configure, and adapt to various systems... The network is resistant to possible malfunctions of individual nodes, but has a short length and does not allow the use of Various types cables within the same network. Special devices are installed at the ends of the network - terminators .

Star topology is based on the concept of a central site called hub to which peripheral nodes are connected. Each peripheral node has its own separate communication line with the central node. All information is transmitted through a central hub, which relays, switches and routes information flows in the network.

A star topology greatly simplifies the interaction of LAN nodes with each other, and allows the use of simpler network adapters. At the same time, the performance of a LAN with a star topology is entirely dependent on the central site.

In real computer networks, more complex topologies can be used, which in some cases represent combinations of those considered. The choice of a particular topology is determined by the area of application of the network, the geographic location of its nodes and the dimension of the network as a whole. For example:

Mesh topology... It is characterized by a node connection scheme, in which physical communication lines are installed with all a number of standing computers:

In a network with a mesh topology, only those computers between which there is an intensive exchange of data are directly connected, and for the exchange of data between computers that are not connected by direct links, transit transmissions are used through intermediate nodes... Mesh topology allows connection a large number computers and is typical, as a rule, for global networks. The advantages of this topology are its resistance to failures and overloads, since there are several ways to bypass individual nodes.

Mixed topology... In such networks, it is possible to distinguish individual subnets with a typical topology - a star, a ring or a common bus, which for large networks are connected arbitrarily.

Network architectures.

The transmission medium is common resource for all nodes on the network. To be able to access this resource from a host, special mechanisms are required - access methods. Media Access Method - a method that ensures the fulfillment of a set of rules according to which network nodes get access to a resource.

Marker access... The computer-subscriber receives from the central computer of the network a token-signal for the right to conduct transmission for a certain time, after which the token is transmitted to another subscriber.

At competitive access method the subscriber starts data transmission if he finds an idle line.

Ethernet network... The data transmission scheme is competitive, network elements can be connected using a bus or star topology using twisted pairs, coaxial and fiber-optic cables. The main advantage is performance from 10 to 100 Mbps.

Token network Ring... Token Access Scheme. Physically designed like a star, but behaves like a ring. Data is transmitted sequentially from station to station, but constantly passes through the central hub. Are used twisted pairs and fiber optic cables. Baud rate 4 or 16 Mbps.

ARCnet... The marker access scheme can work with both bus and star topology. Compatible with twisted pair, coaxial and fiber optic cables. Transfer rate 2.5 Mbps.

Open Systems Interconnection Model.

The main task solved when creating computer networks is to ensure the compatibility of equipment in terms of electrical and mechanical characteristics and to ensure the compatibility of information support (programs and data) in terms of the coding system and data format. The solution to this problem belongs to the field of standardization and is based on the so-called model OSI (open systems interaction model - Model of Open System Interconnections). The OSI model was created based on the technical proposals of the International Standards Organization (ISO).

According to the OSI model, the architecture of computer networks should be considered at different levels (the total number of levels is up to seven). The topmost level is applied , at this level the user interacts with the computing system. The lowest level is physical , it provides signal exchange between devices. The exchange of data in communication systems occurs by moving them from the upper level to the lower one, then transportation and, finally, reverse playback on the client's computer as a result of moving from the lower level to the upper one.

To ensure the necessary compatibility at each of the seven possible levels of computer network architecture, there are special standards (rules) called protocols ... They determine the nature of the hardware interaction of network components ( hardware protocols ) and the nature of the interaction of programs and data ( software protocols ). Physically, the protocol support functions are performed by hardware devices ( interfaces ) and software ( protocol support programs ). The programs that support the protocols are also called protocols.

OSI model layers

№	Level	Functions performed by the level
	Applied	With the help of special applications, the user creates a document (message, picture, etc.).
	Representative	The operating system of the computer records where the created data is located (in RAM, in a file on the hard disk, etc.) and converts it from the internal format of the computer to the transfer format
	Session	Interacts with local or global network... The protocols of this layer check the user's rights.
	Transport	The document is converted into the form in which it is supposed to transmit data in the used network. For example, it can be cut into small standard sized bags.
	Network	Determines the route of data movement in the network. So, for example, if at the transport level the data were "sliced" into packets, then at the network level each packet must receive an address to which it must be delivered regardless of other packets.
	Duct (connections)	It modulates the signals circulating at the physical layer in accordance with the data received from the network layer, provides data flow control in the form of frames, detects transmission errors and implements an information recovery algorithm.
	Physical	Real data transfer. There are no documents, no packets, not even bytes - only bits, that is, elementary units of data representation. Funds physical layer lie outside the computer. V local area networks it is the equipment of the network itself. For remote communication using telephone modems, these are telephone lines, switching equipment telephone exchanges etc.

network hardware.

1 . Network cards (adapters ) Are controllers plugged into expansion slots motherboard computer designed to transmit signals to the network and receive signals from the network.

2 . Terminators - these are resistors with a nominal value of 50 Ohm that produce signal attenuation at the ends of the network segment.

3 . Concentrators (Hub) - this is central devices a cable system or a network of physical "star" topology, which, when it receives a packet on one of its ports, forwards it to all the others. Distinguish between active and passive hubs. Active concentrators amplify the received signals and transmit them. Passive concentrators pass the signal through themselves without amplifying or restoring it.

4 . Repeaters (Repeater) - network device, amplifies and re-forms the shape of the incoming analog signal of the network at the distance of another segment. The repeater operates at the electrical level to connect two segments. Repeaters do not recognize network addresses and therefore cannot be used to reduce traffic.

5 . Switches (Switch) - software-controlled central devices of the cable system, which reduce network traffic due to the fact that the incoming packet is analyzed to find out the address of its recipient and, accordingly, is transmitted only to him.

6 . Routers (Router) – standard devices networks that operate at the network level, and allows you to forward and route packets from one network to another, as well as filter broadcast messages.

7 . Bridges (Bridge) - a network device that connects two separate network segments, limited by their physical length, and transfers traffic between them. Bridges also amplify and convert signals for other cable types. This allows you to expand maximum size networks, while not violating the restrictions on maximum length cable, number of connected devices, or number of repeaters per network segment. The bridge can connect networks of different topologies, but running under the control of the same type of network operating systems.

8 . Gateways (Gateway) - software and hardware systems that connect heterogeneous networks or network devices. Gateways allow you to solve the problem of differences in protocols or addressing systems. They operate at the session, presentation, and application layers of the OSI model.

9 . Multiplexers Are devices central office supported by several hundred digital subscriber lines... Multiplexers send and receive subscriber data over telephone lines, concentrating all traffic on one high-speed channel for transmission to the Internet or company network.

10 . Firewalls (firewall, firewalls) is a software and / or a hardware barrier between two networks, allowing only authorized interconnections to be established. Most of them are based on the differentiation of access, according to which a subject (user, program, process or network packet) is allowed access to an object (file or network node) upon presentation of some unique element inherent only to this subject. In most cases, this element is a password. In other cases, such a unique element is microprocessor cards, biometric characteristics of the user, etc. For network package such an element are addresses or flags in the packet header, as well as some other parameters.

Telecommunication technology.

With the evolution of computing systems, the following types of computer network architecture have formed:

Ø peer-to-peer architecture;

Ø classical architecture "client-server";

Ø Client-server architecture based on Web technology.

With a peer-to-peer architecture, Fig. 1 all resources computing system, including information, are concentrated in a central computer, also called the mainframe ( main frame- the central computer block). As the main means of access to information resources used alphanumeric terminals of the same type, connected to the central computer by cable. This did not require any special actions on the part of the user for setting up and configuring the software.

Rice. 1... Peer-to-peer architecture of computer networks.

Clear Peer-to-Peer Flaws and Development tools led to the emergence of computing systems with a client-server architecture. The peculiarity of this class of systems is the decentralization of the architecture of autonomous computing systems and their integration into global computer networks. The creation of this class of systems is associated with the emergence of personal computers, which took over some of the functions of central computers. As a result, it became possible to create global and local computer networks that unite personal computers (clients or workstations) that use resources, and computers (servers) that provide certain resources for general use... In fig. 2 shows a typical client-server architecture, however, there are several models that differ in the distribution of software components between computers on the network.

Rice. 2... Typical client-server architecture.

Any software application can be represented as a structure of three components:

Ø presentation component that implements the interface with the user;

Ø an application component that ensures the implementation of application functions;

Ø component of access to information resources, or a resource manager that performs the accumulation of information and data management.

Based on the distribution of the listed components between a workstation and a network server, the following client-server architecture models are distinguished:

Ø model of access to remote data;

Ø data management server model;

Ø complex server model;

Ø three-tier client-server architecture.

Remote data access model fig. 3, in which only data is located on the server, has the following features:

Rice. 3... Remote data access model.

Ø low productivity, since all information is processed on workstations;

Ø reduction of the overall exchange rate when transferring large amounts of information for processing from the server to workstations.

When using the data management server model in Fig. 4 in addition to the information itself, the server contains an information resource manager (for example, a database management system). The presentation component and the application component are combined and run on a client computer that supports both data input and display functions and purely application functions. Access to information resources is provided either by operators of a special language (for example, SQL in the case of using a database), or by calls of specialized functions. software libraries... Requests for information resources are routed over the network to a resource manager (for example, a database server), which processes the requests and returns blocks of data to the client. The most significant features of this model:

Rice. 4... Data Management Server Model.

Ø reduction of the amount of information transmitted over the network, since the selection of the necessary information items carried out on the server, not on workstations;

Ø unification and a wide range of tools for creating applications;

Ø the lack of a clear distinction between the presentation component and the application component, which makes it difficult to improve the computing system.

The data management server model is advisable to use in the case of processing moderate volumes of information that do not increase over time. Moreover, the complexity of the applied component should be low.

Rice. 5... Complex server model.

Complex server model fig. 5 is constructed under the assumption that the process running on the client computer is limited to presentation functions, and the actual application functions and data access functions are performed by the server.

Advantages of the complex server model:

Ø high performance;

Ø centralized administration;

Ø saving network resources.

The complex server model is optimal for large networks focused on processing large and increasing volumes of information.

A client-server architecture in which an application component resides on a workstation along with a presentation component (remote data access model and data management server) or on a server along with a resource and data manager (complex server model) is referred to as a two-tier architecture.

With a significant complication and increase in the resource intensity of an application component, a separate server, called an application server, can be allocated for it. In this case, one speaks of a three-tier client-server architecture in Fig. 6. The first link is the client computer, the second is the application server, and the third is the data management server. Within the framework of the application server, several application functions can be implemented, each of which is designed as a separate service that provides some services to all programs. There can be several application servers, each of them is focused on providing a certain set of services.

Rice. 6... Three-tier client-server architecture.

Most vividly modern tendencies telecommunication technologies have manifested themselves on the Internet. The web-based client-server architecture is shown in Fig. 7.

Rice. 7... Web-based client-server architecture.

In accordance with Web technology, the server hosts so-called Web documents, which are rendered and interpreted by a navigation program (Web browser, Web browser) running on a workstation. Logically, a Web document is a hypermedia document that links various Web pages together. Unlike a paper web page, it can be linked to computer programs and contain links to other objects. In Web technology, there is a system of hyperlinks that includes links to the following objects.

Transfer from server to workstation documents and other objects, when requested by the navigator, are provided by a program running on the server called a Web server. When a Web browser needs to retrieve documents or other objects from the Web server, it sends a request to the server. With sufficient access rights, a logical connection is established between the server and the navigator. The server then processes the request, sends the processing results to the Web browser, and breaks established connection... Thus, the Web server acts as an information hub that delivers information from different sources, and then in a uniform form provides it to the user.

The Internet is a thriving collection of computer networks that span the globe, linking government, military, educational and commercial institutions, as well as individual citizens.

Like many other great ideas, the "network of networks" arose from a project that was intended for completely different purposes: the ARPAnet, developed and created in 1969 for the Advanced Research Project Agency (ARPA) of the US Department of Defense. ARPAnet was a network of educational institutions, military and military contractors; it was created to help researchers exchange information, and (which was one of the main purposes) to study the problem of maintaining communication in the event of a nuclear attack.

In the ARPAnet model, there is always a link between the source computer and the destination computer. The network itself is considered unreliable; any segment of it can disappear at any time (after a bombing or as a result of a cable malfunction). The network was built so that the need for information from client computers was minimal. To send a message over a network, a computer simply had to put the data in an envelope called an "Internet Protocol (IP) packet" and correctly "address" those packets. Computers that interact with each other (and not just the network itself) were also responsible for ensuring the transfer of data. The underlying principle was that every computer on the network could communicate as a node with any other computer with a wide range of computer services, resources, information. A set of networking agreements and publicly available "networks of networks" tools are designed to create one big network in which computers, connected together, interact, having many different software and hardware platforms.

Currently, the direction of development of the Internet is mainly determined by the "Internet Society", or ISOC (Internet Society). ISOC is a volunteer organization dedicated to promoting global information exchange through the Internet. She appoints the Internet Architecture Board (IAB), which is responsible for the technical direction and orientation of the Internet (mainly Internet standardization and addressing). Internet users express their opinions at meetings of the Internet Engineering Task Force (IETF). The IETF is another public body that meets regularly to discuss current technical and organizational issues on the Internet.

The financial basis of the Internet is that everyone pays for their share. Individuals from the individual networks get together and decide how to connect and how to fund these interconnections. An educational institution or business entity pays to connect to a regional network, which in turn pays for Internet access to a national provider. Thus, every connection to the Internet is paid for by someone.

Questions:

1. List the functions of all types of computer networks.

2. List the characteristics and classification of computer networks.

3. Types of physical transmission media.

4. List LAN topologies.

5. List the types of network equipment.

6. List the architecture and models of telecommunication technologies.

DB and DBMS concepts.

A database is a collection of structured data stored in the memory of a computing system and displaying the state of objects and their interrelationships in the subject area under consideration.

The logical structure of the data stored in the database is called the data presentation model. The main models of data presentation (data models) include hierarchical, network, relational.

A database management system (DBMS) is a complex of language and software tools designed for the creation, maintenance and sharing of databases by many users. Typically, a DBMS is distinguished by the data model used. So, DBMS based on the use of a relational data model are called relational DBMS.

A data dictionary is a database subsystem designed for centralized storage of information about data structures, relationships of database files with each other, data types and formats of their presentation, data belonging to users, security and access control codes, etc.

Information systems based on the use of databases usually operate in a client-server architecture. In this case, the database is located on the server computer and shared.

The server of a specific resource in a computer network is a computer (program) that manages this resource, a client is a computer (program) that uses this resource. As a resource of a computer network, for example, databases, files, print services, and postal services can act.

The advantage of organizing an information system on a client-server architecture is a successful combination of centralized storage, maintenance and collective access to general corporate information with individual user work.

According to the basic principle of client-server architecture, data is processed only on the server. A user or an application generates queries that come to the database server in the form of SQL statements. The database server provides search and retrieval of the required data, which is then transferred to the user's computer. The advantage of this approach in comparison with the previous ones is a noticeably smaller amount of transmitted data.

The following types of DBMS are distinguished:

* full-featured DBMS;

* database servers;

* tools for developing programs for working with a database.

By the nature of their use, DBMSs are divided into multiuser (industrial) and local (personal).

Industrial, DBMS represent a software basis for development automated systems management of large economic objects. Industrial DBMS must meet the following requirements:

* the possibility of organizing a joint parallel work many users;

* scalability;

* portability to various hardware and software platforms;

* stability in relation to failures of various kinds, including the presence of a multi-level backup system of stored information;

* ensuring the security of stored data and an advanced structured system of access to them.

Personal DBMS is software aimed at solving problems of a local user or a small group of users and intended for use on a personal computer. This explains their second name - desktop. The defining characteristics of desktop systems are:

* relative ease of use, allowing you to create workable user applications on their basis;

* relatively limited hardware resource requirements.

According to the data model used, DBMSs are divided into hierarchical, network, relational, object-oriented, etc. Some DBMS can simultaneously support several data models.

To work with data stored in the database, the following types of languages are used:

* data description language - a high-level non-procedural language
declarative type, designed to describe a logical
data structures;

* data manipulation language - a set of constructions that provide basic operations for working with data: input, modification and retrieval of data on request.

The named languages may differ in different DBMS. The most widespread are two standardized languages: QBE - a query language by model and SQL - a structured query language. QBE mainly has the properties of a data manipulation language, SQL combines the properties of both types of languages.

The DBMS implements the following main functions low level:

* data management in external memory;

* management of RAM buffers;

* transaction management;

* keeping a log of changes in the database;

* ensuring the integrity and security of the database.

The implementation of the data management function in external memory ensures the organization of resource management in the OS file system.

The need to buffer data is due to the fact that the amount of RAM is less than the amount of external memory. Buffers are areas of RAM designed to speed up the exchange between external and RAM... The buffers temporarily store fragments of the database, the data from which are supposed to be used when accessing the DBMS or are planned to be written to the database after processing.

The transaction mechanism is used in the DBMS to maintain the integrity of the data in the database. A transaction is a certain indivisible sequence of operations on database data, which is tracked by the DBMS from start to finish. If for any reason (hardware failures and failures, errors in software, including the application) the transaction remains incomplete, then it is canceled.

There are three main properties inherent in transactions:

* atomicity (all operations included in the transaction are executed or none);

* serializability (there is no mutual influence of transactions executed at the same time);

* durability (even a system crash does not lead to the loss of the results of the committed transaction).

An example of a transaction is the operation of transferring money from one account to another in banking system... First, money is withdrawn from one account, then they are credited to another account. If at least one of the actions is not successful, the result of the operation will be incorrect and the balance of the operation will be upset.

Change logging is performed by the DBMS to ensure the reliability of data storage in the database in the presence of hardware and software failures.

Ensuring the integrity of the database is a necessary condition for the successful functioning of the database, especially when it is used on a network. The integrity of the database is a property of the database, which means that it contains complete, consistent and adequately reflecting the subject area information. The integral state of the database is described using integrity constraints in the form of conditions that must be met by the data stored in the database.

Security is achieved in the DBMS by data encryption, password protection, support for access levels to the database and its individual elements (tables, forms, reports, etc.).

Stages of creating a database.

Designing databases of information systems is a rather laborious task. It is carried out on the basis of formalizing the structure and processes of the subject area, information about which is supposed to be stored in the database. Distinguish between conceptual and schematic-structural design.

Conceptual design of an IS DB is largely a heuristic process. The adequacy of the infological model of the subject area built within its framework is verified empirically, in the process of IS functioning.

Let's list the stages of conceptual design:

1. Study of the subject area to form a general understanding of it;

2. Allocation and analysis of functions and tasks of the developed IS;

3. Determination of the main objects-entities of the subject area
and the relationship between them;

4. Formalized presentation of the subject area.

When designing a relational database schema, the following procedures can be distinguished:

1. Determination of the list of tables and relationships between them;

2. Determination of the list of fields, types of fields, key fields of each table (table schema), establishing links between tables through foreign keys;

3. Establishing indexing for fields in tables;

4. Development of lists (dictionaries) for fields with enumerated
data;

5. Setting integrity constraints for tables and relationships;

6. Normalization of tables, correction of the list of tables and links.

Relational databases.

Relational base data is a set of interconnected tables, each of which contains information about objects of a certain type. Each row of the table contains data about one object (for example, car, computer, customer), and the columns of the table contain various characteristics of these objects - attributes (for example, engine number, processor brand, phone numbers of companies or customers).

The rows in the table are called records. All records of the table have the same structure - they consist of fields (data elements), which store the attributes of the object (Fig. 1). Each record field contains one object characteristic and represents a given data type (for example, text string, number, date). The primary key is used to identify records. A primary key is a set of table fields whose combination of values uniquely identifies each record in the table.

Primary key

Each database table can have a primary key. A primary key is a field or a set of fields that uniquely (uniquely) identify a record. The primary key should be minimally sufficient: it should not contain fields, the removal of which from the primary key will not affect its uniqueness.

Data of the table "Teacher"

Only “Tab. No. ", the values of other fields can be repeated within this table.

Secondary key

Secondary keys are the primary mechanism for organizing relationships between tables and maintaining the integrity and consistency of information in a database.

Secondary is a table field that can only contain values that are in a key field of another table that is referenced by a secondary key. The secondary key links two tables.

Subordination relationships can exist between two or more database tables. Subordination relationships determine that for each record of the master table (master, also called the parent), there can be one or more records in the subordinate table (detail, also called the child).

There are three types of relationships between database tables:

- "one-to-many"

- "one to one"

- "many-to-many"

A one-to-one relationship occurs when one record in the parent table matches one record in the child table.

A many-to-many relationship occurs when:

a) records in the parent table can correspond to more than one record in the child table;

b) records in the child table can correspond to more than one record in the parent table.

A one-to-many relationship occurs when multiple records in the child table can correspond to the same record in the parent table.

Physical and logical database models

Logical data model... At the next, lower level is the logical data model of the domain. The logical model describes the concepts of the subject area, their relationship, as well as the data constraints imposed by the subject area. Examples of concepts are "employee", "department", "project", "salary". Examples of relationships between concepts - "an employee is listed in exactly one department", "an employee can carry out several projects", "several employees can work on one project". Examples of restrictions are "the employee's age is not less than 16 and not more than 60 years old."

The logical data model is the initial prototype for the future database. The logical model is built in terms of information units, but without reference to a specific DBMS... Moreover, the logical data model does not have to be expressed by means of exactly relational data models. The main tool for developing a logical data model in currently are different options ER diagrams (Entity-Relationship , entity-relationship diagrams ). The same ER model can be transformed into both a relational data model and a data model for hierarchical and network DBMSs, or a post-relational data model. However, since we are considering just relational DBMS, then we can assume that the logical data model for us is formulated in terms of the relational data model.

The decisions made at the previous level, when developing a domain model, define some boundaries within which a logical data model can be developed, but within these boundaries it is possible to take different solutions... For example, a domain model warehouse accounting contains the concepts "warehouse", "invoice", "goods". When developing an appropriate relational model, these terms must be used, but different ways There is a lot of implementation here - you can create one relationship in which "warehouse", "invoice", "goods" will be present as attributes, or you can create three separate relationships, one for each concept.

When developing a logical data model, questions arise: are relationships well designed? Do they correctly reflect the domain model, and therefore the domain itself?

Physical model data... At an even lower level is the physical data model. The physical data model describes the data by means of a specific DBMS. We will assume that the physical data model is implemented by means of precisely relational DBMS, although, as mentioned above, this is optional. Relationships developed at the stage of formation of a logical data model are converted into tables, attributes become columns of tables, for key attributes unique indexes are created, domains are transformed into data types accepted in a specific DBMS.

Constraints in the logical data model are implemented by various DBMS tools, for example, using indexes, declarative integrity constraints, triggers, stored procedures. In this case, again, decisions made at the level of logical modeling define some boundaries within which a physical data model can be developed. Likewise, various decisions can be made within these boundaries. For example, relationships contained in a logical data model must be converted to tables, but for each table, you can additionally declare various indices that increase the speed of data access. Much depends on the specific DBMS.

When designing a physical data model, questions arise: Are the tables well designed? Are the indexes selected correctly? How many program code in the form of triggers and stored procedures need to be designed to maintain data integrity?

In the process of creating a database, several stages can be distinguished, at each of which the structure of the designed database is concretized and refined.

1) Creation of a conceptual model of the database.

The process of creating a conceptual model is more related to the design of the entire information system, one of the parts of which is the database. At this stage, the analysis of tasks solved in a specific subject area takes place, objects of the subject area and the relationship of these objects are described. The conceptual model can also contain a description of the processes that occur with the objects of the subject area, which makes it possible to fully take into account all the nuances of the functioning of the information system being developed. When designing a conceptual model, the features of the implementation of certain parts of the information system are not taken into account and the issues of increasing the efficiency of information processing are not considered.

2) Creation of a logical database model.

The logical model of the database is the result of the transformation of the conceptual model, in which information objects become the main objects. The latter are entities - objects or events, information about which must be stored in the database. Entities are characterized by a set of certain properties called attributes. The logical model reflects the logical relationships between entities, regardless of how the data will be stored. The logical model of the database is universal, since it is in no way connected with a specific implementation of the database. The names of entities and attributes in the logical model can be the same as the names used in real life.

An Entity-Relationship diagram (ER-diagram) is used to describe the database schema at the logical design level. There are various options for entity-relationship diagrams. Methods for depicting elements of ER diagrams began to be called notations. On them, the same elements are graphically depicted in different ways. Known notation Martin, notation IDEF1X, etc. In addition, different software tools that implement the same notation may differ in their capabilities. All variants of entity-relationship diagrams are based on one idea - a drawing is always clearer than a text description. All such diagrams use a graphical representation. entities subject area, their properties ( attributes), and the relationship between entities.

3) Creation of a physical model of the database.

The physical model is a mapping of the logical model as applied to a specific DBMS. Several different physical models can correspond to the same logical database model, reflecting the implementation features of specific DBMS. In the physical model, it is important to describe all the information about the physical objects of the database - tables, columns, indexes, procedures, etc.

Modern design tools for the physical model of a database make it possible, on the basis of the created model, to form the necessary prescriptions (commands, queries) for the selected database management system. Based on the instructions received, the DBMS forms the physical structure of the database intended for storing real information.

Consider the process of designing a database of an information system designed to store and maintain information about the premises of the university: classrooms, laboratories, auxiliary premises.

1) Creation of a conceptual database model

Any room is characterized by the following parameters: information about the building in which the room is located; room number; the floor on which it is located; a brief description of the location of the premises in the building; dimensions of the room (width, length and height of the ceiling in meters).

It should be borne in mind that for audiences such parameters as the number of seats for listeners and the number of boards are important; laboratories are characterized by the number of laboratory benches and the maximum power consumption of laboratory electrical equipment; auxiliary premises should contain a description of the purpose of the premises.

In addition, all premises can be characterized by additional optional details, for example, the surname and initials of the person in charge of fire safety; contact phone number of the person responsible for fire safety; the frequency of scheduled inspections and checks of the technical condition of the premises; type of ventilation and air conditioning system, etc.

The number of optional room details is not a finite set and can increase during the operation of the information system.

It is permissible for each room to be used by several university departments. At the same time, the information system should take into account that the structure of university departments has a hierarchical form, when some departments are part of others.

2) Creating a logical database model

Let's select the main information objects, information about which will be stored in the database, the attributes of these objects and the relationship between objects. We represent the logical model in the form of an entity-relationship diagram in the IDEF1X notation (Fig. 1).

Rice. 1. Logical database model

3) Creating a physical database model

Having selected the Oracle DBMS as the target, we transform the logical model into a physical one (Fig. 2).

Rice. 2. Physical model of the database

A variant of the logical data model. Physical and logical database models

11.1. About the concept of "data model"

So the data model is the model logic level database design. It can be seen as a combination of three components(slide 2):

There are three main types of record-based logical data models ( slide 3):

- relational data model ( relational data model);

- network data model;

- hierarchical data model ( hierarchical data model).

Hierarchical and network data models were created almost ten years earlier than the relational data model, therefore their relationship with conceptstraditional file handling is more obvious.

11.5. Advantages and disadvantages of models

Logical models

1) Creation of a conceptual model of the database.

2) Creation of a logical database model.

3) Creation of a physical model of the database.

Top related articles