Introduction

1. The concept of an information system.

2. Concept Database.

3. Evolution of database concepts

4. Requirements that the database organization must satisfy.

4.1. Establishing multilateral connections

4.2. Performance

4.3. Minimum costs

4.4. Minimal redundancy

4.5. Search options

4.6. Integrity

4.7. Security and privacy

4.8. Connection with the past

4.9. Connection with the future

4.10. Ease of use

5. Data representation models

5.1. Hierarchical data model

5.2. Network Data Model

5.3. Relational data model

5.3.1. Tables

5.3.2. Key fields

5.3.4. Ancestor/descendant relationships

5.3.5. Foreign keys

5.3.6. Relational algebra

5.3.7. Database Normalization

5.3.7.1. First normal form

5.3.7.2. Second normal form

5.3.7.3. Third normal form

5.3.7.4. Fourth normal form

5.3.7.5. Fifth normal form

6. SQL language as a standard database language.

6.1. SQL language

6.2. Advantages of SQL

6.2.1. Independence from specific DBMS

6.2.2. Portability from one computing system to others

6.2.3. SQL Language Standards

6.2.4. IBM SQL Endorsement (DB2)

6.2.5. ODBC Protocol and Microsoft

6.2.6. Relational basis

6.2.7. High-level structure reminiscent of English

6.2.8. Interactive queries

6.2.9. Programmatic database access

6.2.10. Different data views

6.2.11. A complete language for working with databases

6.2.12. Dynamic Data Definition

6.2.13. Client/server architecture

7. Database architectures

7.1. Local databases and file-server architecture

7.2. Remote Databases and Client-Server Architecture

8. Delphi environment as a tool for developing a DBMS

8.1. High-performance compiler to machine code

8.2. Powerful object-oriented language

8.3. Object-oriented model of software components

8.4. Library of visual components

8.5. Forms, modules and development method “Two-Way Tools”

8.6. Scalable tools for building databases

8.7. Customizable developer environment

8.8. Low hardware requirements

9. Database design

Infological data model

9.2. Infological data model "entity-relationship"

9.3. Datalogical data model

9.4. Transition from the ER model to the relational one.

9.5. Physical data model

9.6. Database design steps

10. Practical part

10.1. Subject area and tasks assigned to the database

10.2. Defining Database Objects

10.3. Infological and datalogical database models

10.4. Physical description of the model

10.5. Software implementation

Conclusion

Bibliography


INTRODUCTION


The experience of using computers to build application data processing systems shows that the most effective tool here is database management systems (DBMS, DataBase Management System).

The flows of information circulating in the world that surrounds us are enormous. They tend to increase over time. Therefore, in any organization, both large and small, the problem arises of organizing data management in a way that would ensure the most effective work. Some organizations use filing cabinets for this, but most prefer computerized methods - databases that allow you to effectively store, structure and systematize large volumes of data. And today it is impossible to imagine the work of most financial, industrial, trade and other organizations without databases. Without databases, they would simply drown in an information avalanche.

There are many good reasons to transfer existing information to a computer basis. Now the cost of storing information in files on a computer is cheaper than on paper. Databases allow you to store, structure and retrieve information in an optimal way for the user. The use of client/server technologies allows you to save significant money, and most importantly, time to obtain the necessary information, and also simplify access and maintenance, since they are based on comprehensive data processing and centralization of their storage. In addition, the computer allows you to store any data formats: text, drawings, handwritten data, photographs, voice recordings, etc.

To use such huge volumes of stored information, in addition to the development of system devices, data transmission facilities, and memory, tools are needed to ensure human-computer dialogue that allow the user to enter queries, read files, modify stored data, add new data, or make decisions based on stored data. To provide these functions, specialized tools have been created - database management systems (DBMS). Modern DBMS are multi-user database management systems that specialize in managing an array of information by one or many concurrent users.

Economic and static information is growing daily and every second. If earlier, due to insufficient computerization of the economy, there was very little information in electronic form, today this is already commonplace. In this regard, a new problem arises - searching and selecting the necessary information among the ocean of data that we can observe today on the Internet and local corporate networks. Therefore, the correct organization of building up economic and static information for its further rapid retrieval and effective use is a very relevant topic today.

The purpose of this thesis is to evaluate new technologies for organizing accumulation, savings, quick search, selection and retrieval of information, which are based on the relational concept of data models, and to show the advantages of one of the technologies considered using a specific example.

The implementation of this task is carried out in the Delphi 5.0 programming system, which has extensive capabilities for creating database applications, the necessary set of drivers for accessing the most well-known database formats, convenient and developed tools for accessing information located both on a local disk and on a remote one. server, as well as a large collection of visual components for building windows displayed on the screen, which is necessary to create a convenient interface between the user and the executable code.


Information system concept


For centuries, humanity has been accumulating knowledge, work skills, information about the world around us, in other words, collecting information. In the beginning, information was passed down from generation to generation in the form of legends and oral stories. The emergence and development of the book business made it possible to transmit and store information in a more reliable written form. Discoveries in the field of electricity led to the emergence of the telegraph, telephone, radio, television - means that made it possible to quickly transmit and accumulate information. The development of progress has led to a sharp increase in information, and therefore the issue of its preservation and processing has become more acute from year to year. With the advent of computer technology, methods of storing and, most importantly, processing information have become significantly simplified. The development of microprocessor-based computing technology leads to the improvement of computers and software. Programs are appearing that can process large flows of information. With the help of such programs they create Information Systems. The purpose of any information system is to process data about objects and phenomena of the real world and provide a person with the necessary information about them.

If we consider a collection of some objects, we can identify objects that have the same properties. Such objects are classified into separate classes. Within a selected class, objects can be arranged both according to general classification rules, for example, by alphabet, and by some specific general characteristics, for example, by color or material. Grouping objects according to certain characteristics greatly facilitates the search and selection of information. All this information is accumulated in a collection of files called a database, and special programs are created to manage these files - database management systems (DBMS).

Information systems (IS) can be divided into factual and documentary.

IN factual IP facts are recorded - specific values ​​of data (attributes) about objects in the real world. The main idea of ​​such systems is that all information about objects (surnames of people and names of objects, numbers, dates) is reported to the computer in some predetermined format (for example, a date - in the form of a combination DD.MM.YYYY). The information that a factual IS works with has a clear structure that allows the machine to distinguish one data from another, for example, a person’s last name from his position, date of birth from his height, etc. Therefore, the factual system is able to provide unambiguous answers to the questions posed.

Documentary IP serve a fundamentally different class of tasks that do not require a clear answer to the question posed. The database of such systems is formed by a set of unstructured text documents (articles, books, abstracts, etc.) and graphic objects, equipped with one or another formalized search apparatus. The purpose of the system, as a rule, is to provide, in response to a user request, a list of documents or objects that to some extent satisfy the conditions formulated in the request.

This classification of information systems is to a certain extent outdated, since modern factual systems often work with unstructured blocks of information (texts, graphics, sound, video) equipped with structured descriptors. Given certain factors, a factual system can turn into a documentary one (and vice versa).

For systems for processing economic and statistical information, factual information systems are more suitable, which are used in literally all spheres of human activity.

The concept of a database.

There is a well-known, but difficult to implement in practice, concept of a database as a large storage facility in which an organization places all the data it needs and from which various users can obtain this data. The memory devices that store all the data may be located in one or more places; in the latter case they must be connected by means of data transmission. Programs must have access to the data.

Indeed, most databases available today are designed for a limited number of applications. Often several databases are created on one computer. Over time, databases designed to implement separate related functions can be combined if such a combination will increase the efficiency and intensity of use of the entire system.

Database can be defined as a collection of interrelated data stored together with such minimal redundancy that it can be used optimally for one or more applications; data is stored so that it is independent of the programs that use this data; A common controlled method is used to add new or modify existing data, as well as to search for data in the database. .

A system is said to contain a collection of databases if these databases are structurally completely independent. In systems with simple data organization, each application creates its own set of records. The purpose of a database is to ensure that the same set of data can be used for as many applications as possible. Based on this, a database is often developed as a repository for such information, the need for which arises in the process of performing certain functions in a factory, government agency or some other organization. Such a database must provide the ability not only to obtain information, but also to constantly modify it as necessary for the management processes of the organization; it may be necessary to search the database to obtain information for planning purposes or to answer questions. The data set can be used by several departments, regardless of whether there are departmental barriers between them.

The database can be designed for batch processing, real-time processing, or online processing (in this case, processing of each request is completed at a certain point in time, but the processing time is not subject to the strict restrictions that exist in real-time systems). Many databases provide a combination of these processing methods, and many database systems provide real-time terminal service alongside batch processing.

Most disk or tape libraries that existed before the use of database management tools contained a large amount of repetitive information. When storing many data elements, redundancy was allowed, since the same data was recorded on storage media for different purposes and, in addition, various modifications of the same data were stored. The database provides the opportunity to largely eliminate such redundancy. A database is sometimes defined as non-redundant collection of data elements. However, in reality, to reduce data access time or simplify addressing methods, many databases have little redundancy. Some records are repeated to ensure that data can be recovered if it is accidentally lost. In order for the database to be non-redundant and satisfy other requirements, compromises must be made. In this case, we talk about managed, or minimal, redundancy, or that a well-designed database is free of unnecessary redundancy.

Unmanaged redundancy has several disadvantages. First, storing multiple copies of data introduces additional costs. Second, when updating at least a few redundant copies, it is necessary to perform multiple update operations. Redundancy therefore costs significantly more in cases where large amounts of information are updated during file processing or, even worse, new elements are frequently introduced or old ones are destroyed. Third, due to the fact that different copies of data may correspond to different stages of updating, the information produced by the system may be inconsistent.

If you don't use databases, processing large amounts of information will create so much redundant data that it will become virtually impossible to keep it all at the same update level. Very often, users discover obvious contradictions in the data and therefore distrust the information received from the computer. The inability to store redundant data at the same update level is a major obstacle in computer-assisted data processing.

One of the most important characteristics of most databases is their constant change and expansion. As new data types are added or new applications appear, it must be possible to quickly change the database structure. Database reorganization should be carried out without rewriting application programs whenever possible and generally involve a minimum number of transformations. The ease of changing a database can have a big impact on the development of database applications in production management.

ABOUT data independence often referred to as one of the fundamental properties of a database. This means the independence of data and the application programs that use them from each other in the sense that changing one does not lead to changing the other. In particular, the application programmer is insulated from changes in data and its organization, as well as from changes in the characteristics of the physical devices on which it is stored. In reality, it is as rare for data to be completely independent as it is to be completely non-redundant. As we will see below, data independence is defined from different perspectives. The information that a programmer must have to access data varies from database to database. However, data independence is one of the main reasons for using database management systems.

When one set of data elements is used for many applications, many different relationships are established between the elements of this set, necessary for the corresponding application programs. Database organization is highly implementation dependent interrelations between data items and records, and how and where that data is stored. In a database used by many applications, numerous intermediate relationships between elements must be established. In this case, when storing and using data, it is more difficult to control its accuracy, ensure its protection and secrecy than when storing data in simple, unrelated files. As for ensuring data secrecy and recovering it after failures, this issue is very important when designing databases.

Some systems use database management tools to allow users to use data in ways not intended by the system's designers. Administrators or employees may approach the computing system with questions that were not previously provided for. Having this capability means organizing data in a system so that it can be accessed in different ways, and the same data can be used to answer different questions. All essential information about objects is remembered simultaneously and completely, and not just that part that is necessary for one application. .

Currently, there are DBMSs that implement these capabilities both at the level of local databases located on one disk (Paradox, Dbase) and industrial databases (Access, Oracle, FoxPro).

Evolution of Database Concepts

The concept of a database appeared in the late 60s. Before this, the data science industry talked about data files and data sets.

Before the advent of third-generation computers (the first of which were installed in 1965), data processing software performed primarily input-output operations. Data organization had to be taken care of when writing application programs, and this was done in a rudimentary way, that is, data was usually organized in the form of simple sequential files on magnetic tape. There was no independence of data. If the data organization or storage devices changed, the application programmer had to modify the programs accordingly, recompile them, and then debug them. In order to update the file, it was necessary to write a new one. The old file was saved and called the original file. The previous version was also saved, and often earlier versions of the file were saved. Many files were used for one application. Other applications often used the same data, but usually in a different form, with different fields, and therefore had to create different files from the same data. As a result, the level of redundancy in the system was very high and there were different files containing the same data elements.

Random access files were sometimes used, which allowed the user to directly access any record in the file instead of having to view the entire file sequentially. The means for addressing records were provided by the application programmer when writing the program. If storage devices were changed, major changes had to be made to the application program. In practice, changing storage devices is inevitable. New technology has led to a significant reduction in the cost of storing one bit of information, and file sizes today often exceed the size of previously used storage devices.

Stage 2 (late 60s) is characterized by a change compared to stage 1 in both the nature of the files and the devices on which they were stored. An attempt is made to protect the application programmer from the influence of changes in hardware. The software allows the physical arrangement of data to be changed without changing its logical representation, provided that the content of the records or the underlying file structure does not change.

Files corresponding to this stage of development of data processing tools, like Stage 1 files, are intended for a single application or for closely related applications.

As commercial data processing tools evolved, it became clear that it was desirable to make application programs independent not only from changes in file storage hardware and from increasing file sizes, but also from the addition of new fields and new relationships to the stored data.

It is known that a database is a constantly evolving object that is used by an increasing number of applications. New records are added to the database, and new data elements are included in existing records. The structure of the database will change in order to improve the efficiency of its functioning and when new types of queries are added. Users will change requirements and modify the types of data requests.

The database structure is less static than the file structure. The elements of data stored and how they are stored are constantly changing. If the computer system's data organization is constrained by the requirement for a consistent file structure, then when it changes, programmers spend a lot of time modifying existing programs instead of developing new applications.

In one case, only the name of the data element or record it wants to receive may be communicated. In another case (if other software was available) it would have reported the identification of the data element and the name of the set in which the data element was contained. Adding new data elements to records without changing the application programs is possible as long as the software communicates with the data at the data element (field) level rather than at the record level. This often results in the creation of complex data structures. However, good database software relieves the application programmer of the hassles associated with structural complexity. Regardless of how the data is actually organized, the application programmer must think of the file as a relatively simple structure that is designed to suit his requirements.

Stage 3 database software (early 1970s) had a means of mapping the application programmer's file structure into a physical data structure that was stored on the physical medium and vice versa.

Depending on the software level, the data element application programmer must also know the organization of the data file. In this case, it may have to specify the machine address of the data. If there is no data independence, the application programmer needs to know the exact physical format of the record. The worst option is the case when the programmer must be a “navigator”.

The process of converting an application programmer's access to a logical record or to elements of a logical record into machine access to a physical record and its elements is called binding. A binding is a connection between a physical representation of data and a program that uses that data. Once the binding process is completed, the program will no longer be independent of the physical data.

So, for the 3rd stage:

Different logical files could be derived from the same physical data.

The same data was accessed by different applications in different ways to suit the requirements of those applications.

The software contained features to reduce data redundancy.

Data elements were common across applications.

The physical structure of the data is independent of application programs. It could be changed to improve the efficiency of the database without causing modifications to application programs,

Data is addressed at the field or group level. .


As experience gained with early database management systems, it quickly became apparent that an additional level of data independence was needed. The overall logical structure of data is typically complex and will inevitably change as the database grows. It is therefore important to ensure that the overall logical structure can be changed without changing the multiple application programs involved. In some systems, changes in the overall logical structure of data constitute the form of its existence, that is, this structure is in a state of constant development. Therefore, two levels of data independence are required. They are called logical And physical data independence.

Logical data independence means that the overall logical structure of the data can be changed without changing the application programs (the change, of course, should not consist of removing from the database those elements that are used by the application programs).

Physical data independence means that the physical layout and organization of data can change without causing changes to either the overall logical structure of the data or the application programs.

Stage 4 is characterized by the ideas of logical and physical independence of data; The logical structure of data can be very different from the physical structure of data and from its representations in specific application programs. The database software will actually transform the application programmer's view of the data into a common logical view and then map the logical view into a physical view of the data.

The purpose of such a structure provides maximum freedom in changing data structures without reworking the previously completed work on the formation and use of the database.

The database can be developed without high maintenance costs.

The tools provided for the data administrator allow him to perform the functions of a controller and ensure the safety of data.

Effective management procedures are in place to protect the privacy, integrity and security of data.

Some systems use inverted files to allow quick search data in the database.

Databases are designed to provide answers to information requests that are not planned in advance.

A means of moving data is provided.

Requirements that a database organization must satisfy.

This issue has been studied for a long time by various groups of people in institutions that use computers, in government commissions, and in public computing centers. The CODASYL Committee has published reports on this topic (CODASYL is the organization that developed the COBOL language). The IBM SHARE and GUIDE user organizations formulated requirements for a database management system in their report. The ACiM (Association for Computing Machinery) organization also studied this issue.

Listed below are the basic requirements for organizing a database.

Establishing multilateral connections

Different programmers require different logic files. These files are obtained from the same set of data. Various relationships may exist between elements of stored data. Some databases will contain complex webs of relationships. The method of organizing data should be such that it is possible to conveniently represent these relationships and quickly agree on changes made to them. The database management system must provide the ability to obtain the required logical files from the existing data and the relationships existing between them. There must be at least a slight similarity between the application program's representation of a logical file and the way the data is physically stored.

Performance

Databases specifically designed for use by the terminal operator provide response times that are satisfactory for human-terminal dialogue. In addition, the database system must provide adequate throughput. In systems designed for a small flow of requests, throughput imposes minor restrictions on the database structure. In high-traffic systems, such as airline reservation systems, throughput has a decisive influence on the choice of physical data storage.

In batch-only systems, response time is not as important and the physical organization method can be chosen to ensure efficient batch processing.

Minimum costs

To reduce the cost of creating and operating a database, organization methods are selected that minimize external memory requirements. With these techniques, the physical representation of data in memory can be very different from the representation used by the application programmer. The conversion from one representation to another is carried out by software or, if possible, hardware or firmware. In such cases, you have to choose between the cost of the conversion algorithm and saving memory.

Minimal redundancy

In processing systems that existed before the use of database management systems, information collections had a very high level of redundancy. Most tape libraries contained large amounts of redundant data. Even with databases, as more information is collected into integrated databases, the potential for redundant data gradually increases. Redundant data is expensive in the sense that it takes up more memory than necessary and requires more than one update operation. The goal of database organization should be to eliminate redundant data where it is beneficial and to control the inconsistencies that are caused by the presence of redundant data.

Search options

A database user can contact it with a variety of questions about the stored data. In most modern commercial applications, query types are predefined and the physical organization of data is designed to process them at the required speed. Increased demands on systems are to ensure the processing of such requests or the generation of such responses that are not planned in advance. .

Integrity

If the database contains data shared by many users, it is very important that the data elements and the relationships between them are not destroyed. It is necessary to take into account the possibility of errors and various types of random failures. Data storage, updating, and data inclusion procedures must be such that the system, in the event of failures, can restore data without loss. It is necessary that the computing system guarantees the integrity of the data stored in it.

Security and privacy

Data in database systems must be kept confidential and secure. The information stored is sometimes very important to the institution using it. It must not be lost or stolen. To increase the viability of information in a database, it is important to protect it from hardware or software failures, from catastrophic and criminal situations, from incompetent or malicious use by persons who may misuse it.

Under security data understand the protection of data from accidental or intentional access to it by persons who do not have the right to do so, from unauthorized modification of data or their destruction.

Secrecy defined as the right of individuals or organizations to determine when, how, and how much relevant information may be shared with other individuals or organizations.

Connection with the past

Organizations that operate data processing systems for some time spend significant amounts of money on writing programs, procedures, and organizing data storage. When a company begins to use new database management software on a computing installation, it is very important that it can work with the programs already existing on this installation and that the processed data can be converted accordingly. This condition requires software and information compatibility, and its absence may become a major limiting factor in the transition to new database management systems. It is important, however, that the problem of communication with the past does not hold back the development of database management tools. .

Connection with the future

The connection with the future seems especially important. In the future, data and its storage environment will change in many ways. Any commercial organization undergoes changes over time. These changes are especially expensive for users of data processing systems. The enormous costs required to implement the simplest changes greatly hinder the development of these systems. These costs are spent on data conversion, rewriting and debugging of application programs resulting from the changes. Over time, the number of application programs in an organization grows, and therefore the prospect of rewriting all of these programs seems unrealistic. One of the most important tasks in database design is to plan the database in such a way that changes to it can be made without modifying the application programs.

Ease of use

The means used to represent the overall logical description of data should be simple and elegant.

The software interface should be end-user oriented and take into account the possibility that the user does not have the necessary knowledge base of database theory. .


Data representation models

With the growing popularity of DBMSs in the 70s and 80s, many different data models emerged. Each of them had its own advantages and disadvantages, which played a key role in the development of the relational data model, which appeared largely due to the desire to simplify and streamline the first data models.

Modern databases are based on the use of data models (MD), which allow describing objects of subject areas and the relationships between them; there are three main MDs and their combinations on which databases are based: relational data model (RDM), network data model (NDM), hierarchical data model (IMD).

The main difference between these data models is the way they describe the interactions between objects and attributes. A relationship expresses a relationship between sets of data.

They use one-to-one, one-to-many, and many-to-many relationships. "One to one"

The emergence of a database management system. Stages of designing the "Construction Company" database. Infological and datalogical data model. Requirements for information and software compatibility for working with the "Construction Company" database.

Building a conceptual model. Designing a relational data model based on normalization principles: the normalization process and glossary. Database design in Microsoft Access: building tables, creating queries, including SQL queries.

Designing an Access database. Database management system. Creation and maintenance of a database, providing access to data and processing it. Setting tasks and goals, the main functions performed by the database. Main types of databases.

Designing a car service station database model in dialogue mode. It is possible to enter initial data (car owners, faults, etc.), make changes and obtain help in the MS Access report.

Development of a project to create a database to automate the commercial activities of the Hypermarket shopping center. Study of a given subject area and selection of the most significant attributes. Construction of a conceptual infological model of the subject area.

Computer company database

Information system of a computer company, description of the subject area, system model. Creation of a database: problem statement, list of database objects, infological and datalogical models, physical modeling. Forms, queries and reports.

Basic concepts and definition of a database, stages of creation and design, models used. Creation of a database "Public Insurance" for processing data on types of insurance, their cost, completed transactions, clients, insurance validity periods.

The essence of the database. The process of constructing a conceptual model. Building a relational model, creating a key field. Normalization process. Database design in ACCESS. Procedure for creating a database. Creation SQL queries and working in a database.

The concept and procedure for developing a database, its main components and purpose. Construction of a consulting agency database based on an information model, reflected entities and connections between them. Features of database implementation in MS ACCESS.

Creation of a database "Sales of finished products of a timber processing enterprise" with Microsoft help Access for entering data about finished products and their sales, clients and storing information about them and for automatically receiving a report on sales volume.

Concept of databases used in AIWS

Section 2

Control questions

1.What is data, information, knowledge?

2. Define a database (DB).

3.What is the purpose of the database?

4. Define the concepts “file”, “record”, “attribute”, “domain”, “field”, “key”, “superkey”, “architecture”, “data schema”, “data model”, “tuple” , "data dictionary".

5. Define the concepts of “subject area”, “application”, “program”, DL, LMD.

6.Give the classification of DBMS and database.

7. Describe the composition of the DBMS.

8.Show the relationship between DBMS and DBA.

9.List the database operating procedures.

10.Name the components of database theory.

11.List the main elements of the database structure from the standpoint of its implementation.

12.What is the purpose of OLTP and OLAP? relationship between their properties?

13.Describe the composition of OLAP.

14.Name the types of multidimensional model.

A concept in a general sense represents a certain system of views on a process or phenomenon. The components of the concept are a set of principles and methodology. Methodology is understood as a set of methods for solving a problem.

Principle - rules that should be followed in activities. Often principles are formulated in the form of restrictions and requirements, in particular, requirements for databases.

From a modern perspective, the requirements for transactional (operational) databases and data warehouses should be considered separately.

First, let us list the basic requirements that apply to operational databases, and, consequently, to the DBMS on which they are built.

1. Easy to update data. The update operation refers to adding, deleting and changing data.

2. High performance (short response time to a request).
Response time - the period of time from the moment of request to the database and
actual receipt of data. A similar term is time
access - the time interval between the issuance of a write (read) command and the actual receipt of data. Accessible by ponies
The operation of searching, reading or writing data is performed.

3. Data independence.

4. Sharing data among many users.

5. Data security - protecting data from intentional
or inadvertent breach of confidentiality, misrepresentation or
destruction.

6. Standardization of database construction and operation (in fact
DBMS).

8.Friendly user interface.

The first two contradictory requirements are the most important: increasing performance requires simplifying the database structure, which, in turn, complicates the procedure for updating data and increases its redundancy.

Data independence - the ability to change logical and physical structures s database without changing user views. Data independence presupposes invariance to the nature of data storage, software and hardware. It ensures minimal changes to the database structure when the data access strategy and the structure of the source data themselves change. This is achieved, as will be shown below, by “shifting” all changes to the conceptual and logical design stages with minimal changes at the physical design stage.

Data security includes data integrity and protection. Data integrity - the resistance of stored data to damage and destruction associated with faults technical means, system errors and erroneous user actions.

It assumes:

No inaccurately entered data or two identical data
records about the same fact;

Protection against errors when updating the database;

Inability to delete separately (cascade delete) related data from different tables;

Non-distortion of data when working in a multi-user environment
bench press and distributed databases data;

Data safety in case of equipment failures (data recovery).

Integrity is ensured by integrity triggers - special application programs that operate under certain conditions. For some DBMSs (for example, Access, Paradox) triggers are built-in.

Data protection from unauthorized access involves restricting access to confidential data and can be achieved:

Introduction of a password system;

Obtaining permissions from the database administrator (DBA);

A ban from the DBA on access to data;

Formation of views - tables derived from the original and
intended for specific users.

The last three procedures are easily performed within the language structured queries Structured Query Language - SQL, often called SQL2.

Standardization ensures the continuity of DBMS generations and simplifies the interaction of databases of the same generation of DBMS with the same and different data models. Standardization (ANSI/SPARC) has been carried out to a large extent in terms of the DBMS user interface and the SQL language. This made it possible to successfully solve the problem of interaction between various relational DBMSs both using the SQL language and using the Open DataBase Connection (ODBC) application. In this case, both local and remote access to data (client-server technology or network option).

Let's move on to the requirements for data warehouses, which are structurally a continuation of operational databases.

Let the database contain data on the performance of third-year students, with the current semesters being the fifth and sixth. Data for the first four semesters are located (transferred) in a data warehouse (DW), i.e., in fact, in an additional, specific database. It is necessary to request from the repository the names of students who studied with excellent marks in the first four semesters.

In other words, data from the operational database is periodically transferred to an electronic archive (in the example considered, data for the first four semesters), and then can be processed in accordance with the user’s request.

Since the data in the storage practically does not change, but is only added, the requirement for ease of updating becomes irrelevant. Due to the significant amount of data in the storage, the requirement for high performance comes first.

The following additional requirements apply to data warehouses:

High performance of loading data from operational databases;

Possibility of filtering, reformatting, checking
integrity of source data, data indexing, metadata updating;

Increased requirements for the quality of source data in terms of
ensuring their consistency, since they may be
obtained from various sources;

High query performance;

Ensuring high dimensionality;

Simultaneous access to the data storage;

Availability of administration tools.

Support for data analysis using appropriate methods (tools).

E.F. Based on his experience, Codd laid down the following requirements for an OLAP system.

1.Multidimensional conceptual representation of data.

2.Transparency of technology and data sources.

3.Accessibility to data sources when using various data models.

4.Consistent performance of report preparation as the volume, number of measurements, and data summarization procedures increase.

5. Using a flexible, adaptive, scalable client-server architecture.

6. Universality of measurements (formulas and means of creating
reports should not be tied to specific types of dimensions).

7. Dynamic control of matrix sparsity (empty
NULL values ​​must be stored in an efficient manner).

8. Multi-user support.

9. Unlimited operational connections between dimensions.

10.Support for intuitive data manipulation.

11.Flexibility of report generation tools.

12.Unlimited number of dimensions and levels of generalization.

The listed requirements are different from the requirements for operational databases, which led to the emergence of specialized databases - data warehouses.

Annotation: The lecture discusses the general meaning of the concepts of database (DB) and database management system (DBMS). Basic concepts related to the database are given, such as algorithm, tuple, object, entity. Basic requirements for a data bank. Definitions of DB and DBMS.

Purpose of the lecture: Understand the difference between a database and a database management system. Familiarize yourself with the basic requirements for a data bank and the basic definitions related to databases and DBMSs.

Let's consider the general meaning of the concepts of database (DB) and database management system (DBMS).

From the very beginning of the development of computer technology, two main directions of its use emerged.

The first direction is the use of computer technology to perform numerical calculations that take too long or are impossible to perform manually. The formation of this direction contributed to the intensification of methods for numerically solving complex mathematical problems, the development of a class of programming languages ​​focused on the convenient recording of numerical algorithms, the formation feedback with developers of new computer architectures.

The second direction is the use of computer technology in automatic or automated information systems Oh. In the broadest sense, an information system is a software complex, the functions of which are to support reliable storage of information in computer memory, perform specific tasks this application transformations of information and/or calculations, providing users with a convenient and easy-to-learn interface. Typically, the volumes of information that such systems have to deal with are quite large, and the information itself has a rather complex structure. Classic examples of information systems are banking systems, systems for reserving airline or train tickets, hotel rooms, etc.

In fact, the second direction arose somewhat later than the first. This is due to the fact that in the early days of computing, computers had limited memory capabilities. It is clear that we can talk about reliable and long-term storage information only if there are storage devices that retain information after turning off the power supply. RAM usually does not have this property. In the beginning, two types of external memory devices were used: magnetic tapes and drums. At the same time, the capacity of magnetic tapes was quite large, but by their physical nature they provided sequential access to the data. Magnetic drums (they are most similar to modern magnetic disks with fixed heads) allowed random access to data, but were of limited size.

It is easy to see that these restrictions are not very significant for purely numerical calculations. Even if a program must process (or produce) a large amount of information, when programming, you can think about the location of this information in external memory so that the program runs as quickly as possible.

On the other hand, for information systems in which the need for current data is determined by the user, the presence of only magnetic tapes and drums is unsatisfactory. Imagine a ticket buyer who, standing at the ticket office, must wait until the magnetic tape is completely rewinded. One of the natural requirements for such systems is the average speed of operations.

It was the requirements for computing technology from non-numerical applications that caused the emergence of removable magnetic disks with movable heads, which was a revolution in the history of computing. These external memory devices had a significantly larger capacity than magnetic drums, provided satisfactory speed of data access in random access mode, and the ability to change the disk package on the device made it possible to have an almost unlimited data archive.

With the advent of magnetic disks, the history of data management systems in external memory began. Previously, each application program that needed to store data in external memory itself determined the location of each piece of data on a magnetic tape or drum and performed exchanges between the RAM and external memory using software and hardware low level(machine commands or calls to corresponding programs operating system). This mode of operation does not allow or makes it very difficult to maintain one external media several archives of long-term stored information. In addition, each application program had to solve problems of naming parts of data and structuring data in external memory.

A historic step was the move to file management systems. From the application program's point of view, a file is named area external memory into which data can be written to and read from. Rules for naming files, how data stored in a file is accessed, and the structure of that data depend on the specific file management systems and possibly depending on the file type. File management system takes care of allocating external memory, mapping file names to appropriate external memory addresses, and providing access to data.

Any information processing and decision-making task can be represented in the form of a diagram shown in Fig. 1.1.


Rice. 1.1.

Definition of Key Terms

Let us define the main terms. As components The diagram identifies information (input and output) and the rules for its transformation.

Rules can be in the form of algorithms, procedures and heuristic sequences.

Algorithm - sequence of rules for transition from initial data to result. The rules can be executed by a computer or a human.
Data - a set of objective information.
Information - information previously unknown to the recipient of information, adding to his knowledge, confirming or refuting provisions and corresponding beliefs. Information is subjective in nature and is determined by the level of knowledge of the subject and the degree of his perception. Information is extracted by the subject from the relevant data.
Knowledge - a set of facts, patterns and heuristic rules with the help of which the task is solved.

The sequence of data processing operations is called information technology (IT). Due to the significant amount of information in modern tasks, it must be organized. There are two approaches to ordering.

  1. Data is associated with a specific task (array technology) - organized by use. At the same time, algorithms are more mobile (can change more often) than data. This necessitates the reordering of data, which can also be repeated in different tasks.
  2. In this regard, another widely used database technology has been proposed, which is storage ordering.

Under database (DB) understand a collection of data stored together with such minimal redundancy that it can be used optimally for one or more applications. Purpose creating databases, like varieties information technology and forms of data storage, is the construction of a data system that does not depend on the adopted algorithms ( software), the technical means used and the physical location of data in the computer; providing consistent and holistic information for unregulated queries. The database assumes its multi-purpose use (several users, many forms of documents and queries of one user).

Knowledge Base (KB) is a set of databases and used rules received from decision makers (DMs).

Along with the concept of “database,” there is the term “data bank,” which has two interpretations.

  1. Data is currently being processed decentralized(at workplaces) using personal computers(PC). Initially, centralized processing on large computers was used. Due to centralization, the database was called a data bank and therefore often no distinction is made between databases and data banks.
  2. Database- database and its management system (DBMS). A DBMS (for example, FoxPro) is an application for creating databases as a collection of two-dimensional tables.
Data Bank (BnD) is a system of specially organized data, software, language, organizational and technical tools designed for centralized accumulation and collective multi-purpose use of data.
Databases (DB) is a named collection of data that reflects the state of objects and their relationships in the subject area under consideration. A characteristic feature of databases is persistence: data is constantly accumulated and used; the composition and structure of data, necessary for solving certain applied problems, are usually constant and stable over time; individual or even all data elements may change - but these are also manifestations of constancy - constant relevance.
Database management system (DBMS) - is a collection of linguistic and software, intended for creating, maintaining and sharing DB by many users.

Sometimes archives are included as part of a data bank. The basis for this is a special mode of data use, when only part of the data is under the operational control of the DBMS. All other data is usually located on media that is not operationally managed by the DBMS. The same data at different points in time can be included in both databases and archives. Data banks may not have archives, but if they do, the data bank may also include an archive management system.

Effective management external memory are the main function of the DBMS. These usually specialized tools are so important from an efficiency point of view that without them, the system simply will not be able to perform some tasks simply because they will take too much time to complete. However, none of these specialized functions are visible to the user. They provide independence between logical and physical levels systems: the application programmer does not have to write indexing programs, allocate disk memory, etc.

Basic requirements for data banks

The development of the theory and practice of creating information systems based on the concept of databases, the creation of unified methods and means of organizing and retrieving data make it possible to store and process information about more and more complex objects and their relationships, providing multidimensional information needs of different users. The basic requirements for data banks can be formulated as follows:

  • Reuse of data: users must be able to use the data in a variety of ways.
  • Simplicity: Users must be able to easily find out and understand what data is available to them.
  • Ease of use: users should be able to access data in a (procedurally) simple manner, with all the complexities of data access hidden within the database management system itself.
  • Flexibility of use: accessing or searching for data should be done using various methods access.
  • Fast processing of data requests: queries for data must be processed using a high-level query language, not just application programs written for the purpose of processing specific queries.
  • Interaction language end users with the system must provide end users with the ability to obtain data without using application programs.

The database is the basis for the future growth of application programs: databases should enable the rapid and cheap development of new applications.

  • Saving mental labor costs: existing programs And logical structures data should not be altered when changes are made to the database.
  • Availability of application programming interface: Application programs must be able to perform data queries simply and efficiently; programs must be isolated from file locations and addressing methods data.
  • Distributed data processing: the system must operate in conditions computer networks and ensure efficient user access to any distributed database data located anywhere in the network.
  • Adaptability and extensibility: the database must be configurable, and customization should not cause overwriting of application programs. In addition, the set of predefined data types supplied with the DBMS must be extensible - the system must have tools for defining new types and there should be no differences in the use of system and user-defined types.
  • Data integrity control: the system must monitor errors in the data and check the mutual logical consistency of the data.
  • Data recovery after failures: automatic recovery without loss of transaction data. In the event of hardware or software failures, the system must fall back to some consistent data state.
  • Aids must allow the developer or
  • Linguistic means;
  • Software;
  • Technical means;
  • Organizational and administrative subsystems and normative and methodological support.

Organizational and methodological means is a set of instructions, methodological and regulatory materials, descriptions of the structure and procedure for the user to work with the DBMS and database.

Database and DBMS users

Users (DBMS) can be divided into two main categories: end users; database administrators.

We especially need to talk about the Database Administrator (DBA). Naturally, the database is built for the end user (UC). However, it was initially assumed that the CPs would not be able to work without a specialist programmer, who was called a database administrator. With the advent of DBMSs, they took over a significant part of the functions of DBAs, especially for databases with a small amount of data. However, for large centralized and distributed databases, the need for a DBA remains. In broad terms, DBAs mean systems analysts, data structure designers, and information support, designers of processing technology, system and application programmers, operators, subject matter and software specialists maintenance. In other words, in large databases these may be teams of specialists. The responsibilities of the ADB include:

  1. analysis of the subject area, information status and users;
  2. designing structure and modifying data;
  3. setting and ensuring integrity;
  4. data protection;
  5. ensuring database recovery;
  6. collection and statistical processing calls to the database, analysis of the efficiency of the database;
  7. work with the user.

Brief summary

A database (DB) is a named collection of data that reflects the state of objects and their relationships in the subject area under consideration.

A database management system (DBMS) is a set of language and software tools designed for creating, maintaining and sharing a database with many users.

The main requirements for data banks: reuse of data, simplicity, ease of use, flexibility of use, fast processing of data requests, interaction language.

Users (DBMS) can be divided into two main categories: end users; database administrators.

Self-test questions

  • Define a database.
  • Define a data bank.
  • Name two interpretations of the data bank.
  • What is a database management system?
  • Basic requirements for a data bank.
  • What is data, information, knowledge?
  • DBMS and database users?
  • Basic functions of a database administrator.
  • What makes it possible to quickly and cheaply develop new applications?

Database Design is the process of creating a database project designed to support the functioning of an economic entity and contribute to the achievement of its goals.

It is a labor-intensive process that requires the joint efforts of analysts, designers and users.

When designing a database, it is necessary to take into account the fact that the database must satisfy a set of requirements. These requirements are as follows:

1) database integrity – the requirement for completeness and consistency of data;

2) data reuse;

3) quick search and obtaining information based on user requests;

4) ease of data updating;

5) minimizing data redundancy;

6) protection of data from unauthorized access, distortion and destruction.

15. Stages of the database life cycle.

Life cycle – this is the process of designing, implementing and maintaining a database, consisting of the following stages: – preliminary planning; – feasibility check; – definition of requirements; – conceptual design; – logical design; – physical design; – performance evaluation and database support.

Pre-planning– information is collected about programs in use and under development and files associated with them, the information is documented in the form of a generalized conceptual data model.

Feasibility check– reports are prepared on: technological feasibility (hardware and software required); operational feasibility (availability of personnel and experts); economic efficiency (whether the planned database will pay off).

Defining Requirements– the goals of the database, the information needs of departments and their managers, and the requirements for software and hardware are formulated.

Conceptual design– a detailed model of user views of domain data is created. These models are integrated into a conceptual model.

Logical design– transformation of a conceptual model into a logical model. You must first select the model type.

Physical design– a physical model of the database is created by expanding the logical model with the following characteristics: types of storage device; required amount of memory; methods of accessing the database and some others.

Performance assessment and database support– assessment is carried out by surveying users, user training, support - data modification, development of new application programs and some other tasks.

12. “Entity-relationship” model. Entity, attribute, entity instance, relationship, connection strength, cardinality indicator, entity membership class. Er-diagrams.

A tool for modeling a subject area at the conceptual design stage is entity-relationship model , which is often called the ER model. Modeling the data structure of the subject area in it is based on the use of graphical tools - ER diagrams . ER diagram is a graphical representation of the relationship between entities.

Basic concepts of ER diagram:entity, attribute, connection.

Essence – this is some object of the real world, a cat. can exist independently. In an ER diagram, an entity is represented by a rectangle in which its name is indicated. An entity has instances that differ from each other in attribute values ​​and allow unambiguous identification. Entity instance – a specific object characterized by a set of entity attribute values.

Attribute is a named characteristic of an entity. Attribute, which uniquely identifies instances of an entity, is called a key. The key can be composite, that is, representing a combination of several attributes, or simple.

Connection – interaction between entities. It is characterized power , which shows how many entities are involved in the relationship. The relationship between two entities is called binary . In an ER diagram, an entity is depicted as a diamond.

An important characteristic of a connection is the type of connection (cardinality).

Communication type "many-to-many" (each instance of entity A can be associated with several instances of entity B, and each instance of entity B with several instances of entity A). (M:N)

Communication type "one to one" (each instance of entity A can be associated with at most one instance of entity B).(1:1)

Communication type "one-to-many" (each instance of entity A can be associated with more than one instance of entity B, and each instance of entity B can be associated with at most one instance of entity A).

Entity membership class:required– if each instance of entity A is associated with an instance of entity B. – optional– if not every instance of entity A is associated with an instance of entity B.

A conceptual model consists of an ER model and a set of entity attributes.

The ability to change the logical and physical structure of the database without changing user perceptions.

Data independence presupposes invariance to the nature of data storage, software and hardware. It ensures minimal changes to the database structure when the data access strategy and the structure of the source data themselves change. This is achieved, as will be shown below, by “shifting” all changes to the conceptual and logical design stages with minimal changes at the physical design stage.

Data security includes data integrity and protection.

Resistance of stored data to destruction and destruction associated with technical malfunctions, system errors and erroneous user actions.

It assumes:

    1) the absence of inaccurately entered data or two identical entries about the same fact;

    2) protection against errors when updating the database;

    3) the impossibility of deleting (or cascading deleting) related data from different tables;

    4) non-distortion of data when working in multi-user mode and in distributed databases;

    5) data safety in case of equipment failures (data recovery).

Integrity is ensured integrity triggers- special application programs that work under certain conditions. Data protection from unauthorized access involves restricting access to confidential data and can be achieved:

    1) introduction of a password system;

    2) obtaining permissions from the database administrator (DBA);

    4) the formation of types - tables derived from the original ones and intended for specific users.

The last three procedures are easily performed within the Structured Query Language - SQL, often called SQL2.

Standardization ensures the continuity of DBMS generations and simplifies the interaction of databases of the same generation of DBMS with the same and different data models. Standardization (ANSI/SPARC) has been carried out to a large extent in the DBMS user interface and SQL language. This made it possible to successfully solve the problem of interaction between various relational DBMSs both using the SQL language and using the Open DataBase Connection (ODBC) application. In this case, both local and remote access to data can be provided (client/server technology or network option).

Database construction concept

The evolution of the database concept is of interest.

Originally (early 60s) used file system storage To solve primarily engineering problems, characterized by a small amount of data and a significant amount of calculations, the data was stored directly in the program. A consistent way of organizing data was used, there was high redundancy, identical logical and physical structures, and complete data dependence. With the advent of economic and managerial tasks (management information system - MIS), characterized by large volumes of data and a small proportion of calculations, this organization of data turned out to be ineffective. Data ordering was required, which, as it turned out, could be carried out according to two criteria: use (information arrays); storage (databases). Initially, information arrays were used, but the superiority of databases soon became clear. Using files to store only data (Fig. 2.1, a) was proposed by Mac Gree in 1959. Methods of access (including random access) to such files were developed, while the physical and logical structures were already different, and the physical location of the data could be changed without changing the logical representation.

In 1963, S. Bachman built the first industrial IDS database with a networked data model, which was still characterized by data redundancy and its use for only one application. Data were accessed using appropriate software. In 1969, a group was formed that created a set of CODASYL standards for network model data.

Actually started to be used (Fig. 2.1, b) modern database architecture. Architecture is understood as a type (generalization) of a structure in which any element can be replaced by another element, the characteristics of the inputs and outputs of which are identical to the first element. A significant leap in the development of database technology was given by the paradigm of the relational data model proposed by M. Codd in 1970. A paradigm is understood as a scientific theory embodied in a system of concepts reflecting the essential features of reality. Now logical structures could be obtained from the same physical data, i.e. The same physical data could be accessed by different applications through different paths. It has become possible to ensure data integrity and independence.

At the end of the 70s, modern DBMSs appeared, providing physical and logical independence, data security, and having developed database languages. The last decade is characterized by the emergence of distributed and object-oriented databases, the characteristics of which are determined by the applications of automated design tools and database intellectualization.

Before considering procedures for working with a database, we will give a set of database characteristics (Fig. 2.2)
and explanations for it.

There are two approaches to building a database, based on two approaches to creating an automated control system (ACS).

The first of them, widely used in the 80s and therefore called classical (traditional), is associated with the automation of document flow (a set of documents moving during the operation of an enterprise). The source and output coordinates were documents, as can be seen from example 2.1.

Example 2.1. The task is stated as follows. There is a system of manual documents, the form of one of which is shown in Table. 2.1.

Table 2.1.

Delivery data

It is necessary, using the database, to obtain - according to regulations or upon request - information in the form of another system of documents, the form of one of which is given in Table. 2.2.

Table 2.2.

Delivery report for the quarter

The following thesis was used. Data is less mobile than algorithms, so you should create a universal database that can then be used for any algorithm. However, it soon became clear that creating a universal database was problematic. The concept of data integration, which was dominant until recently, turned out to be untenable with a sharp increase in its volume. Moreover, applications began to appear (for example, text, graphic editor), based on widely used standard algorithms. Standard algorithms have also emerged in management (business), as follows from example 2.2.

Example 2.2. Let's consider the standard procedure for using a bank credit card. The buyer-client selects a product in a supermarket and, approaching the checkout, presents a credit card for payment. It is lowered into a special receiver, and the data from it is read and transmitted to the supermarket computer. This computer communicates with the bank computer where the client's money is stored. Data from the bank's computer (regarding the client) is transferred to the supermarket's computer. If the client has more funds in his bank account than the cost of the goods he selected, then the market computer allows the goods to be released. At the same time, he recalculates the funds in the client's account, making changes to the financial documents of the supermarket, the client's bank account and credit card. The credit card with the changed data is returned to the client. If the client does not have enough funds, the credit card may be returned to the client and he will not be served at the supermarket.

By the 90s, a second, modern approach related to control automation had emerged. It involves the initial identification of standard application algorithms (business algorithms in foreign terminology), under which the data, and therefore the database, are defined. Object-oriented programming has only increased the importance of this approach. The composition of the database for various approaches is presented in Fig. 2.3.

The database can operate in single and multi-user modes (several users connect to one computer through different ports).

They use bottom-up and top-down database design. The first is used in distributed databases when integrating designed local databases, which can be implemented using various data models. More typical for centralized databases is top-down design.

In the following sections, the classical approach for centralized databases will first be considered, and then the modern one. Part III of this work is devoted to distributed databases.

Working with databases can be presented in the form of a diagram shown in Fig. 2.4.
It shows that the methodology for creating and the methodology for using the database should be distinguished. The database methodology is determined in the design procedure, but also manifests itself in the use procedure.

Database Design Methodology

There are many types of methodology for considering databases in the classical approach, but most often they adhere to the ANSI/SPARC methodology, the diagram of which is presented in Fig. 2.5.

In Fig. 2.5 shows a set of procedures for designing a centralized database, which can be combined into four stages.

At the stage of formulating and analyzing requirements, the goals of the organization are established and the requirements for the database are determined. They consist of general requirements, defined in section 2.1, and specific requirements. To formulate specific requirements, the technique of interviewing personnel at various management levels is usually used. All requirements are documented in a form accessible to the end user and the database designer.

The conceptual design stage consists of description and synthesis information requirements users into the initial database project. The source data can be a set of user documents (Fig. 2.4)
with the classical approach or application algorithms (business algorithms) with the modern approach. The result of this stage is a high-level representation (in the form of a system of database tables) of user information requirements based on various approaches.

First, the database model is selected. Then, using the DML, a database structure is created, which is filled with data using DML commands, menu systems, screen forms, or in the database table viewing mode. Here, the protection and integrity (including referential integrity) of data is ensured using a DBMS or by constructing triggers.

During the logical design process, a high-level representation of data is converted into the structure of the DBMS used. The main goal of this stage is to eliminate data redundancy using special normalization rules (Fig. 2.4).
The purpose of normalization is to minimize data repetition and possible structural changes in the database during update procedures. This is achieved by dividing (decomposing) one table into two or more and then using navigation operations in queries. Note that navigation search reduces the performance of the database, i.e. increases request response time. The resulting logical structure of the database can be quantified using various characteristics(number of accesses to logical records, volume of data in each application, total volume of data). Based on these assessments, the logical structure can be improved to achieve greater efficiency.

The database management procedure deserves special discussion. It's easiest in single-user mode. In multi-user mode and in distributed databases, the procedure becomes much more complicated. If several users have simultaneous access without taking special measures, integrity may be compromised. To eliminate this phenomenon, use a transaction system and a locking mode for tables or individual records.

The process of changing a file, record, or database caused by the transmission of a single input message. Features of blocking and blocking options will be discussed separately below.

At the physical design stage, issues related to system performance are resolved, data storage structures and access methods are determined.

The interaction between the design phases and the dictionary system must be considered separately. Design procedures can be used independently in the absence of a dictionary system. The dictionary system itself can be considered as an element of design automation.

Design tools and evaluation criteria are used at all stages of development. Currently, uncertainty in the selection of criteria is the weakest point in database design. This is due to the difficulty of describing and identifying a large number of alternative solutions.

The situation is simpler when working with quantitative criteria, which include response time to a request, modification cost, memory cost, creation time, and reorganization cost. The difficulty may be caused by the criteria contradicting each other.

At the same time, there are many optimality criteria that are immeasurable properties that are difficult to express quantitatively or as an objective function.

Quality criteria may include flexibility, adaptability, accessibility for new users, compatibility with other systems, ability to convert to another computing environment, ability to be restored, ability to distribute and expand.

The design process is long and labor-intensive and usually lasts several months. The main resources of the database designer are his own intuition and experience, so the quality of the solution in many cases may be low.

The main reasons for the low efficiency of the designed databases may be:

    insufficiently deep analysis of requirements (initial stages of design), including their semantics and data relationships;

    the long duration of the structuring process, making this process tedious and difficult to perform manually.

In these conditions important issues of development automation are acquired.

Methodology for using databases

Databases are usually not used independently, but are a component of various information systems: data banks, information retrieval and expert systems, computer-aided design systems, automated workstations, automated systems management.

The database has three levels of data presentation (Fig. 2.4):
conceptual, logical and physical base data.

In the procedure of use, they most often deal with a logical and - much less often - with a conceptual and physical model.

A data dictionary is like an internal database containing centralized information about all types of data, their names, structure, as well as information about their use. The advantage of a data dictionary is its efficient storage and management. information resources subject area. Its use allows you to reduce redundancy and inconsistency of data when entering it, carry out simple and effective management when modifying it, simplify the database design procedure by centralizing data management, and establish connections with other users. Thus, the data dictionary contains a generalized representation of all three levels: conceptual, logical and physical.