关系数据库非关系数据库

介绍 (Introduction)

Database management systems (DBMS) are computer programs that allow users to interact with a database. A DBMS allows users to control access to a database, write data, run queries, and perform any other tasks related to database management.

数据库管理系统 (DBMS)是允许用户与数据库进行交互的计算机程序。 DBMS允许用户控制对数据库的访问，写入数据，运行查询以及执行与数据库管理有关的任何其他任务。

In order to perform any of these tasks, though, the DBMS must have some kind of underlying model that defines how the data are organized. The relational model is one approach for organizing data that has found wide use in database software since it was first devised in the late 1960s, so much so that, as of this writing, four of the top five most popular DBMSs are relational.

但是，为了执行这些任务中的任何一项，DBMS必须具有某种基础模型来定义数据的组织方式。 关系模型是一种组织数据的方法，这种方法自1960年代末首次被设计以来就已经在数据库软件中得到了广泛使用，以至于在撰写本文时，最流行的五个DBMS中有四个是关系数据库。

This conceptual article outlines the history of the relational model, how relational databases organize data, and how they’re used today.

这篇概念性文章概述了关系模型的历史，关系数据库如何组织数据以及如何使用它们。

关系模型的历史 (History of the Relational Model)

Databases are logically modelled clusters of information, or data. Any collection of data is a database, regardless of how or where it is stored. Even a file cabinet containing payroll information is a database, as is a stack of hospital patient forms, or a company’s collection of customer information spread across multiple locations. Before storing and managing data with computers was common practice, physical databases like these were the only ones available to government and business organizations that needed to store information.

数据库是逻辑上建模的信息或数据集群。任何数据收集都是一个数据库，而不管其存储方式或存储位置。甚至包含薪资信息的文件柜都是数据库，医院病人表格的堆栈也是如此，或者公司在多个位置分布的客户信息集合也是如此。在使用计算机存储和管理数据之前，通常是这样的物理数据库，它们是政府和商业组织唯一需要存储信息的数据库。

Around the middle of the 20th century, developments in computer science led to machines with more processing power, as well as greater local and external storage capacity. These advancements led computer scientists to start recognizing the potential these machines had for storing and managing ever larger amounts of data.

大约在20世纪中叶，计算机科学的发展导致机器具有更大的处理能力以及更大的本地和外部存储容量。这些进步使计算机科学家开始认识到这些计算机在存储和管理越来越多的数据方面的潜力。

However, there weren’t any theories for how computers could organize data in meaningful, logical ways. It’s one thing to store unsorted data on a machine, but it’s much more complicated to design systems that allow you to add, retrieve, sort, and otherwise manage that data in consistent, practical ways. The need for a logical framework for storing and organizing data led to a number of proposals for how to harness computers for data management.

但是，关于计算机如何以有意义的逻辑方式组织数据没有任何理论。将未排序的数据存储在计算机上是一回事，但是设计系统要复杂得多，该系统允许您以一致，实用的方式添加，检索，排序和以其他方式管理这些数据。对用于存储和组织数据的逻辑框架的需求导致了有关如何利用计算机进行数据管理的许多建议。

One early database model was the hierarchical model, in which data are organized in a tree-like structure, similar to modern-day filesystems. The following example shows how the layout of part of a hierarchical database used to categorize animals might look:

一个早期的数据库模型是分层模型 ，其中的数据以类似于现代文件系统的树状结构组织。以下示例显示了用于对动物进行分类的分层数据库的一部分的布局外观：

The hierarchical model was widely implemented in early database management systems, but it also proved to be somewhat inflexible. In this model, even though individual records can have multiple “children,” each record can only have one “parent” in the hierarchy. Because of this, these earlier hierarchical databases were limited to representing only “one-to-one” and “one-to-many” relationships. This lack of “many-to-many” relationships could lead to problems when you’re working with data points that you’d like to associate with more than one parent.

分层模型在早期的数据库管理系统中得到了广泛的实现，但是也证明了它有些不灵活。在此模型中，即使单个记录可以有多个“子级”，但每个记录在层次结构中也只能有一个“父级”。因此，这些较早的分层数据库仅限于表示“一对一”和“一对多”关系。当您要处理要与多个父级关联的数据点时，缺少“多对多”关系可能会导致问题。

In the late 1960s, Edgar F. Codd, a computer scientist working at IBM, devised the relational model of database management. Codd’s relational model allowed individual records to be associated with more than one table, thereby enabling “many-to-many” relationships between data points in addition to “one-to-many” relationships. This provided more flexibility than other existing models when it came to designing database structures, and meant that relational database management systems (RDBMSs) could meet a much wider range of business needs.

在1960年代后期，在IBM工作的计算机科学家Edgar F. Codd设计了数据库管理的关系模型。 Codd的关系模型允许单个记录与一个以上的表相关联，从而除了“一对多”关系外，还允许数据点之间的“多对多”关系。在设计数据库结构时，这提供了比其他现有模型更大的灵活性，并且意味着关系数据库管理系统(RDBMS)可以满足更广泛的业务需求。

Codd proposed a language for managing relational data, known as Alpha, which influenced the development of later database languages. Two of Codd’s colleagues at IBM, Donald Chamberlin and Raymond Boyce, created one such language inspired by Alpha. They called their language SEQUEL, short for Structured English Query Language, but because of an existing trademark they shortened the name of their language to SQL (referred to more formally as Structured Query Language).

Codd提出了一种用于管理关系数据的语言，称为Alpha ，它影响了后来的数据库语言的开发。 Codd在IBM的两位同事Donald Chamberlin和Raymond Boyce创建了一种受Alpha启发的语言。他们称他们的语言SEQUEL，短期对于s tructured简体中文阙 RY 大号 anguage，但因为现有的商标，他们缩短了语言SQL(简称更正式的结构化查询语言 )的名称。

Due to hardware constraints, early relational databases were still prohibitively slow, and it took some time before the technology became widespread. But by the mid-1980s, Codd’s relational model had been implemented in a number of commercial database management products from both IBM and its competitors. These vendors also followed IBM’s lead by developing and implementing their own dialects of SQL. By 1987, both the American National Standards Institute and the International Organization for Standardization had ratified and published standards for SQL, solidifying its status as the accepted language for managing RDBMSs.

由于硬件的限制，早期的关系数据库仍然非常缓慢，并且该技术花费了一段时间。但是到了1980年代中期，Codd的关系模型已经在IBM及其竞争对手的许多商业数据库管理产品中实现。这些供应商还遵循IBM的领导方式，开发并实现了自己SQL方言。到1987年，美国国家标准协会和国际标准化组织都批准并发布了SQL标准，巩固了SQL作为管理RDBMS的公认语言的地位。

The relational model’s wide use across multiple industries led to it becoming recognized as the standard model for data management. Even with the rise of various NoSQL databases in more recent years, relational databases remain the dominant tools for storing and organizing data.

关系模型在多个行业中的广泛使用使它成为公认的数据管理标准模型。即使最近几年各种NoSQL数据库的兴起，关系数据库仍然是存储和组织数据的主要工具。

关系数据库如何组织数据 (How Relational Databases Organize Data)

Now that you have a general understanding of the relational model’s history, let’s take a closer look at how the model organizes data.

现在您已经对关系模型的历史有了一个大致的了解，下面让我们仔细看看该模型如何组织数据。

The most fundamental elements in the relational model are relations, which users and modern RDBMSs recognize as tables. A relation is a set of tuples, or rows in a table, with each tuple sharing a set of attributes, or columns:

关系模型中最基本的元素是关系，用户和现代RDBMS都将其识别为表。关系是一组元组或表中的行，每个元组共享一组属性或列：

A column is the smallest organizational structure of a relational database, and represents the various facets that define the records in the table. Hence their more formal name, attributes. You can think of each tuple as a unique instance of whatever type of people, objects, events, or associations the table holds. These instances might be things like employees at a company, sales from an online business, or lab test results. For example, in a table that holds employee records of teachers at a school, the tuples might have attributes like name, subjects, start_date, and so on.

列是关系数据库的最小组织结构，它表示定义表中记录的各个方面。因此，它们具有更正式的名称，属性。您可以将每个元组视为表拥有的任何类型的人员，对象，事件或关联的唯一实例。这些实例可能是公司的员工，在线业务的销售或实验室测试结果。例如，在一所学校的教师拥有员工记录表，元组可能有这样的属性name ， subjects ， start_date ，等等。

When creating columns, you specify a data type that dictates what kind of entries are allowed in that column. RDBMSs often implement their own unique data types, which may not be directly interchangeable with similar data types in other systems. Some common data types include dates, strings, integers, and Booleans.

创建列时，您可以指定一种数据类型 ，该数据类型指示该列中允许的条目类型。 RDBMS通常实现自己的唯一数据类型，这些数据类型可能无法与其他系统中的类似数据类型直接互换。一些常见的数据类型包括日期，字符串，整数和布尔值。

In the relational model, each table contains at least one column that can be used to uniquely identify each row, called a primary key. This is important, because it means that users don’t need to know where their data is physically stored on a machine; instead, their DBMS can keep track of each record and return them on an ad hoc basis. In turn, this means that records have no defined logical order, and users have the ability to return their data in whatever order or through whatever filters they wish.

在关系模型中，每个表至少包含一列，可用于唯一标识每一行，称为主键。这很重要，因为这意味着用户不需要知道其数据在计算机上的物理存储位置。相反，他们的DBMS可以跟踪每条记录，并临时将它们返回。反过来，这意味着记录没有定义的逻辑顺序，并且用户可以按所需顺序或通过所需过滤器返回数据。

If you have two tables that you’d like to associate with one another, one way you can do so is with a foreign key. A foreign key is essentially a copy of one table’s (the “parent” table) primary key inserted into a column in another table (the “child”). The following example highlights the relationship between two tables, one used to record information about employees at a company and another used to track the company’s sales. In this example, the primary key of the EMPLOYEES table is used as the foreign key of the SALES table:

如果您有两个要相互关联的表，则可以通过外键来实现 。外键本质上是一个表(“父”表)主键的副本，该主键插入到另一个表(“子”)的列中。以下示例突出显示了两个表之间的关系，一个表用于记录有关公司员工的信息，另一个表用于跟踪公司的销售额。在此示例中， EMPLOYEES表的主键用作SALES表的外键：

If you try to add a record to the child table and the value entered into the foreign key column doesn’t exist in the parent table’s primary key, the insertion statement will be invalid. This helps to maintain relationship-level integrity, as the rows in both tables will always be related correctly.

如果您尝试将记录添加到子表中，而在父表的主键中不存在输入到外键列中的值，则插入语句将无效。这将有助于维护关系级的完整性，因为两个表中的行将始终正确关联。

The relational model’s structural elements help to keep data stored in an organized way, but storing data is only useful if you can retrieve it. To retrieve information from an RDBMS, you can issue a query, or a structured request for a set of information. As mentioned previously, most relational databases use SQL to manage and query data. SQL allows you to filter and manipulate query results with a variety of clauses, predicates, and expressions, giving you fine control over what data will appear in the result set.

关系模型的结构元素有助于以有组织的方式保存数据，但是只有当您可以检索数据时，存储数据才有用。要从RDBMS检索信息，可以发出查询或对一组信息的结构化请求。如前所述，大多数关系数据库都使用SQL来管理和查询数据。 SQL允许您使用各种子句，谓词和表达式来过滤和处理查询结果，从而使您可以更好地控制结果集中将显示哪些数据。

关系数据库的优势和局限性 (Advantages and Limitations of Relational Databases)

With the underlying organizational structure of relational databases in mind, let’s consider some of their advantages and disadvantages.

考虑到关系数据库的底层组织结构，让我们考虑它们的一些优点和缺点。

Today, both SQL and the databases that implement it deviate from Codd’s relational model in several ways. For instance, Codd’s model dictates that each row in a table should be unique while, for reasons of practicality, most modern relational databases do allow for duplicate rows. There are some that don’t consider SQL databases to be true relational databases if they fail to adhere to each of Codd’s specifications for the relational model. In practical terms, though, any DBMS that uses SQL and at least somewhat adheres to the relational model is likely to be referred to as a relational database management system.

如今，SQL和实现它的数据库都在几种方面偏离了Codd的关系模型。例如，Codd的模型要求表中的每一行都应该是唯一的，而出于实用性考虑，大多数现代关系数据库的确允许重复的行。如果有些SQL数据库不能遵守Codd关于关系模型的每个规范，则有些人不认为它们是真正的关系数据库。但是，实际上，任何使用SQL并至少在某种程度上遵守关系模型的DBMS都可能被称为关系数据库管理系统。

Although relational databases quickly grew in popularity, a few of the relational model’s shortcomings started to become apparent as data became more valuable and businesses began storing more of it. For one thing, it can be difficult to scale a relational database horizontally. Horizontal scaling, or scaling out, is the practice of adding more machines to an existing stack in order to spread out the load and allow for more traffic and faster processing. This is often contrasted with vertical scaling which involves upgrading the hardware of an existing server, usually by adding more RAM or CPU.

尽管关系数据库Swift流行起来，但是随着数据变得越来越有价值，企业开始存储更多关系模型，一些关系模型的缺点开始变得明显。一方面，可能很难水平扩展关系数据库。 水平缩放 ，或向外扩展 ，是为了分散负荷增加更多的机器到现有栈的实践，并允许更多的流量和更快的处理。这通常与垂直扩展形成对比， 垂直扩展通常包括通过添加更多的RAM或CPU来升级现有服务器的硬件。

The reason it’s difficult to scale a relational database horizontally has to do with the fact that the relational model is designed to ensure consistency, meaning clients querying the same database will always retrieve the same data. If you were to scale a relational database horizontally across multiple machines, it becomes difficult to ensure consistency since clients may write data to one node but not the others. There would likely be a delay between the initial write and the time when the other nodes are updated to reflect the changes, resulting in inconsistencies between them.

很难水平扩展关系数据库的原因与以下事实有关：关系模型旨在确保一致性 ，这意味着查询同一数据库的客户端将始终检索相同的数据。如果要在多台计算机上水平扩展关系数据库，则由于客户端可能会将数据写入一个节点，而不能写入其他节点，因此很难确保一致性。初始写入与更新其他节点以反映更改的时间之间可能会存在延迟，从而导致它们之间的不一致。

Another limitation presented by RDBMSs is that the relational model was designed to manage structured data, or data that aligns with a predefined data type or is at least organized in some predetermined way, making it easily sortable and searchable. With the spread of personal computing and the rise of the internet in the early 1990s, however, unstructured data — such as email messages, photos, videos, etc. — became more common.

RDBMS的另一个局限性在于，关系模型旨在管理结构化数据或与预定义数据类型一致或至少以某种预定方式组织的数据，从而使其易于分类和搜索。但是，随着个人计算的普及和1990年代初期互联网的兴起， 非结构化数据 (例如电子邮件，照片，视频等)变得更加普遍。

None of this is to say that relational databases aren’t useful. Quite the contrary, the relational model is still the dominant framework for data management after over 40 years. Their prevalence and longevity mean that relational databases are a mature technology, which is itself one of their major advantages. There are many applications designed to work with the relational model, as well as many career database administrators who are experts when it comes to relational databases. There’s also a wide array of resources available in print and online for those looking to get started with relational databases.

所有这些都不是说关系数据库没有用。恰恰相反，关系模型在40多年后仍然是数据管理的主要框架。它们的普遍性和持久性意味着关系数据库是一项成熟的技术，这本身就是它们的主要优势之一。有许多旨在与关系模型一起使用的应用程序，还有许多职业数据库管理员都是关系数据库方面的专家。对于那些想开始使用关系数据库的人来说，还有大量的印刷版和在线资源。

Another advantage of relational databases is that almost every RDBMS supports transactions. A transaction consists of one or more individual SQL statements performed in sequence as a single unit of work. Transactions present an all-or-nothing approach, meaning that every SQL statement in the transaction must be valid; otherwise, the entire transaction will fail. This is very helpful for ensuring data integrity when making changes to multiple rows or tables.

关系数据库的另一个优点是几乎每个RDBMS都支持事务。事务由一个或多个单独SQL语句组成，这些语句按单个工作单元的顺序执行。事务提供了一种全有或全无的方法，这意味着事务中的每个SQL语句都必须有效。否则，整个交易将失败。这对于在更改多个行或表时确保数据完整性非常有帮助。

Lastly, relational databases are extremely flexible. They’ve been used to build a wide variety of different applications, and continue working efficiently even with very large amounts of data. SQL is also extremely powerful, allowing you to add and change data on the fly, as well as alter the structure of database schemas and tables without impacting existing data.

最后，关系数据库非常灵活。它们已被用来构建各种不同的应用程序，并且即使有大量数据也可以继续有效地工作。 SQL的功能也非常强大，允许您即时添加和更改数据，以及在不影响现有数据的情况下更改数据库架构和表的结构。

结论 (Conclusion)

Thanks to their flexibility and design for data integrity, relational databases are still the primary way data are managed and stored more than fifty years after they were first conceived of. Even with the rise of various NoSQL databases in recent years, understanding the relational model and how to work with RDBMSs are key for anyone who wants to build applications that harness the power of data.

由于关系数据库的灵活性和针对数据完整性的设计，关系数据库仍然是最初被设想了五十多年之后管理和存储数据的主要方式。即使近年来各种NoSQL数据库的兴起，了解关系模型以及如何使用RDBMS仍是任何想要构建利用数据功能的应用程序的人的关键。

To learn more about a few popular open-source RDBMSs, we encourage you to check out our comparison of various open-source relational SQL databases. If you’re interested in learning more about databases generally, we encourage you to check out our complete library of database-related content.

要了解有关一些流行的开源RDBMS的更多信息，建议您检查一下各种开源关系SQL数据库的比较。如果您有兴趣一般地了解有关数据库的更多信息，建议您查看我们完整的数据库相关内容库。

翻译自: https://www.digitalocean.com/community/tutorials/understanding-relational-databases

关系数据库非关系数据库