Database Management Systems
Contents
Course Description
Introduction to database management systems, including data modeling, query languages and processing, database design, data integrity and security. Issues related to distributed database systems, object-oriented database systems, and legacy database systems are also discussed.
Purpose, Goals & Objectives
Introduction to database management systems, including data modeling, query languages and processing, database design, data integrity and security. Issues related to distributed database systems, object-oriented database systems, and legacy database systems are also discussed. At the conclusion of this course a student will:
- Understand the difference between the file-oriented method of building applications and the database method.
- Learn and practice data modeling using the E-R model.
- Learn and practice designing normalized databases using the E-R.
- Learn the fundamentals of the relational data model and understand its role in database implementation.
- Learn and practice SQL and Relational Algebra.
- Design and implement a relational database and application in C# using SQL server.
- Understand the implications and problems of enterprise database systems and study solutions to some of the problems.
- Understand the role of the "people" involved in a database system.
- Understand the need for data warehousing and mining applications.
- Understand the uses of spatial temporal and constraint databases.
Resources
Database Management Systems Child Pages
This class is about Database Management Systems. As such we talk about creating, managing and programming SQL databases. When time permits, we also cover spatial/spatial temporal data and database systems. This site contains content developed by Dr. Anderson for his classes.
Accessing Database Examples using SQLite Browser
- Download the SQLite Browser for your platform
- Download a SQLite DB file provided on eclass.
- Open it in your SQLite Browser ... and enjoy the simplicity and cross platform-ness!
Accessing the Database using Visual Studio Code
First off, we will be using SQLite Browser, so you won't need Visual Studio Code accessing SQL Server very soon (I'll let you know when)! Everyone can run a SQL Server through Docker. I recommend using this as it will provide cross platform access for you.
Install SQL Server using Docker
Microsoft provide excellent directions here. Do get one of these up and running first before you continue. Also if you have SQL Server (of any variety include localdb) remove it from your system, you will no longer need it.
Access your SQL Server from VSCode
If you haven't done so, install VSCode with the mssql extension.
- Click the extensions icon on the left tool bar.
- Search for and install mssql extension.
To use Visual Studio code mssql plugin
- Press CTRL+N. Visual Studio Code opens a new 'Plain Text' file by default.
- Press CTRL+K,M and change the language mode to SQL.
- Alternatively, open an existing file with .sql file extension. The language mode is automatically SQL for files that have the .sql extension.
- press CTRL+SHIFT+P (or F1) to open the Command Palette.
- Type sqlcon and press ENTER. (Select the MS SQL: Connect command.)
- Select Create Connection Profile. This creates a connection profile for your SQL Server instance.
- Follow the prompts to specify the connection properties for the new connection profile. After specifying each value, press ENTER to continue.
1 --Press: Ctrl+Shift+E to execute a query
2 --To execute a specific query, hightlight the query before you execute it.
3
4 -- To see a list of databases try:
5 select *
6 from sys.databases;
7
8 -- To select a database to use try:
9 Use UniversitySmall;
10
11 -- To Show the tables in a database try:
12 SELECT *
13 FROM information_schema.tables;
For additional information see https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-develop-use-vscode?view=sql-server-2017
Programming Examples
See ProgrammingLinks for more examples.
Database diagrams
We are now using Vertebelo for database diagrams.
Programming & Design Resources
Learning to program something more complex than a one off programming assignments for students can be difficult. Where do the experts learn to program large database applications and what tools and methods do they use? This section is not an answer, but a suggestion on where to look for those answers. Someday, I hope to have a class that explores this in depth.
Project Ideas
See: Project Ideas
Chapter Notes and Hints
These notes and hints serve to review you for exams and to help you with your homework. Occasionally you will find information here that is not in the book. You should READ THIS BEFORE YOU DO YOUR HOMEWORK!
Chapter 3 and 4 Hints
If you are having problems joining tables in the FROM and WHERE clauses, you are in the majority. There are two hints that I will give you to help you obtain the right perspective on joining tables:
Rule 1: Just because two attributes have the same name doesn't mean that you should join them on these attributes. STOP, THINK about the semantics (the meaning) of the attribute. Do they both refer to the same thing? If the answer is yes, then you should join on them. Using NATURAL JOIN will match strings, not semantics, so be careful. Also NATURAL JOIN "works" but not on most systems and even on Oracle it is not encouraged... so DON'T USE IT ON YOUR HOMEWORK. If you do I'll run it on Oracle. If it doesn't work, you will be docked points accordingly.
Rule 2: When you are looking at the attributes and relations, something is a foreign key only if it is the primary key in a different relation. Even if the foreign key is part (but not all) of the key in the relation it is in, it is a foreign key.
Rule 3: The book doesn't always give you all the attributes that make up the primary key. Look carefully at the values and make sure when you create tables in Chapter 4 that the attributes you select to be the primary key are sufficient to uniquely identify a each row.
Review of the different types of queries we learned
1 --Basic (But complete) query
2 SELECT dept_name, COUNT(*) AS CNT -- List of attributes in the output
3 FROM instructor -- List of Relations that I am getting data from
4 WHERE dept_name LIKE '%i%' -- Filter data before the grouping (this is only for departments that have and "i" in them
5 GROUP BY dept_name -- Must include every non-aggregated attribute from select (and maybe others)
6 HAVING CNT > 3 -- Only really valid on aggregate value because it filters after the grouping.
7 ORDER BY CNT DESC; -- Orders the results.
8
9 -- Subquries:
10 SELECT -- can have a subquery if the subquery returns only a single value.
11 FROM -- Most common place for a subquery
12 WHERE -- Use a subquery for membership check like we did in Retake #1.5
13 GROUP BY -- NEVER USES a subquery
14 HAVING -- Same as where, but rarely used.
15 ORDER BY -- NEVER USES a subquery
16
17 -- With Queries
18 WITH name AS (any kind of query), [name AS (any kind of query)]*
19 QUERY --that uses the temporary tables created in the with.
20
21 JOIN vs LEFT JOIN -- First is a inner join similar to joining in a where clause
22 -- Left join takes all the rows from the left and only matching rows from the right
23
24 -- SET OPERATIONS:
25 A INTERSECT B (only rows in commong)
26 A UNION B (All rows from both A and B - without repeated rows)
27 A EXCEPT B (all rows in A that do not appear in B, where A and B are queries)
Chapter 5: Advanced SQL
Chapter 6.1: Relational Algebra
Rule 1: You don't need to qualify the results of a natural join. The results of a natural join ALWAYS only have one copy of overlapping attributes.
Rule 2: Predicates, attribute lists and functions in $$\sigma, \Pi, \mathcal{G}$$ are always subscript.
Rule 3: Relations, whether expressions or names, NEVER appear as a superscript or subscript.
Rule 4: Input relations, whether expressions or name, always have parenthesis around them. E.g.
$$$\Large~_{sect\_id,course\_id,year,semester}\mathcal{G}_{count(ID)}(student \bowtie takes)$$$
Chapter 7
Answers provided by the book to 7.2 seem a bit rough to me. You will find an image that expands on the book's answer here: Ch7.16 Annotated Drawings.pdf
Chapter 8
Normal Forms
The following definition of a relation is adapted from Database Processing by Kronke. A relation has the following characteristics
- Rows contain data about an entity
- Columns contain data about attributes the entity
- All entities entries in a column are the same kind
- Each column has a unique name
- Cells of the table hold a single (atomic) value
- The order of the columns is unimportant
- The order of the rows is unimportant
- Now two rows may be identical
The following definitions are found in "Database System Concepts" Sixth Edition by Silberschatz, Korth and Sudarshan
First Normal Form p. 328 A relation $$R$$ is in first normal form if the domains of all attributes of $$R$$ are atomic.
Second Normal Form p. 372 A relation $$R$$ is in second normal form if each attribute $$A$$ in $$R$$ meets one of the following criteria:
- It appears in a candidate key
- It is not partially dependent on a candidate key.
Partial Dependency p. 372 A functional dependency $$\alpha \rightarrow \beta$$ is called a partial dependency if there is a proper subset $$\gamma$$ of $$\alpha$$ such that $$\gamma \rightarrow \beta$$. We say that $$\beta$$ is partially dependent on $$\alpha$$. \end{definition}
Third Normal Form p. 336 A relation schema $$R$$ is in third normal form with respect to a set $$F$$ of functional dependencies if, for all functional dependencies in $$F^+$$ of the form $$\alpha \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:
- $$\alpha \rightarrow \beta$$ is a trivial functional dependency (that is, $$\beta \subseteq \alpha$$).
- $$\alpha$$ is a superkey for schema $$R$$.
- Each attribute $$A$$ in $$\beta - \alpha$$ is contained in a candidate key for $$R$$.
Boyce Codd Normal Form (BCNF) p. 333 A relation schema $$R$$ is in BCNF with respect to a set $$F$$ of functional dependencies if, for all functional dependencies in $$F^+$$ of the form $$\alpha \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:
- $$\alpha \rightarrow \beta$$ is a trivial functional dependency (that is, $$\beta \subseteq \alpha$$).
- $$\alpha$$ is a superkey for schema $$R$$.
Fourth Normal Form pp. 358-359 A relation schema $$r(R)$$ is in fourth normal form with respect to a set $$D$$ of functional and multivalued dependencies if, for all multivalued dependencies in $$D^+$$ of the form $$\alpha \rightarrow \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:
- $$\alpha \rightarrow \rightarrow \beta$$ is a trivial multivalued dependency.
- $$\alpha$$ is a superkey for schema $$R$$.
Chapter 20
Using this data: Aircraft Data.xlsx
Import the data into a database and try the following query.
Try the following query:
USE Aircraft GO SELECT [Eng.HP], [Description], AVG(Cruise) AS AvgCruise, AVG(TopSpd) AS AvgTopSpeed, AVG([Gross Wt.]) AS AvgWt, AVG(UsefulLoad) AS avgUsefulLoad, AVG(GPH) AS AvgFuelFlow FROM AirCraftData GROUP BY [Eng.HP], [Description] WITH CUBE ORDER BY AvgFuelFlow;