Differences between revisions 9 and 91 (spanning 82 versions)
Revision 9 as of 2010-03-26 12:58:31
Size: 1998
Editor: 24-183-238-75
Comment:
Revision 91 as of 2024-01-17 14:52:36
Size: 12866
Editor: scot
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

This class is about Database Management Systems. As such we talk about creating, managing and programming SQL databases. When time permits, we also cover spatial/spatial temporal data and database systems. This site contains content developed by Dr. Anderson for his classes.
<<TableOfContents>>

== Course Description ==

Introduction to database management systems, including data modeling, query languages and processing, database design, data integrity and security. Issues related to distributed database systems, object-oriented database systems, and legacy database systems are also discussed.

== Purpose, Goals & Objectives ==

Introduction
to database management systems, including data modeling, query languages and processing, database design, data integrity and security. Issues related to distributed database systems, object-oriented database systems, and legacy database systems are also discussed. At the conclusion of this course a student will:

 1. Understand the difference between the file-oriented method of building applications and the database method.
 1. Learn and practice data modeling using the E-R model.
 1. Learn and practice designing normalized databases using the E-R.
 1. Learn the fundamentals of the relational data model and understand its role in database implementation.
 1. Learn and practice SQL and Relational Algebra.
 1. Design and implement a relational database and application in C# using SQL server.
 1. Understand the implications and problems of enterprise database systems and study solutions to some of the problems.
 1. Understand the role of the "people" involved in a database system.
 1. Understand the need for data warehousing and mining applications.
 1. Understand the uses of spatial temporal and constraint databases.

= Resources =

 * [[/Schedule|Schedule]]

<<ChildPages(on=not:edit|AttachFile|LocalSiteMap, more_link=More..., max_pages=25, title=<h3>%s Child Pages</h3>)>>

This class is about Database Management Systems. As such we talk about creating, managing and programming SQL databases. When time permits, we also cover spatial/spatial temporal data and database systems. This site contains content developed by Dr. Anderson for his classes.

== Accessing Database Examples using SQLite Browser ==

 1. Download the SQLite Browser for your platform
 1. Download a SQLite DB file provided on eclass.
 1. Open it in your SQLite Browser ... and enjoy the simplicity and cross platform-ness!

== Accessing the Database using Visual Studio Code ==

First off, we will be using SQLite Browser, so you won't need Visual Studio Code accessing SQL Server very soon (I'll let you know when)! Everyone can run a SQL Server through Docker. I recommend using this as it will provide cross platform access for you.

=== Install SQL Server using Docker ===

Microsoft provide excellent directions [[https://docs.microsoft.com/en-us/sql/linux/quickstart-install-connect-docker?view=sql-server-ver15&pivots=cs1-bash|here]]. Do get one of these up and running first before you continue. Also if you have SQL Server (of any variety include localdb) remove it from your system, you will no longer need it.

=== Access your SQL Server from VSCode ===

If you haven't done so, install VSCode with the mssql extension.

 1. Click the extensions icon on the left tool bar.
 1. Search for and install mssql extension.

To use Visual Studio code mssql plugin

 1. Press CTRL+N. Visual Studio Code opens a new 'Plain Text' file by default.
 1. Press CTRL+K,M and change the language mode to SQL.
 1. Alternatively, open an existing file with .sql file extension. The language mode is automatically SQL for files that have the .sql extension.
 1. press CTRL+SHIFT+P (or F1) to open the Command Palette.
 1. Type sqlcon and press ENTER. (Select the MS SQL: Connect command.)
 1. Select Create Connection Profile. This creates a connection profile for your SQL Server instance.
 1. Follow the prompts to specify the connection properties for the new connection profile. After specifying each value, press ENTER to continue.


{{{#!highlight sql
--Press: Ctrl+Shift+E to execute a query
--To execute a specific query, hightlight the query before you execute it.

-- To see a list of databases try:
select *
from sys.databases;

-- To select a database to use try:
Use UniversitySmall;

-- To Show the tables in a database try:
SELECT *
FROM information_schema.tables;
}}}

For additional information see https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-develop-use-vscode?view=sql-server-2017

[[/ExploreDbms|How to Explore the DBMS]]
Line 6: Line 85:

 * Lecture 1: It really easy that easy to do small things
   * Commandline [[DatabaseManagementSystems/DbConnectedExample|Connected Example]]
   * Forms [[DatabaseManagementSystems/DbDisconnectedExample|Disconnected Example]]
 * Lecture 2: Look at the Objects we used in the Forms from Lecture 1.
   * [[http://msdn.microsoft.com/en-us/library/system.data.dataset.aspx?PHPSESSID=tn8k5p1s508cop8gr43e1f34d2|DataSet Class]]
   * Filled by [[http://msdn.microsoft.com/en-us/library/bz9tthwx%28VS.80%29.aspx|TableAdapter]] contained in the [[http://msdn.microsoft.com/en-us/library/bb384426.aspx|TableAdapterManager]]
   * [[http://msdn.microsoft.com/en-us/library/system.windows.forms.bindingsource.aspx|BindingSource as a service provider]]
      * How are GUI elements bound then?
      * How do the GUI elements change?
   * [[http://msdn.microsoft.com/en-us/library/system.windows.forms.bindingnavigator.aspx|BindingNavigator]]

== Programming Resources ==
   * See ProgrammingLinks for more examples.
 * [[SqlBackupAll|Scripting a complete Backup]]

== Database diagrams ==

 * We are now using [[vertabelo.com|Vertebelo]] for database diagrams.

== Programming & Design Resources ==
Line 22: Line 97:
 * Domain Driven Design has an entire website to look at: http://domaindrivendesign.org/
 * NHibernate: http://nhibernate.org
 * Naked Entites/Objects (http://www.nakedobjects.com) is a for pay framework
 * CSLA.NET
 * SpecExpress http://specexpress.codeplex.com

And many more to follow I'm sure.


 
 * [[http://domaindrivendesign.org/|Domain Driven Design website]]
 * [[http://msdn.microsoft.com/en-us/practices/default.aspx|MSDN Patterns and Practices]]
 * [[http://nhibernate.org|NHibernate]]
 * [[http://specexpress.codeplex.com|SpecExpress]]
 * [[http://www.jetbrains.com/teamcity/download/|TeamCity]]

== Project Ideas ==

See: [[DatabaseManagementSystems/Cptr319ProjectIdeas|Project Ideas]]

== Chapter Notes and Hints ==

These notes and hints serve to review you for exams and to help you with your homework. Occasionally you will find information here that is not in the book. You should READ THIS BEFORE YOU DO YOUR HOMEWORK!

=== Chapter 3 and 4 Hints ===

If you are having problems joining tables in the FROM and WHERE clauses, you are in the majority. There are two hints that I will give you to help you obtain the right perspective on joining tables:

'''Rule 1:''' Just because two attributes have the same name doesn't mean that you should join them on these attributes. STOP, THINK about the semantics (the meaning) of the attribute. Do they both refer to the same thing? If the answer is yes, then you should join on them. Using NATURAL JOIN will match strings, not semantics, so be careful. Also NATURAL JOIN "works" but not on most systems and even on Oracle it is not encouraged... so DON'T USE IT ON YOUR HOMEWORK. If you do I'll run it on Oracle. If it doesn't work, you will be docked points accordingly.

'''Rule 2:''' When you are looking at the attributes and relations, something is a foreign key only if it is '''the''' primary key in a '''different''' relation. Even if the foreign key is part (but not all) of the key in the relation it is in, it is a foreign key.

'''Rule 3:''' The book doesn't always give you all the attributes that make up the primary key. Look carefully at the values and make sure when you create tables in Chapter 4 that the attributes you select to be the primary key are sufficient to uniquely identify a each row.

'''Review of the different types of queries we learned'''

{{{#!highlight sql
--Basic (But complete) query
SELECT dept_name, COUNT(*) AS CNT -- List of attributes in the output
FROM instructor -- List of Relations that I am getting data from
WHERE dept_name LIKE '%i%' -- Filter data before the grouping (this is only for departments that have and "i" in them
GROUP BY dept_name -- Must include every non-aggregated attribute from select (and maybe others)
HAVING CNT > 3 -- Only really valid on aggregate value because it filters after the grouping.
ORDER BY CNT DESC; -- Orders the results.

-- Subquries:
SELECT -- can have a subquery if the subquery returns only a single value.
FROM -- Most common place for a subquery
WHERE -- Use a subquery for membership check like we did in Retake #1.5
GROUP BY -- NEVER USES a subquery
HAVING -- Same as where, but rarely used.
ORDER BY -- NEVER USES a subquery

-- With Queries
WITH name AS (any kind of query), [name AS (any kind of query)]*
QUERY --that uses the temporary tables created in the with.

JOIN vs LEFT JOIN -- First is a inner join similar to joining in a where clause
                  -- Left join takes all the rows from the left and only matching rows from the right
      
-- SET OPERATIONS:
A INTERSECT B (only rows in commong)
A UNION B (All rows from both A and B - without repeated rows)
A EXCEPT B (all rows in A that do not appear in B, where A and B are queries)
}}}

=== Chapter 5: Advanced SQL ===

 * /ViewExample
 * /ConcatenateStringsExample

=== Chapter 6.1: Relational Algebra ===

'''Rule 1:''' You don't need to qualify the results of a natural join. The results of a natural join ALWAYS only have one copy of overlapping attributes.

'''Rule 2:''' Predicates, attribute lists and functions in $$\sigma, \Pi, \mathcal{G}$$ are always subscript.

'''Rule 3:''' Relations, whether expressions or names, NEVER appear as a superscript or subscript.

'''Rule 4:''' Input relations, whether expressions or name, always have parenthesis around them. E.g.

$$$\Large~_{sect\_id,course\_id,year,semester}\mathcal{G}_{count(ID)}(student \bowtie takes)$$$

=== Chapter 7 ===

Answers provided by the book to 7.2 seem a bit rough to me. You will find an image that expands on the book's answer here: [[attachment:Ch7.16 Annotated Drawings.pdf]]

=== Chapter 8 ===

==== Normal Forms ====


The following definition of a relation is adapted from Database Processing by Kronke. A relation has the following characteristics

 * Rows contain data about an entity
 * Columns contain data about attributes the entity
 * All entities entries in a column are the same kind
 * Each column has a unique name
 * Cells of the table hold a single (atomic) value
 * The order of the columns is unimportant
 * The order of the rows is unimportant
 * Now two rows may be identical

The following definitions are found in "Database System Concepts" Sixth Edition by Silberschatz, Korth and Sudarshan

'''First Normal Form p. 328'''
A relation $$R$$ is in first normal form if the domains of all attributes of $$R$$ are atomic.

'''Second Normal Form p. 372'''
A relation $$R$$ is in second normal form if each attribute $$A$$ in $$R$$ meets one of the following criteria:
 * It appears in a candidate key
 * It is not partially dependent on a candidate key.

'''Partial Dependency p. 372'''
A functional dependency $$\alpha \rightarrow \beta$$ is called a partial dependency if there is a proper subset $$\gamma$$ of $$\alpha$$ such that $$\gamma \rightarrow \beta$$. We say that $$\beta$$ is partially dependent on $$\alpha$$.
\end{definition}

'''Third Normal Form p. 336'''
A relation schema $$R$$ is in third normal form with respect to a set $$F$$ of functional dependencies if, for all functional dependencies in $$F^+$$ of the form $$\alpha \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:
 * $$\alpha \rightarrow \beta$$ is a trivial functional dependency (that is, $$\beta \subseteq \alpha$$).
 * $$\alpha$$ is a superkey for schema $$R$$.
 * Each attribute $$A$$ in $$\beta - \alpha$$ is contained in a candidate key for $$R$$.

'''Boyce Codd Normal Form (BCNF) p. 333'''
A relation schema $$R$$ is in BCNF with respect to a set $$F$$ of functional dependencies if, for all functional dependencies in $$F^+$$ of the form $$\alpha \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:
 * $$\alpha \rightarrow \beta$$ is a trivial functional dependency (that is, $$\beta \subseteq \alpha$$).
 * $$\alpha$$ is a superkey for schema $$R$$.

'''Fourth Normal Form pp. 358-359'''
A relation schema $$r(R)$$ is in fourth normal form with respect to a set $$D$$ of functional and multivalued dependencies if, for all multivalued dependencies in $$D^+$$ of the form $$\alpha \rightarrow \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:
 * $$\alpha \rightarrow \rightarrow \beta$$ is a trivial multivalued dependency.
 * $$\alpha$$ is a superkey for schema $$R$$.

=== Chapter 20 ===

Using this data: [[attachment:Aircraft Data.xlsx]]

Import the data into a database and try the following query.

Try the following query:

{{{
USE Aircraft

GO

SELECT [Eng.HP], [Description], AVG(Cruise) AS AvgCruise,
    AVG(TopSpd) AS AvgTopSpeed,
    AVG([Gross Wt.]) AS AvgWt,
    AVG(UsefulLoad) AS avgUsefulLoad,
    AVG(GPH) AS AvgFuelFlow
FROM AirCraftData
GROUP BY [Eng.HP], [Description]
WITH CUBE
ORDER BY AvgFuelFlow;
}}}

Database Management Systems

Course Description

Introduction to database management systems, including data modeling, query languages and processing, database design, data integrity and security. Issues related to distributed database systems, object-oriented database systems, and legacy database systems are also discussed.

Purpose, Goals & Objectives

Introduction to database management systems, including data modeling, query languages and processing, database design, data integrity and security. Issues related to distributed database systems, object-oriented database systems, and legacy database systems are also discussed. At the conclusion of this course a student will:

  1. Understand the difference between the file-oriented method of building applications and the database method.
  2. Learn and practice data modeling using the E-R model.
  3. Learn and practice designing normalized databases using the E-R.
  4. Learn the fundamentals of the relational data model and understand its role in database implementation.
  5. Learn and practice SQL and Relational Algebra.
  6. Design and implement a relational database and application in C# using SQL server.
  7. Understand the implications and problems of enterprise database systems and study solutions to some of the problems.
  8. Understand the role of the "people" involved in a database system.
  9. Understand the need for data warehousing and mining applications.
  10. Understand the uses of spatial temporal and constraint databases.

Resources

Database Management Systems Child Pages

This class is about Database Management Systems. As such we talk about creating, managing and programming SQL databases. When time permits, we also cover spatial/spatial temporal data and database systems. This site contains content developed by Dr. Anderson for his classes.

Accessing Database Examples using SQLite Browser

  1. Download the SQLite Browser for your platform
  2. Download a SQLite DB file provided on eclass.
  3. Open it in your SQLite Browser ... and enjoy the simplicity and cross platform-ness!

Accessing the Database using Visual Studio Code

First off, we will be using SQLite Browser, so you won't need Visual Studio Code accessing SQL Server very soon (I'll let you know when)! Everyone can run a SQL Server through Docker. I recommend using this as it will provide cross platform access for you.

Install SQL Server using Docker

Microsoft provide excellent directions here. Do get one of these up and running first before you continue. Also if you have SQL Server (of any variety include localdb) remove it from your system, you will no longer need it.

Access your SQL Server from VSCode

If you haven't done so, install VSCode with the mssql extension.

  1. Click the extensions icon on the left tool bar.
  2. Search for and install mssql extension.

To use Visual Studio code mssql plugin

  1. Press CTRL+N. Visual Studio Code opens a new 'Plain Text' file by default.
  2. Press CTRL+K,M and change the language mode to SQL.
  3. Alternatively, open an existing file with .sql file extension. The language mode is automatically SQL for files that have the .sql extension.
  4. press CTRL+SHIFT+P (or F1) to open the Command Palette.
  5. Type sqlcon and press ENTER. (Select the MS SQL: Connect command.)
  6. Select Create Connection Profile. This creates a connection profile for your SQL Server instance.
  7. Follow the prompts to specify the connection properties for the new connection profile. After specifying each value, press ENTER to continue.

   1 --Press: Ctrl+Shift+E to execute a query
   2 --To execute a specific query, hightlight the query before you execute it.
   3 
   4 -- To see a list of databases try:
   5 select * 
   6 from sys.databases;
   7 
   8 -- To select a database to use try:
   9 Use UniversitySmall;
  10 
  11 -- To Show the tables in a database try: 
  12 SELECT *
  13 FROM information_schema.tables;

For additional information see https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-develop-use-vscode?view=sql-server-2017

How to Explore the DBMS

Programming Examples

Database diagrams

  • We are now using Vertebelo for database diagrams.

Programming & Design Resources

Learning to program something more complex than a one off programming assignments for students can be difficult. Where do the experts learn to program large database applications and what tools and methods do they use? This section is not an answer, but a suggestion on where to look for those answers. Someday, I hope to have a class that explores this in depth.

Project Ideas

See: Project Ideas

Chapter Notes and Hints

These notes and hints serve to review you for exams and to help you with your homework. Occasionally you will find information here that is not in the book. You should READ THIS BEFORE YOU DO YOUR HOMEWORK!

Chapter 3 and 4 Hints

If you are having problems joining tables in the FROM and WHERE clauses, you are in the majority. There are two hints that I will give you to help you obtain the right perspective on joining tables:

Rule 1: Just because two attributes have the same name doesn't mean that you should join them on these attributes. STOP, THINK about the semantics (the meaning) of the attribute. Do they both refer to the same thing? If the answer is yes, then you should join on them. Using NATURAL JOIN will match strings, not semantics, so be careful. Also NATURAL JOIN "works" but not on most systems and even on Oracle it is not encouraged... so DON'T USE IT ON YOUR HOMEWORK. If you do I'll run it on Oracle. If it doesn't work, you will be docked points accordingly.

Rule 2: When you are looking at the attributes and relations, something is a foreign key only if it is the primary key in a different relation. Even if the foreign key is part (but not all) of the key in the relation it is in, it is a foreign key.

Rule 3: The book doesn't always give you all the attributes that make up the primary key. Look carefully at the values and make sure when you create tables in Chapter 4 that the attributes you select to be the primary key are sufficient to uniquely identify a each row.

Review of the different types of queries we learned

   1 --Basic (But complete) query
   2 SELECT dept_name, COUNT(*) AS CNT  -- List of attributes in the output
   3 FROM instructor                    -- List of Relations that I am getting data from 
   4 WHERE dept_name LIKE '%i%'         -- Filter data before the grouping (this is only for departments that have and "i" in them
   5 GROUP BY dept_name                 -- Must include every non-aggregated attribute from select (and maybe others)
   6 HAVING CNT > 3                     -- Only really valid on aggregate value because it filters after the grouping.
   7 ORDER BY CNT DESC;                 -- Orders the results.
   8 
   9 -- Subquries:
  10 SELECT -- can have a subquery if the subquery returns only a single value.
  11 FROM   -- Most common place for a subquery
  12 WHERE  -- Use a subquery for membership check like we did in Retake #1.5
  13 GROUP BY  -- NEVER USES a subquery
  14 HAVING -- Same as where, but rarely used.
  15 ORDER BY  -- NEVER USES a subquery
  16 
  17 -- With Queries
  18 WITH name AS (any kind of query), [name AS (any kind of query)]*
  19 QUERY --that uses the temporary tables created in the with. 
  20 
  21 JOIN vs LEFT JOIN -- First is a inner join similar to joining in a where clause
  22                   -- Left join takes all the rows from the left and only matching rows from the right 
  23                                   
  24 -- SET OPERATIONS:
  25 A INTERSECT B (only rows in commong)
  26 A UNION B (All rows from both A and B - without repeated rows)
  27 A EXCEPT B (all rows in A that do not appear in B, where A and B are queries)

Chapter 5: Advanced SQL

Chapter 6.1: Relational Algebra

Rule 1: You don't need to qualify the results of a natural join. The results of a natural join ALWAYS only have one copy of overlapping attributes.

Rule 2: Predicates, attribute lists and functions in $$\sigma, \Pi, \mathcal{G}$$ are always subscript.

Rule 3: Relations, whether expressions or names, NEVER appear as a superscript or subscript.

Rule 4: Input relations, whether expressions or name, always have parenthesis around them. E.g.

$$$\Large~_{sect\_id,course\_id,year,semester}\mathcal{G}_{count(ID)}(student \bowtie takes)$$$

Chapter 7

Answers provided by the book to 7.2 seem a bit rough to me. You will find an image that expands on the book's answer here: Ch7.16 Annotated Drawings.pdf

Chapter 8

Normal Forms

The following definition of a relation is adapted from Database Processing by Kronke. A relation has the following characteristics

  • Rows contain data about an entity
  • Columns contain data about attributes the entity
  • All entities entries in a column are the same kind
  • Each column has a unique name
  • Cells of the table hold a single (atomic) value
  • The order of the columns is unimportant
  • The order of the rows is unimportant
  • Now two rows may be identical

The following definitions are found in "Database System Concepts" Sixth Edition by Silberschatz, Korth and Sudarshan

First Normal Form p. 328 A relation $$R$$ is in first normal form if the domains of all attributes of $$R$$ are atomic.

Second Normal Form p. 372 A relation $$R$$ is in second normal form if each attribute $$A$$ in $$R$$ meets one of the following criteria:

  • It appears in a candidate key
  • It is not partially dependent on a candidate key.

Partial Dependency p. 372 A functional dependency $$\alpha \rightarrow \beta$$ is called a partial dependency if there is a proper subset $$\gamma$$ of $$\alpha$$ such that $$\gamma \rightarrow \beta$$. We say that $$\beta$$ is partially dependent on $$\alpha$$. \end{definition}

Third Normal Form p. 336 A relation schema $$R$$ is in third normal form with respect to a set $$F$$ of functional dependencies if, for all functional dependencies in $$F^+$$ of the form $$\alpha \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:

  • $$\alpha \rightarrow \beta$$ is a trivial functional dependency (that is, $$\beta \subseteq \alpha$$).
  • $$\alpha$$ is a superkey for schema $$R$$.
  • Each attribute $$A$$ in $$\beta - \alpha$$ is contained in a candidate key for $$R$$.

Boyce Codd Normal Form (BCNF) p. 333 A relation schema $$R$$ is in BCNF with respect to a set $$F$$ of functional dependencies if, for all functional dependencies in $$F^+$$ of the form $$\alpha \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:

  • $$\alpha \rightarrow \beta$$ is a trivial functional dependency (that is, $$\beta \subseteq \alpha$$).
  • $$\alpha$$ is a superkey for schema $$R$$.

Fourth Normal Form pp. 358-359 A relation schema $$r(R)$$ is in fourth normal form with respect to a set $$D$$ of functional and multivalued dependencies if, for all multivalued dependencies in $$D^+$$ of the form $$\alpha \rightarrow \rightarrow \beta$$, where $$\alpha \subseteq R$$ and $$\beta \subseteq R$$, at least one of the following holds:

  • $$\alpha \rightarrow \rightarrow \beta$$ is a trivial multivalued dependency.
  • $$\alpha$$ is a superkey for schema $$R$$.

Chapter 20

Using this data: Aircraft Data.xlsx

Import the data into a database and try the following query.

Try the following query:

USE Aircraft

GO

SELECT [Eng.HP], [Description], AVG(Cruise) AS AvgCruise, 
                                AVG(TopSpd) AS AvgTopSpeed, 
                                AVG([Gross Wt.]) AS AvgWt,
                                AVG(UsefulLoad) AS avgUsefulLoad,
                                AVG(GPH) AS AvgFuelFlow
FROM AirCraftData
GROUP BY [Eng.HP], [Description]
WITH CUBE 
ORDER BY AvgFuelFlow;

DatabaseManagementSystems (last edited 2024-02-25 16:33:21 by scot)