Netside where knowledge is shared, ideas are spread.

[Notes] CSCI 585 DB Structured Query Language (SQL)

Credit to: Prof. Saty Raghavachary, CSCI 585, Spring 2020

outline

  • The basic commands and functions of SQL
  • How to use SQL for data administration (to create tables and indexes)
  • How to use SQL for data manipulation (to add, modify, delete, and retrieve data)
  • How to use SQL to query a database for useful information

Structured Query Language (SQL)

  • Categories of SQL function
    • Data definition language (DDL)
    • Data manipulation language (DML)
  • Nonprocedural language with basic command vocabulary set of less than 100 words
  • Differences in SQL dialects are minor

Table 7.1 - SQL Data Definition Command

Table 7.2 - SQL Data Manipulation Commands

Figure 7.1 - The Database Model

Common SQL Data Types

  • Numeric
    • NUMBER(L,D) or NUMERIC(L,D)
  • Character
    • CHAR(L)
    • VARCHAR(L) or VARCHAR2(L)
  • Date
    • DATE

char vs VARCHAR (reserved for future use by SQL - so don’t use!) vs VARCHAR2 [versus NVARCHAR2 (for Unicode)]: http://www.orafaq.com/faq/what_is_the_difference_between_varchar_varchar2_and_char_data_types

Creating Table Structures

  • Use one line per column (attribute) definition
  • Use spaces to line up attribute characteristics and constraints
  • Table and attribute names are capitalized
  • Features of table creating command sequence
    • NOT NULL specification
    • UNIQUE specification
  • Syntax to create table
    • CREATE TABLE tablename();

Primary Key and Foreign Key

  • Primary key attributes contain both a NOT NULL and a UNIQUE specification
  • RDBMS will automatically enforce referential integrity for foreign keys
  • Command sequence ends with semicolon
  • ANSI SQL allows use of following clauses to cover CASCADE, SET NULL, or SET DEFAULT
    • ON DELETE and ON UPDATE
CREATE TABLE PRODUCT(
    P_CODE      VARCHAR(10)     NOT NULL    UNIQUE,
    P_DESCRIPT  VARCHAR(35)     NOT NULL,
    P_INDATE    DATE            NOT NULL,
    P_QOH       SMALLINT        NOT NULL,
    P_PRICE     NUMBER(8,2)     NOT NULL,
    P_DISCOUNT  NUMBER(5,2)     NOT NULL,
    V_CODE      INTEGER,
    PRIMARY KEY(P_CODE),
    FOREIGN KEY(V_CODE) REFERENCES VENDOR ON UPDATE CASCADE 
);

SQL Constraints

  • NOT NULL
    • Ensures that column does not accept nulls
  • UNIQUE
    • Ensures that all values in column are unique
  • DEFAULT
    • Assigns value to attribute when a new row is added to table
  • CHECK
    • Validates data when attribute value is entered

Plus: ON UPDATE CASCADE and ON DELETE CASCADE - both affect [change] a secondary table (that has an FK), when a change is made in the primary table (with the corresponding PK). ON UPDATE will update the values in the secondary table when corresp. values in the primary table are changed; ON DELETE will delete rows in the secondary table, when linked rows are deleted in the primary table.dates data when attribute values is entered

Data Manipulation Commands

  • INSERT: Command to insert data into table
    • Syntax - INSERT INTO tablename VALUES();
    • Used to add table rows with NULL and NOT NULL attributes
INSERT INTO VENDOR VALUES(2155,'Bryson,Inc.','Smithson','615','223-3234','TN','y');

INSERT INTO PRODUCT(P_CODE,P_DESCRIPT) VALUES('BRT-345','Titanium dril bit');
  • COMMIT: Command to save changes
    • Syntax - COMMIT [WORK];
    • Ensures database update integrity
  • SELECT: Command to list the contents
    • Syntax - SELECT columnlist FROM tablename;
    • Wildcard character(*): Substitute for other characters/command
  • UPDATE: Command to modify data
  • Syntax - UPDATE tablename SET columnname = expression [, columnname = expression] [WHERE conditionlist];
  • WHERE condition
    • Specifies the rows to be selected
  • ROLLBACK: Command to restore the database
    • Syntax - ROLLBACK;
    • Undoes the changes since last COMMIT command
  • DELETE: Command to delete
    • Syntax - DELETE FROM tablename [WHERE conditionlist];

SELECT operates on 1 or more columns, and 0 or more rows from 1 or more tables - like many SQL commands, it is set-oriented and non procedural.

UPDATE modifies one or more columns of a table (on all rows, or on specific rows based on a condition specified by WHERE).

SELECT * FROM PRODUCT;

SELECT P_CODE,P_DESCRIPT FROM PRODUCT;

UPDATE PRODUCT SET P_INDATE='18-JAN-2014' WHERE P_CODE='13-Q2/P2';

DELETE FROM PRODUCT WHERE P_CODE='BRT-345';

Vaguely similar to parameter passing/matching during a function call [where arguments need to match in position, type, count].

Inserting Table Rows with a SELECT Subquery

  • Syntax
    • INSERT INTO tablename SELECT columnlist FROM tablename;
  • Used to add multiple rows using another table as source
  • SELECT command - Acts as a subquery and is executed first
    • Subquery: Query embedded/nested inside another query

Selecting Rows Using Conditional Restrictions

  • Following syntax enables to specify which rows to select
    • SELECT columnlist FROM tablelist [WHERE conditionlist];
  • Used to select partial table contents by placing restrictions on the rows
  • Optional WHERE clause
    • Adds conditional restrictions to the SELECT statement

Comparison Operators

  • Add conditional restrictions on selected table contents
  • Used on:
    • Character attributes
    • Dates
SYMBOL MEANING
= equal to
< less than
<= less than or equal to
> greater than
<> or != not equal to
Computed Columns and Column Aliases
  • SQL accepts any valid expressions/formulas in the computed columns
  • Alias: Alternate name given to a column or table in any SQL statement to improve the readability
  • Computed column, an alias, and date arithmetic can be used in a single query
SELECT P_DESCRIPT,P_QOH,P_PRICE,P_QOH*P_PRICE AS TOTVALUE  FROM PRODUCT;

rename the calculated colum to “TOTVALUE”

Arithmetic operators

  • The Rule of Precedence: Establish the order in which computations are completed
  • Perform:
    • Operations within parentheses
    • Power operations
    • Multiplications and divisions
    • Additions and subtractions
OPERATOR DESCRIPTION
+ add
- subtract
* multiply
/ divide
^ raise to the power of (some application use ** instead of ^)

Special Operators

  • AND
  • OR
  • BETWEEN
    • Checks whether attribute value is within a range
  • IS NULL
    • Checks whether attribute value is null
  • LIKE
    • Checks whether attribute value matches given string pattern
    • % means any and all following or preceding characters are eligible
    • _ means any one character may be substituted for the underscore
  • IN
    • Checks whether attribute value matches any value within a value list
  • EXISTS
    • Checks if subquery returns any rows

You can loosely think of EXISTS as “ONLY WHEN”. We use it to ‘defensively’ update (insert, modify, delete) parts of a table (after we determine it is updatable).

You can also think of ‘WHERE EXISTS’ as “such that there exists”. Eg. in the first example below (SELECT * FROM VENDOR..), it reads as “Find all vendors such that there exists records for them in the PRODUCT TABLE (via V_CODE) where P_QOH<=P_MIN” [in other words, we are looking for vendors we need to re-order from].

SELECT * FROM PRODUCT WHERE V_CODE IS NULL;

SELECT * FROM VENDOR WHERE V_CONTACT LIKE `Smith%`;

SELECT * FORM PRODUCT WHERE V_CODE IN (21344,24288);

SELECT V_CODE,V_NAME FORM VENDOR WHERE V_CODE IN (SELECT V_CODE FORM PRODUCT);

The EXISTS clause is used with a query, and returns TRUE if the subquery results in any output (non-zero # of rows being returned), or FALSE if the subquery results in no data. The rest of the query (the ‘main’ query) will (or will not) run, based on EXIST’s output - if EXISTS returns false, the main query will get skipped.

You can loosely think of EXISTS as “ONLY WHEN”. We use it to ‘defensively’ update (insert, modify, delete) parts of a table (after we determine it is updatable).

You can also think of ‘WHERE EXISTS’ as “such that there exists”. Eg. in the first example below (SELECT * FROM VENDOR..), it reads as “Find all vendors such that there exists records for them in the PRODUCT TABLE (via V_CODE) where P_QOH<=P_MIN” [in other words, we are looking for vendors we need to re-order from].

In the second example (INSERT INTO CONTACTS..), read it as “get the supplier ID and name for all suppliers such that there exists order IDs for them, then insert them into the contacts table”.

SELECT * FROM VENDOR WHERE EXISTS(SELECT * FROM PRODUCT WHERE P_QOH <= P_MIN);

UPDATE suppliers
SET supplier_name = (SELECT customers.name
                     FROM customers
                     WHERE customers.customer_id = suppliers.supplier_id)
WHERE EXISTS (SELECT customer.name
              FROM customers
              WHERE customers.customer_id = suppliers.supplier_id);

Advanced Data Definition Commands

  • ALTER TABLE command: To make changes in the table structure
  • Keywords use with the command
    • ADD - Adds a column
    • MODIFY - Changes column characteristics. To prevent destroy data, you need to copy back data
    • DROP - Deletes a column
  • Used to:
    • Add table constraints
    • Remove table constraints
Changing Column’s Data Type
  • ALTER can be used to change data type
  • Some RDBMSs do not permit changes to data types unless column is empty
  • Syntax – ALTER TABLE tablename MODIFY (columnname(datatype));
Changing Column’s Data Characteristics
  • Use ALTER to change data characteristics
  • Changes in column’s characteristics are permitted if changes do not alter the existing data type
  • Syntax - ALTER TABLE tablename MODIFY (columnname(characterstic));
Adding Column, Dropping Column
  • Adding a column
    • Use ALTER and ADD
    • Do not include the NOT NULL clause for new column
  • Dropping a column
    • Use ALTER and DROP
    • Some RDBMSs impose restrictions on the deletion of an attribute
  • Advanced Data Updates
  • UPDATE command updates only data in existing rows
  • If a relationship is established between entries and existing columns, the relationship can assign values to appropriate slots
  • Arithmetic operators are useful in data updates
  • In Oracle, ROLLBACK command undoes changes made by last two UPDATE statements
ALTER TABLE PRODUCT MODIFY(V_CODE CHAR(5));

ALTER TABLE PRODUCT MODIFY(P_PRICE DECIMAL(9,2));

ALTER TABLE PRODUCT ADD(P_SALECODE CHAR(1));

ALTER TABLE VENDOR DROP COLUMN V_ORDER;

UPDATE PRODUCT SET P_PRICE = P_PRICE * 1.10 WHERE P_PRICE< 50.00;

Copying Parts of Tables

Sometimes we can update a table, by filling it with output from a query. The table to be filled in has to exist first (so we need to create it if necessary), and have type-compatible columns that can receive values from our data-fetching query.

The table creation and updating can happen in separate steps, or even be combined for compactness of expression.

  • SQL permits copying contents of selected table columns
    • Data need not be reentered manually into newly created table(s)
  • Table structure is created
  • Rows are added to new table using rows from another table
CREATE TABLE PART(
  PART_CODE     CHAR(8),
  PART_DESXRIPT CHAR(35),
  PART_PRICE    DECIMAL(8,2),
  V_CODE        INTEGER,
  PRIMARY KEY (PART_CODE)
);

INSERT INTO PART    (PART_CODE, PART_DESCRIPT, PART_PRICE, V_CODE)
SELECT              P_CODE, P_DESCRIPT, P_PRICE, V_CODE FORM PRODUCT;

----

CREATE TABLE PART AS
SELECT          P_CODE AS PART_CODE, P_DESCRIPT AS PART_DESCRIPT, P_PRICE AS PART_PRICE, V_CODE
FROM            PRODUCT;

Adding Primary and Foreign Key Designations

  • ALTER TABLE command
    • Followed by a keyword that produces the specific change one wants to make
    • Options include ADD, MODIFY, and DROP
  • Syntax to add or modify columns
    • ALTER TABLE tablename
      • {ADD | MODIFY} ( columnname datatype [ {ADD | MODIFY} columnname datatype] ) ;
    • ALTER TABLE tablename
      • ADD constraint [ ADD constraint ] ;
ALTER TABLE PART ADD PRIMARY KEY (PART_CODE);

ALTER TABLE PART ADD FOREIGN KEY (V_CODE) REFERENCE VENDOR;

Deleting a Table from the Database

  • DROP TABLE: Deletes table from database
  • Syntax - DROP TABLE tablename;
  • Can drop a table only if it is not the one side of any relationship
  • RDBMS generates a foreign key integrity violation error message if the table is dropped

Additional SELECT Query Keywords

  • Logical operators work well in the query environment
  • SQL provides useful functions that:
    • Counts
    • Find minimum and maximum values
    • Calculate averages
  • SQL allows user to limit queries to entries:
    • Having no duplicates
    • Whose duplicates may be grouped
Ordering a Listing
  • ORDER BY clause is useful when listing order is important
  • Syntax - SELECT columnlis FROM tablelist [WHERE conditionlist][ORDER BY columnlist [ASC | DESC]];
  • Cascading order sequence: Multilevel ordered sequence
  • Created by listing several attributes after the ORDER BY clause
Listing Unique Values
  • DISTINCT clause: Produces list of values that are unique
  • Syntax - SELECT DISTINCT columnlist FROM tablelist;
  • Access places nulls at the top of the list
    • Oracle places it at the bottom
    • Placement of nulls does not affect list contents

AGGREGATE functions

COUNT, MIN, MAX, SUM, AVG

Grouping Data

  • Frequency distributions created by GROUP BY clause within SELECT statement. It will work with some aggregate functions

You can think of ‘GROUP BY’ to mean “CATEGORIZE BY” or “ITEMIZE BY”: rather than just a single MIN, MAX, SUM, COUNT or AVG, we’re asking for a value PER OCCURRENCE of a GROUP BY value, eg. minimum price PER VENDOR, max GPA PER DEPARTMENT, average earnings PER MAJOR, etc. Specifically, when used with SUM(), GROUP BY is used to request subtotals.

  • Syntax
    SELECT columnlist
    FROM tablelist
    [WHERE conditionlist]
    [GROUP BY columnlist]
    [HAVING conditionlist]
    [ORDER BY	columnlist [ASC | DESC]];
    
HAVING Clause

The HAVING clause is used to filter rows in a GROUP BY specification (only those rows HAVING met the specified condition will appear in the result).

Note that HAVING can only occur in a query that has a GROUP BY, which in turn can only occur when there is an aggregate function (MIN, MAX, SUM, COUNT or AVG).

  • Extension of GROUP BY feature
  • Applied to output of GROUP BY operation
  • Used in conjunction with GROUP BY clause in second SQL command set
  • Similar to WHERE clause in SELECT statement

Joining Database Tables

  • Performed when data are retrieved from more than one table at a time
    • Equality comparison between foreign key and primary key of related tables
  • Tables are joined by listing tables in FROM clause of SELECT statement
    • DBMS creates Cartesian product of every table in the FROM clause

Joining Tables With an Alias

  • Alias identifies the source table from which data are taken
  • Any legal table name can be used as alias
  • Add alias after table name in FROM clause
    • FROM tablename alias eg.’FROM PRODUCT P’

Recursive Joins

Need different aliases for the table being queried so that we can use such aliases as namespaces for attributes.

  • Recursive query: Table is joined to itself using alias
  • Use aliases to differentiate the table from itself
SELECT      E.EMP_NUM
FROM        EMP E, EMP M
WHERE       E.EMP_MGR=M.EMP_NUM
ORDER BY    E.EMP_MGR

SQL is declarative, not imperative

Now that you’re familiar with SQL syntax, you will appreciate knowing how it compares with a regular programming language such as C++. Shown below is such a comparison, using C# which lets us write code in an imperative (command-oriented) manner as well a declarative (result-oriented) one [this is from a book on functional programming]:

SQL is declarative, not imperative