Latest Post:
Loading...

Chapter 1 Database Management System complete note

Database Management System

Grade 12



Data:

Data is the raw fact about anything which is unprocessed, un-interpreted or unorganized, but able to organize into useful information. For Example: Hari, Manager, Pokhara, 5 etc.

Information:

When raw data is processed and changed it into meaning data, then the meaningful data is called information. Information is processed, interpreted or organized data.

For Example: Ram, Student of MMC, Manager of Mega Bank, etc.

Database:

A database is a large collection of interrelated data of certain place, person or things in such a way that it can be easily access by an application program like DBMS. For Example: candidates of SLC exam, users of Facebook, users of NTC/NCELL etc.

Data Processing

The process of getting useful information from data by processing data with the help of database software is called data processing


Data storage hierarchy or components of a file


u BIT: smallest unit of memory. It can either 0 or 1

u Character: collection of 8 bits. It represents alphabet, number and anyone sign or symbol.

u Field: (data item/data element) holds single facts or attribute to an entity. It is as an attribute

u Record: a collection of related fields which represents a complete unit of information

u File: collection of related records.

u Table

u objects

Problems with file processing

u In a file processing system, data needed for each user application was stored in independent data files. File processing consisted of using separate computer program s that independent data files. Some problems are

u Data duplication:

u Data Dependence

u Lack of data integration

u Record: group of values of the attributes in predetermined order of an instance of an entity set. For eg: Anil Poudel 12 XII is a record of an entity set students.

u File is a collection or identical records in an entity set and is labeled to identify the entity set

u Field: A database field is a place for a piece of information in a record or file

u A database object in a relational database is a data structure used to either store or reference data. The most common object that people interact with is the table. Other objects are indexes, stored procedures, sequences, views and many more.

Database

u Database is simply organised collection of related data arranged in a specific order so that it can be easily access, managed or updated

u Some of the examples are;

u Telephone directory

u Mark ledger

u Result sheet

u Flight schedule

Advantage of database

u It allows sharing large volume of data

u It allows searching of data quickly

u It reduces data duplication

u It allows advance filtering of data

DBMS (Database Management System)

u DBMS is a software which is used to store the data, process them and provide the useful information to the user.

u It involves in creating, modifying, deleting and adding data in database.

Advantage of DBMS.

u Reduce data redundancy

u Control data consistency: (any change made in one file automatically done in all field)

u Facilitates sharing of data

u Ensure data security

u Maintain data integrity: (make sure that database accept only the valid data)

Disadvantage of DBMS

u Required H/W and S/W are expensive.

u Costly and time-consuming procedures

u High operating cost

u Qualified persons are required to operate

u Difficult to recover backup

Database Models



Hierarchical Database model



u hierarchical database model is a data model in which the data are organized into a tree-like structure

u Data are stored as records which are connected to one another through links.

u  Each node is related to others in parent-child relationship.

u  conventional database model.



Network Model



u modified model of hierarchical model

u  each node may have several parents

u  higher level of flexibility than hierarchical

u  Easier to restructure

u  becomes complicated according to the increment in network

u  less user friendly than  hierarchical



 

Relational Database Model

u Data are arranged in two dimensional table which are easy for user to develop and understand.

u the basic data structure of relational data base is the table. data is presented as relationship or table of two dimension formed by rows and columns

u rows are also called records or tuples.

u the relationship identifies a set of entities(tuples) of the same type.






 

 u Relational Model (RM) represents the database as a collection of relations.

u A relation is nothing but a table of values.

u Every row in the table represents a collection of related data values.

u These rows in the table denote a real-world entity or relationship.



Object oriented database model

u An object-oriented database is a database that subscribes to a model with information represented by objects.

u Object is any real things that are available in the world which has own data and method so, is independent.

u Therefore objects contain both executable code and data.

u  There are other characteristics of objects such as whether methods or data can be accessed from outside the object.







Centralized database VS Distributed database

Centralized database

It is a simple type which works on client server basis. In this type clients or user are directly connected to the centrally totally located server. This server hosts the data of its client or user and helps them to store and retrieve data as requirement. This type of system is used in small scale industries which doesn't have to deal with large volume of data and user. Centralized database runs on single computer which may have single or multiple users. Since database is centralized, security is not a crucial part here. The maintenance of database is easier because of data are centrally stored. This type of system denotes allow unauthorized person to access data.

Uploading: 384120 of 384120 bytes uploaded.


v A centralized database (sometimes abbreviated CDB) is a database that is located, stored, and maintained in a single location. This location is most often a central computer or database system, for example a desktop or server CPU, or a mainframe computer. 

v In most cases, a centralized database would be used by an organization (e.g. a business company) or an institution (e.g. a university.) Users access a centralized database through a computer network which is able to give them access to the central CPU, which in turn maintains to the database itself.

v  All of the information stored on the CBS is accessible from a large number of different points, which in turn creates a significant amount of both advantages and disadvantages.



Distributed database

Ø  A distributed database is a database in which storage devices are not all attached to a common processing unit such as the CPU and which is controlled by a distributed database management system (together sometimes called a distributed database system).



Ø   It may be stored in multiple computers, located in the same physical location; or may be dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components.

Ø  System administrators can distribute collections of data (e.g. in a database) across multiple physical locations. A distributed database can reside on network servers on the Internet, on corporate intranets or extranets, or on other company networks. Because they store data across multiple computers, distributed databases can improve performance at end-user worksites by allowing transactions to be processed on many machines, instead of being limited to one.

This type of database system is complex in structure, instead of storing and retrieving data from centrally located server, it uses several numbers of database and server randomly located at different place. It is the collection of multiple logically interrelated database which are distributed in many geographical locations. Since server are located at different locations user can experience a good speed of bandwidth. Similarly, backup and recovery process are lot easier there, which makes data more secured. This type of system is used by large organization who has to deal with large volume of data and user all around the world. Since it is distributed in nature there may arises security issue and are costly to maintain and operate. Simply, distributed database system is the collection of several number of centralized database system in different locations.

Advantages

  • Backup and recovery of data is easier.
  •  It can handle large volume of data and user all over the word.
  • User can experience high speed bandwidth.
  •  Failure of a site doesn’t affect the whole computer system.
  • It is more reliable as compared to centralized database system.
  • The performance and service are better than centralized system.
  • It is suitable for big organization with large amount of data, database users and computers.
  •  Due to multiple servers, it can support large numbers of users and computers at a same time.

Disadvantages

  • Ø  Very expensive to operate and maintain.
  • Ø  Data security may be real issue.
  • Ø  Complex to implement and manage
  • Ø  Risk of data theft, hacking etc.

Differences between centralized and distributed database system.

Centralized

Distributed

Simple type

Complex type

Located on particular location

Many geographical location

Only one server

Multiple server in many location

Suitable for small scale industries

Suitable for large scale industries

Maintenance is easy

Maintenance is difficult

Security is high

Security is low

Low speed

High Speed

Cheap

Expensive

Failure of server affect whole network

Doesn’t affect whole network

High chance of data loss

Less chance of data loss

 E-R Diagram

Ø  It is a graphical representation of database. E-R Diagram is a graphical representation of different entities and their relationship.

Ø  E-R model is based on the perception of a real world that consists of a collection of basic objects (entities) and relationship among these objects.

Ø  The overall logical structure of a database can be express graphically by  

o   Rectangle: It represents of entity. It contains objects used in relational database.

o   Oval or Ellipse: it represents attributes (properties) of entities.

o   Diamond: It represents relationship among entities.

o   Line: It is used to link attributes to entity sets and entity set to relationship and vice-versa.

An E-R diagram for the relation writer writes book


E-R diagram shows the relationship between car purchaser and dealer 


Different key attributes:

u PRIMARY KEY

In DBMS is a column or group of columns in a table that uniquely identify every row in that table. The Primary Key can't be a duplicate meaning the same value can't appear more than once in the table. A table cannot have more than one primary key.

Rules for defining Primary key:

Ø  

Two rows can't have the same primary key value

Ø  It must for every row to have a primary key value.

Ø  The primary key field cannot be null.

Ø  The value in a primary key column can never be modified or updated if any foreign key refers to that primary key.


What is the Alternate key?

Ø  ALTERNATE KEYS is a column or group of columns in a table that uniquely identify every row in that table.

Ø  A table can have multiple choices for a primary key but only one can be set as the primary key.

Ø  All the keys which are not primary key are called an Alternate Key.

What is a Candidate Key?

Ø  CANDIDATE KEY in SQL is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key with no repeated attributes.

Ø  The Primary key should be selected from the candidate keys.

Ø  Every table must have at least a single candidate key.

Ø  A table can have multiple candidate keys but only a single primary key.



What is the foreign key?

FOREIGN KEY is a column that creates a relationship between two tables. The purpose of Foreign keys is to maintain data integrity and allow navigation between two different instances of an entity. It acts as a cross-reference between two tables as it references the primary key of another table.


In this key in dbms example, we have two table, teach and department in a school. However, there is no way to see which search work in which department.

In this table, adding the foreign key in Deptcode to the Teacher name, we can create a relationship between the two tables.

What is the Compound key?

Ø  COMPOUND KEY has two or more attributes that allow you to uniquely recognize a specific record.

It is possible that each column may not be unique by itself within the database.

Ø  However, when combined with the other column or columns the combination of composite keys become unique.

Ø  The purpose of the compound key in database is to uniquely identify each record in the table.

OrderNo

ProductID

Product Name

Quantity

B005

JAP102459

Mouse

5

B005

DKT321573

USB

10

B005

OMG446789

LCD Monitor

20

B004

DKT321573

USB

15

B002

OMG446789

Laser Printer

3

Ø  In this example, OrderNo and ProductID can't be a primary key as it does not uniquely identify a record. However, a compound key of Order ID and Product ID could be used as it uniquely identified each record

What is the Composite key?

Ø  COMPOSITE KEY is a combination of two or more columns that uniquely identify rows in a table. The combination of columns guarantees uniqueness, though individually uniqueness is not guaranteed. Hence, they are combined to uniquely identify records in a table.

Ø  The difference between compound and the composite key is that any part of the compound key can be a foreign key, but the composite key may or maybe not a part of the foreign key.


Ø  Column cust_Id alone cannot become a key as a same customer can place multiple orders, thus the same customer can have multiple entires.

Ø  Column order_Id alone cannot be a primary key as a same order can contain the order of multiple products, thus same order_Id can be present multiple times.

Ø  Column product_code cannot be a primary key as more than one customers can place order for the same product.

Ø  Column product_count alone cannot be a primary key because two orders can be placed for the same product count.

Ø  Based on this, it is safe to assume that the key should be having more than one attributes:
Key in above table: {cust_id, product_code}

Ø  This is a composite key as it is made up of more than one attributes.

Normalization (v. imp)

The process of decomposing large and complex table into the simple and normal table is called normalization. It is a process to present the database in a normal form to avoid undesirable things such as repetition of information, inability of represents information, loss of information etc. it improves performance of database by reducing data redundancy in large extent.

Concept of normalization

Data normalization is the process to present database in a normal form to avoid undesirable things such as repetition of information, inability to represent information, loss of informatin etc. it improves performance by reducing data redundancy to a large extent.

EF Codd has introduced few rules for normalizing the database in 1970 and these rules are known as normal forms. This process minimizes and controls the duplication of data in a database and also provides a rapid search for data from database.

Basic rule for normalization

  •   The attribute values in a relational table should be functionally dependent on the primary key value:
  •   No repeating groups allowed in relational tables.
  •  A relational table should not have attributes involved in a transitive dependency relation with the primary key

Normalization is used to avoid

Duplication of data

Insert anomaly: A record about an entity cannot be inserted into the table without first insertion of information about another entity, we cannot enter a customer with a sales order.

Delete anomaly: A record cannot be deleted without deleting a record about a related entity. Eg: we cannot delete a sales order without deleting all of the customers' information.

Update anomaly: cannot update information without changing information in many places. Eg: to update a customer information, it must be updated for each sales order the customer has placed.

Advantages of normalization

  • It reduces data redundancy (duplication of data)
  • It improves faster sorting and indexing.
  •  It simplifies the structure of the database table.
  • It improves the performance of a system.
  • It avoids loss of information.

Types of Normalization

Let us consider a following unnormalized table.

Name

Roll

Class

Sub_name

Sub_marks

Sub_name

Sub_marks

Ram

1

11

Computer

95

Account

78

Sita

1

12

Computer

98

Account

80

Hari

2

11

Computer

80

Account

82

Shyam

2

12

Computer

92

Account

83

First Normal Form (1NF)

A relation or table is said to be in 1NF if its attributes are atomic. That is, there should not be repeating groups of an attribute. The purpose of 1NF is to eliminate repeating groups of attribute in an entity.

 1NF (First Normal Form) Rules

  • A relation  will be in 1 NF if it contains an atomic value.

        Each table cell should contain a single value.

        Each record needs to be unique.

Name

Roll

Class

Sub_name

Sub_marks

Ram

1

11

Computer

95

Ram

1

11

Account

78

Sita

1

12

Computer

98

Sita

1

12

Account

80

Hari

2

11

Computer

80

Hari

2

11

Account

82

Shyam

2

12

Computer

92

Shyam

2

13

Account

83


B) 2NF (Second Normal form)

  • Rule 1- Be in 1NF
  •  Rule 2- Single Column Primary Key that does not functionally dependent on any subset of candidate key relation
  •  Eliminated partial key dependencies.
  • Functional dependency: the value of one attribute in a table is determined entirely by value of another
  • Partial dependency: A type of functional dependency where an attribute is functional dependent on only part of the primary key (primary key must be a composite key)

Hence, above table can be decomposed as:

  • Name à depends on roll no and class.
  • Subject à only depends on class.
  • Subject marks depend on name and subject_name


3NF (Third Normal Form)

  • Rule 1- Be in 2NF
  • Rule 2- Has no transitive functional dependencies
  • Transitive dependencies: A transitive dependencies is one in which, among three attributes, A, B, and C, if AàB, BàC, then CàA.


Lets see another Example of Normalization


Second Normal Form



DBA (Database Administrator)

DBA is the most responsible person in an organization with sound knowledge of DBMS. He/she is the overall administrator of the program. He/she has the maximum amount of privileges for accessing database and defining the role of the employee which use the system. The main goal of DBA is to keep the database server up to date, secure and provide information to the user on demand.

Qualities of good DBA

  • He/she should have sound and complete knowledge about DBMS and its operation.
  • He/she should be familiar with several DBMS packages such as MS Access, MY SQL, Oracle etc
  • He/she should have depth knowledge about the OS in which database server is running.
  • He/she should have good understanding of network architecture.
  •  He/she should hove good database designing skill.

Responsibilities

  • DBA has responsibility to install, monitor, and upgrade database server.
  • He/she should has responsibility to maintain database security by creating backup for recovery.
  •  He/she has responsibility to conduct training on the uses of database.
  •   DBA defines user privilege, relationships and manages form, reports in database.


Structure query Language (SQL)

Ø  SQL stands for Structured Query language, pronounced as “S-Q-L” or sometimes as “See-Quel”… Relational databases like MySQL Database, Oracle, MS SQL Server, Sybase, etc. use ANSI SQL.

Ø  SQL is the standard language for dealing with Relational Databases. SQL can be used to insert, search, update, and delete database records. SQL can do lots of other operations, including optimizing and maintenance of database.

SQL is made of three sub languages: DDL, DML and DCL

  1. DDL (Data Definition Language): DDL is used by the database designer and programmers to specify the content and the structure of table. It is used to define the physical characteristics of record. It includes commands that manipulate the structure of object such as tables:

Data Definition Language helps you to define the database structure or schema. Let’s learn about DDL commands with syntax.

a)     CREATE statement: CREATE statements is used to define the database structure schema:

Syntax:

CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);

For example:

a.      Create database university;

b.     Create table students;

c.      Create view for_students;

 For eg, to create table

Syntax:

CREATE TABLE table_name (field_name1 data_type1  field_name2 data_type2 ………);

CREATE TABLE Student (SN Number Fname text);

b)     Drop statement: Drops commands remove tables and databases from RDBMS

c)




2) DML (Data Manipulation Language): DML is related with manipulation of records such as retrieval, sorting, display and deletion of records or data. It helps user to use query and display report of the table. It provide technique for processing the database. It includes commands like insert, delete, select, and update to manipulate the information stored in the database.




3) DCL (Data Control Language): All provides additional feature for security of table and database. It includes commands for controlling data and access to the database. Some of the example of this command are grant, commit, etc


Data Dictionary

  • A data dictionary contains metadata i.e data about the database. 
  • The data dictionary is very important as it contains information such as what is in the database, who is allowed to access it, where is the database physically stored etc. 
  • It can be defined as collection of information of all data elements and contents of database such as data types, text descriptions of system.

The data dictionary contains information about the following −

  • Names of all the database tables and their schemas.
  • Details about all the tables in the database, such as their owners, their security constraints, when they were created etc.
  • Physical information about the tables such as where they are stored and how.
  • Table constraints such as primary key attributes, foreign key information etc.
  • Information about the database views that are visible.

Data Security

  • Data security refers to protective digital privacy measures that are applied to prevent unauthorized access to computer, database and websites
  • It helps to ensure privacy. 
  • It helps in protecting personal data. 
  • It ensure the data theft, data privacy and data corruption. 

Data security measures:

  • Physical Security: Securing physical access to data storage locations, servers, and devices that store sensitive information prevents unauthorized access or theft.
  • Data Backup and Recovery: Regularly backing up data and having a secure backup storage system ensures that data can be restored in case of data loss due to breaches, accidents, or natural disasters.
  • Firewalls and Network Security: Implementing firewalls, intrusion detection/prevention systems, and secure network protocols helps defend against unauthorized access and cyberattacks.
  • Application security(SQL injection web attack that add malicious code in sql query and hack sensitive information and destroy them
  • Database auditing: Check log file who access the internet, how long time stay there and what they did in database.
  • Regular Security Audits and Monitoring: Conducting regular audits and real-time monitoring of systems and networks help detect vulnerabilities or unauthorized access attempts promptly.
  • Encryption: Encrypting data both at rest (stored data) and in transit (data being transmitted) ensures that even if intercepted, the data cannot be read without the encryption key.
  • Access Control: Limiting access to data based on the principle of least privilege ensures that only authorized individuals or systems can access specific data. This involves using authentication mechanisms (like passwords, biometrics, or two-factor authentication) and authorization protocols (like role-based access control).
  • Employee Training and Awareness: Educating employees about data security best practices, phishing attacks, and the importance of safeguarding sensitive information reduces the risk of human error leading to security breaches.

Database Administrator (DBA)

A Database Administrator (DBA) is a specialist responsible for overall control of the database system at a technical level. They make strategic policy decisions regarding the organization's data and provide the necessary technical support to implement those decisions. 

The responsibilities of a DBA typically include:

  • Installing and upgrading the database server or application tools.
  • Planning for and allocating the physical requirements of the database system.
  • Monitoring and providing technical support for the database system and related applications.
  • Ensuring compliance with the database vendor's license agreement, including managing the number of installations and handling license renewals.
  • Creating a backup and recovery strategy for the database.

Data integrity 

Data integrity refers to the overall accuracy, completeness, and reliability of data. It is characterized by the consistency and absence of errors or discrepancies between different instances or consecutive updates of a record, ensuring that the information remains error-free. Data integrity also encompasses security measures to ensure compliance with regulatory requirements. It guarantees that data transferred from one stage to another is accurate and error-free.

Types of data integrity

Domain Integrity

A domain refers to the range of acceptable values within a particular column in a database. It specifies the set of values that can be stored in that column based on its data type. Common data types include integer, text, date, etc. Any entry made for a column must fall within the domain defined by its data type.

Entity Integrity

ach row representing an entity in a table must be uniquely identified. This means that every record saved in the database should be distinguishable from others. The primary key is crucial for achieving this uniqueness.

The primary key constraint specifies that the value of the primary key cannot be NULL. This constraint is essential because if the primary key value were NULL, it would not effectively distinguish one row from another, especially when all other fields in those rows are identical.


Therefore, with the help of a primary key, each record in a database table can be uniquely identified. This uniqueness ensures efficient data management and retrieval, as each record can be referenced and updated accurately.

Referential Integrity

Referential Integrity is used to maintain data consistency between two tables. Rules are established in the database structure governing how foreign keys should be utilized to ensure that changes, additions, and deletions in the database maintain data integrity. The referential integrity constraints dictate that if a foreign key in the first table refers to the primary key of the second table, then every value of the foreign key in the first table must either be null or exist in the second table.

Frequently Asked Question

1) Define DBMS. List out objectives of DBMS.

2) What is relational database model? List out the advantage of this model.

3) Differentiate between DBMS and RDBMS with example

4) Describe the centralized and distributed database model with figure.

5) What is hierarchical database model? list out its advantage and disadvantage.

6) What is normalization? Explain the normalization process with example.

7) Define DBA. what are the roles/ responsibilities and criteria of a good DBA.

8) Differentiate between database and DBMS with example.

9) write short notes on

ERD

SQL -> DDL, DML, DCL

Data Integrity

Data security

Data Dictionary






2 Comments