BSc. Solutions and Notes: Unit-6 : Database Management System (Notes of TU BSc. 4th Year Computer Science)

Database Management System

Syllabus: What is Database? What is Database Management System? Advantages of using Database Approach; Database applications; Introduction to Database Models; Introduction to data warehousing, Data mining, and Data Mart; Computational Nano Science; Space Data; Computational Biology

A database is an organized collection of logically related data that contains information relevant to an enterprise. For example, the university database maintains information about students, courses, and grades.

Database Management System (DBMS)

A database management system (DBMS) is the set of programs that are used to store, retrieve, and manipulate the data in a convenient and efficient way. The main goal of DBMS is to hide the complexities of data management from users and provide an easy interface to them. Examples of DBMS structures are Oracle, Sybase, Microsoft SQL Server, Dbase, etc.

The database management system that maintains a relationship between multiple data files is called Relational Database Management System (RDBMS).

Advantages of using Database Approach

Drawbacks of the flat-file systems are solved by DBMS. Some of the advantages of DBMS are:

Data Redundancy - Data redundancy means duplication of the same data or data files in different places. Flat file systems suffered from the problems of high data redundancy which leads to higher storage and access cost. DBMS can greatly reduce the problem of data.

Data Inconsistency - Data inconsistency is a side effect of data redundancy. Data is said to be inconsistent if various copies of the same data may no longer agree. Data inconsistency occurs if changed data is reflected in data files in one place but not elsewhere in the system. For example, if library data contain a cell number of a student as 98******55 but examination data file stores 98******53 as cell number of the student then we can say that the data is inconsistent.
Data Isolation - Isolation is the act of separating something. Because data are scattered in various files and files may be in different formats, writing new application programs to retrieve the appropriate data is difficult in a flat-file system. For example, one data file may contain data separated from commas and another file may contain data separated by white space.
Database management systems provide shared access to centrally stored data therefore it is easy for application programs to retrieve required data from a centralized database.
Difficulty in Accessing Data - File processing systems do not allow the required data to be retrieved in an efficient and convenient way. For example, assume we already have a program to generate the list of books on the basis of the subject. Now, if we need to generate the list of books on the basis of the author's name, either we need to extract the data from book data files manually or we should request the programmer to write a program to retrieve the required data from the book data file. Both the alternative is not satisfactory but in a database system, it is very easy to write general programs to generate different lists on the basis of different criteria.
Integrity Problems - Integrity means the correctness of data before and after the execution of a transaction. Integrity constraints are condition applied to the data which are imported to maintain the correctness of data.
Database management systems allow us to specify integrity constraints on data. Therefore it is easy to maintain the correctness of data.
Atomicity Problems - Execution of transaction must be atomic. This means transactions must execute at its entirely or not at all. If the execution of the transaction is not atomic, it leaves the database an incorrect state.
Database management systems guarantee the atomicity of execution of transaction.
Concurrent Access Anomalies - Concurrent updates to the same data by different transactions at the same time may result in inconsistent data. For example, consider bank account ’n’ containing Rs. 50,000. If two customers withdraw funds say 15,000 and 20,000 respectively from account A at about the same time, the result of the concurrent execution may leave the account in an inconsistent state.
Database systems support concurrent execution of transaction on the same data without resulting in inconsistent data.
Security Problems - In a database system, we may create different user accounts and provide different authorization to different users. Thus we are able to hide certain information from some users.
For example, in a banking system, payroll personnel need to see only that part of the database that has information about various bank employees.

Database Applications

DBMS is widely used in various areas because of its numerous advantages. Some of the most common database applications are listed here.

Airlines and Railways - They use an online database for reservation and for displaying and schedule information.
Banking - Banks use databases for customer inquiry, accounts, loans, and other transactions.
Education - Schools and colleges use the database for course registration, result, and other information.
Telecommunications - Department of IT uses the database to restore or store information about the telecommunication network, telephone numbers, records of calls for generating monthly bills, etc.
Credit Card Transaction - Database is used for keeping track of purchase on credit cards in order to get or generate monthly statements.
E-commerce - Integration of heterogeneous information sources for a business activity such as online shopping, booking of holiday packages, etc.
Healthcare Information Systems and Electronic Patient Record - Database are used for maintaining the patient health care details.
Digital Libraries and Digital Publishing - Database is used for the management and delivery of large bodies of textual and multimedia data.
Finance - Database is used to store product, customer, and transaction details.
Human Resource - Organizations use a database for storing information about their employees, salaries, benefits, taxes, and generating salary cheques.

Database Models

A database model is an abstract model that describes how the data is represented and used.

It consists of a set of data structures and conceptual tools that are used to describe the structure (data, types, relationships and constraints) of a database. Traditionally, there are different database models that are used to design and develop the database of the organization.

Hierarchical model
Network model
Entity-Relationship model
Relational model
Object-oriented model
Object-relational model

Hierarchical Model (Tree-like): The oldest type of data model developed by IBM in 1968. It is the record based representational or implementational data model.
In this model, different records are interrelated through the hierarchical or tree-like structure.
For example, A parent record can have several children, but a child can have only one parent. It, therefore, represents only one and one-many relationships.
Network Data Model: It is an extension of the hierarchical database structure. It is also a record based representational or implementational data model.
It is more flexible than the old data model.
It describes data and relations between data by using a graph rather than a tree-like structure.
In a hierarchical data model, a child cannot have more than one parent but it is allowed in-network data model.
Entity-Relationship Model: It is based on a perception of the real world that consists of a collection of basic objects called entities and relationships among these subjects/objects.
In this model, a database can be modeled as a collection of entities and relationships among entities.
It is one of the conceptual data models and describes the information used by an organization in a way that is independent of any implementation level issues and details.
The database can be expressed graphically by the E.R. diagram.
Relational Model: It is also a representational or implementation data model. In this data model, unlike the hierarchical and network models, there are no physical links.
All the data is maintained in the form of tables (generally known relations) consisting of rows and models.
Thus, the relational model has become more programmers friendly and much more dominant and popular in both industrial and academic scenarios.
Object-Oriented Data Model: Based on object-oriented programming paradigm, a core object-oriented data model consists of following basic object-oriented concepts.
a) object and object identifier
b) attributes and methods
c) class
d) class hierarchy and inheritance

Data warehousing

It is the process of constructing and using a data warehouse. A data warehouse repository of the information constructed by integrating data from multiple heterogeneous sources that support analytical reporting structured and decision making.

It involves data integration, data cleaning, and data consolidations.

Data Integration: Process of standardizing the data definition and data structures of multiple data sources.
Data Cleaning: Process of detecting and correcting incorrect, irrelevant, out of date, corrupt, etc.
Data Consolidation: Refers to the collection and integration of data from multiple sources into a single destination.

(Repository: A place which stores a large number of data)

Data Mining

Extracting information from huge sets of data. It is the process of mining knowledge from data.

Knowledge extracted from data can be used for any of the following applications such as market analysis, fraud detection, customer retention, production control, science exploration, etc.

Applications of data mining

Financial Data Analysis: Banks and financial institutions use data mining for loan payment prediction, customer credit policy analysis, etc.
Retail Industry: Data mining in the retail industry helps in identifying customer buying patterns and trends that lead to improved quality of customer service.
Telecom Industry: In the telecommunication industry, data mining helps in identifying telecommunication patterns.
Biological Data Analysis: Data mining is a very important part of bioinformatics.
Other scientific application
Industrial Detection

Data Mart

Datamart is a subject-oriented archive that stores data and uses the retrieved information to assist and support the requirements involved within a particular business function or department. Data marts exist within a single organizational data warehouse repository.

A data mart is basically a condensed and more focused version of a data warehouse that reflects the regulations and processed specifications of each business unit within an organization. Each data mart is dedicated to a specific business function or region. This subset of data may span across many or all of an enterprise’s functional subject areas.

Computational Nanoscience

It is the field that is concerned with modeling of large scale computer simulation in order to understand the new nanoscale phenomena and regime. In nanotechnology, the numbers are especially important because things are to be built and require very high accuracy. Nanosystem presents a new type of multi-scale modeling and algorithmic time and storage challenges.

Hardware and software tools to solve some of the nanoscience modeling problems already exist. Software tools and application-oriented computer programming languages that already exist must be assembled and investigated for their suitability in solving nanosystem problems.

Molecular Workbench (MW) is another tool used in computational nanoscience.

Space Data

Data collected from space with the help of a satellite is called space data. May be data about weather conditions, data about other planets, data about different areas like forest, oceans, etc.

Space Data Routers (SDR) will allow space agencies, academic institutes, and research centers to share space data generated by single or multiple missions.

Computational Biology

It is the science of using biological data to develop algorithms and relations among various biological systems. It involves the development and application of data analytical and theoretical methods, mathematical modeling, and computational simulation techniques to the study of biological behavioral and social systems. It spans a wide range of subfields such as computational pharmacology, computational genetics, computational bio-modeling, computational neuroscience, etc. The main goal of computational biology is to discover new biology and knowledge about living systems.

Some important Terms and Terminologies :

Database Schema and Instance: The overall structure of the database is called the database schema. For example, employee information in a company in a company database may be stored in relation to the following schema.

Employee ( Eid:string, Ename:string, Address:string, Salary:integer, Age:integer) Once created, the database schema is not expected to change frequently. Database administration is responsible for creating, deleting, and modifying database schema.

The collection of information stored in the database at a particular moment is called an instance of a database. It is the actual content of the database at a particular point in time.

Database instance changes frequently with every insertion, deletion, and update operations performed in data stored in a database.

Here is the Downloadable PDF of Unit-6 : Database Management System (DBMS)

Click here for 2073-Long Answer Question

Click here for 2073-Short Answer Question

Click here for 2074-Long Answer Question

Click here for 2074-Short Answer Question

Click here for 2075-Long Answer Question

Click here for 2075-Short Answer Question

Click here for 2076-Long Answer Question

Click here for 2076-Short Answer Question

Click here for Unit-1: Introduction to Computer

Click here for Unit-2: Operating System

Click here for Unit-4: The Internet