2.2+Databases

=Databases=

//Before you begin//: Lessons are based on community-based, continually updated online sources such as [|Wikipedia]. Relevant terms for this lesson are listed under Topics and presented in a narrative format in the Read about sections. Click on each of the linked items and visit the Wikipedia article to get the most out of the lesson, and then hit the Back button on your browser to return to the lesson.

toc

=Goals=
 * Basic**
 * Understand the different types of databases (e.g. flat file versus relational)

=Topics= [|database], [|database model], [|relational database] (DB), [|flat file DB] , [|hierarchical DB] , [|object-oriented DB], [|data model] , [|data field] , [|data record] , [|data repository] , [|data warehouse] , [|data mining] , [|OLTP] (online transaction processing), [|OLAP] (online analytical processing), [|database management system] (DBMS), [|query language], [|SQL] (Structured Query Language), [|expert system]

=Read about=

Data and databases
Pathology laboratories generate an enormous amount of data, which must be organized and stored for accessibility and security purposes. Databases perform this function, and form the backbone of laboratory information systems. A simple database may be thought of as a table consisting of columns (aka fields, attributes) and rows (aka instances, records, tuples). A cell in the table containing a value is the data element (aka data, value). This model of a database is also known as a [|flat file database].

An example of a flat file database includes a library card catalog. Books are organized alphabetically and/or by topic on paper cards inside drawers. Data attributes recorded on the cards are predictable and do not vary much from card to card (i.e. title, author, publisher, year of publication, etc.).
 * ~ Last name (author) ||~ First name (author) ||~ Title ||~ Year published ||~ Publisher ||
 * Calvino || Italo || Invisible Cities || 1972 || Harcourt Brace Jovanovich ||
 * Jacobs || Jane || The Death and Life of Great American Cities || 1961 || Modern Library (Random House) ||

The disadvantages of a flat file database is that certain elements may be redundant and the database may be difficult to update if all the elements are not known in advance. In a pathology database, a single patient may be associated with multiple tests and specimens on multiple days, times, and locations. Organizing rows by unique patient will make it very difficult to determine in advance how many columns to allocate for different tests and specimens. Organizing rows by unique specimen number (i.e. accession number) may make certain fields, such as the patient's name, patient's medical record number, birthdate, address, primary care provider, etc, redundant and waste space. In addition, a specimen may have multiple parts, for example, several chemistry tests being performed from one blood tube or a surgical specimen arriving in three separate pieces, bringing us back to the first problem.

A [|relational database] contains multiple tables that are linked or related to one another by unique identifiers (i.e. [|primary key], [|foreign key] ). For example, patient-related data (e.g. medical record number, birthdate, home address, etc.) can be stored in one table and linked to a second table that has specimen information. The specimen table may be still linked to a third table that holds data on specific studies performed on the specimen, for example, chemistry and coagulation tests. Similarly, there may be yet another table holding reference ranges, quality control information, list of providers, etc. Distributing information in multiple tables and linking them allows for greater flexibility and efficiency in storing data.

Database management system (DBMS)
A [|database management system] (DBMS) is a set of computer programs (i.e. [|middleware] ) that controls access to and modification of a database and enforces data integrity.

Without a DBMS, access and security issues may be dictated by physical means, such as location of a library card catalog, drawer locks, etc. This simple card catalog database is updated by a single entity but can be read by everyone ("read-only"). A second user interface to the databases may be through a librarian.

With a DBMS, the functions include accessing the database on the storage device, creating file elements in the database, updating or adding or deleting information, looking up information, interfacing with users, providing backup and security functions. Examples of DBMS's include [|MySQL], [|Microsoft SQL Server], [|Oracle database].

Database language
A [|query language] is used to extract information from databases and information systems. [|SQL] is an example of a well-known computer language for managing data in relational databases. One of the earlier "database languages", although technically a [|programming language], is [|MUMPS]. [|MUMPS], also known as M, is a [|programming language] that is optimized for database application development. It is the basis of the [|Veteran Health Administration's] [|Elecronic Health Record System], [|VistA], one of the earliest [|EHR] implementations.

=Activities= Design a simple database for a collection of slides or patient records that you have.

=Online Resources=
 * Sinard, JH. [|Databases and Data Storage powerpoint]. APIII Conference: October 2008.

=Questions=

=Advanced courses=
 * Nitin Patel. Course materials for 15.062 [|Data Mining], Spring 2003. MIT OpenCourseWare, Massachusetts Institute of Technology.
 * Thomas Malone. Course materials for 15.561 [|Information Technology Essentials, Spring 2005] . MIT OpenCourseWare, Massachusetts Institute of Technology.

=Expert corner= Help with Wikipedia article(s):

flat This page was last modified by user:{$revisioneditor} on {$revisiondate}.