Introduction to Databases
Understanding the fundamentals of data storage and management
What is a Database?
A Database is an organized collection of structured information or data, typically stored electronically in a computer system. It's a way to store data in a format that is easily accessible, manageable, and updatable.
Real-world analogy: Think of a library where books are organized by topic, author, and title – that's essentially what a database does with data, but with much more powerful search and retrieval capabilities.
Why Do We Need Databases?
- To store, manage, and retrieve large amounts of data efficiently
- To prevent data duplication and maintain data integrity
- To allow multiple users to access and manipulate data simultaneously
- To enforce data security and access control
- To maintain consistency across complex data relationships
What is SQL?
SQL (Structured Query Language) is the standard programming language used to communicate with and manipulate databases. It provides a standardized way to interact with relational databases regardless of the specific database system being used.
SQL was developed at IBM in the 1970s and has become the standard language for relational database management systems (RDBMS).
Common Operations (CRUD)
CRUD represents the four basic operations you can perform on data:
| Operation | SQL Command | Description |
|---|---|---|
| Create | INSERT |
Add new records to a table |
| Read | SELECT |
Retrieve data from one or more tables |
| Update | UPDATE |
Modify existing data in a table |
| Delete | DELETE |
Remove records from a table |
Comparison with Excel
While Excel is great for small datasets and quick analysis, databases offer significant advantages for larger applications:
- In Excel, a sheet is like a table in a database
- Each row in the sheet is similar to a record (or entry) in a database table
- Each column in Excel corresponds to a field (or attribute) in a table
- Excel stores all data in one file, whereas a database can contain multiple related tables
- In databases, you can define strict data types and rules, which Excel doesn't enforce
- Unlike Excel, databases allow complex querying, relationships between tables, and secure multi-user access
A database is a more powerful, structured, and scalable version of Excel for data management.
Relational vs Non-relational Databases
Understanding the difference between SQL (Relational) and NoSQL (Non-relational) databases is crucial for choosing the right tool for your project.
| Feature | Relational (SQL) | Non-relational (NoSQL) |
|---|---|---|
| Structure | Tables (rows & columns) | Documents, Key-Value, Graphs |
| Language | SQL | Varies (Mongo Query, etc.) |
| Schema | Fixed schema | Flexible schema |
| Examples | MySQL, PostgreSQL, Oracle | MongoDB, Firebase, Redis |
| Best For | Complex queries, ACID compliance | Big data, real-time applications |
Getting Started with MySQL
Setting up your database environment
What is DBMS?
A Database Management System (DBMS) is software that interacts with users, applications, and the database itself to capture and analyze data.
- Examples: MySQL, PostgreSQL, Oracle Database, SQLite, Microsoft SQL Server
- Functions: Data storage, retrieval, security, backup, and recovery
- Interfaces: Command line, GUI tools, and API connections
What is MySQL?
MySQL is an open-source relational database management system (RDBMS) that uses SQL. It's one of the most popular database systems in the world.
Key Features
- Widely used in web development (LAMP stack: Linux, Apache, MySQL, PHP/Python/Perl)
- High performance and reliability
- Powers platforms like WordPress, Facebook (early architecture), and YouTube
- Strong community support and extensive documentation
Real-World Use Cases
- E-commerce: Storing customer orders, product listings, and inventory
- Banking: Handling transactions securely with ACID compliance
- Social Networks: Managing user data, messages, and relationships
- Content Management: WordPress, Drupal, and Joomla all use MySQL
Creating a Database
Let's start by creating your first database. Open your MySQL client and run the following commands:
-- Create a new database
CREATE DATABASE database_name;
-- Example: Create a student database
CREATE DATABASE student_db;
- Database names should be unique within the server
- Avoid using spaces or special characters
- Use lowercase and underscores for better readability (e.g.,
employee_records) - Be descriptive but concise
Viewing All Databases
-- List all databases
SHOW DATABASES;
Switching to a Database
Before working with tables, you must select the database:
USE student_db;
Dropping a Database
This action is irreversible! This will permanently delete all data and tables in the database.
DROP DATABASE database_name;
-- Example
DROP DATABASE student_db;
Creating Tables
Tables are the foundation of relational databases. Each table consists of columns (fields) and rows (records).
Basic Syntax
CREATE TABLE table_name (
column1 datatype constraints,
column2 datatype constraints,
...
);
Example: Creating a Students Table
CREATE TABLE students (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL DEFAULT 'No Name',
age INT,
email VARCHAR(100) UNIQUE,
admission_date DATE
);
Explanation:
id INT AUTO_INCREMENT PRIMARY KEY– A unique identifier for each student that auto-incrementsname VARCHAR(100) NOT NULL– Name must be provided (cannot be NULL)age INT– Stores numeric values for ageemail VARCHAR(100) UNIQUE– Each email must be different across all recordsadmission_date DATE– Stores the date of admission
Commonly Used Data Types
| Data Type | Description | Example Usage |
|---|---|---|
INT |
Whole numbers | age, quantity, id |
VARCHAR(n) |
Variable-length string (max n characters) | names, emails, addresses |
TEXT |
Long text strings | descriptions, comments, articles |
DATE |
Date values (YYYY-MM-DD) | birth dates, appointment dates |
DATETIME |
Date and time values | timestamps, event logs |
BOOLEAN |
TRUE or FALSE | flags, status indicators |
DECIMAL(p,s) |
Exact precision numbers | prices, financial data |
SQL Constraints
Constraints are rules applied to table columns to enforce data integrity, consistency, and validity.
Why Use Constraints?
- Ensure data quality and reliability
- Prevent invalid, duplicate, or null data
- Maintain business rules directly in the database layer
- Reduce application-side validation complexity
| Constraint | Description | Example |
|---|---|---|
NOT NULL |
Disallow null values | name VARCHAR(100) NOT NULL |
UNIQUE |
Disallow duplicate values | email VARCHAR(100) UNIQUE |
DEFAULT |
Set default value if none given | status VARCHAR(20) DEFAULT 'active' |
CHECK |
Enforce value conditions | age INT CHECK (age >= 0) |
PRIMARY KEY |
Uniquely identify each record | id INT PRIMARY KEY |
FOREIGN KEY |
Establish relationships between tables | FOREIGN KEY (dept_id) REFERENCES departments(id) |
Data Operations
Mastering CRUD operations and data manipulation
Modifying Tables
As your application evolves, you'll need to modify existing table structures. Use the
ALTER TABLE command.
Renaming a Table
RENAME TABLE old_table_name TO new_table_name;
Adding a Column
ALTER TABLE table_name
ADD COLUMN column_name datatype constraints;
-- Example
ALTER TABLE students
ADD COLUMN gender VARCHAR(10);
Always review changes on production databases carefully. Use DESCRIBE table_name
to verify structure before and after modifications.
Inserting Data
Insert One Row
INSERT INTO student (id, name, age, grade, date_of_birth)
VALUES (1, 'Ayesha Khan', 16, '10th', '2007-05-15');
Insert Multiple Rows at Once
INSERT INTO student (id, name, age, grade) VALUES
(15, 'Ayesha Khan', 16, '10th'),
(25, 'Ravi Sharma', 17, '11th'),
(35, 'Meena Joshi', 15, '9th'),
(45, 'Arjun Verma', 18, '12th'),
(55, 'Sara Ali', 16, '10th');
Inserting multiple rows at once is more efficient than inserting them one by one, as it reduces network round trips.
Selecting Data
Basic SELECT Statement
-- Select all columns
SELECT * FROM student;
-- Select specific columns
SELECT name, grade FROM student;
Using WHERE Clause
SELECT * FROM student
WHERE grade = '10th';
SELECT * FROM student
WHERE age > 16;
Comparison Operators
| Operator | Description | Example |
|---|---|---|
= |
Equals | WHERE age = 16 |
!= or <> |
Not equal to | WHERE grade != '12th' |
> |
Greater than | WHERE age > 16 |
< |
Less than | WHERE age < 17 |
>= |
Greater than or equal | WHERE age >= 16 |
<= |
Less than or equal | WHERE age <= 18 |
BETWEEN |
Within a range (inclusive) | WHERE age BETWEEN 15 AND 17 |
IN |
Matches any in a list | WHERE grade IN ('10th', '12th') |
LIKE |
Pattern matching | WHERE name LIKE 'A%' |
WHERE grade = NULL will NOT work! Use IS NULL or
IS NOT NULL instead.
Sorting Results
-- Ascending order (default)
SELECT * FROM student
ORDER BY age ASC;
-- Descending order
SELECT * FROM student
ORDER BY name DESC;
Updating Data
-- Update a single row
UPDATE student
SET grade = '12th'
WHERE id = 2;
-- Update multiple columns
UPDATE student
SET age = 17, grade = '10th'
WHERE id = 3;
If you omit the WHERE clause, all rows in the table will be updated! Always double-check your WHERE conditions.
Deleting Data
-- Delete specific rows
DELETE FROM student
WHERE id = 2;
-- Delete based on conditions
DELETE FROM student
WHERE grade = '9th';
If you omit the WHERE clause, all rows in the table will be deleted permanently!
DELETE vs DROP vs TRUNCATE
| Command | What it does | Can rollback? |
|---|---|---|
DELETE |
Removes specific rows (or all if no WHERE) | Yes (if in transaction) |
TRUNCATE |
Removes all rows, resets auto-increment | No |
DROP TABLE |
Removes table structure and data | No |
Transactions & Advanced Topics
Ensuring data integrity and complex operations
Transactions: ACID Properties
A transaction is a sequence of one or more SQL statements executed as a single unit. Transactions ensure data integrity through ACID properties:
- Atomicity: All operations complete successfully or none do (all-or-nothing)
- Consistency: Database remains in a valid state before and after transaction
- Isolation: Concurrent transactions do not interfere with each other
- Durability: Committed changes persist even after system failure
Using Transactions
START TRANSACTION;
-- Transfer $100 from account 1 to account 2
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
-- If both succeed, commit the changes
COMMIT;
-- If something goes wrong, rollback
-- ROLLBACK;
- Always use transactions when performing multiple related operations
- Keep transactions as short as possible to avoid locking issues
- Test your transactions thoroughly before running on production data
- Use appropriate isolation levels based on your consistency requirements
MySQL Joins
Joins are used to combine data from multiple tables based on a related column.
| Join Type | Description | Use When |
|---|---|---|
INNER JOIN |
Returns only matching rows from both tables | You only want data that exists in both tables |
LEFT JOIN |
Returns all rows from left table, matched rows from right | You want all records from main table, even without matches |
RIGHT JOIN |
Returns all rows from right table, matched rows from left | Rarely used; usually just switch table order |
CROSS JOIN |
Returns all combinations of rows from both tables | You need a Cartesian product |
-- INNER JOIN example
SELECT students.name, marks.subject, marks.score
FROM students
INNER JOIN marks ON students.id = marks.student_id;
-- LEFT JOIN example
SELECT students.name, marks.subject, marks.score
FROM students
LEFT JOIN marks ON students.id = marks.student_id;
GROUP BY & Aggregation
The GROUP BY clause groups rows that have the same values in specified columns, typically used with aggregate functions.
-- Count employees per department
SELECT department, COUNT(*) AS total_employees
FROM employees
GROUP BY department;
-- HAVING filters groups (like WHERE filters rows)
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 50000;
MySQL Indexes
An index is a data structure that improves the speed of data retrieval operations on a database table.
When to Use Indexes:
- Columns frequently used in WHERE, JOIN, or ORDER BY clauses
- Searching by unique fields like email or username
- Large tables where scan performance matters
When NOT to Use Indexes:
- Small tables (MySQL can scan quickly anyway)
- Columns that are rarely queried
- Tables with frequent INSERT/UPDATE operations (indexes slow down writes)
-- Create an index
CREATE INDEX idx_email ON users(email);
-- Create a composite index
CREATE INDEX idx_name_city ON users(name, city);
-- View indexes on a table
SHOW INDEX FROM users;
Overusing indexes or indexing the wrong columns can hurt performance. Use them wisely based on query patterns.
Ready to Practice?
Try running these queries in your local MySQL environment or use an online SQL playground like DB Fiddle or SQLFiddle.
Open Online SQL Editor