Hotel reservation system
Questions
What
- What is the scale of the system? (5,000 hotels and 1 million rooms)
- What are the payment terms? (Customers pay in full when they make reservations)
- What are the supported reservation options? (Hotel website and app)
- What features are within the scope?
- Show the hotel-related page
- Show the hotel room-related detail page
- Reserve a room
- Admin panel to add/remove/update hotel or room info
- Support overbooking
- What are the pricing dynamics? (Prices change dynamically based on expected hotel occupancy)
How
- How do customers book rooms? (Through the hotel’s website or app)
- How is overbooking handled? (Allow 10% overbooking)
Who
- Who can cancel reservations? (Customers)
Overview
Components
-
User Interface
- Mobile App
- Web Application
-
Admin Interface
- Internal Software: Used by authorized hotel staff.
-
CDN (Content Delivery Network)
- Caches static assets (JavaScript, images, videos, HTML) for faster load times.
-
Public API Gateway
- Manages incoming requests, supports rate limiting and authentication.
- Routes requests to specific services based on the endpoint.
-
Internal APIs
- Accessible only by authorized hotel staff, typically protected by a VPN.
- Used for administrative operations.
-
Services
- Hotel Service: Provides detailed information about hotels and rooms.
- Rate Service: Provides room rates based on future dates and expected occupancy.
- Reservation Service: Manages reservation requests and tracks room inventory.
- Payment Service: Handles payment transactions and updates reservation statuses.
- Hotel Management Service: Admin-exclusive service for managing reservations and room information.
Interaction Flow
-
User Booking a Room:
- User accesses the Mobile App or Web Application.
- Static assets are loaded quickly via the CDN.
- User searches for hotels and rooms:
- Request goes through the Public API Gateway to the Hotel Service.
- Hotel Service retrieves and caches hotel and room information.
- User checks room rates for specific dates:
- Request is routed through the Public API Gateway to the Rate Service.
- Rate Service returns dynamic pricing based on expected occupancy.
- User books a room:
- Request is directed to the Reservation Service.
- Reservation Service checks availability and reserves the room.
- User proceeds to payment:
- Payment details are sent to the Payment Service.
- Payment Service processes the payment and updates the reservation status.
-
Admin Operations:
- Admin uses Internal Software (secured by VPN) to access Internal APIs.
- Admin performs operations such as viewing upcoming reservations, reserving rooms, canceling reservations, and updating room information:
- Requests are routed through the Public API Gateway to the appropriate service.
- Hotel Management Service handles operations specific to hotel staff.
- Reservation Service manages reservation modifications.
- Payment Service handles refunds and payment-related operations.
Data model
(skip)
Concurrency problem
To avoid double booking, it’s crucial to handle two scenarios: the same user clicking the “book” button multiple times and multiple users trying to book the same room type simultaneously.
Approach 1: Client-Side Implementation
Client-Side Measures:
- Disable or gray out the “submit” button once a request is sent to prevent multiple clicks.
Limitations:
- Users can disable JavaScript or circumvent client-side checks, making this approach unreliable.
Approach 2: Idempotent APIs
Idempotency Key:
- Implement an idempotency key (e.g.,
reservation_id) in the reservation API request. - Ensures that multiple identical requests result in only one reservation.
Process:
-
Generate Reservation Order:
- After the user inputs reservation details and clicks “continue,” a reservation order is generated by the reservation service.
- A unique
reservation_idis created and returned in the API response.
-
Review Reservation:
- The user reviews the reservation details, including the
reservation_id.
- The user reviews the reservation details, including the
-
Submit Reservation:
- The
reservation_idis included in the request. - If the user clicks “Complete my booking” multiple times, the unique constraint on
reservation_idin the database ensures no duplicate reservations.
- The
Handling Concurrent Bookings
Problem:
- Multiple users attempt to book the same room type simultaneously when only one room is left.
Solution:
-
Pessimistic Locking:
- Lock the inventory row during the reservation process to prevent concurrent access.
-
Optimistic Locking:
- Use versioning or timestamps to detect conflicts during the reservation process.
-
Serializable Isolation Level:
- Set the database isolation level to
serializableto ensure transactions are executed sequentially.
- Set the database isolation level to
Option 1: Pessimistic Locking
Description: Pessimistic locking, or pessimistic concurrency control, places a lock on a record as soon as a user starts to update it. Other users attempting to update the same record must wait until the lock is released (i.e., the changes are committed).
Implementation:
- Use the
SELECT ... FOR UPDATEstatement in MySQL to lock the rows returned by a query.
Example:
BEGIN;
-- Lock the row for update
SELECT total_reserved, total_inventory
FROM room_type_inventory
WHERE hotel_id = ${hotelId} AND room_type_id = ${roomTypeId} AND date BETWEEN ${startDate} AND ${endDate}
FOR UPDATE;
-- Check and update inventory
IF (total_reserved + ${numberOfRoomsToReserve}) <= total_inventory THEN
UPDATE room_type_inventory
SET total_reserved = total_reserved + ${numberOfRoomsToReserve}
WHERE hotel_id = ${hotelId} AND room_type_id = ${roomTypeId} AND date BETWEEN ${startDate} AND ${endDate};
END IF;
COMMIT;Pros:
- Prevents updates on data that are being modified.
- Simple to implement and avoids conflicts by serializing updates.
Cons:
- Deadlocks can occur, complicating application code.
- Not scalable; long-lived transactions impact performance as other transactions must wait.
Option 2: Optimistic Locking
Description: Optimistic locking allows multiple concurrent users to attempt to update the same resource without immediate locking. A version number or timestamp is used to detect conflicts, with the version number being the preferred method due to clock inaccuracies.
Implementation:
- Add a
versioncolumn to the database table. - Increment the version number with each update.
- Use a validation check to ensure the version number is as expected before committing changes.
Example:
-- Add version column to table
ALTER TABLE room_type_inventory ADD COLUMN version INT DEFAULT 0;
-- Transaction with optimistic locking
BEGIN;
-- Fetch current state and version
SELECT total_reserved, total_inventory, version
FROM room_type_inventory
WHERE hotel_id = ${hotelId} AND room_type_id = ${roomTypeId} AND date BETWEEN ${startDate} AND ${endDate};
-- Check and update with version check
IF (total_reserved + ${numberOfRoomsToReserve}) <= total_inventory THEN
UPDATE room_type_inventory
SET total_reserved = total_reserved + ${numberOfRoomsToReserve}, version = version + 1
WHERE hotel_id = ${hotelId} AND room_type_id = ${roomTypeId} AND date BETWEEN ${startDate} AND ${endDate}
AND version = ${currentVersion};
END IF;
COMMIT;Pros:
- Prevents editing stale data.
- No actual locking at the database level; handled by the application logic.
- Efficient when data contention is low.
Cons:
- Performance degrades with high data contention due to repeated retries.
- Users may experience a high volume of failures and retries, leading to a poor user experience.
Option 3: Database Constraints
Description: Database constraints can be used to enforce rules at the database level, such as ensuring the total number of reserved rooms does not exceed the total inventory.
Implementation:
- Add a constraint to the
room_type_inventorytable to ensure the total number of reserved rooms does not exceed the total inventory.
Example:
ALTER TABLE room_type_inventory
ADD CONSTRAINT check_room_count CHECK (total_inventory - total_reserved >= 0);Pros:
- Easy to implement.
- Effective when data contention is minimal.
Cons:
- Similar to optimistic locking, high data contention leads to frequent transaction failures.
- Users may see available rooms but encounter errors when booking.
- Not all databases support constraints, and migration between databases can be problematic.
Scalability
1. Database Sharding
Concept:
- Sharding involves splitting the data across multiple databases, with each database holding a subset of the data.
Sharding Strategy:
- Shard data by
hotel_idsince most queries filter by this field. SQL Example:
-- Pseudo-code for determining shard
shard_id = hash(hotel_id) % number_of_servers
connect to shard[shard_id]
execute reservation query2. Caching
Concept:
- Use a caching layer to handle most read operations, significantly reducing the load on the database.
Implementation:
- Utilize Redis for its TTL (time-to-live) and LRU (Least Recently Used) cache eviction policies.
- Cache key:
hotelID_roomTypeID_{date} - Cache value: Number of available rooms for the given
hotel_id,room_type_id, and date.
Architecture:
- Reservation Service:
- Manages API calls for querying availability, reserving rooms, and updating inventory.
- Inventory Cache (Redis):
- Handles read operations for room availability.
- Pre-populated with inventory data.
- Inventory DB:
- Acts as the source of truth for inventory data.
Flow:
- Query room inventory in the cache.
- If available, update the inventory in the database.
- Asynchronously update the cache to reflect changes.
Example:
-- Cache structure in Redis
SET hotelID_roomTypeID_20210701 10
-- Query cache
GET hotelID_roomTypeID_20210701
-- Update database
UPDATE room_type_inventory
SET total_reserved = total_reserved + 1
WHERE hotel_id = ${hotelId} AND room_type_id = ${roomTypeId} AND date = ${date};
-- Asynchronously update cache
SET hotelID_roomTypeID_20210701 9Handling Inconsistencies
Concept:
- Data consistency between the cache and database can be challenging but can be managed by ensuring the database is always the final authority.
Consistency Approach:
- Query Cache:
- If the cache indicates availability, proceed to book.
- Update Database:
- Perform the final availability check and update.
- If the database indicates no availability, return an error.
- Update Cache:
- Reflect the database changes asynchronously to the cache.
Example Scenario:
- Cache indicates 1 room available, but the database has 0 rooms.
- User attempts to book:
- Cache query shows availability.
- Database update fails due to no rooms left.
- User receives an error indicating the room is no longer available.
Pros and Cons of Caching
Pros:
- Reduced Database Load: Most read queries are handled by the cache, significantly reducing the load on the database.
- High Performance: Read operations are faster as they are served from memory.
Cons:
- Data Consistency: Maintaining consistency between the database and the cache is complex and requires careful consideration of user experience impacts.
Monolithic vs Microservice Architecture
Monolithic Architecture
In a monolithic architecture, a shared relational database ensures data consistency by wrapping operations in a single transaction to maintain ACID properties (Atomicity, Consistency, Isolation, Durability).
Microservice Architecture
In a microservice architecture, each service has its own database, and a single logical operation may span multiple services, making it difficult to maintain ACID properties across services.
Challenges:
- Data consistency issues arise when operations on different services must be coordinated.
- If an update in the reservation service fails, the corresponding update in the inventory service must be rolled back to maintain consistency.
Industry-Proven Techniques for Data Consistency
-
Two-Phase Commit (2PC):
- Description: 2PC is a distributed database protocol that ensures all nodes in a transaction either commit or rollback.
- Process:
- Prepare Phase: The coordinator asks all nodes to prepare to commit.
- Commit Phase: If all nodes agree, the coordinator instructs them to commit. If any node disagrees, the coordinator instructs all nodes to rollback.
- Pros: Guarantees atomicity.
- Cons: Blocking protocol; a single node failure can block the entire process, leading to poor performance.
-
Saga Pattern:
- Description: A saga is a sequence of local transactions, each updating a service and publishing a message to trigger the next transaction. If a transaction fails, compensating transactions are executed to undo previous changes.
- Process:
- Service A performs a local transaction and publishes a message.
- Service B receives the message, performs its local transaction, and publishes another message.
- If any step fails, compensating transactions are executed to revert the changes.
- Pros: Non-blocking, relies on eventual consistency.
- Cons: Increased complexity due to the need for compensating transactions and message handling.
Hybrid Approach
Given the complexity of maintaining data consistency in a pure microservice architecture, a pragmatic hybrid approach can be considered. In this approach, the Reservation Service handles both reservation and inventory operations, storing all related data in the same relational database.
Pros:
- Simpler implementation.
- Leverages the ACID properties of relational databases to handle concurrency and ensure consistency.
Cons:
- Deviation from pure microservice principles.
- Potentially less scalable if not designed carefully.