MongoDB Interview Questions
MongoDB Interview Questions
Comprehensive guide to MongoDB interview topics for backend developer positions.
Core Concepts
Document Model & Architecture
Q: What is MongoDB and how does it differ from relational databases?
- MongoDB is a NoSQL document database that stores data in flexible, JSON-like documents called BSON
- No fixed schema - documents can have different fields and structures
- Horizontal scaling through sharding
- Embedding vs referencing for relationships
- Uses collections (like tables) and documents (like rows)
Q: Explain the document-based model in MongoDB and its advantages/disadvantages over relational databases.
Advantages:
- Flexible schema - easy to evolve data structure
- Fast development - no complex joins or migrations
- Better performance for read-heavy workloads with proper indexing
- Horizontal scalability through sharding
- Native JSON support - maps well to application objects
- Good for unstructured or semi-structured data
Disadvantages:
- No ACID transactions across documents (before 4.0)
- Limited complex querying compared to SQL
- No JOIN operations - must use application-level joins
- Data duplication can occur
- Less mature tooling ecosystem
- Memory usage can be high for large documents
Q: What is BSON and how does it differ from JSON?
- BSON (Binary JSON) is the binary-encoded serialization format MongoDB uses
- More efficient than JSON - faster parsing, smaller size
- Supports additional data types: Date, ObjectId, Binary, Decimal128, etc.
- Type information is preserved
Q: When would you choose MongoDB over a relational database?
- Rapid prototyping and schema evolution
- Content management systems
- Real-time analytics and IoT data
- Mobile applications with flexible schemas
- Social networking features (users, posts, comments)
- Catalog data with varying attributes
- Logging and time-series data
- When horizontal scaling is a priority
Data Modeling
Q: How do you decide between embedding and referencing documents?
Embedding (one-to-one, one-to-few):
- Use when related data is frequently accessed together
- Better read performance
- Atomic updates possible
- Documents stay under 16MB limit
- Examples: User address, comments on a post (few), tags
Referencing (one-to-many, many-to-many):
- Use when related data is large or accessed independently
- Avoids duplication
- Allows for growth without document size limits
- More flexible for relationships that change frequently
- Examples: User posts (many), products in orders
Q: Explain the difference between normalized and denormalized data in MongoDB.
- Normalized: Data split across collections using references (like relational DB)
- Denormalized: Related data embedded in same document
- MongoDB favors denormalization for read performance
- Trade-off: Faster reads vs. more complex updates and potential duplication
Q: How would you model a one-to-many relationship in MongoDB?
Option 1: Embedding (small, bounded many)
{
_id: "post123",
title: "My Post",
comments: [
{user: "user1", text: "Great!"},
{user: "user2", text: "Thanks"}
]
}
Option 2: Referencing (large or unbounded many)
// Posts collection
{_id: "post123", title: "My Post"}
// Comments collection
{_id: "comment1", postId: "post123", user: "user1", text: "Great!"}
{_id: "comment2", postId: "post123", user: "user2", text: "Thanks"}
Queries & Operations
Q: Explain the difference between find(), findOne(), and findOneAndUpdate().
find(): Returns a cursor with all matching documentsfindOne(): Returns a single document (first match) or nullfindOneAndUpdate(): Atomically finds and updates a document, returns the original or updated document
Q: How do you perform complex queries in MongoDB?
- Comparison operators:
$gt,$gte,$lt,$lte,$ne,$in,$nin - Logical operators:
$and,$or,$nor,$not - Element operators:
$exists,$type - Array operators:
$all,$elemMatch,$size - Regex for pattern matching
- Text search with
$text - Geospatial queries
Q: What is an aggregation pipeline and when would you use it?
- Pipeline of stages that process documents sequentially
- Stages:
$match,$group,$project,$sort,$limit,$lookup, etc. - Use for complex data transformations, analytics, reporting
- More powerful than simple queries for data processing
Example:
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$customerId", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 10 },
]);
Q: Explain $lookup in aggregation pipelines.
- Performs left outer join between two collections
- Similar to SQL LEFT JOIN
- Can join with other collection to get related data
- Available from MongoDB 3.2+
Q: How do you handle transactions in MongoDB?
- Multi-document transactions available from MongoDB 4.0+
- ACID guarantees within a replica set or sharded cluster
- Use
startSession()andwithTransaction() - Important for financial operations, inventory management
- Performance impact - should be used judiciously
Indexing
Q: What are indexes in MongoDB and why are they important?
- Indexes improve query performance by creating sorted data structures
- Without indexes, MongoDB performs collection scans (slow)
- Default index on
_idfield - Trade-off: Faster queries vs. storage space and slower writes
Q: Explain different types of indexes in MongoDB.
-
Single Field Index: On one field
db.users.createIndex({ email: 1 }); -
Compound Index: On multiple fields
db.users.createIndex({ name: 1, age: -1 }); -
Multikey Index: For array fields
db.posts.createIndex({ tags: 1 }); -
Text Index: For full-text search
db.articles.createIndex({ title: "text", content: "text" }); -
Geospatial Index: For location data
db.places.createIndex({ location: "2dsphere" }); -
TTL Index: Automatically delete documents after expiration
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 }); -
Partial Index: Only indexes documents matching filter
db.users.createIndex( { email: 1 }, { partialFilterExpression: { active: true } } );
Q: How do compound indexes work and what is index prefix?
- Compound index on multiple fields in specific order
- Can be used for queries on: first field, first two fields, etc. (prefix)
- Order matters:
{a: 1, b: 1, c: 1}can be used for queries ona,a+b,a+b+c, but notborcalone - Follow ESR rule: Equality, Sort, Range fields
Q: What is the ESR (Equality, Sort, Range) rule for index design?
- Order fields in compound index: Equality → Sort → Range
- Improves index effectiveness
- Example: Query with
status: "active"(equality),sort by createdAt(sort),age > 25(range) - Index:
{status: 1, createdAt: 1, age: 1}
Q: How do you identify slow queries in MongoDB?
- Enable profiling:
db.setProfilingLevel(2) - Check profiler collection:
db.system.profile.find() - Use
db.currentOp()for active operations - MongoDB Atlas Performance Advisor
- Explain plans:
db.collection.find().explain("executionStats")
Performance Optimization
Q: What are common MongoDB performance optimization techniques?
- Indexing: Create proper indexes for frequently queried fields
- Query Optimization: Use projections, limit results, avoid
$where - Read Preferences: Read from secondaries for analytics workloads
- Connection Pooling: Reuse connections, limit pool size
- Sharding: Distribute data across multiple servers
- Denormalization: Embed frequently accessed data
- Document Size: Keep documents under 16MB, optimize schema
- Write Concerns: Balance durability vs. performance
- Covered Queries: Use indexes that include all requested fields
- Batch Operations: Use
bulkWrite()for multiple operations
Q: Explain query execution plans and how to use them.
- Use
explain()to see how MongoDB executes a query - Shows: execution stats, index usage, documents examined
- Helps identify missing indexes or inefficient queries
- Types:
queryPlanner,executionStats,allPlansExecution
Q: What is a covered query?
- Query that can be satisfied entirely using an index
- No documents need to be examined
- Very fast - only index is accessed
- Requires all projected fields to be in the index
Replication
Q: What is replication in MongoDB and why is it used?
- Replication maintains copies of data across multiple servers (replica set)
- Provides high availability - if primary fails, secondary takes over
- Data redundancy and disaster recovery
- Read scalability - can read from secondaries
Q: Explain replica set architecture.
- Primary: Handles all write operations
- Secondaries: Maintain copies of primary's data, can handle reads
- Arbiter: Votes in elections but doesn't store data
- Minimum: 1 primary + 2 secondaries (or 1 primary + 1 secondary + 1 arbiter)
- Automatic failover through elections
Q: What is eventual consistency in MongoDB replica sets?
- Secondaries may have slight lag behind primary
- Reads from secondaries may return slightly stale data
- Data will eventually become consistent across all replicas
- Trade-off for read scalability and availability
Q: How do you achieve strong consistency in MongoDB?
- Read from primary (
readPreference: "primary") - Use write concern
{w: "majority"}- waits for majority of nodes to acknowledge - Read concern
"majority"- only read data confirmed by majority - Slower but guarantees strong consistency
Q: Explain read preferences and write concerns.
Read Preferences:
primary: Read only from primary (default for writes)primaryPreferred: Prefer primary, fallback to secondarysecondary: Read only from secondariessecondaryPreferred: Prefer secondary, fallback to primarynearest: Read from lowest latency member
Write Concerns:
{w: 1}: Acknowledge write after primary confirms (default){w: "majority"}: Wait for majority of nodes{w: 2}: Wait for 2 nodes{j: true}: Wait for journal commit (durability)
Sharding
Q: What is sharding in MongoDB and when would you use it?
- Sharding distributes data across multiple machines (shards)
- Used when data exceeds single server capacity
- Horizontal scaling for large datasets and high throughput
- Each shard is a replica set
Q: Explain sharding architecture components.
- Shards: Store data (replica sets)
- Config Servers: Store cluster metadata and chunk mappings
- Mongos (Router): Routes queries to appropriate shards
- Shard Key: Field(s) used to distribute data across shards
Q: How do you choose a shard key?
- Should have high cardinality (many unique values)
- Good write distribution - avoid hotspots
- Supports common query patterns
- Avoid monotonically increasing keys (can cause hot shards)
- Compound shard keys can help balance load
Q: What are the consequences of a bad shard key?
- Hot Shards: Uneven distribution, some shards overloaded
- Jumbo Chunks: Chunks that can't be split (all documents have same shard key value)
- Poor Query Performance: Queries may hit all shards instead of targeted ones
- Manual Rebalancing: May require manual intervention
Q: Explain chunk splitting and balancing in sharding.
- Data is divided into chunks (default 64MB)
- Chunks are distributed across shards
- When chunk grows too large, MongoDB splits it
- Balancer moves chunks between shards to maintain balance
- Configurable chunk size affects balance vs. migration overhead
Transactions & ACID
Q: Does MongoDB support ACID transactions?
- Yes, multi-document transactions from MongoDB 4.0+
- ACID guarantees within a replica set
- ACID transactions in sharded clusters from MongoDB 4.2+
- Previously only single-document atomicity
Q: Explain write atomicity in MongoDB.
- Single-document writes are always atomic
- Multi-document transactions require explicit session
- Operations either fully succeed or fully fail (no partial writes)
- Isolation levels: snapshot isolation for transactions
Q: When should you use transactions vs. embedded documents?
- Transactions: Cross-document updates that must be atomic (e.g., transfer money between accounts)
- Embedded Documents: Related data that's frequently accessed together (e.g., user profile)
- Transactions have performance overhead, use sparingly
- Prefer embedding for better performance when possible
Security
Q: How do you secure MongoDB?
- Authentication: Enable authentication, use strong passwords
- Authorization: Role-based access control (RBAC)
- Network Security: Bind to specific IPs, use VPN/firewalls
- Encryption: Encrypt data at rest and in transit (TLS/SSL)
- Auditing: Enable auditing for compliance
- Regular Updates: Keep MongoDB updated
Q: Explain MongoDB authentication mechanisms.
- SCRAM-SHA-1, SCRAM-SHA-256 (default)
- LDAP authentication
- x.509 certificate authentication
- Kerberos authentication
- MongoDB Atlas uses AWS IAM for authentication
Q: What are MongoDB roles and privileges?
- Built-in roles:
read,readWrite,dbAdmin,userAdmin, etc. - Custom roles: Define specific privileges
- Privileges: actions like
find,insert,update,remove - Resource: database and collection scope
Monitoring & Operations
Q: How do you monitor MongoDB performance?
- MongoDB Atlas monitoring dashboard
mongostatandmongotopcommand-line tools- Profiler for slow queries
db.serverStatus()for server metrics- Third-party tools: Datadog, New Relic, Prometheus
- Key metrics: OPS, connections, memory, CPU, replication lag
Q: What are important MongoDB metrics to monitor?
- Operations: Read/write operations per second
- Connections: Active connections, connection pool usage
- Memory: Working set, resident memory
- CPU: Usage percentage
- Replication Lag: Delay between primary and secondaries
- Disk I/O: Read/write throughput
- Index Hit Ratio: Percentage of queries using indexes
- Cache Hit Ratio: WiredTiger cache efficiency
Q: How do you backup and restore MongoDB?
- mongodump: Creates binary export of database
- mongorestore: Restores from mongodump backup
- mongoexport: Exports data in JSON/CSV format
- mongoimport: Imports from JSON/CSV format
- Point-in-time recovery using oplog
- MongoDB Atlas automated backups
Q: Explain the oplog in MongoDB.
- Operations log - a capped collection that records all write operations
- Used for replication - secondaries read from primary's oplog
- Used for point-in-time recovery
- Limited size - oldest entries removed when full
- Only on replica set primary
Comparison Questions
Q: Compare MongoDB with PostgreSQL/MySQL.
| Aspect | MongoDB | PostgreSQL |
|---|---|---|
| Data Model | Document (NoSQL) | Relational (SQL) |
| Schema | Flexible | Fixed |
| Joins | Application-level | Native SQL joins |
| Transactions | Multi-doc (4.0+) | Full ACID support |
| Scaling | Horizontal (sharding) | Vertical + read replicas |
| Query Language | JavaScript-like | SQL |
| ACID | Per-document (always), Multi-doc (4.0+) | Full ACID |
Q: When would you choose MongoDB vs. other NoSQL databases?
- MongoDB: General-purpose document store, good balance of features
- Cassandra: Write-heavy, high availability, eventual consistency
- Redis: In-memory caching, real-time applications
- Elasticsearch: Full-text search, log analytics
- DynamoDB: AWS-native, serverless, predictable performance
- CouchDB: Offline-first, sync capabilities
Practical Questions
Q: Design a schema for a social media application using MongoDB.
- Users collection: profile, authentication info
- Posts collection: content, author reference, timestamps, embedded comments (if few)
- Comments collection: separate if many per post
- Follows collection: user relationships
- Indexes: user email (unique), post author, timestamp sorting
- Embedding: User profile info in posts for denormalized reads
- References: User IDs for relationships
Q: How would you optimize a slow query in MongoDB?
- Use
explain()to analyze query plan - Check if index is being used
- Create appropriate index if missing
- Optimize query - use projections, add filters
- Consider denormalization if frequently accessed together
- Review document structure
- Check for collection scans
- Monitor query performance after changes
Q: Explain how you would handle a MongoDB migration.
- Use migration scripts (Mongoose migrations, custom scripts)
- Version control schema changes
- Blue-green deployment for zero downtime
- Data transformation scripts for schema updates
- Testing on staging first
- Rollback plan
- Monitor during and after migration
Q: How do you handle schema changes in MongoDB?
- MongoDB doesn't enforce schema, but application does
- Backward compatibility: Add new fields as optional
- Migration scripts for data transformation
- Gradually migrate documents
- Update application code to handle both old and new formats
- Remove old fields after migration complete
Advanced Topics
Q: What is change streams in MongoDB?
- Real-time notifications of data changes
- Available from MongoDB 3.6+
- Can watch specific collections, databases, or entire deployment
- Uses aggregation pipeline for filtering
- Useful for: cache invalidation, triggers, real-time analytics
Q: Explain MongoDB Atlas and its benefits.
- Managed MongoDB service by MongoDB Inc.
- Automatic backups, monitoring, scaling
- Multi-cloud support (AWS, Azure, GCP)
- Global clusters, data residency
- Security features built-in
- No database administration overhead
Q: What is MongoDB Compass?
- GUI tool for MongoDB
- Query interface, schema analysis
- Explain plans visualization
- Index management
- Data import/export
- Performance monitoring
Best Practices
Q: What are MongoDB best practices?
- Schema Design: Embed frequently accessed data, reference for large/independent data
- Indexing: Index frequently queried fields, follow ESR rule
- Queries: Use projections, limit results, avoid full collection scans
- Write Concerns: Balance durability with performance based on use case
- Connection Pooling: Reuse connections, don't create new for each request
- Document Size: Keep under 16MB, avoid large arrays
- Naming: Use descriptive collection and field names
- Monitoring: Set up alerts for key metrics
- Backups: Regular backups, test restore procedures
- Security: Enable authentication, use encrypted connections
Q: What common mistakes should be avoided in MongoDB?
- Creating too many indexes (slows writes)
- Not creating indexes for frequently queried fields
- Storing large arrays that grow unbounded
- Not using connection pooling
- Reading from secondary when consistency is required
- Using
$wherewith JavaScript (slow) - Ignoring write concerns for critical data
- Not monitoring performance
- Poor shard key selection
- Not planning for growth
Coding Questions
Q: Write a query to find all users with age > 25 and salary < 100000.
db.users.find({
age: { $gt: 25 },
salary: { $lt: 100000 },
});
Q: Write an aggregation pipeline to get average salary by department.
db.users.aggregate([
{
$group: {
_id: "$department",
avgSalary: { $avg: "$salary" },
count: { $sum: 1 },
},
},
{ $sort: { avgSalary: -1 } },
]);
Q: Write a query to update user salary and return the updated document.
db.users.findOneAndUpdate(
{ email: "user@example.com" },
{ $inc: { salary: 5000 } },
{ returnDocument: "after" }
);
Q: How would you find users who have not logged in for 30 days?
const thirtyDaysAgo = new Date();
thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);
db.users.find({
lastLogin: { $lt: thirtyDaysAgo },
});