Mongodb Cheetsheet

General commands

show dbs
show collections
use <dbname>
db.dropDatabase()
db.users.drop(): drop a collection

CRUD Operations

Insert

db.insertOne({name:"yilmaz"})
db.users.insertMany([{name:"asd"},{name:"anan"}])

Practice Queries with sample-users.csv

The enhanced CSV includes fields for practicing various MongoDB operations:

Fields available:

name, email, city, occupation, department - Strings for text search and filtering
age, salary, rating, score - Numbers for comparison, aggregation, sorting
isActive, isVerified - Booleans for boolean queries
createdAt, lastLogin - Dates (imported as strings; convert with new Date() after import)
13 records with varied values for grouping and aggregation

Query Types You Can Practice:

Basic Queries:
- db.users.find({age: {$gt: 30}})
- db.users.find({isActive: true})
- db.users.find({salary: {$lte: 90000}})
- db.users.find({address.street : ""})
Text Search:
- db.users.find({name: /John/})
- db.users.find({city: "New York"})
- db.users.find({name: {$regex: "Lisa"}})
Comparison Operators:
- $gt, $gte, $lt, $lte, $ne, $in
- db.users.find({rating: {$gte: 4.5}})
- db.users.find({department: {$in: ["Design","Sales"]}}).skip(1).limit(10)
- db.users.find({city: {$ne: "Los Angeles"}})
- db.users.find({salary: {$gt:10000, $lt: 100000}})
- db.users.find({salary: {$gt:10000, $lt: 100000}, age: {$gte: 20}})
Logical Operators:
- $and, $or, $nor, $not
- db.users.find({$and: [{age: {$gt: 25}}, {salary: {$lt: 100000}}]})
- db.users.find({age: {$not: {$gte: 10}}})
Aggregation:
- $group, $sum, $avg, $min, $max, $count
- db.users.aggregate([{$group: {_id: "$department", avgSalary: {$avg: "$salary"}}}])
Sorting:
- db.users.find().sort({salary: -1})
- db.users.find().sort({name: 1, age: -1})
Projection:
- db.users.find({}, {name: 1, email: 1, salary: 1})
- db.users.find({},{name: 0})
Limit and Skip:
- db.users.find().limit(5).skip(2)
Update Operations:
- db.users.updateOne({name: "John Doe"}, {$set: {salary: 100000}})
- db.users.updateMany({department: "Engineering"}, {$inc: {salary: 5000}})
- db.users.updateMany({city: "Los Angeles"},{$set: {city: "istanbul"}})
- db.users.updateMany({},{$inc: {age: 10}})
- db.users.updateMany({},{$rename: {city: "memleket"}})
  , $pull, $push, $unset
- db.users.updateOne({_id: 52},{$push: {tags: 'hasd'}})
- db.users.update({_id: 52},{$pull: {tags: 'tag123'}})
Date Queries (after converting date strings):
- Convert dates: db.users.updateMany({}, [{$set: {createdAt: {$dateFromString: {dateString: "$createdAt"}}}}])
- Then: db.users.find({createdAt: {$gte: ISODate("2023-01-01")}})
Distinct Values:
- db.users.distinct("department")
- db.users.distinct("city")
Count:
- db.users.countDocuments({isActive: true})
- db.users.countDocuments({salary: {$gt: 100000}})
- db.users.countDocuments({name: {$regex: "Lisa"}})
Complext Queries
- db.users.find({$expr: {$gte: ["$age","$score"]}})
Insert

  products = [
  |   { "vw": { "brand": "vw", "ceo": "gd" } },
  |   { "bmw": { "brand": "bmw", "ceo": "asd" } }
  | ]
  |
  [
    { vw: { brand: 'vw', ceo: 'gd' } },
    { bmw: { brand: 'bmw', ceo: 'asd' } }
  ]
  db.products.insertMany(products)

Importing CSV Files

brew install mongodb-database-tools

Verify installation:

mongoimport --version

MongoDB provides mongoimport tool to import CSV files directly into collections:

mongoimport --db <database_name> --collection <collection_name> --type csv --headerline --file <path/to/file.csv>

Options:

--db: Database name
--collection: Collection name
--type csv: Specify CSV format
--headerline: Use first line as field names
--file: Path to CSV file
--drop: Drop collection before importing (optional)
--ignoreBlanks: Ignore blank fields (optional)

Example:

mongoimport --db mydb --collection users --type csv --headerline --file ./users.csv

If CSV doesn't have headers, specify fields:

mongoimport --db mydb --collection users --type csv --fields name,email,age --file ./users.csv

Performance Practice

Generating Large Datasets

For practicing indexing and query optimization, generate large datasets. Use a Node.js script or import public datasets.

Option 1: Data Generation Script

cd mongodb
npm install
npm run generate          # Generate 1M documents
TOTAL_DOCS=100000 npm run generate  # Custom amount

Option 2: Public Datasets

Kaggle Datasets
Faker.js for synthetic data
Import with mongoimport

Creating Indexes

✅ Best Practice: Generate data first, then create indexes

Faster data loading without index maintenance overhead during inserts.

// 1. Generate/import data first
// npm run generate

// 2. Then create indexes on existing data
use performance_db;
db.users.createIndex({ city: 1 });
db.users.createIndex({ age: 1 });
db.users.createIndex({ department: 1, salary: -1 });

// Verify indexes
db.users.getIndexes();

Index Types:

// Single field
db.users.createIndex({ email: 1 });

// Compound (follow ESR rule: Equality → Sort → Range)
db.users.createIndex({ department: 1, salary: -1, age: 1 });

// Multikey (array fields)
db.users.createIndex({ tags: 1 });

// Unique
db.users.createIndex({ email: 1 }, { unique: true });

// Partial (only indexes matching documents)
db.users.createIndex(
  { email: 1 },
  { partialFilterExpression: { isActive: true } }
);

Performance Testing

1. Query without index (baseline):

db.users.find({ city: "New York" }).explain("executionStats");
// Note: executionTimeMillis, totalDocsExamined

2. Create index and compare:

db.users.createIndex({ city: 1 });
db.users.find({ city: "New York" }).explain("executionStats");
// Compare: execution time should be much faster

3. Compound index practice:

db.users.createIndex({ department: 1, salary: -1, age: 1 });
db.users
  .find({ department: "Engineering", salary: { $gt: 100000 } })
  .sort({ age: 1 })
  .explain("executionStats");

4. Aggregation performance:

db.users.aggregate([
  { $match: { isActive: true } },
  {
    $group: {
      _id: "$department",
      avgSalary: { $avg: "$salary" },
      count: { $sum: 1 },
    },
  },
  { $sort: { avgSalary: -1 } },
  { $limit: 5 },
]);

Key Metrics to Track:

executionTimeMillis - Query execution time
totalDocsExamined - Should be close to nReturned with index
executionStages - Prefer IXSCAN (index scan) over COLLSCAN (collection scan)

Common Pitfalls:

Too many indexes slow writes
Wrong compound index order (follow ESR rule)
Not using .explain() before optimizing
Ignoring index size (monitor with db.users.stats().indexSizes)

Remove unused indexes:

db.users.dropIndex("city_1");
db.users.dropIndexes(); // Removes all except _id