BDSL456B Lab Manual
BDSL456B Lab Manual
BDSL456B Lab Manual
Laboratory Manual
on
MONGODB (BDSL456B)
BANGALORE
2023-2024
COMPILED BY
1. Apply the Artificial Intelligence and Data Analytics skills in the areas of Disaster
Management, Health Care, Education, Agriculture, Transportation, Environment,
Society and in other multi-disciplinary areas.
2. Analyze and demonstrate the knowledge of Human cognition, Artificial
Intelligence, Machine Learning and Data engineering in terms of real-world
problems to meet the challenges of the future.
3. Develop computational knowledge and project management skills using
innovative tools and techniques to solve problems in the areas related to Deep
Learning, Machine learning, Artificial Intelligence which will lead to lifelong
learning.
MongoDB
Course Code BDSL456B CIE Marks 50
Teaching Hours/Week (L:T:P: S) 0:1:2:0 SEE Marks 50
Total Hours of Pedagogy 24 Total Marks 100
Credits 01 Exam Hours 03
Course Learning Objectives:
1. Lecturer method (L) does not mean only traditional lecture method, but different type of
teaching methods may be adopted to develop the outcomes.
2. Show Video/animation films to explain functioning of various concepts.
3. Encourage collaborative (Group Learning) Learning in the class.
4. Ask at least three HOTS (Higher order Thinking) questions in the class, which promotes critical
thinking.
5. Adopt Problem Based Learning (PBL), which fosters students’ Analytical skills, develop thinking
skills such as the ability to evaluate, generalize, and analyze information rather than simply recall it.
6. Topics will be introduced in a multiple representation.
7. Show the different ways to solve the same problem and encourage the students to come up with their
own creative ways to solve them.
8. Discuss how every concept can be applied to the real world - and when that's possible, it helps
improve the students' understanding.
SL # Experiments
1 a. Illustration of Where Clause, AND, OR operations in MongoDB.
b. Execute the Commands of MongoDB and operations in MongoDB: Insert, Query, Update,
Delete and Projection. (Note: use any collection)
[Refer: Book 1 chapter 4].
2 a. Develop a MongoDB query to select certain fields and ignore some fields of the documents
from any collection.
b. Develop a MongoDB query to display the first 5 documents from the results obtained in (a).
[use of limit and find]
[Refe: Book1 Chapter 4, book 2: chapter 5]
3 a. Execute query selectors (comparison selectors, logical selectors) and list out the results on
any collection
b. Execute query selectors (Geospatial selectors, Bitwise selectors) and list out the results on
any collection
[Refer: Book 3 Chapter 13]
4 Create and demonstrate how projection operators ($, $elematch and $slice) would be used in the
MondoDB.
[Refer: Book 3 Chapter 14]
5 Execute Aggregation operations ($avg, $min,$max, $push, $addToSet etc.). students encourage to
execute several queries to demonstrate various aggregation operators)
[Refer: Book 3 Chapter 15]
6 Execute Aggregation Pipeline and its operations (pipeline must contain $match, $group, $sort,
$project, $skip etc. students encourage to execute several queries to demonstrate various
aggregation operators)
[refer book 2: chapter 6 ]
7 a. Find all listings with listing_url, name, address, host_picture_url in the listings And
Reviews collection that have a host with a picture url
b. Using E-commerce collection write a query to display reviews summary.
[refer Book2: chapter 6]
8 a. Demonstrate creation of different types of indexes on collection (unique, sparse, compound
and multikey indexes)
b. Demonstrate optimization of queries using indexes.
Refer: Book 2: Chapter 8 and Book 3: Chapter 12]
9 a. Develop a query to demonstrate Text search using catalog data collection for a given word
b. Develop queries to illustrate excluding documents with certain words and phrases
Refer: Book 2: Chapter 9]
10 Develop an aggregation pipeline to illustrate Text search on Catalog data collection.
Refer: Book 2 :Chapter 9]
NoSQL databases are different than relational databases like MS Sql. In relational
database you need to create the table, define schema, set the data types of fields etc before
you can actually insert the data. In NoSQL you don’t have to worry about that, you can
insert, update data on the fly.
One of the advantages of NoSQL database is that they are really easy to scale and they are
much faster in most types of operations that we perform on database. There are certain
situations where you would prefer relational database over NoSQL, however when you
are dealing with huge amount of data then NoSQL database is your best choice.
1. In relational database we need to define structure and schema of data first and
then only we can process the data.
2. Relational database systems provide consistency and integrity of data by
enforcing ACID properties (Atomicity, Consistency, Isolation and Durability).
There are some scenarios where this is useful like banking system. However in
most of the other cases these properties are significant performance overhead and
can make your database response very slow.
3. Most of the applications store their data in JSON format and RDBMS don’t provide
you a better way of performing operations such as create, insert, update, delete
etc on this data. On the other hand NoSQL store their data in JSON format, which
is compatible with most of the today’s world application.
There are several advantages of working with NoSQL databases such as MongoDB and
Cassandra. The main advantages are high scalability and high availability.
High scalability: NoSQL database such as MongoDB uses sharding for horizontal scaling.
Sharding is partitioning of data and placing it on multiple machines in such a way that the
order of the data is preserved. Vertical scaling means adding more resources to the
existing machine while horizontal scaling means adding more machines to handle the
data. Vertical scaling is not that easy to implement, on the other hand horizontal scaling
is easy to implement. Horizontal scaling database examples: MongoDB, Cassandra etc.
Because of this feature NoSQL can handle huge amount of data, as the data grows NoSQL
scale itself to handle that data in efficient manner.
Here are the types of NoSQL databases and the name of the databases system that falls in
that category. MongoDB falls in the category of NoSQL document based database.
RDBMS Vs NoSQL
RDBMS: It is a structured data that provides more functionality but gives less
performance.
NoSQL: Structured or semi structured data, less functionality and high performance.
These supports actually hinders the scalability of a database, so while using NoSQL
database like MongoDB, you can implement these functionalities at the application level.
When to go for NoSQL
What is MongoDB ?
Not good to use for hierarchical data storage Good to use for hierarchical data storage
Due to Vertically scalability –user can increase Due to Horizontally scalability – user can add
RAM more servers
Highlights on ACID properties (Atomicity, Highlights on CAP theorem (Consistency,
Consistency, Isolation and Durability) Availability and Partition tolerance)
In MongoDB, Querying on this model is easy, since the schema is de-normalized. No joins
are required. So we plan to use the Mongo DB based solution
Introduction to MongoDB
What is a document?
If you came from a relational database background then you can think of them as rows in
RDBMS. The mapping between relational database and MongoDB is covered in the next
tutorial so if you want to know the equivalent of rows, tables, columns in MongoDB, you
should definitely check it: Mapping Relational database to MongoDB. This is a JSON like
structure. Where data is stored in form of key and value pairs.
{
name: "Chaitanya",
age: 30,
website: "beginnersbook.com",
History of MongoDB
MongoDB was created by Eliot and Dwight (founders of DoubleClick) in 2007, when they
faced scalability issues while working with relational database. The organization that
developed MongoDB was originally known as 10gen.
In Feb 2009, they changed their business model and released MongoDB as an open-
source Project. The organization changed its name in 2013 and now known as MongoDB
Inc.
Features of MongoDB
1. MongoDB provides high performance. Most of the operations in the MongoDB are
faster compared to relational databases.
2. MongoDB provides auto replication feature that allows you to quickly recover data
in case of a failure.
3. Horizontal scaling is possible in MongoDB because of sharing. Sharding is
partitioning of data and placing it on multiple machines in such a way that the
order of the data is preserved.
Horizontal scaling vs vertical scaling:
Vertical scaling means adding more resources to the existing machine while
horizontal scaling means adding more machines to handle the data. Vertical
scaling is not that easy to implement, on the other hand horizontal scaling is
easy to implement. Horizontal scaling database examples: MongoDB,
Cassandra etc.
If you are coming from a relational database background then it might be difficult for you
to relate the RDBMS terms with MongoDB. In this guide, we will see the mapping between
relational database and MongoDB.
Collections in MongoDB is equivalent to the tables in RDBMS.
Fields (key and value pairs) are stored in document; documents are stored in collection
and
This is how a document looks in MongoDB: As you can see this is similar to the row in
Here we will see how a table in relational database looks in MongoDB. As you see columns
are represented as key-value pairs (JSON Format), rows are represented as documents.
MongoDB automatically inserts a unique id (12-byte field) field in every document, this
serves as primary key for each document.
Another cool thing about MongoDB is that it supports dynamic schema which means one
document of a collection can have 4 fields while the other document has only 3 fields. This
is not possible in relational database.
Experiment No: 01
BASIC OPERATIONS IN MONGODB
Aim:
a. Illustration of Where Clause, AND, OR operations in MongoDB.
b. Execute the Commands of MongoDB and operations in MongoDB: Insert, Query,
Update, Delete and Projection. (Note: use any collection)
MongoDB Commands:
Insert:
To insert documents into a collection, you use the insertOne() or insertMany() method.
Syntax: db.collection.insertOne({ field1: value1, field2: value2 });
Query:
To retrieve documents from a collection, you use the find() method.
Syntax: db.collection.find({ field: value });
Update:
To update documents in a collection, you use the updateOne() or updateMany() method.
Syntax: db.collection.updateOne({ field: value }, { $set: { fieldToUpdate: newValue } });
Delete:
To delete documents from a collection, you use the deleteOne() or deleteMany()
method.
Syntax: db.collection.deleteOne({ field: value });
Projection:
Projection allows you to specify which fields to include or exclude in the query results.
Syntax: db.collection.find({ field: value }, { fieldToInclude: 1, _id: 0 });
Example:
Let's put all these operations together in an example: Suppose we have a collection
called users with documents representing users:
[
{"id": 1, "name": "Alice", "age": 30, "city": "New York"},
{"id": 2, "name": "Bob", "age": 25, "city": "Los Angeles"},
{"id": 3, "name": "Charlie", "age": 35, "city": "Chicago"}
]
Query:
Find users who are older than 25 and live in either New York or Chicago:
db.users.find({ $and: [ { age: { $gt: 25 } }, { $or: [ { city: "New York" }, { city: "Chicago" } ]
} ] });
This query will return documents for Alice and Charlie.
Update:
Update Bob's age to 27:
db.users.updateOne({ name: "Bob" }, { $set: { age: 27 } });
Delete:
Delete users who are younger than 30:
db.users.deleteMany({ age: { $lt: 30 } });
Projection:
Get only the names of users:
db.users.find({}, { name: 1, _id: 0 });
This will return:
[
{ "name": "Alice" },
{ "name": "Bob" },
{ "name": "Charlie" }
]
Experiment No: 02
Basic Operations
Aim:
a. Develop a MongoDB query to select certain fields and ignore some fields of the
documents from any collection.
b. Develop a MongoDB query to display the first 5 documents from the results
obtained in a. [use of limit and find]
Set the value to 1 for the fields you want to include and _id to 0 if you want to exclude the
default _id field.
For example, if you have a collection named users and you want to select only the name
and email fields while excluding the _id field, you can use the following query:
This query will return all documents from the users collection with only the name and
email fields included.
b.
use the limit() method in MongoDB to limit the number of documents returned by a query.
continuing with the previous example of selecting name and email fields from the users
collection:
db.users.find({}, { name: 1, email: 1, _id: 0 }).limit(5)
This query will return the first 5 documents from the users collection, including only the
name and email fields, and excluding the _id field.
Experiment No: 03
Selectors
Aim:
a. Execute query selectors (comparison selectors, logical selectors) and list out the
results on any collection
b. Execute query selectors (Geospatial selectors, Bitwise selectors) and list out the
results on any collection
Comparison Selectors:
Logical Selectors:
// Find products with price less than $50 or quantity greater than 100
db.products.find({ $or: [ { price: { $lt: 50 } }, { quantity: { $gt: 100 } } ] })
// Find products with price greater than or equal to $100 and quantity less than or equal
to 10
db.products.find({ $and: [ { price: { $gte: 100 } }, { quantity: { $lte: 10 } } ] })
b.
Geospatial Selectors:
Suppose we have a collection named locations where each document represents a
location with coordinates.
// Create a 2d sphere index on the 'location' field for geospatial queries
db.locations.createIndex({ location: "2dsphere" })
// Find locations near a specific point (longitude, latitude) within a specified distance (in
meters)
db.locations.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: [longitude, latitude]
},
$maxDistance: distanceInMeters
}
}
})
Replace longitude, latitude, and distanceInMeters with the coordinates and distance you
want to search around.
Bitwise Selectors:
Suppose we have a collection named permissions where each document represents a
user's permissions stored as bit flags.
Replace <bitmask> with the binary representation of the permissions you want to query
for.
Output
Experiment No: 04
Projection Operators
Aim:
Create and demonstrate how projection operators ($, $elematch and $slice) would be
used in the MongoDB.
let's first create a sample collection named orders with some documents representing
orders, each containing an array of items:
db.orders.insertMany([
{
order_id: 1,
customer_name: "Alice",
items: [
{ name: "iPhone", quantity: 2 },
{ name: "Laptop", quantity: 1 },
{ name: "Headphones", quantity: 3 }
]
},
{
order_id: 2,
customer_name: "Bob",
items: [
{ name: "Tablet", quantity: 1 },
{ name: "Smartwatch", quantity: 2 }
]
}
])
Output
Experiment No: 05
Aggregation Operations
Aim:
Execute Aggregation operations ($avg, $min,$max, $push, $addToSet etc.). students
encourage to execute several queries to demonstrate various aggregation operators)
Let's say we have a collection named "students" with documents structured like this:
{
"_id": 1,
"name": "John",
"age": 20,
"grade": "A"
}
Aggregation operations:
db.students.aggregate([
{
$group: {
_id: null,
averageAge: { $avg: "$age" }
}
}
])
db.students.aggregate([
{
$group: {
_id: null,
minAge: { $min: "$age" }
}
}
])
$max: Finds the maximum age among all students.
db.students.aggregate([
{
$group: {
_id: null,
maxAge: { $max: "$age" }
}
}
])
$push: Groups students by their grades and pushes their names into an array for
each grade.
db.students.aggregate([
{
$group: {
_id: "$grade",
students: { $push: "$name" }
}
}
])
$addToSet: Similar to $push, but ensures unique values in the resulting array.
db.students.aggregate([
{
$group: {
_id: "$grade",
uniqueStudents: { $addToSet: "$name" }
}
}
])
Output
Experiment No: 06
MongoDB Aggregation Pipeline
Aim:
Execute Aggregation Pipeline and its operations (pipeline must contain $match, $group,
$sort, $project, $skip etc. students encourage to execute several queries to demonstrate
various aggregation operators)
db.students.aggregate([
{
$match: { age: 20 }
},
{
$group: {
_id: "$grade",
count: { $sum: 1 },
averageAge: { $avg: "$age" }
}
},
{
$sort: { count: -1 }
},
{
$project: {
_id: 0,
grade: "$_id",
count: 1,
averageAge: 1
}
},
{
$skip: 1
}
])
Explanation:
$group: Groups documents by grade, counting the number of students in each grade and
calculating the average age.
$project: Reshapes the output documents to include only the grade, count, and average
age.
$skip: Skips the first result.
This pipeline will give you aggregated data about students aged 20, grouped by grade,
sorted by the count of students in each grade in descending order, and skipping the first
result.
Output
Experiment No: 07
MongoDB Aggregation Pipeline
Aim:
a. Find all listings with listing_url, name, address, host, picture url in the listings and
Reviews collection that have a host with a picture url
b. Using E-commerce collection write a query to display reviews summary.
To find all listings with listing_url, name, address, host_picture_url in the "Listings And
Reviews" collection that have a host with a picture URL, you can use the MongoDB
aggregation pipeline with the $lookup stage to join the "Listings" and "Hosts" collections.
db.listings.aggregate([
{
$lookup: {
from: "hosts",
localField: "host_id",
foreignField: "_id",
as: "host"
}
},
{
$match: {
"host.host_picture_url": { $exists: true, $ne: null }
}
},
{
$project: {
_id: 0,
listing_url: 1,
name: 1,
address: 1,
host_picture_url: "$host.host_picture_url"
}
}
])
Explanation:
- $lookup: This stage performs a left outer join to the "Hosts" collection based on the
host_id field in the "Listings" collection.
- $match: Filters documents where the host_picture_url exists and is not null.
- $project: Reshapes the output documents to include only the required fields from the
"Listings" collection and the host_picture_url from the "Hosts" collection.
To display reviews summary using an e-commerce collection, you can use MongoDB
aggregation functions to calculate various summary statistics like average rating, total
reviews count, etc. Here's an example query:
db.products.aggregate([
{
$project: {
_id: 0,
name: 1,
average_rating: { $avg: "$reviews.rating" },
total_reviews: { $size: "$reviews" },
five_star_reviews: {
$size: {
$filter: {
input: "$reviews",
cond: { $eq: ["$$this.rating", 5] }
}
}
},
one_star_reviews: {
$size: {
$filter: {
input: "$reviews",
cond: { $eq: ["$$this.rating", 1] }
}
}
}
}
}
])
Explanation:
$project: Projects the required fields and performs calculations on the reviews array.
average_rating: Calculates the average rating using $avg.
total_reviews: Counts the number of reviews using $size.
five_star_reviews: Counts the number of reviews with a rating of 5 using $filter and size.
one_star_reviews: Counts the number of reviews with a rating of 1 using $filter and
$size.
Output
Experiment No: 08
Indexes in MongoDB
Aim:
a. Demonstrate creation of different types of indexes on collection (unique, sparse,
compound and multikey indexes)
b. Demonstrate optimization of queries using indexes.
Unique Index: Ensures that the indexed fields have unique values across the collection.
db.collection.createIndex({ "email": 1 }, { unique: true })
Sparse Index: Indexes only documents that contain the indexed field, omitting
documents that lack the indexed field.
db.collection.createIndex({ "age": 1 }, { sparse: true })
Compound Index: Indexes multiple fields together. Useful for queries that involve
multiple fields.
db.collection.createIndex({ "name": 1, "age": -1 })
Multikey Index: Indexes arrays, allowing queries to efficiently match elements within
arrays.
db.collection.createIndex({ "tags": 1 })
Optimization of Queries Using Indexes:
Optimizing queries involves identifying which fields are commonly used in queries and
creating indexes on those fields. Let's consider a scenario where we have a collection of
books and we frequently query based on the author's name and the publication year.
// Create indexes
db.books.createIndex({ "author": 1 })
db.books.createIndex({ "publication_year": 1 })
// Query optimization
// Example 1: Query by author
db.books.find({ "author": "J.K. Rowling" })
In the above examples, we create indexes on the fields "author" and "publication_year".
These indexes will significantly speed up queries that involve filtering by these fields.
When executing queries, MongoDB's query optimizer will automatically select the most
efficient index to use based on the query predicates. By creating appropriate indexes, we
can ensure that queries execute efficiently, improving overall database performance.
Output
Experiment No: 09
Text Search
Aim:
a. Develop a query to demonstrate Text search using catalog data collection for a
given word
b. Develop queries to illustrate excluding documents with certain words and phrases
Explanation:
• We create a text index on the "description" field to enable text search.
• We perform a text search using the $text operator and the $search operator. This
query will return all documents where the "description" field contains the word
"phone".
Let's say we want to exclude documents that contain certain words or phrases from our
search results. We can achieve this using the $not operator combined with regular
expressions.
// Find documents not containing the phrase "out of stock" in the description
db.catalog.find({ description: { $not: /out of stock/i } })
Explanation:
Output
Experiment No: 10
Text Search
Aim:
Develop an aggregation pipeline to illustrate Text search on Catalog data collection.
To perform text search using the aggregation pipeline in MongoDB, we can use the
$match stage along with the $text operator. Here's how you can develop an aggregation
pipeline to illustrate text search on the "catalog" data collection:
db.catalog.aggregate([
{
$match: {
$text: {
$search: "phone"
}
}
}
])
Explanation:
• We use the $match stage to filter documents based on the text search criteria.
• Within the $match stage, we use the $text operator to perform the text search.
• The $search parameter specifies the word or phrase to search for. In this case,
we're searching for the word "phone".
This aggregation pipeline will return all documents from the "catalog" collection where
the "description" field contains the word "phone". Make sure you have created a text
index on the "description" field in the "catalog" collection before executing this
aggregation pipeline for efficient text search.
Output