MongoDB Manual PDF
MongoDB Manual PDF
MongoDB Manual PDF
Release 2.4.6
Contents
Install MongoDB
1.1 Installation Guides . . . .
1.2 Upgrade MongoDB . . .
1.3 Release Notes . . . . . .
1.4 First Steps with MongoDB
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
23
26
26
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
37
66
91
.
.
.
.
.
.
.
.
Data Models
117
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.2 Data Modeling Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Administration
135
4.1 Administration Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.2 Administration Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
4.3 Administration Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Security
5.1 Security Introduction
5.2 Security Concepts .
5.3 Security Tutorials . .
5.4 Security Reference .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
235
235
237
242
264
Aggregation
6.1 Aggregation Introduction
6.2 Aggregation Concepts . .
6.3 Aggregation Examples . .
6.4 Aggregation Reference . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
275
275
279
290
306
Indexes
7.1 Index Introduction
7.2 Index Concepts . .
7.3 Indexing Tutorials
7.4 Indexing Reference
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
313
313
318
338
374
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
377
377
381
419
467
Sharding
9.1 Sharding Introduction . .
9.2 Sharding Concepts . . . .
9.3 Sharded Cluster Tutorials
9.4 Sharding Reference . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
493
493
498
521
563
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
581
581
584
594
596
600
605
609
613
616
Replication
8.1 Replication Introduction
8.2 Replication Concepts . .
8.3 Replica Set Tutorials . .
8.4 Replication Reference .
11 Reference
621
11.1 MongoDB Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621
11.2 Architecture and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924
11.3 General Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009
12 Release Notes
12.1 Current Stable Release . . . . .
12.2 Previous Stable Releases . . . .
12.3 Current Development Series . .
12.4 Other MongoDB Release Notes
12.5 MongoDB Version Numbers . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1029
1029
1050
1075
1089
1090
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1091
1091
1091
1092
1092
1092
Index
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1107
See About MongoDB Documentation (page 1091) for more information about the MongoDB Documentation project,
this Manual and additional editions of this text.
Contents
Contents
CHAPTER 1
Install MongoDB
Package Options
The MongoDB downloads repository contains two packages:
mongo-10gen-server
This package contains the mongod (page 925) and mongos (page 938) daemons from the latest stable release
and associated configuration and init scripts. Additionally, you can use this package to install daemons from a
previous release (page 4) of MongoDB.
mongo-10gen
This package contains all MongoDB tools from the latest stable release. Additionally, you can use this package
to install tools from a previous release (page 4) of MongoDB. Install this package on all production MongoDB
hosts and optionally on other systems from which you may need to administer MongoDB systems.
Install MongoDB
Configure Package Management System (YUM)
Create a /etc/yum.repos.d/mongodb.repo file to hold the following configuration information for the MongoDB repository:
If you are running a 64-bit system, which is recommended, use the following configuration:
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
If you are running a 32-bit system, which is not recommended for production deployments, use the following configuration:
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/i686/
gpgcheck=0
enabled=1
Install Packages
Issue the following command (as root or with sudo) to install the latest stable version of MongoDB and the associated tools:
yum install mongo-10gen mongo-10gen-server
You can use the mongo-10gen and mongo-10gen-server packages to install previous releases of MongoDB.
To install a specific release, append the version number, as in the following example:
yum install mongo-10gen-2.2.3 mongo-10gen-server-2.2.3
This installs the mongo-10gen and mongo-10gen-server packages with the 2.2.3 release. You can specify
any available version of MongoDB; however yum will upgrade the mongo-10gen and mongo-10gen-server
packages when a newer version becomes available. Use the following pinning procedure to prevent unintended upgrades.
4
Configure MongoDB
These packages configure MongoDB using the /etc/mongod.conf file in conjunction with the control script. You
can find the init script at /etc/rc.d/init.d/mongod.
This MongoDB instance will store its data files in the /var/lib/mongo and its log files in /var/log/mongo,
and run using the mongod user account.
Note: If you change the user that runs the MongoDB process, you will need to modify the access control rights to the
/var/lib/mongo and /var/log/mongo directories.
Control MongoDB
Warning: With the introduction of systemd in Fedora 15, the control scripts included in the packages available
in the MongoDB downloads repository are not compatible with Fedora systems. A correction is forthcoming,
see SERVER-7285a for more information, and in the mean time use your own control scripts or install using the
procedure outlined in Install MongoDB on Linux (page 11).
a https://jira.mongodb.org/browse/SERVER-7285
Start MongoDB
Start the mongod (page 925) process by issuing the following command (as root, or with sudo):
service mongod start
You can verify that the mongod (page 925) process has started successfully by checking the contents of the log file at
/var/log/mongo/mongod.log.
You may optionally, ensure that MongoDB will start following a system reboot, by issuing the following command
(with root privileges:)
chkconfig mongod on
Stop MongoDB
Stop the mongod (page 925) process by issuing the following command (as root, or with sudo):
service mongod stop
Restart MongoDB
You can restart the mongod (page 925) process by issuing the following command (as root, or with sudo):
service mongod restart
Follow the state of this process by watching the output in the /var/log/mongo/mongod.log file to watch for
errors or important messages from the server.
Control mongos
As of the current release, there are no control scripts for mongos (page 938). mongos (page 938) is only used in
sharding deployments and typically do not run on the same systems where mongod (page 925) runs. You can use the
mongodb script referenced above to derive your own mongos (page 938) control script.
SELinux Considerations
You must SELinux to allow MongoDB to start on Fedora systems. Administrators have two options:
enable access to the relevant ports (e.g. 27017) for SELinux. See Configuration Options (page 239) for more
information on MongoDBs default ports.
disable SELinux entirely. This requires a system reboot and may have larger implications for your deployment.
Using MongoDB
Among the tools included in the mongo-10gen package, is the mongo (page 942) shell. You can connect to your
MongoDB instance by issuing the following command at the system prompt:
mongo
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database and then retrieve
that document.
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
Package Options
The MongoDB downloads repository provides the mongodb-10gen package, which contains the latest stable release. Additionally you can install previous releases (page 7) of MongoDB.
You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients packages provided by Ubuntu.
Install MongoDB
Configure Package Management System (APT)
The Ubuntu package management tool (i.e. dpkg and apt) ensure package consistency and authenticity by requiring
that distributors sign packages with GPG keys. Issue the following command to import the MongoDB public GPG
Key1 :
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Install Packages
Issue the following command to install the latest stable version of MongoDB:
sudo apt-get install mongodb-10gen
When this command completes, you have successfully installed MongoDB! Continue for configuration and start-up
suggestions.
Manage Installed Versions
You can use the mongodb-10gen package to install previous versions of MongoDB. To install a specific release,
append the version number to the package name, as in the following example:
1 http://docs.mongodb.org/10gen-gpg-key.asc
This will install the 2.2.3 release of MongoDB. You can specify any available version of MongoDB; however
apt-get will upgrade the mongodb-10gen package when a newer version becomes available. Use the following
pinning procedure to prevent unintended upgrades.
To pin a package, issue the following command at the system prompt to pin the version of MongoDB at the currently
installed version:
echo "mongodb-10gen hold" | sudo dpkg --set-selections
Configure MongoDB
These packages configure MongoDB using the /etc/mongodb.conf file in conjunction with the control script.
You will find the control script is at /etc/init.d/mongodb.
This MongoDB instance will store its data files in the /var/lib/mongodb and its log files in
/var/log/mongodb, and run using the mongodb user account.
Note: If you change the user that runs the MongoDB process, you will need to modify the access control rights to the
/var/lib/mongodb and /var/log/mongodb directories.
Controlling MongoDB
Starting MongoDB
You can start the mongod (page 925) process by issuing the following command:
sudo service mongodb start
You can verify that mongod (page 925) has started successfully by checking the contents of the log file at
/var/log/mongodb/mongodb.log.
Stopping MongoDB
As needed, you may stop the mongod (page 925) process by issuing the following command:
sudo service mongodb stop
Restarting MongoDB
You may restart the mongod (page 925) process by issuing the following command:
sudo service mongodb restart
Controlling mongos
As of the current release, there are no control scripts for mongos (page 938). mongos (page 938) is only used in
sharding deployments and typically do not run on the same systems where mongod (page 925) runs. You can use the
mongodb script referenced above to derive your own mongos (page 938) control script.
Using MongoDB
Among the tools included with the MongoDB package, is the mongo (page 942) shell. You can connect to your
MongoDB instance by issuing the following command at the system prompt:
mongo
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database.
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
Package Options
The downloads repository provides the mongodb-10gen package, which contains the latest stable release. Additionally you can install previous releases (page 10) of MongoDB.
You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients packages that your release of Debian may include.
Install MongoDB
Configure Package Management System (APT)
The Debian package management tool (i.e. dpkg and apt) ensure package consistency and authenticity by requiring
that distributors sign packages with GPG keys. Issue the following command to import the MongoDB public GPG
Key2 :
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10
Install Packages
Issue the following command to install the latest stable version of MongoDB:
sudo apt-get install mongodb-10gen
You can use the mongodb-10gen package to install previous versions of MongoDB. To install a specific release,
append the version number to the package name, as in the following example:
apt-get install mongodb-10gen=2.2.3
This will install the 2.2.3 release of MongoDB. You can specify any available version of MongoDB; however
apt-get will upgrade the mongodb-10gen package when a newer version becomes available. Use the following
pinning procedure to prevent unintended upgrades.
To pin a package, issue the following command at the system prompt to pin the version of MongoDB at the currently
installed version:
echo "mongodb-10gen hold" | sudo dpkg --set-selections
Configure MongoDB
These packages configure MongoDB using the /etc/mongodb.conf file in conjunction with the control script.
You can find the control script at /etc/init.d/mongodb.
This MongoDB instance will store its data files in the /var/lib/mongodb and its log files in
/var/log/mongodb, and run using the mongodb user account.
Note: If you change the user that runs the MongoDB process, you will need to modify the access control rights to the
/var/lib/mongodb and /var/log/mongodb directories.
2 http://docs.mongodb.org/10gen-gpg-key.asc
10
Controlling MongoDB
Starting MongoDB
You can verify that mongod (page 925) has started successfully by checking the contents of the log file at
/var/log/mongodb/mongodb.log.
Stopping MongoDB
Restarting MongoDB
Controlling mongos
As of the current release, there are no control scripts for mongos (page 938). mongos (page 938) is only used in
sharding deployments and typically do not run on the same systems where mongod (page 925) runs. You can use the
mongodb script referenced above to derive your own mongos (page 938) control script.
Using MongoDB
Among the tools included with the MongoDB package, is the mongo (page 942) shell. You can connect to your
MongoDB instance by issuing the following command at the system prompt:
mongo
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database.
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
11
See
Additional installation tutorials:
Install MongoDB on Red Hat Enterprise, CentOS, or Fedora Linux (page 3)
Install MongoDB on Ubuntu (page 6)
Install MongoDB on Debian (page 9)
Install MongoDB on OS X (page 13)
Install MongoDB on Windows (page 16)
Download MongoDB
Note: You should place the MongoDB binaries in a central location on the file system that is easy to access and
control. Consider http://docs.mongodb.org/manualopt or /usr/local/bin.
In a terminal session, begin by downloading the latest release. In most cases you will want to download the 64-bit
version of MongoDB.
curl http://downloads.mongodb.org/linux/mongodb-linux-x86_64-2.4.6.tgz > mongodb.tgz
If you need to run the 32-bit version, use the following command.
curl http://downloads.mongodb.org/linux/mongodb-linux-i686-2.4.6.tgz > mongodb.tgz
Once youve downloaded the release, issue the following command to extract the files from the archive:
tar -zxvf mongodb.tgz
Optional
You may use the following command to copy the extracted folder into a more generic location.
cp -R -n
mongodb-linux-????-??-??/ mongodb
You can find the mongod (page 925) binary, and the binaries all of the associated MongoDB utilities, in the bin/
directory within the extracted directory.
Using MongoDB
Before you start mongod (page 925) for the first time, you will need to create the data directory. By default, mongod
(page 925) writes data to the /data/db/ directory. To create this directory, use the following command:
mkdir -p /data/db
Note: Ensure that the system account that will run the mongod (page 925) process has read and write permissions to
this directory. If mongod (page 925) runs under the mongodb user account, issue the following command to change
the owner of this folder:
chown mongodb /data/db
12
If you use an alternate location for your data directory, ensure that this user can write to your chosen data path.
You can specify, and create, an alternate path using the --dbpath option to mongod (page 925) and the above
command.
The official builds of MongoDB contain no control scripts or method to control the mongod (page 925) process. You
may wish to create control scripts, modify your path, and/or create symbolic links to the MongoDB programs in your
/usr/local/bin or /usr/bin directory for easier use.
For testing purposes, you can start a mongod (page 925) directly in the terminal without creating a control script:
mongod --config /etc/mongod.conf
Note: This command assumes that the mongod (page 925) binary is accessible via your systems search path. You
may use modified form to invoke any mongod (page 925) binary. Furthermore, the command assumes that you
have created a default configuration file located at /etc/mongod.conf. See Run-time Database Configuration
(page 145) for more information on the format of configuration files.
You must mongod (page 925) with a user account that has read and write permissions to the dbpath (page 993).
Among the tools included with this MongoDB distribution, is the mongo (page 942) shell. You can use this shell to
connect to your MongoDB instance by issuing the following command at the system prompt:
./bin/mongo
Note: The ./bin/mongo command assumes that the mongo (page 942) binary is in the bin/ sub-directory of the
current directory. This is the directory into which you extracted the .tgz file.
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database and then retrieve
that record:
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
Synopsis
This tutorial outlines the basic installation process for deploying MongoDB on Macintosh OS X systems. This tutorial
provides two main methods of installing the MongoDB server (i.e. mongod (page 925)) and associated tools: first
using the official MongoDB builds, and second using community package management tools.
See
13
In a terminal session, begin by downloading the latest release. Use the following command at the system prompt:
curl http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.4.6.tgz > mongodb.tgz
Note: The mongod (page 925) process will not run on older Macintosh computers with PowerPC (i.e. non-Intel)
processors.
Once youve downloaded the release, issue the following command to extract the files from the archive:
tar -zxvf mongodb.tgz
Optional
You may use the following command to move the extracted folder into a more generic location.
mv -n mongodb-osx-[platform]-[version]/ /path/to/new/location/
Replace [platform] with i386 or x86_64 depending on your system and the version you downloaded, and
[version] with 2.4 or the version of MongoDB that you are installing.
You can find the mongod (page 925) binary, and the binaries all of the associated MongoDB utilities, in the bin/
directory within the archive.
Use MongoDB from Official Builds
Before you start mongod (page 925) for the first time, you will need to create the data directory. By default, mongod
(page 925) writes data to the /data/db/ directory. To create this directory, and set the appropriate permissions use
the following commands:
sudo mkdir -p /data/db
sudo chown `id -u` /data/db
You can specify an alternate path for data files using the --dbpath option to mongod (page 925).
The official MongoDB builds contain no control scripts or method to control the mongod (page 925) process. You
may wish to create control scripts, modify your path, and/or create symbolic links to the MongoDB programs in your
/usr/local/bin directory for easier use.
For testing purposes, you can start a mongod (page 925) directly in the terminal without creating a control script:
14
Note: This command assumes that the mongod (page 925) binary is accessible via your systems search path. You
may use modified form to invoke any mongod (page 925) binary. Furthermore, the command assumes that you have
created a default configuration file located at /etc/mongod.conf.
See Run-time Database Configuration (page 145) for more information on the format of configuration files.
Among the tools included with this MongoDB distribution, is the mongo (page 942) shell. You can use this shell
to connect to your MongoDB instance by issuing the following command at the system prompt from inside of the
directory where you extracted mongo (page 942):
./bin/mongo
Note: The ./bin/mongo command assumes that the mongo (page 942) binary is in the bin/ sub-directory of the
current directory. This is the directory into which you extracted the .tgz file.
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database and then retrieve
that record:
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
Install with Package Management
Both community package management tools: Homebrew3 and MacPorts4 require some initial setup and configuration.
This configuration is beyond the scope of this document. You only need to use one of these tools.
If you want to use package management, and do not already have a system installed, Homebrew is typically easier and
simpler to use.
Homebrew
Homebrew installs binary packages based on published formula. Issue the following command at the system shell
to update the brew package manager:
brew update
Use the following command to install the MongoDB package into your Homebrew system.
brew install mongodb
Later, if you need to upgrade MongoDB, you can issue the following sequence of commands to update the MongoDB
installation on your system:
brew update
brew upgrade mongodb
3 http://mxcl.github.com/homebrew/
4 http://www.macports.org/
15
MacPorts
MacPorts distributes build scripts that allow you to easily build packages and their dependencies on your own system.
The compilation process can take significant period of time depending on your systems capabilities and existing
dependencies. Issue the following command in the system shell:
port install mongodb
The packages installed with Homebrew and MacPorts contain no control scripts or interaction with the systems
process manager.
If you have configured Homebrew and MacPorts correctly, including setting your PATH, the MongoDB applications
and utilities will be accessible from the system shell. Start the mongod (page 925) process in a terminal (for testing
or development) or using a process management tool.
mongod
Then open the mongo (page 942) shell by issuing the following command at the system prompt:
mongo
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database and then retrieve
that record.
> db.test.save( { a: 1 } )
> db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
16
Download the latest production release of MongoDB from the MongoDB downloads page6 .
There are three builds of MongoDB for Windows:
MongoDB for Windows Server 2008 R2 edition (i.e. 2008R2) only runs on Windows Server 2008 R2, Windows
7 64-bit, and newer versions of Windows. This build takes advantage of recent enhancements to the Windows
Platform and cannot operate on older versions of Windows.
MongoDB for Windows 64-bit runs on any 64-bit version of Windows newer than Windows XP, including
Windows Server 2008 R2 and Windows 7 64-bit.
MongoDB for Windows 32-bit runs on any 32-bit version of Windows newer than Windows XP. 32-bit versions
of MongoDB are only intended for older systems and for use in testing and development systems.
Changed in version 2.2: MongoDB does not support Windows XP. Please use a more recent version of Windows to
use more recent releases of MongoDB.
Note: Always download the correct version of MongoDB for your Windows system. The 64-bit versions of MongoDB will not work with 32-bit Windows.
32-bit versions of MongoDB are suitable only for testing and evaluation purposes and only support databases smaller
than 2GB.
You can find the architecture of your version of Windows platform using the following command in the Command
Prompt:
wmic os get osarchitecture
In Windows Explorer, find the MongoDB download file, typically in the default Downloads directory. Extract the
archive to C:\ by right clicking on the archive and selecting Extract All and browsing to C:\.
Note: The folder name will be either:
C:\mongodb-win32-i386-[version]
Or:
C:\mongodb-win32-x86_64-[version]
Start the Command Prompt by selecting the Start Menu, then All Programs, then Accessories, then right click Command Prompt, and select Run as Administrator from the popup menu. In the Command Prompt, issue the following
commands:
cd \
move C:\mongodb-win32-* C:\mongodb
Note: MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from
any folder you choose. You may install MongoDB in any directory (e.g. D:\test\mongodb)
6 http://www.mongodb.org/downloads
17
MongoDB requires a data folder to store its files. The default location for the MongoDB data directory is
C:\data\db. Create this folder using the Command Prompt. Issue the following command sequence:
md data
md data\db
Note: You may specify an alternate path for \data\db with the dbpath (page 993) setting for mongod.exe
(page 948), as in the following example:
C:\mongodb\bin\mongod.exe --dbpath d:\test\mongodb\data
If your path includes spaces, enclose the entire path in double quotations, for example:
C:\mongodb\bin\mongod.exe --dbpath "d:\test\mongo db data"
Start MongoDB
This will start the main MongoDB database process. The waiting for connections message in the console
output indicates that the mongod.exe process is running successfully.
Note: Depending on the security level of your system, Windows will issue a Security Alert dialog box about blocking
some features of C:\\mongodb\bin\mongod.exe from communicating on networks. All users should select
Private Networks, such as my home or work network and click Allow access. For additional
information on security and MongoDB, please read the Security Concepts (page 237) page.
Warning: Do not allow mongod.exe (page 948) to be accessible to public networks without running in Secure
Mode (i.e. auth (page 993).) MongoDB is designed to be run in trusted environments and the database does
not enable authentication or Secure Mode by default.
Connect to MongoDB using the mongo.exe (page 942) shell. Open another Command Prompt and issue the following command:
C:\mongodb\bin\mongo.exe
Note: Executing the command start C:\mongodb\bin\mongo.exe will automatically start the mongo.exe
shell in a separate Command Prompt window.
The mongo.exe (page 942) shell will connect to mongod.exe (page 948) running on the localhost interface and
port 27017 by default. At the mongo.exe (page 942) prompt, issue the following two commands to insert a record
in the test collection of the default test database and then retrieve that record:
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806). If you want to develop applications using .NET, see the
documentation of C# and MongoDB7 for more information.
7 http://docs.mongodb.org/ecosystem/drivers/csharp
18
You should specify two options when running MongoDB as a Windows Service: a path for the log output (i.e.
logpath (page 992)) and a configuration file (page 990).
1. Create a specific directory for MongoDB log files:
md C:\mongodb\log
2. Create a configuration file for the logpath (page 992) option for MongoDB in the Command Prompt by
issuing this command:
echo logpath=C:\mongodb\log\mongo.log > C:\mongodb\mongod.cfg
While these optional steps are optional, creating a specific location for log files and using the configuration file are
good practice.
Note: Consider setting the logappend (page 992) option. If you do not, mongod.exe (page 948) will delete the
contents of the existing log file when starting.
Changed in version 2.2: The default logpath (page 992) and logappend (page 992) behavior changed in the 2.2
release.
Run all of the following commands in Command Prompt with Administrative Privileges:
1. To install the MongoDB service:
C:\mongodb\bin\mongod.exe --config C:\mongodb\mongod.cfg --install
Modify the path to the mongod.cfg file as needed. For the --install option to succeed, you must specify
a logpath (page 992) setting or the --logpath run-time option.
2. To run the MongoDB service:
net start MongoDB
Note:
If you wish to use an alternate path for your dbpath (page 993) specify it in the config file (e.g.
C:\mongodb\mongod.cfg) on that you specified in the --install operation. You may also specify
--dbpath on the command line; however, always prefer the configuration file.
If the dbpath (page 993) directory does not exist, mongod.exe (page 948) will not be able to start. The default
value for dbpath (page 993) is \data\db.
19
sudo apt-get install libssl0.9.8 snmp snmpd cyrus-sasl2-dbg cyrus-sasl2-mit-dbg libsasl2-2 libsa
CentOS and Red Hat Enterprise Linux 6.x and 5.x, as well as Amazon Linux AMI require:
net-snmp, net-snmp-libs, openssl, net-snmp-utils, cyrus-sasl, cyrus-sasl-lib,
cyrus-sasl-devel, and cyrus-sasl-gssapi. Issue a command such as the following to install these
packages:
sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils cyrus-sasl cyrus-sasl-lib cyrus-s
Before version 2.4.4, the 2.4 versions of MongoDB Enterprise use libgsasl9 . The required packages for the different
distributions are as follows:
Ubuntu 12.04 requires libssl0.9.8, libgsasl, snmp, and snmpd. Issue a command such as the following to install these packages:
8 http://www.mongodb.com/products/mongodb-enterprise
9 http://www.gnu.org/software/gsasl/
20
Red Hat Enterprise Linux 6.x series and Amazon Linux AMI require openssl, libgsasl7, net-snmp,
net-snmp-libs, and net-snmp-utils. To download libgsasl you must enable the EPEL repository
by issuing the following sequence of commands to add and update the system repositories:
sudo rpm -ivh http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
sudo yum update -y
When you have installed and updated the EPEL repositories, issue the following install these packages:
sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils libgsasl
Note: Before 2.4.4, MongoDB Enterprise 2.4 for SUSE requires libgsasl10 which is not available in the default
repositories for SUSE.
Use the sequence of commands below to download and extract MongoDB Enterprise packages appropriate for your
distribution:
Ubuntu 12.04
21
Note: Ensure that the system account that will run the mongod (page 925) process has read and write permissions to
this directory. If mongod (page 925) runs under the mongodb user account, issue the following command to change
the owner of this folder:
chown mongodb /data/db
If you use an alternate location for your data directory, ensure that this user can write to your chosen data path.
You can specify, and create, an alternate path using the --dbpath option to mongod (page 925) and the above
command.
The official builds of MongoDB contain no control scripts or method to control the mongod (page 925) process. You
may wish to create control scripts, modify your path, and/or create symbolic links to the MongoDB programs in your
/usr/local/bin or /usr/bin directory for easier use.
For testing purposes, you can start a mongod (page 925) directly in the terminal without creating a control script:
mongod --config /etc/mongod.conf
Note: This command assumes that the mongod (page 925) binary is accessible via your systems search path. You
may use modified form to invoke any mongod (page 925) binary. Furthermore, the command assumes that you
have created a default configuration file located at /etc/mongod.conf. See Run-time Database Configuration
(page 145) for more information on the format of configuration files.
You must mongod (page 925) with a user account that has read and write permissions to the dbpath (page 993).
Among the tools included with this MongoDB distribution, is the mongo (page 942) shell. You can use this shell to
connect to your MongoDB instance by issuing the following command at the system prompt:
./bin/mongo
Note: The ./bin/mongo command assumes that the mongo (page 942) binary is in the bin/ sub-directory of the
current directory. This is the directory into which you extracted the .tgz file.
22
This will connect to the database running on the localhost interface by default. At the mongo (page 942) prompt, issue
the following two commands to insert a record in the test collection of the (default) test database and then retrieve
that record:
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo (page 942) and mongo Shell Methods (page 806)
Further Reading
As you begin to use MongoDB, consider the Getting Started with MongoDB (page 26) and MongoDB Tutorials
(page 177) resources. To read about features only available in MongoDB Enterprise, consider: Monitor MongoDB
with SNMP (page 171) and Deploy MongoDB with Kerberos Authentication (page 259).
23
24
(b) cfg1.example.net
(c) cfg0.example.net
4. Upgrade each shard.
If a shard is a replica set, upgrade the shard using the procedure below titled Upgrade Replica Sets
(page 25).
If a shard is a standalone instance, upgrade the shard using the procedure below titled Upgrade a MongoDB
Instance (page 24).
5. Re-enable the balancer, as described in Enable the Balancer (page 553).
Upgrade Replica Sets
To upgrade a replica set, upgrade each member individually, starting with the secondaries and finishing with the
primary. Plan the upgrade during a predefined maintenance window.
Upgrade Secondaries
1. Step down the primary to initiate the normal failover (page 396) procedure. Using one of the following:
The rs.stepDown() (page 899) helper in the mongo (page 942) shell.
The replSetStepDown (page 730) database command.
During failover, the set cannot accept writes. Typically this takes 10-20 seconds. Plan the upgrade during a
predefined maintenance window.
Note: Stepping down the primary is preferable to directly shutting down the primary. Stepping down expedites
the failover procedure.
2. Once the primary has stepped down, call the rs.status() (page 898) method from the mongo (page 942)
shell until you see that another member has assumed the PRIMARY state.
3. Shut down the original primary and upgrade its instance by following the instructions below in Upgrade a
MongoDB Instance (page 24).
To upgrade to a new revision of a MongoDB major release, see Upgrade to the Latest Revision of MongoDB (page 23)
25
26
Connect to a Database
In this section, you connect to the database server, which runs as mongod (page 925), and begin using the mongo
(page 942) shell to select a logical database within the database instance and access the help text in the mongo
(page 942) shell.
Connect to a mongod
From a system prompt, start mongo (page 942) by issuing the mongo (page 942) command, as follows:
mongo
By default, mongo (page 942) looks for a database server listening on port 27017 on the localhost interface. To
connect to a server on a different port or interface, use the --port and --host options.
Select a Database
After starting the mongo (page 942) shell your session will use the test database by default. At any time, issue the
following operation at the mongo (page 942) to report the name of the current database:
db
1. From the mongo (page 942) shell, display the list of databases, with the following operation:
show dbs
3. Confirm that your session has the mydb database as context, by checking the value of the db object, which
returns the name of the current database, as follows:
db
At this point, if you issue the show dbs operation again, it will not include the mydb database. MongoDB
will not permanently create a database until you insert data into that database. The Create a Collection and
Insert Documents (page 27) section describes the process for inserting data.
New in version 2.4: show databases also returns a list of databases.
Display mongo Help
At any point, you can access help for the mongo (page 942) shell using the following operation:
help
Furthermore, you can append the .help() method to some JavaScript methods, any cursor object, as well as the db
and db.collection objects to return additional help information.
Create a Collection and Insert Documents
In this section, you insert documents into a new collection named testData within the new database named mydb.
27
MongoDB will create a collection implicitly upon its first use. You do not need to create a collection before inserting
data. Furthermore, because MongoDB uses dynamic schemas (page 582), you also need not specify the structure of
your documents before inserting them into the collection.
1. From the mongo (page 942) shell, confirm you are in the mydb database by issuing the following:
db
2. If mongo (page 942) does not return mydb for the previous operation, set the context to the mydb database,
with the following operation:
use mydb
3. Create two documents named j and k by using the following sequence of JavaScript operations:
j = { name : "mongo" }
k = { x : 3 }
4. Insert the j and k documents into the testData collection with the following sequence of operations:
db.testData.insert( j )
db.testData.insert( k )
When you insert the first document, the mongod (page 925) will create both the mydb database and the
testData collection.
5. Confirm that the testData collection exists. Issue the following operation:
show collections
The mongo (page 942) shell will return the list of the collections in the current (i.e. mydb) database. At this
point, the only collection is testData. All mongod (page 925) databases also have a system.indexes
(page 229) collection.
6. Confirm that the documents exist in the testData collection by issuing a query on the collection using the
find() (page 816) method:
db.testData.find()
This operation returns the following results. The ObjectId (page 103) values will be unique:
{ "_id" : ObjectId("4c2209f9f3924d31102bd84a"), "name" : "mongo" }
{ "_id" : ObjectId("4c2209fef3924d31102bd84b"), "x" : 3 }
All MongoDB documents must have an _id field with a unique value. These operations do not explicitly
specify a value for the _id field, so mongo (page 942) creates a unique ObjectId (page 103) value for the field
before inserting it into the collection.
Insert Documents using a For Loops or JavaScript Function
To perform the remaining procedures in this tutorial, first add more documents to your database using one or both of
the procedures described in Generate Test Data (page 31).
Working with the Cursor
When you query a collection, MongoDB returns a cursor object that contains the results of the query. The mongo
(page 942) shell then iterates over the cursor to display the results. Rather than returning all results at once, the shell
iterates over the cursor 20 times to display the first 20 results and then waits for a request to iterate over the remaining
results. In the shell, use enter it to iterate over the next set of results.
28
The procedures in this section show other ways to work with a cursor. For comprehensive documentation on cursors,
see Iterate the Returned Cursor (page 820).
Iterate over the Cursor with a Loop
Before using this procedure, make sure to add at least 25 documents to a collection using one of the procedures in
Generate Test Data (page 31). You can name your database and collections anything you choose, but this procedure
will assume the database named test and a collection named testData.
1. In the MongoDB JavaScript shell, query the testData collection and assign the resulting cursor object to the
c variable:
var c = db.testData.find()
2. Print the full result set by using a while loop to iterate over the c variable:
while ( c.hasNext() ) printjson( c.next() )
The hasNext() function returns true if the cursor has documents. The next() method returns the next
document. The printjson() method renders the document in a JSON-like format.
The operation displays 20 documents. For example, if the documents have a single field named x, the operation
displays the field as well as each documents ObjectId:
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
ObjectId("51a7dc7b2cacf40b79990be6"),
ObjectId("51a7dc7b2cacf40b79990be7"),
ObjectId("51a7dc7b2cacf40b79990be8"),
ObjectId("51a7dc7b2cacf40b79990be9"),
ObjectId("51a7dc7b2cacf40b79990bea"),
ObjectId("51a7dc7b2cacf40b79990beb"),
ObjectId("51a7dc7b2cacf40b79990bec"),
ObjectId("51a7dc7b2cacf40b79990bed"),
ObjectId("51a7dc7b2cacf40b79990bee"),
ObjectId("51a7dc7b2cacf40b79990bef"),
ObjectId("51a7dc7b2cacf40b79990bf0"),
ObjectId("51a7dc7b2cacf40b79990bf1"),
ObjectId("51a7dc7b2cacf40b79990bf2"),
ObjectId("51a7dc7b2cacf40b79990bf3"),
ObjectId("51a7dc7b2cacf40b79990bf4"),
ObjectId("51a7dc7b2cacf40b79990bf5"),
ObjectId("51a7dc7b2cacf40b79990bf6"),
ObjectId("51a7dc7b2cacf40b79990bf7"),
ObjectId("51a7dc7b2cacf40b79990bf8"),
ObjectId("51a7dc7b2cacf40b79990bf9"),
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
1 }
2 }
3 }
4 }
5 }
6 }
7 }
8 }
9 }
10 }
11 }
12 }
13 }
14 }
15 }
16 }
17 }
18 }
19 }
20 }
The following procedure lets you manipulate a cursor object as if it were an array:
1. In the mongo (page 942) shell, query the testData collection and assign the resulting cursor object to the c
variable:
var c = db.testData.find()
2. To find the document at the array index 4, use the following operation:
29
printjson( c [ 4 ] )
When you access documents in a cursor using the array index notation, mongo (page 942) first calls the
cursor.toArray() method and loads into RAM all documents returned by the cursor. The index is then
applied to the resulting array. This operation iterates the cursor completely and exhausts the cursor.
For very large result sets, mongo (page 942) may run out of available memory.
For more information on the cursor, see Iterate the Returned Cursor (page 820).
Query for Specific Documents
MongoDB has a rich query system that allows you to select and filter the documents in a collection along specific
fields and values. See Query Documents (page 68) and Read Operations (page 39) for a full account of queries in
MongoDB.
In this procedure, you query for specific documents in the testData collection by passing a query document as a
parameter to the find() (page 816) method. A query document specifies the criteria the query must match to return
a document.
In the mongo (page 942) shell, query for all documents where the x field has a value of 18 by passing the { x :
18 } query document as a parameter to the find() (page 816) method:
db.testData.find( { x : 18 } )
With the findOne() (page 824) method you can return a single document from a MongoDB collection. The
findOne() (page 824) method takes the same parameters as find() (page 816), but returns a document rather
than a cursor.
To retrieve one document from the testData collection, issue the following command:
db.testData.findOne()
For more information on querying for documents, see the Query Documents (page 68) and Read Operations (page 39)
documentation.
Limit the Number of Documents in the Result Set
To increase performance, you can constrain the size of the result by limiting the amount of data your application must
receive over the network.
To specify the maximum number of documents in the result set, call the limit() (page 867) method on a cursor, as
in the following command:
db.testData.find().limit(3)
30
MongoDB will return the following result, with different ObjectId (page 103) values:
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 }
The mongo (page 942) shell displays the first 20 documents in the collection. Your ObjectId (page 103) values
will be different:
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
ObjectId("51a7dc7b2cacf40b79990be6"),
ObjectId("51a7dc7b2cacf40b79990be7"),
ObjectId("51a7dc7b2cacf40b79990be8"),
ObjectId("51a7dc7b2cacf40b79990be9"),
ObjectId("51a7dc7b2cacf40b79990bea"),
ObjectId("51a7dc7b2cacf40b79990beb"),
ObjectId("51a7dc7b2cacf40b79990bec"),
ObjectId("51a7dc7b2cacf40b79990bed"),
ObjectId("51a7dc7b2cacf40b79990bee"),
ObjectId("51a7dc7b2cacf40b79990bef"),
ObjectId("51a7dc7b2cacf40b79990bf0"),
ObjectId("51a7dc7b2cacf40b79990bf1"),
ObjectId("51a7dc7b2cacf40b79990bf2"),
ObjectId("51a7dc7b2cacf40b79990bf3"),
ObjectId("51a7dc7b2cacf40b79990bf4"),
ObjectId("51a7dc7b2cacf40b79990bf5"),
ObjectId("51a7dc7b2cacf40b79990bf6"),
ObjectId("51a7dc7b2cacf40b79990bf7"),
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
1 }
2 }
3 }
4 }
5 }
6 }
7 }
8 }
9 }
10 }
11 }
12 }
13 }
14 }
15 }
16 }
17 }
18 }
31
1. The find() (page 816) returns a cursor. To iterate the cursor and return more documents use the it operation
in the mongo (page 942) shell. The mongo (page 942) shell will exhaust the cursor, and return the following
documents:
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
ObjectId("51a7dce92cacf40b79990bfc"),
ObjectId("51a7dce92cacf40b79990bfd"),
ObjectId("51a7dce92cacf40b79990bfe"),
ObjectId("51a7dce92cacf40b79990bff"),
ObjectId("51a7dce92cacf40b79990c00"),
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
21
22
23
24
25
}
}
}
}
}
The insertData() function takes three parameters: a database, a new or existing collection, and the number of
documents to create. The function creates documents with an x field that is set to an incremented integer, as in the
following example documents:
{ "_id" : ObjectId("51a4da9b292904caffcff6eb"), "x" : 0 }
{ "_id" : ObjectId("51a4da9b292904caffcff6ec"), "x" : 1 }
{ "_id" : ObjectId("51a4da9b292904caffcff6ed"), "x" : 2 }
Store the function in your .mongorc.js (page 946) file. The mongo (page 942) shell loads the function for you every
time you start a session.
Example
Specify database name, collection name, and the number of documents to insert as arguments to insertData().
insertData("test", "testData", 400)
This operation inserts 400 documents into the testData collection in the test database. If the collection and
database do not exist, MongoDB creates them implicitly before inserting documents.
Getting Started with MongoDB (page 26)
Generate Test Data (page 31)
MongoDB CRUD Concepts (page 37)
Data Models (page 117)
32
33
34
CHAPTER 2
MongoDB provides rich semantics for reading and manipulating data. CRUD stands for create, read, update, and
delete. These terms are the foundation for all interactions with the database.
MongoDB CRUD Introduction (page 35) An introduction to the MongoDB data model as well as queries and data
manipulations.
MongoDB CRUD Concepts (page 37) The core documentation of query and data manipulation.
MongoDB CRUD Tutorials (page 66) Examples of basic query and data modification operations.
MongoDB CRUD Reference (page 91) Reference material for the query and data manipulation interfaces.
35
36
Figure 2.3: The stages of a MongoDB query with a query criteria and a sort modifier.
Read Preference
For replica sets and sharded clusters with replica set components, applications specify read preferences (page 405). A
read preference determines how the client direct read operations to the set.
Write Concern
Applications can also control the behavior of write operations using write concern (page 55). Particularly useful
for deployments with replica sets, the write concern semantics allow clients to specify the assurance that MongoDB
provides when reporting on the success of a write operation.
Aggregation
In addition to the basic queries, MongoDB provides several data aggregation features. For example, MongoDB can
return counts of the number of documents that match a query, or return the number of distinct values for a field, or
process a collection of documents using a versatile stage-based data processing pipeline or map-reduce operations.
37
38
Distributed Queries (page 46) Describes how sharded clusters and replica sets affect the performance of read
operations.
Write Operations (page 50) Introduces data create and modify operations, their behavior, and performances.
Write Concern (page 55) Describes the kind of guarantee MongoDB provides when reporting on the success
of a write operation.
Distributed Write Operations (page 59) Describes how MongoDB directs write operations on sharded clusters
and replica sets and the performance characteristics of these operations.
Example
39
This query selects the documents in the users collection that match the condition age is greater than 18. To
specify the greater than condition, query criteria uses the greater than (i.e. $gt (page 622)) query selection operator
(page 621). The query returns at most 5 matching documents (or more precisely, a cursor to those documents). The
matching documents will return with only the _id, name and address fields. See Projections (page 40) for details.
See
SQL to MongoDB Mapping Chart (page 98) for additional examples of MongoDB queries and the corresponding SQL
statements.
Query Behavior
MongoDB queries exhibit the following behavior:
All queries in MongoDB address a single collection.
You can modify the query to impose limits (page 867), skips (page 871), and sort orders (page 872).
The order of documents returned by a query is not defined and is not necessarily consistent unless you specify a
sort() (page 872).
Operations that modify existing documents (page 75) (i.e. updates) use the same query syntax as queries to select
documents to update.
In aggregation (page 279) pipeline, the $match (page 666) pipeline stage provides access to MongoDB queries.
MongoDB provides a db.collection.findOne() (page 824) method as a special case of find() (page 816)
that returns a single document.
Query Statements
Consider the following diagram of the query process that specifies a query criteria and a sort modifier:
In the diagram, the query selects documents from the users collection. Using a query selection operator (page 621)
to define the conditions for matching documents, the query selects documents that have age greater than (i.e. $gt
(page 622)) 18. Then the sort() (page 872) modifier sorts the results by age in ascending order.
For additional examples of queries, see Query Documents (page 68).
Projections
Queries in MongoDB return all fields in all matching documents by default. To limit the amount of data that MongoDB
sends to applications, include a projection in the queries. By projecting results with a subset of fields, applications
reduce their network overhead and processing requirements.
Projections, which are the the second argument to the find() (page 816) method, may either specify a list of fields
to return or list fields to exclude in the result documents.
Important:
projections.
Except for excluding the _id field in inclusive projections, you cannot mix exclusive and inclusive
Consider the following diagram of the query process that specifies a query criteria and a projection:
40
Figure 2.7: The stages of a MongoDB query with a query criteria and a sort modifier.
Figure 2.8: The stages of a MongoDB query with a query criteria and projection. MongoDB only transmits the
projected data to the clients.
41
In the diagram, the query selects from the users collection. The criteria matches the documents that have age equal
to 18. Then the projection specifies that only the name field should return in the matching documents.
Projection Examples
This query selects a number of documents in the records collection that match the query { "user_id":
$lt: 42} }, but excludes the history field.
This query selects a number of documents in the records collection that match the query { "user_id": {
$lt: 42} }, but returns documents that have the _id field (implicitly included) as well as the name and email
fields.
Return Two Fields and Exclude _id
db.records.find( { "user_id": { $lt: 42} }, { "_id": 0, "name": 1 , "email": 1 } )
This query selects a number of documents in the records collection that match the query { "user_id":
$lt: 42} }, but only returns the name and email fields.
See
Limit Fields to Return from a Query (page 72) for more examples of queries with projection statements.
Projection Behavior
42
Cursors
In the mongo (page 942) shell, the primary method for the read operation is the db.collection.find()
(page 816) method. This method queries a collection and returns a cursor to the returning documents.
To access the documents, you need to iterate the cursor. However, in the mongo (page 942) shell, if the returned
cursor is not assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times 1
to print up to the first 20 documents in the results.
For example, in the mongo (page 942) shell, the following read operation queries the inventory collection for
documents that have type equal to food and automatically print up to the first 20 matching documents:
db.inventory.find( { type: 'food' } );
To manually iterate the cursor to access the documents, see Iterate a Cursor in the mongo Shell (page 73).
Cursor Behaviors
Closure of Inactive Cursors By default, the server will automatically close the cursor after 10 minutes of inactivity
or if client has exhausted the cursor. To override this behavior, you can specify the noTimeout wire protocol flag2
in your query; however, you should either close the cursor manually or exhaust the cursor. In the mongo (page 942)
shell, you can set the noTimeout flag:
var myCursor = db.inventory.find().addOption(DBQuery.Option.noTimeout);
See your driver (page 95) documentation for information on setting the noTimeout flag. For the mongo (page 942)
shell, see cursor.addOption() (page 858) for a complete list of available cursor flags.
Cursor Isolation Because the cursor is not isolated during its lifetime, intervening write operations on a document
may result in a cursor that returns a document more than once if that document has changed. To handle this situation,
see the information on snapshot mode (page 592).
Cursor Batches The MongoDB server returns the query results in batches. Batch size will not exceed the maximum
BSON document size (page 1015). For most queries, the first batch returns 101 documents or just enough documents to exceed 1 megabyte. Subsequent batch size is 4 megabytes. To override the default size of the batch, see
batchSize() (page 859) and limit() (page 867).
For queries that include a sort operation without an index, the server must load all the documents in memory to perform
the sort and will return all documents in the first batch.
As you iterate through the cursor and reach the end of the returned batch, if there are more results, cursor.next()
(page 870) will perform a getmore operation (page 881) to retrieve the next batch. To see how many documents
remain in the batch as you iterate the cursor, you can use the objsLeftInBatch() (page 871) method, as in the
following example:
var myCursor = db.inventory.find();
var myFirstDocument = myCursor.hasNext() ? myCursor.next() : null;
myCursor.objsLeftInBatch();
1 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries
(page 205) for more information.
2 http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol
43
Cursor Information You can use the command cursorInfo (page 769) to retrieve the following information on
cursors:
total number of open cursors
size of the client cursors in current use
number of timed out cursors since the last server restart
Consider the following example:
db.runCommand( { cursorInfo: 1 } )
Query Optimization
Indexes improve the efficiency of read operations by reducing the amount of data that query operations need to process.
This simplifies the work associated with fulfilling queries within MongoDB.
Create an Index to Support Read Operations If your application queries a collection on a particular field or fields,
then an index on the queried field or fields can prevent the query from scanning the whole collection to find and
return the query results. For more information about indexes, see the complete documentation of indexes in MongoDB
(page 318).
Example
An application queries the inventory collection on the type field. The value of the type field is user-driven.
var typeValue = <someUserInput>;
db.inventory.find( { type: typeValue } );
To improve the performance of this query, add an ascending, or a descending, index to the inventory
3
collection on the type field.
In the mongo (page 942) shell, you can create indexes using the
db.collection.ensureIndex() (page 814) method:
db.inventory.ensureIndex( { type: 1 } )
This index can prevent the above query on type from scanning the whole collection to return the results.
To analyze the performance of the query with an index, see Analyze Query Performance (page 74).
In addition to optimizing read operations, indexes can support sort operations and allow for a more efficient storage utilization. See db.collection.ensureIndex() (page 814) and Indexing Tutorials (page 338) for more
information about index creation.
3 For single-field indexes, the selection between ascending and descending order is immaterial. For compound indexes, the selection is important.
See indexing order (page 323) for more details.
44
Query Selectivity Some query operations are not selective. These operations cannot use indexes effectively or
cannot use indexes at all.
The inequality operators $nin (page 624) and $ne (page 624) are not very selective, as they often match a large
portion of the index. As a result, in most cases, a $nin (page 624) or $ne (page 624) query with an index may
perform no better than a $nin (page 624) or $ne (page 624) query that must scan all documents in a collection.
Queries that specify regular expressions, with inline JavaScript regular expressions or $regex (page 633) operator
expressions, cannot use an index with one exception. Queries that specify regular expression with anchors at the
beginning of a string can use an index.
Covering a Query An index covers (page 368) a query, a covered query, when:
all the fields in the query (page 68) are part of that index, and
all the fields returned in the documents that match the query are in the same index.
For these queries, MongoDB does not need to inspect documents outside of the index. This is often more efficient
than inspecting entire documents.
Example
Given a collection inventory with the following index on the type and item fields:
{ type: 1, item: 1 }
This index will cover the following query on the type and item fields, which returns only the item field:
db.inventory.find( { type: "food", item:/^c/ },
{ item: 1, _id: 0 } )
However, the index will not cover the following query, which returns the item field and the _id field:
db.inventory.find( { type: "food", item:/^c/ },
{ item: 1 } )
See Create Indexes that Support Covered Queries (page 368) for more information on the behavior and use of covered
queries.
Query Plans
The MongoDB query optimizer processes queries and chooses the most efficient query plan for a query given the available indexes. The query system then uses this query plan each time the query runs. The query optimizer occasionally
reevaluates query plans as the content of the collection changes to ensure optimal query plans.
You can use the explain() (page 861) method to view statistics about the query plan for a given query. This
information can help as you develop indexing strategies (page 367).
Query Optimization To create a new query plan, the query optimizer:
1. runs the query against several candidate indexes in parallel.
2. records the matches in a common results buffer or buffers.
If the candidate plans include only ordered query plans, there is a single common results buffer.
If the candidate plans include only unordered query plans, there is a single common results buffer.
45
If the candidate plans include both ordered query plans and unordered query plans, there are two common
results buffers, one for the ordered plans and the other for the unordered plans.
If an index returns a result already returned by another index, the optimizer skips the duplicate match. In the
case of the two buffers, both buffers are de-duped.
3. stops the testing of candidate plans and selects an index when one of the following events occur:
An unordered query plan has returned all the matching results; or
An ordered query plan has returned all the matching results; or
An ordered query plan has returned a threshold number of matching results:
Version 2.0: Threshold is the query batch size. The default batch size is 101.
Version 2.2: Threshold is 101.
The selected index becomes the index specified in the query plan; future iterations of this query or queries with the
same query pattern will use this index. Query pattern refers to query select conditions that differ only in the values, as
in the following two queries with the same query pattern:
db.inventory.find( { type: 'food' } )
db.inventory.find( { type: 'utensil' } )
Query Plan Revision As collections change over time, the query optimizer deletes the query plan and re-evaluates
after any of the following events:
The collection receives 1,000 write operations.
The reIndex (page 756) rebuilds the index.
You add or drop an index.
The mongod (page 925) process restarts.
Distributed Queries
Read Operations to Sharded Clusters Sharded clusters allow you to partition a data set among a cluster of
mongod (page 925) instances in a way that is nearly transparent to the application. For an overview of sharded
clusters, see the Sharding (page 493) section of this manual.
For a sharded cluster, applications issue operations to one of the mongos (page 938) instances associated with the
cluster.
Read operations on sharded clusters are most efficient when directed to a specific shard. Queries to sharded collections
should include the collections shard key (page 506). When a query includes a shard key, the mongos (page 938) can
use cluster metadata from the config database (page 502) to route the queries to shards.
If a query does not include the shard key, the mongos (page 938) must direct the query to all shards in the cluster.
These scatter gather queries can be inefficient. On larger clusters, scatter gather queries are unfeasible for routine
operations.
For more information on read operations in sharded clusters, see the Sharded Cluster Query Routing (page 510) and
Shard Keys (page 506) sections.
Read Operations to Replica Sets Replica sets use read preferences to determine where and how to route read
operations to members of the replica set. By default, MongoDB always reads data from a replica sets primary. You
can modify that behavior by changing the read preference mode (page 489).
46
47
Figure 2.10: Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos
can target the query to the appropriate shard or shards.
48
Figure 2.11: Read operations to a sharded cluster. Query criteria does not include the shard key. The query router
mongos must broadcast query to all shards for the collection.
49
You can configure the read preference mode (page 489) on a per-connection or per-operation basis to allow reads from
secondaries to:
reduce latency in multi-data-center deployments,
improve read throughput by distributing high read-volumes (relative to write volume),
for backup operations, and/or
to allow reads during failover (page 396) situations.
Figure 2.12: Read operations to a replica set. Default read preference routes the read to the primary. Read preference
of nearest routes the read to the nearest member.
Read operations from secondary members of replica sets are not guaranteed to reflect the current state of the primary,
and the state of secondaries will trail the primary by some amount of time. Often, applications dont rely on this kind
of strict consistency, but application developers should always consider the needs of their application before setting
read preference.
For more information on read preference or on the read preference modes, see Read Preference (page 405) and Read
Preference Modes (page 489).
After issuing these modification operations, MongoDB allows applications to determine the level of acknowledgment
returned from the database. See Write Concern (page 55).
Create
Create operations add new documents to a collection. In MongoDB, the db.collection.insert() (page 832)
method perform create operations.
The following diagram highlights the components of a MongoDB insert operation:
Example
db.users.insert(
{
name: "sue",
age: 26,
status: "A"
}
)
This operation inserts a new documents into the users collection. The new document has four fields name, age,
and status, and an _id field. MongoDB always adds the _id field to the new document if the field does not exist.
For more information, see db.collection.insert() (page 832) and Insert Documents (page 67).
51
An upsert is an operation that performs either an update of an existing document or an insert of a new document if the document to modify does not exist. With an upsert, applications do not need to make two separate calls to the database in order to decide between performing an update or an insert operation. Both the
db.collection.update() (page 849) method and the db.collection.save() (page 846) method can
perform an upsert. See db.collection.update() (page 849) and db.collection.save() (page 846)
for details on performing an upsert with these methods.
See
SQL to MongoDB Mapping Chart (page 98) for additional examples of MongoDB write operations and the corresponding SQL statements.
Insert Behavior
If you add a new document without the _id field, the client library or the mongod (page 925) instance adds an _id
field and populates the field with a unique ObjectId.
If you specify the _id field, the value must be unique within the collection. For operations with write concern
(page 55), if you try to create a document with a duplicate _id value, mongod (page 925) returns a duplicate key
exception.
Update
Update operations modify existing documents in a collection. In MongoDB, db.collection.update()
(page 849) and the db.collection.save() (page 846) methods perform update operations.
The
db.collection.update() (page 849) method can accept a query criteria to determine which documents to
update as well as an option to update multiple rows. The method can also accept options that affect its behavior such
as the multi option to update multiple documents.
The following diagram highlights the components of a MongoDB update operation:
52
By default, the db.collection.update() (page 849) method updates a single document. However, with the
multi option, update() (page 849) can update all documents in a collection that match a query.
The db.collection.update() (page 849) method either updates specific fields in the existing document or
replaces the document. See db.collection.update() (page 849) for details.
When performing update operations that increase the document size beyond the allocated space for that document, the
update operation relocates the document on disk and may reorder the document fields depending on the type of update.
The db.collection.save() (page 846) method replaces a document and can only update a single document.
See db.collection.save() (page 846) and Insert Documents (page 67) for more information
Delete
Delete operations remove documents from a collection. In MongoDB, db.collection.remove() (page 844)
method performs delete operations. The db.collection.remove() (page 844) method can accept a query
criteria to determine which documents to remove.
The following diagram highlights the components of a MongoDB remove operation:
53
db.users.remove(
{ status: "D" }
)
This delete operation on the users collection removes all documents that match the criteria of status equal to D.
For more information, see db.collection.remove() (page 844) method and Remove Documents (page 76).
Remove Behavior
By default, db.collection.remove() (page 844) method removes all documents that match its query. However,
the method can accept a flag to limit the delete operation to a single document.
Isolation of Write Operations
The modification of a single document is always atomic, even if the write operation modifies multiple sub-documents
within that document. For write operations that modify multiple documents, the operation as a whole is not atomic,
and other operations may interleave.
No other operations are atomic. You can, however, attempt to isolate a write operation that affects multiple documents
using the isolation operator (page 663).
To isolate a sequence of write operations from other read and write operations, see Perform Two Phase Commits
(page 77).
Related Concepts
The following documents further describe write operations:
Write Concern (page 55) Describes the kind of guarantee MongoDB provides when reporting on the success of a
write operation.
Distributed Write Operations (page 59) Describes how MongoDB directs write operations on sharded clusters and
replica sets and the performance characteristics of these operations.
Write Operation Performance (page 60) Introduces the performance constraints and factors for writing data to MongoDB deployments.
Bulk Inserts in MongoDB (page 64) Describe behaviors associated with inserting an array of documents.
Record Padding (page 65) When storing documents on disk, MongoDB reserves space to allow documents to grow
efficiently during subsequent updates.
54
Write Concern
Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation.
The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak
write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns
may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm
the write operations.
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather
than ensure persistence to the entire deployment.
See also:
Write Concern Reference (page 96) for a reference of specific write concern configuration. Also consider Write
Operations (page 50) for a general overview of write operations with MongoDB and Write Concern for Replica Sets
(page 402) for considerations specific to replica sets.
Note: The driver write concern (page 1089) change created a new connection class in all of the MongoDB drivers.
The new class, called MongoClient change the default write concern. See the release notes (page 1089) for this
change and the release notes for your driver.
Write Concern Levels Clients issue write operations with some level of write concern. MongoDB has the following
levels of conceptual write concern, listed from weakest to strongest:
Errors Ignored With an errors ignored write concern, MongoDB does not acknowledge write operations. With this
level of write concern, the client cannot detect failed write operations. These errors include connection errors and
mongod (page 925) exceptions such as duplicate key exceptions for unique indexes (page 334). Although the errors
ignored write concern provides fast performance, this performance gain comes at the cost of significant risks for data
persistence and durability.
To set errors ignored write concern, specify w values of -1 to your driver.
Warning: Do not use errors ignored write concern in normal operation.
Unacknowledged With an unacknowledged write concern, MongoDB does not acknowledge the receipt of write
operation. Unacknowledged is similar to errors ignored; however, drivers attempt receive and handle network errors
when possible. The drivers ability to detect network errors depends on the systems networking configuration.
To set unacknowledged write concern, specify w values of 0 to your driver.
Before the releases outlined in Default Write Concern Change (page 1089), this was the default write concern.
Acknowledged With a receipt acknowledged write concern, the mongod (page 925) confirms the receipt of the
write operation. Acknowledged write concern allows clients to catch network, duplicate key, and other errors.
To set acknowledged write concern, specify w values of 1 to your driver.
MongoDB uses acknowledged write concern by default, after the releases outlined in Default Write Concern Change
(page 1089).
Internally, the default write concern calls getLastError (page 720) with no arguments. For replica sets,
you can define the default write concern settings in the getLastErrorDefaults (page 483). When
2.2. MongoDB CRUD Concepts
55
Figure 2.19: Write operation to a mongod instance with write concern of unacknowledged. The client does not
wait for any acknowledgment.
Figure 2.20: Write operation to a mongod instance with write concern of acknowledged. The client waits for
acknowledgment of success or exception.
56
getLastErrorDefaults (page 483) does not define a default write concern setting, getLastError (page 720)
defaults to basic receipt acknowledgment.
Journaled With a journaled write concern, the mongod (page 925) confirms the write operation only after committing to the journal. A confirmed journal commit ensures durability, which guarantees that a write operation will survive
a mongod (page 925) shutdown. However, there is a window between journal commits when the write operation is
not fully durable. See journalCommitInterval (page 995) for more information on this window.
To set journaled write concern, specify w values of 1 and set the journal or j option to true to your driver.
Figure 2.21: Write operation to a mongod instance with write concern of journaled. The mongod sends acknowledgment after it commits the write operation to the journal.
Receipt acknowledged without journaled provides the basis for write concern. Require journaled as part of the write
concern to provide this durability guarantee. The mongod (page 925) must have journaling enabled for the journaled
write concern to have effect.
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the
primary of the set regardless of the level of replica acknowledged write concern.
Replica Acknowledged Replica sets add several considerations for write concern. Basic write concerns affect write
operations on only one mongod (page 925) instance. The w argument to getLastError (page 720) provides replica
acknowledged write concerns. With replica acknowledged you can guarantee that the write operation propagates to
the members of a replica set. See Write Concern Reference (page 96) document for the values for w and Write Concern
for Replica Sets (page 402) for more information.
To set replica acknowledged write concern, specify w values greater than 1 to your driver.
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the
primary of the set regardless of the level of replica acknowledged write concern.
57
Figure 2.22: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one
secondary.
58
Write Operations on Sharded Clusters For sharded collections in a sharded cluster, the mongos (page 938)
directs write operations from applications to the shards that are responsible for the specific portion of the data set. The
mongos (page 938) uses the cluster metadata from the config database (page 502) to route the write operation to the
appropriate shards.
59
Figure 2.24: Diagram of the shard key value space segmented into smaller ranges or chunks.
Write Operations on Replica Sets In replica sets, all write operations go to the sets primary, which applies the
write operation then records the operations on the primarys operation log or oplog. The oplog is a reproducible
sequence of operations to the data set. Secondary members of the set are continuously replicating the oplog and
applying the operations to themselves in an asynchronous process.
Large volumes of write operations, particularly bulk operations, may create situations where the secondary members
have difficulty applying the replicating operations from the primary at a sufficient rate: this can cause the secondarys
state to fall behind that of the primary. Secondaries that are significantly behind the primary present problems for
normal operation of the replica set, particularly failover (page 396) in the form of rollbacks (page 401) as well as
general read consistency (page 402).
To help avoid this issue, you can customize the write concern (page 55) to return confirmation of the write operation
to another member 4 of the replica set every 100 or 1,000 operations. This provides an opportunity for secondaries
to catch up with the primary. Write concern can slow the overall progress of write operations but ensure that the
secondaries can maintain a largely current state with respect to the primary.
For more information on replica sets and write operations, see Replica Acknowledged (page 57), Oplog Size (page 411),
and Change the Size of the Oplog (page 446).
Write Operation Performance
Indexes After every insert, update, or delete operation, MongoDB must update every index associated with the
collection in addition to the data itself. Therefore, every index on a collection adds some amount of overhead for the
performance of write operations. 5
4 Calling getLastError (page 720) intermittently with a w value of 2 or majority will slow the throughput of write traffic; however, this
practice will allow the secondaries to remain current with the state of the primary.
5 For inserts and updates to un-indexed fields, the overhead for sparse indexes (page 335) is less than for non-sparse indexes. Also for non-sparse
indexes, updates that do not change the record size have less indexing overhead.
60
Figure 2.25: Diagram of default routing of reads and writes to the primary.
61
Figure 2.26: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one
secondary.
62
In general, the performance gains that indexes provide for read operations are worth the insertion penalty. However,
in order to optimize write performance when possible, be careful when creating new indexes and evaluate the existing
indexes to ensure that your queries actually use these indexes.
For indexes and queries, see Query Optimization (page 44). For more information on indexes, see Indexes (page 313)
and Indexing Strategies (page 367).
Document Growth If an update operation causes a document to exceeds the currently allocated record size, MongoDB relocates the document on disk with enough contiguous space to hold the document. These relocations take
longer than in-place updates, particularly if the collection has indexes. If a collection has indexes, MongoDB must
update all index entries. Thus, for a collection with many indexes, the move will impact the write throughput.
Some update operations, such as the $inc (page 651) operation, do not cause an increase in document size. For these
update operations, MongoDB can apply the updates in-place. Other update operations, such as the $push (page 659)
operation, change the size of the document.
In-place-updates are significantly more efficient than updates that cause document growth. When possible, use data
models (page 117) that minimize the need for document growth.
See Record Padding (page 65) for more information.
Storage Performance
Hardware The capability of the storage system creates some important physical limits for the performance of MongoDBs write operations. Many unique factors related to the storage system of the drive affect write performance,
including random access patterns, disk caches, disk readahead and RAID configurations.
Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by 100 times or more for random workloads.
See
Production Notes (page 218) for recommendations regarding additional hardware and configuration options.
Journaling MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 50) durability and to provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation
to the journal.
While the durability assurance provided by the journal typically outweigh the performance costs of the additional write
operations, consider the following interactions between the journal and performance:
if the journal and the data file reside on the same block device, the data files and the journal may have to contend
for a finite number of available write operations. Moving the journal to a separate device may increase the
capacity for write operations.
if applications specify write concern (page 55) that includes journaled (page 57), mongod (page 925) will
decrease the duration between journal commits, which can increases the overall write load.
the duration between journal commits is configurable using the journalCommitInterval (page 995) runtime option. Decreasing the period between journal commits will increase the number of write operations,
which can limit MongoDBs capacity for write operations. Increasing the amount of time between commits
may decrease the total number of write operation, but also increases the chance that the journal will not record
a write operation in the event of a failure.
For additional information on journaling, see Journaling Mechanics (page 232).
63
64
Reverse the binary bits of the shard key. This preserves the information and avoids correlating insertion order
with increasing sequence of values.
Swap the first and last 16-bit words to shuffle the inserts.
Example
The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so that they
are no longer monotonically increasing.
using namespace mongo;
OID make_an_id() {
OID x = OID::gen();
const unsigned char *p = x.getData();
swap( (unsigned short&) p[0], (unsigned short&) p[10] );
return x;
}
void foo() {
// create an object
BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
// now we may insert o into a sharded collection
}
See also:
Shard Keys (page 506) for information on choosing a sharded key. Also see Shard Key Internals (page 506) (in
particular, Choosing a Shard Key (page 526)).
Record Padding
Update operations can increase the size of the document 6 . If a document outgrows its current allocated record space,
MongoDB must allocate a new space and move the document to this new location.
To reduce the number of moves, MongoDB includes a small amount of extra space, or padding, when allocating the
record space. This padding reduces the likelihood that a slight increase in document size will cause the document to
exceed its allocated record size.
See also:
Write Operation Performance (page 60).
Padding Factor To minimize document movements and their impact, MongoDB employs padding. MongoDB
adaptively adjusts the size of record allocations in a collection by adding a paddingFactor (page 764) so that the
documents have room to grow. The paddingFactor (page 764) indicates the padding for new inserts and moves.
To check the current paddingFactor (page 764) on a collection, you can run the db.collection.stats()
(page 848) operation in the mongo (page 942) shell, as in the following example:
db.myCollection.stats()
Since MongoDB writes each document at a different point in time, the padding for each document will not be the
same. You can calculate the padding size by subtracting 1 from the paddingFactor (page 764), for example:
padding size = (paddingFactor - 1) * <document size>.
6
Documents in MongoDB can grow up to the full maximum BSON document size (page 1015).
65
For example, a paddingFactor (page 764) of 1.0 specifies no padding whereas a paddingFactor of 1.5 specifies
a padding size of 0.5 or 50 percent (50%) of the document size.
Because the paddingFactor (page 764) is relative to the size of each document, you cannot calculate the exact
amount of padding for a collection based on the average document size and padding factor.
If an update operation causes the document to decrease in size, for instance if you perform an $unset (page 655) or
a $pop (page 657) update, the document remains in place and effectively has more padding. If the document remains
this size, the space is not reclaimed until you perform a compact (page 752) or a repairDatabase (page 757)
operation.
Operations That Remove Padding The following operations remove padding: compact (page 752),
repairDatabase (page 757), and initial replica sync operations. However, with the compact (page 752)
command, you can run the command with a paddingFactor or a paddingBytes parameter. See compact
(page 752) command for details.
Padding is also removed if you use mongoexport (page 969) a collection. If you use mongoimport (page 965)
into a new collection, mongoimport (page 965) will not add padding. If you use mongoimport (page 965) with
an existing collection with padding, mongoimport (page 965) will not affect the existing padding.
When a database operation removes padding from a collection, subsequent updates to the collection that increase the
record size will have reduced throughput until the collections padding factor grows. However, the collection will
require less storage.
Record Allocation Strategies New in version 2.2: collMod (page 755) and usePowerOf2Sizes (page 755).
To more efficiently reuse the space freed as a result of deletions or document relocations, you can specify that
MongoDB allocates record sizes in powers of 2. To do so, use the collMod (page 755) command with the
usePowerOf2Sizes (page 755) flag. See collMod (page 755) command for more details. As with all padding,
power of 2 size allocations minimizes, but does not eliminate, document movements.
See also Can I manually pad documents to prevent moves during updates? (page 593)
66
Isolate Sequence of Operations (page 84) Use the <isolation> isolated operator to isolate a single write
operation that affects multiple documents, preventing other operations from interrupting the sequence of write
operations.
Create an Auto-Incrementing Sequence Field (page 86) Describes how to create an incrementing sequence number
for the _id field using a Counters Collection or an Optimistic Loop.
Limit Number of Elements in an Array after an Update (page 89) Use $push with various modifiers to sort and
maintain an array of fixed size after update
In the example, the document has a user-specified _id field value of 10. The value must be unique within the
inventory collection.
For more examples, see insert() (page 832).
Insert a Document with update() Method
Call the update() (page 849) method with the upsert flag to create a new document if no document matches the
updates query criteria. 7
The following example creates a new document if no document in the inventory collection contains { type:
"books", item : "journal" }:
db.inventory.update(
{ type: "book", item : "journal" },
{ $set : { qty: 10 } },
{ upsert : true }
)
MongoDB adds the _id field and assigns as its value a unique ObjectId. The new document includes the item and
type fields from the <query> criteria and the qty field from the <update> parameter.
{ "_id" : ObjectId("51e8636953dbe31d5f34a38a"), "item" : "journal", "qty" : 10, "type" : "book" }
67
MongoDB adds the _id field and assigns as its value a unique ObjectId.
{ "_id" : ObjectId("51e866e48737f72b32ae4fbc"), "type" : "book", "item" : "notebook", "qty" : 40 }
The
This tutorial provides examples of read operations using the db.collection.find() (page 816) method in the
mongo (page 942) shell. In these examples, the retrieved documents contain all their fields. To restrict the fields to
return in the retrieved documents, see Limit Fields to Return from a Query (page 72).
Select All Documents in a Collection
An empty query document ({}) selects all documents in the collection:
db.inventory.find( {} )
Not specifying a query document to the find() (page 816) is equivalent to specifying an empty query document.
Therefore the following operation is equivalent to the previous operation:
db.inventory.find()
The following example retrieves from the inventory collection all documents where the type field has the value
snacks:
db.inventory.find( { type: "snacks" } )
68
Although you can express this query using the $or (page 625) operator, use the $in (page 623) operator rather than
the $or (page 625) operator when performing equality checks on the same field.
Refer to the Operators (page 621) document for the complete list of query operators.
Specify AND Conditions
A compound query can specify conditions for more than one field in the collections documents. Implicitly, a logical
AND conjunction connects the clauses of a compound query so that the query selects the documents in the collection
that match all the conditions.
In the following example, the query document specifies an equality match on the field food and a less than ($lt
(page 623)) comparison match on the field price:
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )
This query selects all documents where the type field has the value food and the value of the price field is less
than 9.95. See comparison operators (page 621) for other comparison operators.
Specify OR Conditions
Using the $or (page 625) operator, you can specify a compound query that joins each clause with a logical OR
conjunction so that the query selects the documents in the collection that match at least one condition.
In the following example, the query document selects all documents in the collection where the field qty has a value
greater than ($gt (page 622)) 100 or the value of the price field is less than ($lt (page 623)) 9.95:
db.inventory.find(
{ $or: [
{ qty: { $gt: 100 } },
{ price: { $lt: 9.95 } }
]
}
)
Subdocuments
When the field holds an embedded document (i.e. subdocument), you can either specify the entire subdocument as
the value of a field, or reach into the subdocument using dot notation, to specify values for individual fields in the
subdocument:
69
To specify an equality match on the whole subdocument, use the query document { <field>: <value> }
where <value> is the subdocument to match. Equality matches on a subdocument require that the subdocument
field match exactly the specified <value>, including the field order.
In the following example, the query matches all documents where the value of the field producer is a subdocument that contains only the field company with the value ABC123 and the field address with the value 123
Street, in the exact order:
db.inventory.find(
{
producer: {
company: 'ABC123',
address: '123 Street'
}
}
)
Equality matches for specific fields within subdocuments select the documents in the collection when the field in the
subdocument contains a field that matches the specified value.
In the following example, the query uses the dot notation to match all documents where the value of the field
producer is a subdocument that contains a field company with the value ABC123 and may contain other fields:
db.inventory.find( { 'producer.company': 'ABC123' } )
Arrays
When the field holds an array, you can query for an exact array match or for specific values in the array. If the array
holds sub-documents, you can query for specific fields within the sub-documents using dot notation:
Exact Match on an Array
To specify equality match on an array, use the query document { <field>: <value> } where <value> is
the array to match. Equality matches on the array require that the array field match exactly the specified <value>,
including the element order.
In the following example, the query matches all documents where the value of the field tags is an array that holds
exactly three elements, fruit, food, and citrus, in this order:
db.inventory.find( { tags: [ 'fruit', 'food', 'citrus' ] } )
Equality matches can specify a single element in the array to match. These specifications match if the array contains
at least one element with the specified value.
In the following example, the query matches all documents where the value of the field tags is an array that contains
fruit as one of its elements:
70
Equality matches can specify equality matches for an element at a particular index or position of the array.
In the following example, the query uses the dot notation to match all documents where the value of the tags field is
an array whose first element equals fruit:
db.inventory.find( { 'tags.0' : 'fruit' } )
Array of Subdocuments
Match a Field in the Subdocument Using the Array Index If you know the array index of the subdocument, you
can specify the document using the subdocuments position.
The following example selects all documents where the memos contains an array whose first element (i.e. index is 0)
is a subdocument with the field by with the value shipping:
db.inventory.find( { 'memos.0.by': 'shipping' } )
Match a Field Without Specifying Array Index If you do not know the index position of the subdocument, concatenate the name of the field that contains the array, with a dot (.) and the name of the field in the subdocument.
The following example selects all documents where the memos field contains an array that contains at least one
subdocument with the field by with the value shipping:
db.inventory.find( { 'memos.by': 'shipping' } )
Match Multiple Fields To match by multiple fields in the subdocument, you can use either dot notation or the
$elemMatch (page 645) operator:
The following example uses dot notation to query for documents where the value of the memos field is an array that has
at least one subdocument that contains the field memo equal to on time and the field by equal to shipping:
db.inventory.find(
{
'memos.memo': 'on time',
'memos.by': 'shipping'
}
)
The following example uses $elemMatch (page 645) to query for documents where the value of the memos field is
an array that has at least one subdocument that contains the field memo equal to on time and the field by equal
to shipping:
db.inventory.find( {
memos: {
$elemMatch: {
memo : 'on time',
by: 'shipping'
}
}
}
)
71
This operation will return all documents in the inventory collection where the value of the type field is food.
The returned documents contain all its fields.
Return the Specified Fields and the _id Field Only
A projection can explicitly include several fields. In the following operation, find() (page 816) method returns all
documents that match the query. In the result set, only the item and qty fields and, by default, the _id field return
in the matching documents.
db.inventory.find( { type: 'food' }, { item: 1, qty: 1 } )
This operation returns all documents that match the query. In the result set, only the item and qty fields return in
the matching documents.
Return All But the Excluded Field
To exclude a single field or group of fields you can use a projection in the following form:
db.inventory.find( { type: 'food' }, { type:0 } )
This operation returns all documents where the value of the type field is food. In the result set, the type field does
not return in the matching documents.
With the exception of the _id field you cannot combine inclusion and exclusion statements in projection documents.
72
You can also use the cursor method next() (page 870) to access the documents, as in the following example:
var myCursor = db.inventory.find( { type: 'food' } );
var myDocument = myCursor.hasNext() ? myCursor.next() : null;
if (myDocument) {
var myItem = myDocument.item;
print(tojson(myItem));
}
As an alternative print operation, consider the printjson() helper method to replace print(tojson()):
if (myDocument) {
var myItem = myDocument.item;
printjson(myItem);
}
You can use the cursor method forEach() (page 866) to iterate the cursor and access the documents, as in the
following example:
var myCursor =
myCursor.forEach(printjson);
See JavaScript cursor methods (page 858) and your driver (page 95) documentation for more information on cursor
methods.
9 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries
(page 205) for more information.
73
Iterator Index
In the mongo (page 942) shell, you can use the toArray() (page 874) method to iterate the cursor and return the
documents in an array, as in the following:
var myCursor = db.inventory.find( { type: 'food' } );
var documentArray = myCursor.toArray();
var myDocument = documentArray[3];
The toArray() (page 874) method loads into RAM all documents returned by the cursor; the toArray()
(page 874) method exhausts the cursor.
Additionally, some drivers (page 95) provide access to the documents by using an index on the cursor (i.e.
cursor[index]). This is a shortcut for first calling the toArray() (page 874) method and then using an index on the resulting array.
Consider the following example:
var myCursor = db.inventory.find( { type: 'food' } );
var myDocument = myCursor[3];
74
"indexBounds" : { "type" : [
[ "food",
"food" ]
] },
"server" : "mongodbo0.example.net:27017" }
The BtreeCursor value of the cursor (page 863) field indicates that the query used an index.
This query returned 5 documents, as indicated by the n (page 864) field.
To return these 5 documents, the query scanned 5 documents from the index, as indicated by the nscanned (page 864)
field, and then read 5 full documents from the collection, as indicated by the nscannedObjects (page 864) field.
Without the index, the query would have scanned the whole collection to return the 5 documents.
See Explain Results (page 861) method for full details on the output.
These return the statistics regarding the execution of the query using the respective index.
Note: If you run explain() (page 861) without including hint() (page 866), the query optimizer reevaluates
the query and runs against multiple indexes before returning the query statistics.
For more detail on the explain output, see Explain Results (page 861).
This shows the syntax for MongoDB 2.2 and later. For syntax for versions prior to 2.2, see update() (page 849).
75
The following example finds all documents with type equal to "book" and modifies their qty field by -1. The
example uses $inc (page 651), which is one of the:ref:update operators <update-operators> available.
db.inventory.update(
{ type : "book" },
{ $inc : { qty : -1 } },
{ multi: true }
)
To remove all documents from a collection, it may be more efficient to use the drop() (page 812) method to drop the entire collection,
including the indexes, and then recreate the collection and rebuild the indexes.
76
Note: For large deletion operations, it may be more efficient to copy the documents that you want to keep to a new
collection and then use drop() (page 812) on the original collection.
77
Pattern
Overview
The most common example of transaction is to transfer funds from account A to B in a reliable way, and this pattern
uses this operation as an example. In a relational database system, this operation would encapsulate subtracting funds
from the source (A) account and adding them to the destination (B) within a single atomic transaction. For MongoDB,
you can use a two-phase commit in these situations to achieve a compatible response.
All of the examples in this document use the mongo (page 942) shell to interact with the database, and assume that
you have two collections: First, a collection named accounts that will store data about accounts with one account
per document, and a collection named transactions which will store the transactions themselves.
Begin by creating two accounts named A and B, with the following command:
db.accounts.save({name: "A", balance: 1000, pendingTransactions: []})
db.accounts.save({name: "B", balance: 1000, pendingTransactions: []})
mongo (page 942) will return two documents that resemble the following:
Transaction Description
Set Transaction State to Initial Create the transaction collection by inserting the following document. The
transaction document holds the source and destination, which refer to the name fields of the accounts
collection, as well as the value field that represents the amount of data change to the balance field. Finally, the
state field reflects the current state of the transaction.
db.transactions.save({source: "A", destination: "B", value: 100, state: "initial"})
Switch Transaction State to Pending Before modifying either records in the accounts collection, set the transaction state to pending from initial.
Set the local variable t in your shell session, to the transaction document using findOne() (page 824):
t = db.transactions.findOne({state: "initial"})
After assigning this variable t, the shell will return the value of t, you will see the following output:
{
"_id" : ObjectId("4d7bc7a8b8a04f5126961522"),
"source" : "A",
"destination" : "B",
"value" : 100,
78
"state" : "initial"
}
The find() (page 816) operation will return the contents of the transactions collection, which should resemble
the following:
Apply Transaction to Both Accounts Continue by applying the transaction to both accounts. The update()
(page 849) query will prevent you from applying the transaction if the transaction is not already pending. Use the
following update() (page 849) operation:
The find() (page 816) operation will return the contents of the accounts collection, which should now resemble
the following:
Set Transaction State to Committed Use the following update() (page 849) operation to set the transactions
state to committed:
db.transactions.update({_id: t._id}, {$set: {state: "committed"}})
db.transactions.find()
The find() (page 816) operation will return the contents of the transactions collection, which should now
resemble the following:
Remove Pending Transaction Use the following update() (page 849) operation to set remove the pending transaction from the documents in the accounts collection:
db.accounts.update({name: t.source}, {$pull: {pendingTransactions: t._id}})
db.accounts.update({name: t.destination}, {$pull: {pendingTransactions: t._id}})
db.accounts.find()
The find() (page 816) operation will return the contents of the accounts collection, which should now resemble
the following:
Set Transaction State to Done Complete the transaction by setting the state of the transaction document to done:
db.transactions.update({_id: t._id}, {$set: {state: "done"}})
db.transactions.find()
79
The find() (page 816) operation will return the contents of the transactions collection, which should now
resemble the following:
The most important part of the transaction procedure is not, the prototypical example above, but rather the possibility
for recovering from the various failure scenarios when transactions do not complete as intended. This section will
provide an overview of possible failures and provide methods to recover from these kinds of events.
There are two classes of failures:
all failures that occur after the first step (i.e. setting the transaction set to initial (page 78)) but before the third
step (i.e. applying the transaction to both accounts (page 79).)
To recover, applications should get a list of transactions in the pending state and resume from the second step
(i.e. switching the transaction state to pending (page 78).)
all failures that occur after the third step (i.e. applying the transaction to both accounts (page 79)) but before
the fifth step (i.e. setting the transaction state to done (page 79).)
To recover, application should get a list of transactions in the committed state and resume from the fourth
step (i.e. remove the pending transaction (page 79).)
Thus, the application will always be able to resume the transaction and eventually arrive at a consistent state. Run
the following recovery operations every time the application starts to catch any unfinished transactions. You may also
wish run the recovery operation at regular intervals to ensure that your data remains consistent.
The time required to reach a consistent state depends, on how long the application needs to recover each transaction.
Rollback In some cases you may need to rollback or undo a transaction when the application needs to cancel
the transaction, or because it can never recover as in cases where one of the accounts doesnt exist, or stops existing
during the transaction.
There are two possible rollback operations:
1. After you apply the transaction (page 79) (i.e. the third step,) you have fully committed the transaction and you
should not roll back the transaction. Instead, create a new transaction and switch the values in the source and
destination fields.
2. After you create the transaction (page 78) (i.e. the first step,) but before you apply the transaction (page 79) (i.e
the third step,) use the following process:
Set Transaction State to Canceling
update() (page 849) operation:
Use the following sequence of operations to undo the transaction operation from both ac-
80
The find() (page 816) operation will return the contents of the accounts collection, which should resemble the
following:
Set Transaction State to Canceled Finally, use the following update() (page 849) operation to set the transactions state to canceled:
Step 3: set the transactions state to canceled:
db.transactions.update({_id: t._id}, {$set: {state: "canceled"}})
Multiple Applications Transactions exist, in part, so that several applications can create and run operations concurrently without causing data inconsistency or conflicts. As a result, it is crucial that only one 1 application can handle
a given transaction at any point in time.
Consider the following example, with a single transaction (i.e. T1) and two applications (i.e. A1 and A2). If both
applications begin processing the transaction which is still in the initial state (i.e. step 1 (page 78)), then:
A1 can apply the entire whole transaction before A2 starts.
A2 will then apply T1 for the second time, because the transaction does not appear as pending in the accounts
documents.
To handle multiple applications, create a marker in the transaction document itself to identify the application that is
handling the transaction. Use findAndModify() (page 821) method to modify the transaction:
t = db.transactions.findAndModify({query: {state: "initial", application: {$exists: 0}},
update: {$set: {state: "pending", application: "A1"}},
new: true})
When you modify and reassign the local shell variable t, the mongo (page 942) shell will return the t object, which
should resemble the following:
{
"_id" : ObjectId("4d7be8af2c10315c0847fc85"),
"application" : "A1",
"destination" : "B",
"source" : "A",
"state" : "pending",
"value" : 150
}
Amend the transaction operations to ensure that only applications that match the identifier in the value of the
application field before applying the transaction.
If the application A1 fails during transaction execution, you can use the recovery procedures (page 80), but applications
should ensure that they owns the transaction before applying the transaction. For example to resume pending jobs,
use a query that resembles the following:
db.transactions.find({application: "A1", state: "pending"})
This will (or may) return a document from the transactions document that resembles the following:
81
82
the cursor returns the document at the end of the collection and then the application deletes those document.
A dead cursor has an id of 0.
See your driver documentation (page 95) for the driver-specific method to specify the tailable cursor. For more
information on the details of specifying a tailable cursor, see MongoDB wire protocol12 documentation.
C++ Example
The tail function uses a tailable cursor to output the results from a query to a capped collection:
The function handles the case of the dead cursor by having the query be inside a loop.
To periodically check for new data, the cursor->more() statement is also inside a loop.
#include "client/dbclient.h"
using namespace mongo;
/*
* Example of a tailable cursor.
* The function "tails" the capped collection (ns) and output elements as they are added.
* The function also handles the possibility of a dead cursor by tracking the field 'insertDate'.
* New documents are added with increasing values of 'insertDate'.
*/
void tail(DBClientBase& conn, const char *ns) {
BSONElement lastValue = minKey.firstElement();
Query query = Query().hint( BSON( "$natural" << 1 ) );
while ( 1 ) {
auto_ptr<DBClientCursor> c =
conn.query(ns, query, 0, 0, 0,
QueryOption_CursorTailable | QueryOption_AwaitData );
while ( 1 ) {
if ( !c->more() ) {
if ( c->isDead() ) {
break;
}
continue;
}
BSONObj o = c->next();
lastValue = o["insertDate"];
cout << o.toString() << endl;
}
query = QUERY( "insertDate" << GT << lastValue ).hint( BSON( "$natural" << 1 ) );
}
}
83
Initialize the lastValue variable, which tracks the last accessed value. The function will use the lastValue
if the cursor becomes invalid and tail needs to restart the query. Use hint() (page 866) to ensure that the
query uses the $natural (page 693) order.
In an outer while(1) loop,
Query the capped collection and return a tailable cursor that blocks for several seconds waiting for new
documents
auto_ptr<DBClientCursor> c =
conn.query(ns, query, 0, 0, 0,
QueryOption_CursorTailable | QueryOption_AwaitData );
* Loop through the outer while (1) loop to re-query with the new query condition and repeat.
See also:
Detailed blog post on tailable cursor13
84
This document describes one method of updating documents only if the local copy of the document reflects the current
state of the document in the database. In addition the following methods provide a way to manage isolated sequences
of operations:
the findAndModify() (page 821) provides an isolated query and modify operation.
Perform Two Phase Commits (page 77)
Create a unique index (page 334), to ensure that a key doesnt exist when you insert it.
Update if Current
In this pattern, you will:
query for a document,
modify the fields in that document
and update the fields of a document only if the fields have not changed in the collection since the query.
Consider the following example in JavaScript which attempts to update the qty field of a document in the products
collection:
1
2
3
4
if (myDocument) {
6
7
8
9
10
11
12
13
14
15
myCollection.update(
{
_id: myDocument._id,
qty: oldQty
},
{
$set: { qty: myDocument.qty }
}
)
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
Your application may require some modifications of this pattern, such as:
85
Use the entire document as the query in lines 18 and 19, to generalize the operation and guarantee that the
original document was not modified, rather than ensuring that as single field was not changed.
Add a version variable to the document that applications increment upon each update operation to the documents.
Use this version variable in the query expression. You must be able to ensure that all clients that connect to your
database obey this constraint.
Use $set (page 655) in the update expression to modify only your fields and prevent overriding other fields.
Use one of the methods described in Create an Auto-Incrementing Sequence Field (page 86).
A Counters Collection
Use a separate counters collection to track the last number sequence used. The _id field contains the sequence
name and the seq field contains the last value of the sequence.
1. Insert into the counters collection, the initial value for the userid:
db.counters.insert(
{
_id: "userid",
seq: 0
}
)
2. Create a getNextSequence function that accepts a name of the sequence. The function uses the
findAndModify() (page 821) method to atomically increment the seq value and return this new value:
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true
}
);
return ret.seq;
}
86
Note: When findAndModify() (page 821) includes the upsert: true option and the query field(s) is not
uniquely indexed, the method could insert a document multiple times in certain circumstances. For instance, if multiple
clients each invoke the method with the same query condition and these methods complete the find phase before any
of methods perform the modify phase, these methods could insert the same document.
In the counters collection example, the query field is the _id field, which always has a unique index. Consider that
the findAndModify() (page 821) includes the upsert: true option, as in the following modified example:
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true,
upsert: true
}
);
return ret.seq;
}
If multiple clients were to invoke the getNextSequence() method with the same name parameter, then the
methods would observe one of the following behaviors:
Exactly one findAndModify() (page 821) would successfully insert a new document.
Zero or more findAndModify() (page 821) methods would update the newly inserted document.
Zero or more findAndModify() (page 821) methods would fail when they attempted to insert a duplicate.
87
If the method fails due to a unique index constraint violation, retry the method. Absent a delete of the document, the
retry should not fail.
Optimistic Loop
In this pattern, an Optimistic Loop calculates the incremented _id value and attempts to insert a document with the
calculated _id value. If the insert is successful, the loop ends. Otherwise, the loop will iterate through possible _id
values until the insert is successful.
1. Create a function named insertDocument that performs the insert if not present loop. The function wraps
the insert() (page 832) method and takes a doc and a targetCollection arguments.
function insertDocument(doc, targetCollection) {
while (1) {
var cursor = targetCollection.find( {}, { _id: 1 } ).sort( { _id: -1 } ).limit(1);
var seq = cursor.hasNext() ? cursor.next()._id + 1 : 1;
doc._id = seq;
targetCollection.insert(doc);
var err = db.getLastErrorObj();
if( err && err.code ) {
if( err.code == 11000 /* dup key */ )
continue;
else
print( "unexpected error inserting data: " + tojson( err ) );
}
break;
}
}
88
The while loop may iterate many times in collections with larger insert volumes.
Pattern
Consider the following document in the collection students:
{
_id: 1,
scores: [
{ attempt: 1, score: 10 },
{ attempt: 2 , score:8 }
]
}
89
The following update uses the $push (page 659) operator with:
the $each (page 660) modifier to append to the array 2 new elements,
the $sort (page 661) modifier to order the elements by ascending (1) score, and
the $slice (page 661) modifier to keep the last 3 elements of the ordered array.
db.students.update(
{ _id: 1 },
{ $push: { scores: { $each : [
{ attempt: 3, score: 7 },
{ attempt: 4, score: 4 }
],
$sort: { score: 1 },
$slice: -3
}
}
}
)
Note: When using the $sort (page 661) modifier on the array element, access the field in the subdocument element
directly instead of using the dot notation on the array field.
After the operation, the document contains the only the top 3 scores in the scores array:
{
"_id" : 1,
"scores" : [
{ "attempt" : 3, "score" : 7 },
{ "attempt" : 2, "score" : 8 },
{ "attempt" : 1, "score" : 10 }
]
}
See also:
$push (page 659) operator,
$each (page 660) modifier,
$sort (page 661) modifier, and
$slice (page 661) modifier.
90
Description
Returns a count of the documents in a cursor.
Reports on the query execution plan, including index use, for a cursor.
Forces MongoDB to use a specific index for a query.
Constrains the size of a cursors result set.
Returns the next document in a cursor.
Returns a cursor that begins returning results only after passing or skipping a
number of documents.
Returns results ordered according to a sort specification.
Returns an array that contains all documents returned by the cursor.
Description
Wraps count (page 695) to return a count of the number of documents in a
collection or matching a query.
Returns an array of documents that have distinct values for the specified field.
Performs a query on a collection and returns a cursor object.
Performs a query and returns a single document.
Creates a new document in a collection.
Deletes documents from a collection.
Provides a wrapper around an insert() (page 832) and update()
(page 849) to insert new documents.
Modifies a document in a collection.
91
Write Concern Reference (page 96) Configuration options associated with the guarantee MongoDB provides when
reporting on the success of a write operation.
SQL to MongoDB Mapping Chart (page 98) An overview of common database operations showing both the MongoDB operations and SQL statements.
ObjectId (page 103) A 12-byte BSON type that MongoDB uses as the default value for its documents _id field if
the _id field is not specified.
BSON Types (page 105) Outlines the unique BSON types used by MongoDB. See BSONspec.org14 for the complete
BSON specification.
MongoDB Extended JSON (page 108) Describes the JSON super set used to express BSON documents in the
mongo (page 942) shell and other MongoDB tools.
GridFS Reference (page 110) Convention for storing large files in a MongoDB Database.
The bios Example Collection (page 111) Sample data for experimenting with MongoDB. insert() (page 832),
update() (page 849) and find() (page 816) pages use the data for some of their examples.
Documents
MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs:
{ "item": "pencil", "qty": 500, "type": "no.2" }
MongoDB stores documents on disk in the BSON serialization format. BSON is a binary representation of JSON
documents, though contains more data types than does JSON. For the BSON spec, see bsonspec.org15 . See also BSON
Types (page 105).
The mongo (page 942) JavaScript shell and the MongoDB language drivers (page 95) translate between BSON and
the language-specific document representation.
Document Structure
MongoDB documents are composed of field-and-value pairs and have the following structure:
{
field1: value1,
field2: value2,
field3: value3,
14 http://bsonspec.org/
15 http://bsonspec.org/
92
...
fieldN: valueN
}
The value of a field can be any of the BSON data types (page 105), including other documents, arrays, and arrays of
documents. The following document contains values of varying types:
var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [ "Turing machine", "Turing test", "Turingery" ],
views : NumberLong(1250000)
}
Field names are strings. Field names cannot contain null characters, dots (.) or dollar signs ($). See Dollar Sign
Operator Escaping (page 589) for an alternate approach.
BSON documents may have more than one field with the same name. Most MongoDB interfaces (page 95), however,
represent MongoDB with a structure (e.g. a hash table) that does not support duplicate field names. If you need to
manipulate documents that have more than one field with the same name, see the driver documentation (page 95) for
your driver.
Some documents created by internal MongoDB processes may have duplicate fields, but no MongoDB process will
ever add duplicate fields to an existing user document.
Document Limitations
93
MongoDB does not make guarantees regarding the order of fields in a BSON document. Drivers and MongoDB
will reorder the fields of a documents upon insertion and following updates.
Most programming languages represent BSON documents with some form of mapping type. Comparisons
between mapping type objects typically, depend on order. As a result, the only way to ensure that two documents
have the same set of field and value pairs is to compare each field and value individually.
The _id Field
Dot Notation
MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument.
To access an element of an array by the zero-based index position, concatenate the array name with the dot (.) and
zero-based index position, and enclose in quotes:
'<array>.<index>'
To access a field of a subdocument with dot-notation, concatenate the subdocument name with the dot (.) and the
field name, and enclose in quotes:
16 http://api.mongodb.org/
94
'<subdocument>.<field>'
See also:
Subdocuments (page 69) for dot notation examples with subdocuments.
Arrays (page 70) for dot notation examples with arrays.
MongoDB Drivers and Client Libraries
An application communicates with MongoDB by way of a client library, called a driver17 , that handles all interaction
with the database in a language appropriate to the application.
Drivers
See the following pages for more information about the MongoDB drivers18 :
JavaScript (Language Center19 , docs20 )
Python (Language Center21 , docs22 )
Ruby (Language Center23 , docs24 )
PHP (Language Center25 , docs26 )
Perl (Language Center27 , docs28 )
Java (Language Center29 , docs30 )
Scala (Language Center31 , docs32 )
C# (Language Center33 , docs34 )
C (Language Center35 , docs36 )
C++ (Language Center37 , docs38 )
Haskell (Language Center39 , docs40 )
17 http://docs.mongodb.org/ecosystem/drivers
18 http://docs.mongodb.org/ecosystem/drivers
19 http://docs.mongodb.org/ecosystem/drivers/javascript
20 http://api.mongodb.org/js/current
21 http://docs.mongodb.org/ecosystem/drivers/python
22 http://api.mongodb.org/python/current
23 http://docs.mongodb.org/ecosystem/drivers/ruby
24 http://api.mongodb.org/ruby/current
25 http://docs.mongodb.org/ecosystem/drivers/php
26 http://php.net/mongo/
27 http://docs.mongodb.org/ecosystem/drivers/perl
28 http://api.mongodb.org/perl/current/
29 http://docs.mongodb.org/ecosystem/drivers/java
30 http://api.mongodb.org/java/current
31 http://docs.mongodb.org/ecosystem/drivers/scala
32 http://api.mongodb.org/scala/casbah/current/
33 http://docs.mongodb.org/ecosystem/drivers/csharp
34 http://api.mongodb.org/csharp/current/
35 http://docs.mongodb.org/ecosystem/drivers/c
36 http://api.mongodb.org/c/current/
37 http://docs.mongodb.org/ecosystem/drivers/cpp
38 http://api.mongodb.org/cplusplus/current/
39 http://hackage.haskell.org/package/mongoDB
40 http://api.mongodb.org/haskell/mongodb
95
Driver version numbers use semantic versioning43 or major.minor.patch versioning system. The first number is the
major version, the second the minor version, and the third indicates a patch.
Example
Driver version numbers.
If your driver has a version number of 2.9.1, 2 is the major version, 9 is minor, and 1 is the patch.
The numbering scheme for drivers differs from the scheme for the MongoDB server. For more information on server
versioning, see MongoDB Version Numbers (page 1090).
Write Concern Reference
Overview
Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation.
The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak
write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns
may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm
the write operations.
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather
than ensure persistence to the entire deployment.
See also:
Write Concern (page 55) for an introduction to write concern in MongoDB.
Available Write Concern
To provide write concern, drivers (page 95) issue the getLastError (page 720) command after a write operation
and receive a document with information about the last operation. This documents err field contains either:
null, which indicates the write operations have completed successfully, or
a description of the last error encountered.
The definition of a successful write depends on the arguments specified to getLastError (page 720), or in replica
sets, the configuration of getLastErrorDefaults (page 483). When deciding the level of write concern for your
application, see the introduction to Write Concern (page 55).
The getLastError (page 720) command has the following options to configure write concern requirements:
41 http://docs.mongodb.org/ecosystem/drivers/erlang
42 http://api.mongodb.org/erlang/mongodb
43 http://semver.org/
96
j or journal option
This option confirms that the mongod (page 925) instance has written the data to the on-disk journal and ensures
data is not lost if the mongod (page 925) instance shuts down unexpectedly. Set to true to enable, as shown
in the following example:
db.runCommand( { getLastError: 1, j: "true" } )
If you set journal (page 995) to true, and the mongod (page 925) does not have journaling enabled, as with
nojournal (page 996), then getLastError (page 720) will provide basic receipt acknowledgment, and
will include a jnote field in its return document.
w option
This option provides the ability to disable write concern entirely as well as specifies the write concern operations
for replica sets. See Write Concern Considerations (page 55) for an introduction to the fundamental concepts
of write concern. By default, the w option is set to 1, which provides basic receipt acknowledgment on a single
mongod (page 925) instance or on the primary in a replica set.
The w option takes the following values:
-1:
Disables all acknowledgment of write operations, and suppresses all errors, including network and socket
errors.
0:
Disables basic acknowledgment of write operations, but returns information about socket exceptions and
networking errors to the application.
Note: If you disable basic write operation acknowledgment but require journal commit acknowledgment,
the journal commit prevails, and the driver will require that mongod (page 925) will acknowledge the
write operation.
1:
Provides acknowledgment of write operations on a standalone mongod (page 925) or the primary in a
replica set.
A number greater than 1:
Guarantees that write operations have propagated successfully to the specified number of replica set members including the primary. If you set w to a number that is greater than the number of set members that
hold data, MongoDB waits for the non-existent members to become available, which means MongoDB
blocks indefinitely.
majority:
Confirms that write operations have propagated to the majority of configured replica set: a majority of the
sets configured members must acknowledge the write operation before it succeeds. This ensures that write
operation will never be subject to a rollback in the course of normal operation, and furthermore allows you
to avoid hard coding assumptions about the size of your replica set into your application.
A tag set:
By specifying a tag set (page 451) you can have fine-grained control over which replica set members must
acknowledge a write operation to satisfy the required level of write concern.
getLastError (page 720) also supports a wtimeout setting which allows clients to specify a timeout for the write
concern: if you dont specify wtimeout, or if you give it a value of 0, and the mongod (page 925) cannot fulfill the
write concern the getLastError (page 720) will block, potentially forever.
97
For more information on write concern and replica sets, see Write Concern for Replica Sets (page 57) for more
information.
In sharded clusters, mongos (page 938) instances will pass write concern on to the shard mongod (page 925) instances.
SQL to MongoDB Mapping Chart
In addition to the charts that follow, you might want to consider the Frequently Asked Questions (page 581) section for
a selection of common questions about MongoDB.
Terminology and Concepts
The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology
and concepts.
SQL Terms/Concepts
database
table
row
column
index
table joins
primary key
Specify any unique column or column combination as
primary key.
aggregation (e.g. group by)
MongoDB Terms/Concepts
database
collection
document or BSON document
field
index
embedded documents and linking
primary key
In MongoDB, the primary key is automatically set to
the _id field.
aggregation pipeline
See the SQL to Aggregation Mapping Chart
(page 309).
Executables
The following table presents the MySQL/Oracle executables and the corresponding MongoDB executables.
Database Server
Database Client
MySQL/Oracle
mysqld/oracle
mysql/sqlplus
MongoDB
mongod (page 925)
mongo (page 942)
Examples
The following table presents the various SQL statements and the corresponding MongoDB statements. The examples
in the table assume the following conditions:
The SQL examples assume a table named users.
The MongoDB examples assume a collection named users that contain documents of the following prototype:
{
_id: ObjectID("509a8fb2f3f4948bd2f983a0"),
user_id: "abc123",
age: 55,
status: 'A'
}
98
Create and Alter The following table presents the various SQL statements related to table-level actions and the
corresponding MongoDB statements.
99
"" } },
100
db.users.drop()
Insert The following table presents the various SQL statements related to inserting records into tables and the corresponding MongoDB statements.
SQL INSERT Statements
Reference
See insert() (page 832) for more
information.
Select The following table presents the various SQL statements related to reading records from tables and the corresponding MongoDB statements.
101
SELECT *
FROM users
db.users.find()
Reference
See find() (page 816) for more information.
See find() (page 816) for more inSELECT id, user_id, status db.users.find(
formation.
FROM users
{ },
{ user_id: 1, status: 1 }
)
SELECT user_id, status
FROM users
SELECT *
FROM users
WHERE status = "A"
db.users.find(
{ status: "A" }
)
SELECT *
FROM users
WHERE status != "A"
SELECT *
FROM users
WHERE status = "A"
AND age = 50
db.users.find(
{ status: "A",
age: 50 }
)
SELECT *
FROM users
WHERE status = "A"
OR age = 50
SELECT *
FROM users
WHERE age > 25
db.users.find(
{ age: { $gt: 25 } }
)
SELECT *
FROM users
WHERE age < 25
db.users.find(
{ age: { $lt: 25 } }
)
SELECT *
FROM users
WHERE age > 25
AND
age <= 50
102
SELECT *
FROM users
WHERE user_id like "%bc%"
Update Records The following table presents the various SQL statements related to updating existing records in
tables and the corresponding MongoDB statements.
SQL Update Statements
UPDATE users
SET status = "C"
WHERE age > 25
UPDATE users
SET age = age + 3
WHERE status = "A"
Reference
See update() (page 849), $gt
db.users.update(
(page 622), and $set (page 655) for
{ age: { $gt: 25 } },
more information.
{ $set: { status: "C" } },
{ multi: true }
)
db.users.update(
{ status: "A" } ,
{ $inc: { age: 3 } },
{ multi: true }
)
Delete Records The following table presents the various SQL statements related to deleting records from tables and
the corresponding MongoDB statements.
SQL Delete Statements
DELETE FROM users
WHERE status = "D"
DELETE FROM users
Reference
See remove() (page 844) for more
db.users.remove( { status: "D" } )
information.
db.users.remove( )
ObjectId
Overview
103
single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also
result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod
(page 925) process.
Also consider the Documents (page 92) section for related information on MongoDBs document orientation.
ObjectId()
The mongo (page 942) shell provides the ObjectId() wrapper class to generate a new ObjectId, and to provide the
following helper attribute and methods:
str
The hexadecimal string value of the ObjectId() object.
getTimestamp() (page 916)
Returns the timestamp portion of the ObjectId() object as a Date.
toString() (page 916)
Returns the string representation of the ObjectId() object. The returned string literal has the
format ObjectId(...).
Changed in version 2.2: In previous versions toString() (page 916) returns the value of the
ObjectId as a hexadecimal string.
valueOf() (page 917)
Returns the value of the ObjectId() object as a hexadecimal string. The returned string is the
str attribute.
Changed in version 2.2: In previous versions valueOf() (page 917) returns the ObjectId()
object.
Examples
Consider the following uses ObjectId() class in the mongo (page 942) shell:
Generate a new ObjectId To generate a new ObjectId, use the ObjectId() constructor with no argument:
x = ObjectId()
To generate a new ObjectId using the ObjectId() constructor with a unique hexadecimal string:
y = ObjectId("507f191e810c19729de860ea")
To return the timestamp of an ObjectId() object, use the getTimestamp() (page 916) method as follows:
104
Convert an ObjectId into a Timestamp To return the timestamp of an ObjectId() object, use the
getTimestamp() (page 916) method as follows:
ObjectId("507f191e810c19729de860ea").getTimestamp()
Convert ObjectIds into Strings Access the str attribute of an ObjectId() object, as follows:
ObjectId("507f191e810c19729de860ea").str
To return the value of an ObjectId() object as a hexadecimal string, use the valueOf() (page 917) method as
follows:
ObjectId("507f191e810c19729de860ea").valueOf()
To return the string representation of an ObjectId() object, use the toString() (page 916) method as follows:
ObjectId("507f191e810c19729de860ea").toString()
BSON Types
BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The
BSON specification is located at bsonspec.org44 .
BSON supports the following data types as values in documents. Each data type has a corresponding number that can
be used with the $type (page 630) operator to query documents by BSON type.
44 http://bsonspec.org/
105
Type
Double
String
Object
Array
Binary data
Object id
Boolean
Date
Null
Regular Expression
JavaScript
Symbol
JavaScript (with scope)
32-bit integer
Timestamp
64-bit integer
Min key
Max key
Number
1
2
3
4
5
7
8
9
10
11
13
14
15
16
17
18
255
127
To determine a fields type, see Check Types in the mongo Shell (page 201).
If you convert BSON to JSON, see Data Type Fidelity (page 150) for more information.
The next sections describe special considerations for particular BSON types.
ObjectId
ObjectIds are: small, likely unique, fast to generate, and ordered. These values consists of 12-bytes, where the first
four bytes are a timestamp that reflect the ObjectIds creation. Refer to the ObjectId (page 103) documentation for
more information.
String
BSON strings are UTF-8. In general, drivers for each programming language convert from the languages string format
to UTF-8 when serializing and deserializing BSON. This makes it possible to store most international characters in
BSON strings with ease. 45 In addition, MongoDB $regex (page 633) queries support UTF-8 in the regex string.
Timestamps
BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date (page 107)
type. Timestamp values are a 64 bit value where:
the first 32 bits are a time_t value (seconds since the Unix epoch)
the second 32 bits are an incrementing ordinal for operations within a given second.
Within a single mongod (page 925) instance, timestamp values are always unique.
In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON
timestamp value.
45 Given strings using UTF-8 character sets, using sort() (page 872) on strings will be reasonably correct. However, because internally
sort() (page 872) uses the C++ strcmp api, the sort order may handle some characters incorrectly.
106
Note: The BSON Timestamp type is for internal MongoDB use. For most cases, in application development, you
will want to use the BSON date type. See Date (page 107) for more information.
If you create a BSON Timestamp using the empty constructor (e.g. new Timestamp()), MongoDB will only
generate a timestamp if you use the constructor in the first field of the document. 46 Otherwise, MongoDB will
generate an empty timestamp value (i.e. Timestamp(0, 0).)
Changed in version 2.1: mongo (page 942) shell displays the Timestamp value with the wrapper:
Timestamp(<time_t>, <ordinal>)
Prior to version 2.1, the mongo (page 942) shell display the Timestamp value as a document:
{ t : <time_t>, i : <ordinal> }
Date
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). The
official BSON specification47 refers to the BSON Date type as the UTC datetime.
Changed in version 2.0: BSON Date type is signed.
48
Example
Construct a Date using the new Date() constructor in the mongo (page 942) shell:
var mydate1 = new Date()
Example
Construct a Date using the ISODate() constructor in the mongo (page 942) shell:
var mydate2 = ISODate()
Example
Return the Date value as string:
mydate1.toString()
Example
Return the month portion of the Date value; months are zero-indexed, so that January is month 0:
mydate1.getMonth()
46
If the first field in the document is _id, then you can generate a timestamp in the second field of a document.
47 http://bsonspec.org/#/specification
48 Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date
fields. Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates
before 1970 are relevant to your application.
107
108
Strict Mode
data_binary
data_date
data_timestamp
JavaScript
(via JSONP)
Mode
Notes
<bindata> is the
{
BinData ( <t>, <bindata> )
base64 representation
"$binary": "<bindata>",
"$binary": "<bindata>",
of a binary string.
"$type": "<t>"
"$type": "<t>"
<t> is the hexadeci}
}
mal representation of
a single byte that indicates the data type.
<date> is the JSON
{
new Date( <date> n
)ew Date ( <date> )
representation of a
"$date": <date>
64-bit signed integer
}
for milliseconds since
epoch UTC (unsigned
before version 1.9.1).
<t> is the JSON rep{
{
Timestamp( <t>, <i> )
resentation of a 32-bit
"$timestamp":
"$timestamp":
unsigned integer for
{
{
seconds since epoch.
"t": <t>,
"t": <t>,
<i> is a 32-bit un"i": <i>
"i": <i>
signed integer for the
}
}
increment.
}
}
<sRegex> is a string
/<jRegex>/<jOptions>
/<jRegex>/<jOptions>
of valid JSON charac"$regex": "<sRegex>",
ters.
"$options": "<sOptions>"
<jRegex>
is
a
}
string that may contain
valid
JSON
characters and unescaped double quote
(") characters, but
may not contain unescaped forward slash
(http://docs.mongodb.o
characters.
<sOptions> is a
string containing the
regex options represented by the letters
of the alphabet.
<jOptions>
is
a string that may
contain only the
characters g, i,
m and s (added
in v1.9).
Because
the
JavaScript
and mongo Shell
representations support a limited range
of options, any nonconforming options
will be dropped when
converting to this
representation.
2.4. MongoDB CRUD Reference
109
<id> is a 24{
{
ObjectId( "<id>" )
data_oid
character hexadecimal
"$oid": "<id>"
"$oid": "<id>"
string.
}
}
data_regex
GridFS Reference
GridFS stores files in two collections:
chunks stores the binary chunks. For details, see The chunks Collection (page 110).
files stores the files metadata. For details, see The files Collection (page 110).
GridFS places the collections in a common bucket by prefixing each with the bucket name. By default, GridFS uses
two collections with names prefixed by fs bucket:
fs.files
fs.chunks
You can choose a different bucket name than fs, and create multiple buckets in a single database.
See also:
GridFS (page 154) for more information about GridFS.
The chunks Collection
Each document in the chunks collection represents a distinct chunk of a file as represented in the GridFS store. The
following is a prototype document from the chunks collection.:
{
"_id" : <string>,
"files_id" : <string>,
"n" : <num>,
"data" : <binary>
}
Each document in the files collection represents a file in the GridFS store. Consider the following prototype of a
document in the files collection:
{
"_id" : <ObjectID>,
"length" : <num>,
"chunkSize" : <num>
"uploadDate" : <timestamp>
"md5" : <hash>
110
"filename" : <string>,
"contentType" : <string>,
"aliases" : <string array>,
"metadata" : <dataObject>,
}
Documents in the files collection contain some or all of the following fields. Applications may create additional
arbitrary fields:
files._id
The unique ID for this document. The _id is of the data type you chose for the original document. The default
type for MongoDB documents is BSON ObjectID.
files.length
The size of the document in bytes.
files.chunkSize
The size of each chunk. GridFS divides the document into chunks of the size specified here. The default size is
256 kilobytes.
files.uploadDate
The date the document was first stored by GridFS. This value has the Date type.
files.md5
An MD5 hash returned from the filemd5 API. This value has the String type.
files.filename
Optional. A human-readable name for the document.
files.contentType
Optional. A valid MIME type for the document.
files.aliases
Optional. An array of alias strings.
files.metadata
Optional. Any additional information you want to store.
The bios Example Collection
The bios collection provides example data for experimenting with MongoDB. Many of this guides examples on
insert (page 832), update (page 849) and read (page 816) operations create or query data from the bios
collection.
The following documents comprise the bios collection. In the examples, the data might be different, as the examples
themselves make changes to the data.
{
"_id" : 1,
"name" : {
"first" : "John",
"last" : "Backus"
},
"birth" : ISODate("1924-12-03T05:00:00Z"),
"death" : ISODate("2007-03-17T04:00:00Z"),
"contribs" : [
"Fortran",
"ALGOL",
"Backus-Naur Form",
111
"FP"
],
"awards" : [
{
"award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Science",
"year" : 1975,
"by" : "National Science Foundation"
},
{
"award" : "Turing Award",
"year" : 1977,
"by" : "ACM"
},
{
"award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering"
}
]
}
{
"_id" : ObjectId("51df07b094c6acd67e492f41"),
"name" : {
"first" : "John",
"last" : "McCarthy"
},
"birth" : ISODate("1927-09-04T04:00:00Z"),
"death" : ISODate("2011-12-24T05:00:00Z"),
"contribs" : [
"Lisp",
"Artificial Intelligence",
"ALGOL"
],
"awards" : [
{
"award" : "Turing Award",
"year" : 1971,
"by" : "ACM"
},
{
"award" : "Kyoto Prize",
"year" : 1988,
"by" : "Inamori Foundation"
},
{
"award" : "National Medal of Science",
"year" : 1990,
"by" : "National Science Foundation"
}
]
}
112
{
"_id" : 3,
"name" : {
"first" : "Grace",
"last" : "Hopper"
},
"title" : "Rear Admiral",
"birth" : ISODate("1906-12-09T05:00:00Z"),
"death" : ISODate("1992-01-01T05:00:00Z"),
"contribs" : [
"UNIVAC",
"compiler",
"FLOW-MATIC",
"COBOL"
],
"awards" : [
{
"award" : "Computer Sciences Man of the Year",
"year" : 1969,
"by" : "Data Processing Management Association"
},
{
"award" : "Distinguished Fellow",
"year" : 1973,
"by" : " British Computer Society"
},
{
"award" : "W. W. McDowell Award",
"year" : 1976,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Technology",
"year" : 1991,
"by" : "United States"
}
]
}
{
"_id" : 4,
"name" : {
"first" : "Kristen",
"last" : "Nygaard"
},
"birth" : ISODate("1926-08-27T04:00:00Z"),
"death" : ISODate("2002-08-10T04:00:00Z"),
"contribs" : [
"OOP",
"Simula"
],
"awards" : [
{
"award" : "Rosing Prize",
"year" : 1999,
"by" : "Norwegian Data Association"
},
{
113
114
},
{
"award" : "NLUUG Award",
"year" : 2003,
"by" : "NLUUG"
}
]
}
{
"_id" : ObjectId("51e062189c6ae665454e301d"),
"name" : {
"first" : "Dennis",
"last" : "Ritchie"
},
"birth" : ISODate("1941-09-09T04:00:00Z"),
"death" : ISODate("2011-10-12T04:00:00Z"),
"contribs" : [
"UNIX",
"C"
],
"awards" : [
{
"award" : "Turing Award",
"year" : 1983,
"by" : "ACM"
},
{
"award" : "National Medal of Technology",
"year" : 1998,
"by" : "United States"
},
{
"award" : "Japan Prize",
"year" : 2011,
"by" : "The Japan Prize Foundation"
}
]
}
{
"_id" : 8,
"name" : {
"first" : "Yukihiro",
"aka" : "Matz",
"last" : "Matsumoto"
},
"birth" : ISODate("1965-04-14T04:00:00Z"),
"contribs" : [
"Ruby"
],
"awards" : [
{
"award" : "Award for the Advancement of Free Software",
"year" : "2011",
"by" : "Free Software Foundation"
}
]
115
}
{
"_id" : 9,
"name" : {
"first" : "James",
"last" : "Gosling"
},
"birth" : ISODate("1955-05-19T04:00:00Z"),
"contribs" : [
"Java"
],
"awards" : [
{
"award" : "The Economist Innovation Award",
"year" : 2002,
"by" : "The Economist"
},
{
"award" : "Officer of the Order of Canada",
"year" : 2007,
"by" : "Canada"
}
]
}
{
"_id" : 10,
"name" : {
"first" : "Martin",
"last" : "Odersky"
},
"contribs" : [
"Scala"
],
}
116
CHAPTER 3
Data Models
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Although you may be
able to use different structures for a single data set in MongoDB, different data models may have significant impacts
on MongoDB and application performance. Consider Data Modeling Considerations for MongoDB Applications
(page 117) for a conceptual overview of data modeling problems in MongoDB, and the Data Modeling Patterns
(page 124) documents for examples of different approaches to data models.
See also:
MongoDB Use Case Studies1 for overviews of application design, including data models, with MongoDB.
3.1 Background
3.1.1 Data Modeling Considerations for MongoDB Applications
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. This means that:
documents in the same collection do not need to have the same set of fields or structure, and
common fields in a collections documents may hold different types of data.
Each document only needs to contain relevant fields to the entity or object that the document represents. In practice,
most documents in a collection share a similar structure. Schema flexibility means that you can model your documents
in MongoDB so that they can closely resemble and reflect application-level objects.
As in all data modeling, when developing data models (i.e. schema designs) for MongoDB, you must consider the
inherent properties and requirements of the application objects and the relationships between application objects.
MongoDB data models must also reflect:
how data will grow and change over time, and
the kinds of queries your application will perform.
These considerations and requirements force developers to make a number of multi-factored decisions when modeling
data, including:
normalization and de-normalization.
1 http://docs.mongodb.org/ecosystem/use-cases
117
These decisions reflect the degree to which the data model should store related pieces of data in a single document. Fully normalized data models describe relationships using references (page 121) between documents,
while de-normalized models may store redundant information across related models.
indexing strategy (page 338).
representation of data in arrays in BSON.
Although a number of data models may be functionally equivalent for a given application, different data models may
have significant impacts on MongoDB and applications performance.
This document provides a high level overview of these data modeling decisions and factors. In addition, consider the
Data Modeling Patterns and Examples (page 121) section which provides more concrete examples of all the discussed
patterns.
Data Modeling Decisions
Data modeling decisions involve determining how to structure the documents to model the data effectively. The
primary decision is whether to embed (page 118) or to use references (page 118).
Embedding
To normalize data, store references (page 121) between two documents to indicate a relationship between the data
represented in each document.
In general, use normalized data models:
118
when embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication.
to represent more complex many-to-many relationships.
to model large hierarchical data sets. See data-modeling-trees.
Referencing provides more flexibility than embedding; however, to resolve the references, client-side applications
must issue follow-up queries. In other words, using references requires more roundtrips to the server.
See Model Referenced One-to-Many Relationships Between Documents (page 126) for an example of referencing.
Atomicity
MongoDB only provides atomic operations on the level of a single document. 2 As a result needs for atomic operations
influence decisions to use embedded or referenced relationships when modeling data for MongoDB.
Embed fields that need to be modified together atomically in the same document. See Model Data for Atomic Operations (page 128) for an example of atomic updates within a single document.
Operational Considerations
In addition to normalization and normalization concerns, a number of other operational factors help shape data modeling decisions in MongoDB. These factors include:
data lifecycle management,
number of collections and
indexing requirements,
sharding, and
managing document growth.
These factors implications for database and application performance as well as future maintenance and development
costs.
Data Lifecycle Management
Data modeling decisions should also take data lifecycle management into consideration.
The Time to Live or TTL feature (page 158) of collections expires documents after a period of time. Consider using
the TTL feature if your application requires some data to persist in the database for a limited period of time.
Additionally, if your application only uses recently inserted documents consider Capped Collections (page 156).
Capped collections provide first-in-first-out (FIFO) management of inserted documents and optimized to support operations that insert and read documents based on insertion order.
Large Number of Collections
In certain situations, you might choose to store information in several collections rather than in a single collection.
Consider a sample collection logs that stores log documents for various environment and applications. The logs
collection contains documents of the following form:
2 Document-level atomic operations include all operations within a single MongoDB document record: operations that affect multiple subdocuments within that single record are still atomic.
3.1. Background
119
If the total number of documents is low you may group documents into collection by type. For logs, consider maintaining distinct log collections, such as logs.dev and logs.debug. The logs.dev collection would contain
only the documents related to the dev environment.
Generally, having large number of collections has no significant performance penalty and results in very good performance. Distinct collections are very important for high-throughput batch processing.
When using models that have a large number of collections, consider the following behaviors:
Each collection has a certain minimum overhead of a few kilobytes.
Each index, including the index on _id, requires at least 8KB of data space.
A single <database>.ns file stores all meta-data for each database. Each index and collection has its own entry
in the namespace file, MongoDB places limits on the size of namespace files (page 1016).
Because of limits on namespaces (page 1015), you may wish to know the current number of namespaces in
order to determine how many additional namespaces the database can support, as in the following example:
db.system.namespaces.count()
The <database>.ns file defaults to 16 MB. To change the size of the <database>.ns file, pass a new size to
--nssize option <new size MB> on server start.
The --nssize sets the size for new <database>.ns files. For existing databases, after starting up the server with
--nssize, run the db.repairDatabase() (page 892) command from the mongo (page 942) shell.
Indexes
Create indexes to support common queries. Generally, indexes and index use in MongoDB correspond to indexes and
index use in relational database: build indexes on fields that appear often in queries and for all operations that return
sorted results. MongoDB automatically creates a unique index on the _id field.
As you create indexes, consider the following behaviors of indexes:
Each index requires at least 8KB of data space.
Adding an index has some negative performance impact for write operations. For collections with high writeto-read ratio, indexes are expensive as each insert must add keys to each index.
Collections with high proportion of read operations to write operations often benefit from additional indexes.
Indexes do not affect un-indexed read operations.
See Indexing Tutorials (page 338) for more information on determining indexes. Additionally, the MongoDB database
profiler (page 167) may help identify inefficient queries.
Sharding
Sharding allows users to partition a collection within a database to distribute the collections documents across a
number of mongod (page 925) instances or shards.
The shard key determines how MongoDB distributes data among shards in a sharded collection. Selecting the proper
shard key (page 506) has significant implications for performance.
See Sharding Introduction (page 493) and Shard Keys (page 506) for more information.
120
Document Growth
Certain updates to documents can increase the document size, such as pushing elements to an array and adding new
fields. If the document size exceeds the allocated space for that document, MongoDB relocates the document on disk.
This internal relocation can be both time and resource consuming.
Although MongoDB automatically provides padding to minimize the occurrence of relocations, you may still need
to manually handle document growth. Refer to the Pre-Aggregated Reports Use Case Study3 for an example of the
Pre-allocation approach to handle document growth.
Data Modeling Patterns and Examples
The following documents provide overviews of various data modeling patterns and common schema design considerations:
Model Embedded One-to-One Relationships Between Documents (page 124)
Model Embedded One-to-Many Relationships Between Documents (page 125)
Model Referenced One-to-Many Relationships Between Documents (page 126)
Model Data for Atomic Operations (page 128)
Model Tree Structures with Parent References (page 129)
Model Tree Structures with Child References (page 129)
Model Tree Structures with Materialized Paths (page 131)
Model Tree Structures with Nested Sets (page 132)
For more information and examples of real-world data modeling, consider the following external resources:
Schema Design by Example4
Walkthrough MongoDB Data Modeling5
Document Design for MongoDB6
Dynamic Schema Blog Post7
MongoDB Data Modeling and Rails8
Ruby Example of Materialized Paths9
Sean Cribs Blog Post10 which was the source for much of the data-modeling-trees content.
3.1. Background
121
1. Manual references (page 122) where you save the _id field of one document in another document as a reference.
Then your application can run a second query to return the embedded data. These references are simple and
sufficient for most use cases.
2. DBRefs (page 123) are references from one document to another using the value of the first documents _id
field collection, and optional database name. To resolve DBRefs, your application must perform additional
queries to return the referenced documents. Many drivers (page 95) have helper methods that form the query
for the DBRef automatically. The drivers 11 do not automatically resolve DBRefs into documents.
Use a DBRef when you need to embed documents from multiple collections in documents from one collection.
DBRefs also provide a common format and type to represent these relationships among documents. The DBRef
format provides common semantics for representing links between documents if your database must interact
with multiple frameworks and tools.
Unless you have a compelling reason for using a DBRef, use manual references.
Manual References
Background
Manual references refers to the practice of including one documents _id field in another document. The application
can then issue a second query to resolve the referenced fields as needed.
Process
Consider the following operation to insert two documents, using the _id field of the first document as a reference in
the second document:
original_id = ObjectId()
db.places.insert({
"_id": original_id,
"name": "Broadway Center",
"url": "bc.example.net"
})
db.people.insert({
"name": "Erin",
"places_id": original_id,
"url": "bc.example.net/Erin"
})
Then, when a query returns the document from the people collection you can, if needed, make a second query for
the document referenced by the places_id field in the places collection.
Use
For nearly every case where you want to store a relationship between two documents, use manual references
(page 122). The references are simple to create and your application can resolve references as needed.
The only limitation of manual linking is that these references do not convey the database and collection name. If you
have documents in a single collection that relate to documents in more than one collection, you may need to consider
using DBRefs (page 123).
11
Some community supported drivers may have alternate behavior and may resolve a DBRef into a document automatically.
122
DBRefs
Background
DBRefs are a convention for representing a document, rather than a specific reference type. They include the name
of the collection, and in some cases the database, in addition to the value from the _id field.
Format
The DBRef in this example, points to a document in the creators collection of the users database that has
ObjectId("5126bc054aed4daf9e2ab772") in its _id field.
Note: The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef.
Support
C++ The C++ driver contains no support for DBRefs. You can transverse references manually.
C# The C# driver provides access to DBRef objects with the MongoDBRef Class12 and supplies the FetchDBRef
Method13 for accessing these objects.
12 http://api.mongodb.org/csharp/current/html/46c356d3-ed06-a6f8-42fa-e0909ab64ce2.htm
13 http://api.mongodb.org/csharp/current/html/1b0b8f48-ba98-1367-0a7d-6e01c8df436f.htm
3.1. Background
123
Java The DBRef14 class provides supports for DBRefs from Java.
JavaScript The mongo (page 942) shells JavaScript (page 806) interface provides a DBRef.
Perl The Perl driver contains no support for DBRefs. You can transverse references manually or use the MongoDBx::AutoDeref15 CPAN module.
PHP The PHP driver does support DBRefs, including the optional $db reference, through The MongoDBRef class16 .
Python The Python driver provides the DBRef class17 , and the dereference method18 for interacting with DBRefs.
Ruby The Ruby Driver supports DBRefs using the DBRef class19 and the deference method20 .
Use
In most cases you should use the manual reference (page 122) method for connecting two or more related documents.
However, if you need to reference documents from multiple collections, consider a DBRef.
124
state: "MA"
zip: 12345
}
If the address data is frequently retrieved with the name information, then with referencing, your application needs
to issue multiple queries to resolve the reference. The better data model would be to embed the address data in the
patron data, as in the following document:
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA"
zip: 12345
}
}
With the embedded data model, your application can retrieve the complete patron information with one query.
125
city: "Boston",
state: "MA",
zip: 12345
}
If your application frequently retrieves the address data with the name information, then your application needs
to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data
entities in the patron data, as in the following document:
{
_id: "joe",
name: "Joe Bookreader",
addresses: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: 12345
},
{
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: 12345
}
]
}
With the embedded data model, your application can retrieve the complete patron information with one query.
126
To avoid repetition of the publisher data, use references and keep the publisher information in a separate collection
from the book collection.
When using references, the growth of the relationships determine where to store the reference. If the number of books
per publisher is small with limited growth, storing the book reference inside the publisher document may sometimes
be useful. Otherwise, if the number of books per publisher is unbounded, this data model would lead to mutable,
growing arrays, as in the following example:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
To avoid mutable, growing arrays, store the publisher reference inside the book document:
{
_id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA"
127
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: "oreilly"
}
You can use the db.collection.findAndModify() (page 821) method to atomically determine if a book is
available for checkout and update with the new checkout information. Embedding the available field and the
checkout field within the same document ensures that the updates to these fields are in sync:
db.books.findAndModify ( {
query: {
_id: 123456789,
available: { $gt: 0 }
},
update: {
$inc: { available: -1 },
$push: { checkout: { by: "abc", date: new Date() } }
128
}
} )
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
You can create an index on the field parent to enable fast search by the parent node:
db.categories.ensureIndex( { parent: 1 } )
You can query by the parent field to find its immediate children nodes:
db.categories.find( { parent: "Databases" } )
The Parent Links pattern provides a simple solution to tree storage, but requires multiple queries to retrieve subtrees.
129
Pattern
The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an
array the id(s) of the nodes children.
Consider the following example that models a tree of categories using Child References:
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
"MongoDB", children: [] } )
"Postgres", children: [] } )
"Databases", children: [ "MongoDB", "Postgres" ] } )
"Languages", children: [] } )
"Programming", children: [ "Databases", "Languages" ] } )
"Books", children: [ "Programming" ] } )
The query to retrieve the immediate children of a node is fast and straightforward:
db.categories.findOne( { _id: "Databases" } ).children
You can create an index on the field children to enable fast search by the child nodes:
db.categories.ensureIndex( { children: 1 } )
You can query for a node in the children field to find its parent node as well as its siblings:
db.categories.find( { children: "MongoDB" } )
The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are
necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple
parents.
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
The query to retrieve the ancestors or path of a node is fast and straightforward:
130
You can create an index on the field ancestors to enable fast search by the ancestors nodes:
db.categories.ensureIndex( { ancestors: 1 } )
The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node
by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working
with subtrees.
The Array of Ancestors pattern is slightly slower than the Materialized Paths pattern but is more straightforward to
use.
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
You can query to retrieve the whole tree, sorting by the path:
db.categories.find().sort( { path: 1 } )
You can use regular expressions on the path field to find the descendants of Programming:
db.categories.find( { path: /,Programming,/ } )
You can also retrieve the descendants of Books where the Books is also at the topmost level of the hierarchy:
db.categories.find( { path: /^,Books,/ } )
131
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
The Nested Sets pattern provides a fast and efficient solution for finding subtrees but is inefficient for modifying the
tree structure. As such, this pattern is best for static trees that do not change.
132
In 2.4, MongoDB provides a text search feature. See Text Indexes (page 332) for more information.
If your application needs to perform queries on the content of a field that holds text you can perform exact matches on
the text or use $regex (page 633) to use regular expression pattern matches. However, for many operations on text,
these methods do not satisfy application requirements.
This pattern describes one method for supporting keyword search using MongoDB to support application search
functionality, that uses keywords stored in an array in the same document as the text field. Combined with a multi-key
index (page 324), this pattern can support applications keyword search operations.
Pattern
To add structures to your document to support keyword-based queries, create an array field in your documents and add
the keywords as strings in the array. You can then create a multi-key index (page 324) on the array and create queries
that select values from the array.
Example
Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array
topics, and you add as many keywords as needed for a given volume.
For the Moby-Dick volume you might have the following document:
{ title : "Moby-Dick" ,
author : "Herman Melville" ,
published : 1851 ,
ISBN : 0451526996 ,
topics : [ "whaling" , "allegory" , "revenge" , "American" ,
"novel" , "nautical" , "voyage" , "Cape Cod" ]
}
The multi-key index creates separate index entries for each keyword in the topics array. For example the index
contains one entry for whaling and another for allegory.
You then query based on the keywords. For example:
db.volumes.findOne( { topics : "voyage" }, { title: 1 } )
Note: An array with a large number of elements, such as one with several hundreds or thousands of keywords will
incur greater indexing costs on insertion.
133
Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for keyword indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be
more efficient for some kinds of content and workloads.
134
CHAPTER 4
Administration
The administration documentation addresses the ongoing operation and maintenance of MongoDB instances and deployments. This documentation includes both high level overviews of these concerns as well as tutorials that cover
specific procedures and processes for operating MongoDB.
Administration Concepts (page 135) Core conceptual documentation of operational practices for managing MongoDB deployments and systems.
Backup Strategies for MongoDB Systems (page 136) Describes approaches and considerations for backing up
a MongoDB database.
Data Center Awareness (page 153) Presents the MongoDB features that allow application developers and
database administrators to configure their deployments to be more data center aware or allow operational
and location-based separation.
Monitoring for MongoDB (page 138) An overview of monitoring tools, diagnostic strategies, and approaches
to monitoring replica sets and sharded clusters.
Administration Tutorials (page 163) Tutorials that describe common administrative procedures and practices for operations for MongoDB instances and deployments.
Configuration, Maintenance, and Analysis (page 163) Describes routine management operations, including
configuration and performance analysis.
Backup and Recovery (page 181) Outlines procedures for data backup and restoration with mongod
(page 925) instances and deployments.
Administration Reference (page 217) Reference and documentation of internal mechanics of administrative features,
systems and functions and operations.
See also:
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica
Set Tutorials (page 419) and Sharded Cluster Tutorials (page 521) for additional tutorials and information.
Monitoring for MongoDB (page 138) An overview of monitoring tools, diagnostic strategies, and approaches
to monitoring replica sets and sharded clusters.
Run-time Database Configuration (page 145) Outlines common MongoDB configurations and examples of
best-practice configurations for common use cases.
Data Management (page 153) Core documentation that addresses issues in data management, organization, maintenance, and lifestyle management.
Data Center Awareness (page 153) Presents the MongoDB features that allow application developers and
database administrators to configure their deployments to be more data center aware or allow operational
and location-based separation.
Expire Data from Collections by Setting TTL (page 158) TTL collections make it possible to automatically
remove data from a collection based on the value of a timestamp and are useful for managing data like
machine generated event data that are only useful for a limited period of time.
Capped Collections (page 156) Capped collections provide a special type of size-constrained collections that
preserve insertion order and can support high volume inserts.
GridFS (page 154) GridFS is a specification for storing documents that exceeds the BSON-document size limit
of 16MB.
Optimization Strategies for MongoDB (page 160) Techniques for optimizing application performance with MongoDB.
136
Chapter 4. Administration
Backup Considerations
As you develop a backup strategy for your MongoDB deployment consider the following factors:
Geography. Ensure that you move some backups away from your primary database infrastructure.
System errors. Ensure that your backups can survive situations where hardware failures or disk errors impact
the integrity or availability of your backups.
Production constraints. Backup operations themselves sometimes require substantial system resources. It is
important to consider the time of the backup schedule relative to peak usage and maintenance windows.
System capabilities. Some of the block-level snapshot tools require special support on the operating-system or
infrastructure level.
Database configuration. Replication and sharding can affect the process and impact of the backup implementation. See Sharded Cluster Backup Considerations (page 137) and Replica Set Backup Considerations
(page 138).
Actual requirements. You may be able to save time, effort, and space by including only crucial data in the most
frequent backups and backing up less crucial data less frequently.
Warning: In order to use filesystem snapshots for backups, your mongod (page 925) instance must have
journal (page 995) enabled, which is the default for 64 bit versions of MongoDB since 2.0. If the journal
sits on a different filesystem than your data files then you must also disable writes while the snapshot completes.
Backup Approaches
There are two main methodologies for backing up MongoDB instances. Creating binary dumps of the database
using mongodump (page 951) or creating filesystem level snapshots. Both methodologies have advantages and disadvantages:
binary database dumps are comparatively small, because they dont include index content or pre-allocated free
space, and record padding (page 65). However, its impossible to capture a copy of a running system that reflects
a single moment in time using a binary dump.
filesystem snapshots, sometimes called block level backups, produce larger backup sizes, but complete quickly
and can reflect a single moment in time on a running system. However, snapshot systems require filesystem and
operating system support and tools.
The best option depends on the requirements of your deployment and disaster recovery needs. Typically, filesystem
snapshots are because of their accuracy and simplicity; however, mongodump (page 951) is a viable option used often
to generate backups of MongoDB systems.
In some cases, taking backups is difficult or impossible because of large data volumes, distributed architectures, and
data transmission speeds. In these situations, increase the number of members in your replica set or sets.
Backup and Recovery Procedures
For tutorials on the backup approaches, see Backup and Recovery (page 181).
Backup Strategies for MongoDB Deployments
137
Sharded clusters complicate backup operations, as distributed systems. True point-in-time backups are only possible
when stopping all write activity from the application. To create a precise moment-in-time snapshot of a cluster, stop
all application write activity to the database, capture a backup, and allow only write operations to the database after
the backup is complete.
However, you can capture a backup of a cluster that approximates a point-in-time backup by capturing a backup from
a secondary member of the replica sets that provide the shards in the cluster at roughly the same moment. If you
decide to use an approximate-point-in-time backup method, ensure that your application can operate using a copy of
the data that does not reflect a single moment in time.
For backup procedures for sharded clusters, see Backup and Restore Sharded Clusters (page 188).
Replica Set Backup Considerations In most cases, backing up data stored in a replica set is similar to backing up
data stored in a single instance. Options include:
Create a file system snapshot of a single secondary, as described in Backup and Restore with MongoDB Tools
(page 181). You may choose to maintain a dedicated hidden member for backup purposes.
As an alternative you can create a backup with the mongodump (page 951) program and the --oplog option.
To restore this backup use the mongorestore (page 956) program and the --oplogReplay option.
If you have a sharded cluster where each shard is itself a replica set, you can use one of these methods to create a
backup of the entire cluster without disrupting the operation of the node. In these situations you should still turn off
the balancer when you create backups.
For any cluster, using a non-primary node to create backups is particularly advantageous in that the backup operation
does not affect the performance of the primary. Replication itself provides some measure of redundancy. Nevertheless,
keeping point-in time backups of your cluster to provide for disaster recovery and as an additional layer of protection
is crucial.
Monitoring for MongoDB
Monitoring is a critical component of all database administration. A firm grasp of MongoDBs reporting will allow you
to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDBs
normal operational parameters will allow you to diagnose before they escalate to failures.
This document presents an overview of the available monitoring utilities and the reporting statistics available in MongoDB. It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters.
Note: MongoDB Management Service (MMS)3 is a hosted monitoring service which collects and aggregates data
to provide insight into the performance and operation of MongoDB deployments. See the MMS documentation4 for
more information.
Monitoring Tools
There are two primary strategies for collecting data about the state of a running MongoDB instance:
First, there is a set of utilities distributed with MongoDB that provides real-time reporting of database activities.
Second, database commands (page 694) return statistics regarding the current database state with greater fidelity.
3 http://mms.mongodb.com
4 http://mms.mongodb.com/help/
138
Chapter 4. Administration
Each strategy can help answer different questions and is useful in different contexts. The two methods are complementary.
This section provides an overview of these methods and the statistics they provide. It also offers examples of the kinds
of questions that each method is best suited to help you address.
Utilities The MongoDB distribution includes a number of utilities that quickly return statistics about instances
performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation.
mongostat mongostat (page 974) captures and returns the counts of database operations by type (e.g. insert,
query, update, delete, etc.). These counts report on the load distribution on the server.
Use mongostat (page 974) to understand the distribution of operation types and to inform capacity planning. See
the mongostat manual (page 974) for details.
mongotop mongotop (page 979) tracks and reports the current read and write activity of a MongoDB instance,
and reports these statistics on a per collection basis.
Use mongotop (page 979) to check if your database activity and use match your expectations. See the mongotop
manual (page 979) for details.
REST Interface MongoDB provides a simple REST interface that can be useful for configuring monitoring and
alert scripts, and for other administrative tasks.
To enable, configure mongod (page 925) to use REST, either by starting mongod (page 925) with the --rest
option, or by setting the rest (page 997) setting to true in a configuration file (page 990).
For more information on using the REST Interface see, the Simple REST Interface5 documentation.
HTTP Console MongoDB provides a web interface that exposes diagnostic and monitoring information in a simple
web page. The web interface is accessible at localhost:<port>, where the <port> number is 1000 more than
the mongod (page 925) port .
For example, if a locally running mongod (page 925) is using the default port 27017, access the HTTP console at
http://localhost:28017.
Commands MongoDB includes a number of commands that report on the state of the database.
These data may provide a finer level of granularity than the utilities discussed above. Consider using their output
in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the
activity of your instance. The db.currentOp (page 879) method is another useful tool for identifying the database
instances in-progress operations.
serverStatus The serverStatus (page 782) command, or db.serverStatus() (page 893) from the
shell, returns a general overview of the status of the database, detailing disk usage, memory use, connection, journaling,
and index access. The command returns quickly and does not impact MongoDB performance.
serverStatus (page 782) outputs an account of the state of a MongoDB instance. This command is rarely run
directly. In most cases, the data is more meaningful when aggregated, as one would see with monitoring tools such
as the MMS Monitoring Service6 . Nevertheless, all administrators should be familiar with the data provided by
serverStatus (page 782).
5 http://docs.mongodb.org/ecosystem/tools/http-interfaces
6 http://mms.mongodb.com
139
dbStats The dbStats (page 767) command, or db.stats() (page 894) from the shell, returns a document
that addresses storage use and data volumes. The dbStats (page 767) reflect the amount of storage used, the quantity
of data contained in the database, and object, collection, and index counters.
Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare
use between databases and to determine the average document size in a database.
collStats The collStats (page 763) provides statistics that resemble dbStats (page 767) on the collection
level, including a count of the objects in the collection, the size of the collection, the amount of disk space used by the
collection, and information about its indexes.
replSetGetStatus The replSetGetStatus (page 726) command (rs.status() (page 898) from the
shell) returns an overview of your replica sets status. The replSetGetStatus (page 726) document details the state and
configuration of the replica set and statistics about its members.
Use this data to ensure that replication is properly configured, and to check the connections between the current host
and the other members of the replica set.
Third Party Tools A number of third party monitoring tools have support for MongoDB, either directly, or through
their own plugins.
Self Hosted Monitoring Tools These are monitoring tools that you must install, configure and maintain on your
own servers. Most are open source.
Tool
Ganglia7
Plugin
mongodb-ganglia8
Ganglia
gmond_python_modules9
Motop10
None
mtop11
Munin12
Munin
None
mongo-munin13
mongomon14
Munin
Nagios16
Zabbix18
nagios-plugin-mongodb17
mikoomi-mongodb19
Description
Python script to report operations per second,
memory usage, btree statistics, master/slave status
and current connections.
Parses output from the serverStatus
(page 782) and replSetGetStatus (page 726)
commands.
Realtime monitoring tool for MongoDB servers.
Shows current operations ordered by durations
every second.
A top like tool.
Retrieves server statistics.
Retrieves collection statistics (sizes, index sizes,
and each (configured) collection count for one
DB).
Some additional munin plugins not in the main
distribution.
A simple Nagios check script, written in Python.
Monitors availability, resource utilization, health,
performance and other important metrics.
7 http://sourceforge.net/apps/trac/ganglia/wiki
8 https://github.com/quiiver/mongodb-ganglia
9 https://github.com/ganglia/gmond_python_modules
10 https://github.com/tart/motop
11 https://github.com/beaufour/mtop
12 http://munin-monitoring.org/
13 https://github.com/erh/mongo-munin
14 https://github.com/pcdummy/mongomon
15 https://launchpad.net/
140
chris-lea/+archive/munin-plugins
Chapter 4. Administration
Also consider dex20 , an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes
to make indexing recommendations.
As part of MongoDB Enterprise21 , you can run MMS On-Prem22 , which offers the features of MMS in a package that
runs within your infrastructure.
Hosted (SaaS) Monitoring Tools These are monitoring tools provided as a hosted service, usually through a paid
subscription.
Name
MongoDB Management Service23
Scout25
Server Density29
Notes
MMS24 is a cloud-based suite of services for managing MongoDB
deployments. MMS provides monitoring and backup functionality.
Several plugins, including MongoDB Monitoring26 , MongoDB
Slow Queries27 , and MongoDB Replica Set Monitoring28 .
Dashboard for MongoDB30 , MongoDB specific alerts, replication
failover timeline and iPhone, iPad and Android mobile apps.
Process Logging
During normal operation, mongod (page 925) and mongos (page 938) instances report a live account of all server
activity and operations to either standard output or a log file. The following runtime settings control these options.
quiet (page 998). Limits the amount of information written to the log or output.
verbose (page 991). Increases the amount of information written to the log or output.
You can also specify this as v (as in -v). For higher levels of verbosity, set multiple v, as in vvvv = True.
You can also change the verbosity of a running mongod (page 925) or mongos (page 938) instance with the
setParameter (page 756) command.
logpath (page 992). Enables logging to a file, rather than the standard output. You must specify the full path
to the log file when adjusting this setting.
logappend (page 992). Adds information to a log file instead of overwriting the file.
Note: You can specify these configuration operations as the command line arguments to mongod (page 925) or
mongos (page 937)
For example:
mongod -v --logpath /var/log/mongodb/server1.log --logappend
Starts a mongod (page 925) instance in verbose (page 991) mode, appending data to the log file at
/var/log/mongodb/server1.log/.
16 http://www.nagios.org/
17 https://github.com/mzupan/nagios-plugin-mongodb
18 http://www.zabbix.com/
19 https://code.google.com/p/mikoomi/wiki/03
20 https://github.com/mongolab/dex
21 http://www.mongodb.com/products/mongodb-enterprise
22 http://mms.mongodb.com
23 https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring
24 https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring
25 http://scoutapp.com
26 https://scoutapp.com/plugin_urls/391-mongodb-monitoring
27 http://scoutapp.com/plugin_urls/291-mongodb-slow-queries
28 http://scoutapp.com/plugin_urls/2251-mongodb-replica-set-monitoring
29 http://www.serverdensity.com
30 http://www.serverdensity.com/mongodb-monitoring/
141
Degraded performance in MongoDB is typically a function of the relationship between the quantity of data stored
in the database, the amount of system RAM, the number of connections to the database, and the amount of time the
database spends in a locked state.
In some cases performance issues may be transient and related to traffic load, data access patterns, or the availability
of hardware on the host system for virtualized environments. Some users also experience performance limitations as a
result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns. In other
situations, performance issues may indicate that the database may be operating at capacity and that it is time to add
additional capacity to the database.
The following are some causes of degraded performance in MongoDB.
Locks MongoDB uses a locking system to ensure consistency. However, if certain operations are long-running, or
a queue forms, performance will slow as requests and operations wait for the lock. Lock-related slowdowns can be
intermittent. To see if the lock has been affecting your performance, look to the data in the globalLock (page 785)
section of the serverStatus (page 782) output. If globalLock.currentQueue.total (page 786) is consistently high, then there is a chance that a large number of requests are waiting for a lock. This indicates a possible
concurrency issue that may be affecting performance.
If globalLock.totalTime (page 785) is high relative to uptime (page 783), the database has existed in a
lock state for a significant amount of time. If globalLock.ratio (page 786) is also high, MongoDB has likely
been processing a large number of long running queries. Long queries are often the result of a number of factors:
ineffective use of indexes, non-optimal schema design, poor query structure, system architecture issues, or insufficient
RAM resulting in page faults (page 142) and disk reads.
Memory Usage MongoDB uses memory mapped files to store data. Given a data set of sufficient size, the MongoDB
process will allocate all available memory on the system for its use. While this is part of the design, and affords
MongoDB superior performance, the memory mapped files make it difficult to determine if the amount of RAM is
sufficient for the data set.
The memory usage statuses (page 786) metrics of the serverStatus (page 782) output can provide insight into
MongoDBs memory use. Check the resident memory use (i.e. mem.resident (page 787)): if this exceeds the
amount of system memory and there is a significant amount of data on disk that isnt in RAM, you may have exceeded
the capacity of your system.
You should also check the amount of mapped memory (i.e. mem.mapped (page 787).) If this value is greater than
the amount of system memory, some operations will require disk access page faults to read data from virtual memory
and negatively affect performance.
Page Faults A page fault occurs when MongoDB requires data not located in physical memory, and must read
from virtual memory. To check for page faults, see the extra_info.page_faults (page 788) value in the
serverStatus (page 782) output. This data is only available on Linux systems.
A single page fault completes quickly and is not problematic. However, in aggregate, large volumes of page faults
typically indicate that MongoDB is reading too much data from disk. In many situations, MongoDBs read locks will
142
Chapter 4. Administration
yield after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read
into memory. This approach improves concurrency, and also improves overall throughput in high volume systems.
Increasing the amount of RAM accessible to MongoDB may help reduce the number of page faults. If this is not
possible, you may want to consider deploying a sharded cluster and/or adding shards to your deployment to distribute
load among mongod (page 925) instances.
Number of Connections In some cases, the number of connections between the application layer (i.e. clients) and
the database can overwhelm the ability of the server to handle requests. This can produce performance irregularities.
The following fields in the serverStatus (page 782) document can provide insight:
globalLock.activeClients (page 786) contains a counter of the total number of clients with active
operations in progress or queued.
connections (page 787) is a container for the following two fields:
current (page 787) the total number of current clients that connect to the database instance.
available (page 787) the total number of unused collections available for new clients.
Note: Unless constrained by system-wide limits MongoDB has a hard connection limit of 20,000 connections. You
can modify system limits using the ulimit command, or by editing your systems /etc/sysctl file.
If requests are high because there are numerous concurrent application requests, the database may have trouble keeping
up with demand. If this is the case, then you will need to increase the capacity of your deployment. For read-heavy
applications increase the size of your replica set and distribute read operations to secondary members. For write heavy
applications, deploy sharding and add one or more shards to a sharded cluster to distribute load among mongod
(page 925) instances.
Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported
MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently.
Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or
other configuration error.
Database Profiling MongoDBs Profiler is a database profiling system that can help identify inefficient queries
and operations.
The following profiling levels are available:
Level
0
1
2
Setting
Off. No profiling
On. Only includes slow operations
On. Includes all operations
Enable the profiler by setting the profile (page 770) value using the following command in the mongo (page 942)
shell:
db.setProfilingLevel(1)
The slowms (page 997) setting defines what constitutes a slow operation. To set the threshold above which the
profiler considers operations slow (and thus, included in the level 1 profiling data), you can configure slowms
(page 997) at runtime as an argument to the db.setProfilingLevel() (page 894) operation.
See
The documentation of db.setProfilingLevel() (page 894) for more information about this command.
143
By default, mongod (page 925) records all slow queries to its log (page 992), as defined by slowms (page 997).
Unlike log data, the data in system.profile does not persist between mongod (page 925) restarts.
Note: Because the database profiler can negatively impact performance, only enable profiling for strategic intervals
and as minimally as possible on production systems.
You may enable profiling on a per-mongod (page 925) basis. This setting will not propagate across a replica set or
sharded cluster.
You can view the output of the profiler in the system.profile collection of your database by issuing the show
profile command in the mongo (page 942) shell, or with the following operation:
db.system.profile.find( { millis : { $gt : 100 } } )
This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (100, in this
example) is above the slowms (page 997) threshold.
See also:
Optimization Strategies for MongoDB (page 160) addresses strategies that may improve the performance of your
database queries and operations.
Replication and Monitoring
Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor
replication lag. Replication lag refers to the amount of time that it takes to copy (i.e. replicate) a write operation
on the primary to a secondary. Some small delay period may be acceptable, but two significant problems emerge as
replication lag grows:
First, operations that occurred during the period of lag are not replicated to one or more secondaries. If youre
using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.
Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform
an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. This is uncommon
under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise.
Note: The size of the oplog is only configurable during the first run using the --oplogSize argument to
the mongod (page 925) command, or preferably, the oplogSize (page 1000) in the MongoDB configuration
file. If you do not specify this on the command line before running with the --replSet option, mongod
(page 925) will create a default sized oplog.
By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about
changing the oplog size, see the Change the Size of the Oplog (page 446)
For causes of replication lag, see Replication Lag (page 463).
Replication issues are most often the result of network connectivity issues between members, or the result of a primary
that does not have the resources to support application and replication traffic. To check the status of a replica, use the
replSetGetStatus (page 726) or the following helper in the shell:
rs.status()
The replSetGetStatus (page 726) document provides a more in-depth overview view of this output. In general, watch
the value of optimeDate (page 727), and pay particular attention to the time difference between the primary and
the secondary members.
144
Chapter 4. Administration
In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB
instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes
and that sharding operations are functioning appropriately.
See also:
See the Sharding Concepts (page 498) documentation for more information.
Config Servers The config database maintains a map identifying which documents are on which shards. The cluster
updates this map as chunks move between shards. When a configuration server becomes inaccessible, certain sharding
operations become unavailable, such as moving chunks and starting mongos (page 938) instances. However, clusters
remain accessible from already-running mongos (page 938) instances.
Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should monitor your configuration servers to ensure that the cluster remains well balanced and that mongos (page 938) instances
can restart.
MMS Monitoring31 monitors config servers and can create notifications if a config server becomes inaccessible.
Balancing and Chunk Distribution The most effective sharded cluster deployments evenly balance chunks among
the shards. To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks
are always optimally distributed among the shards.
Issue the db.printShardingStatus() (page 892) or sh.status() (page 910) command to the mongos
(page 938) by way of the mongo (page 942) shell. This returns an overview of the entire cluster including the
database name, and a list of the chunks.
Stale Locks In nearly every case, all locks used by the balancer are automatically released when they become stale.
However, because any long lasting lock can block future balancing, its important to insure that all locks are legitimate.
To check the lock status of the database, connect to a mongos (page 938) instance using the mongo (page 942) shell.
Issue the following command sequence to switch to the config database and display all outstanding locks on the
shard database:
use config
db.locks.find()
For active deployments, the above query can provide insights. The balancing process, which originates on a randomly
selected mongos (page 938), takes a special balancer lock that prevents other balancing activity from transpiring.
Use the following command, also to the config database, to check the status of the balancer lock.
db.locks.find( { _id : "balancer" } )
If this lock exists, make sure that the balancer process is actively using this lock.
Run-time Database Configuration
The command line (page 925) and configuration file (page 990) interfaces provide MongoDB administrators with a
large number of options and settings for controlling the operation of the database system. This document provides an
overview of common configurations and examples of best-practice configurations for common use cases.
While both interfaces provide access to the same collection of options and settings, this document primarily uses the
configuration file interface. If you run MongoDB using a control script or installed from a package for your operating
31 http://mms.mongodb.com
145
system, you likely already have a configuration file located at /etc/mongodb.conf. Confirm this by checking the
contents of the /etc/init.d/mongod or /etc/rc.d/mongod script to insure that the control scripts start the
mongod (page 925) with the appropriate configuration file (see below.)
To start a MongoDB instance using this configuration issue a command in the following form:
mongod --config /etc/mongodb.conf
mongod -f /etc/mongodb.conf
Modify the values in the /etc/mongodb.conf file on your system to control the configuration of your database
instance.
Configure the Database
For most standalone servers, this is a sufficient base configuration. It makes several assumptions, but consider the
following explanation:
fork (page 993) is true, which enables a daemon mode for mongod (page 925), which detaches (i.e. forks)
the MongoDB from the current session and allows you to run the database as a conventional server.
bind_ip (page 991) is 127.0.0.1, which forces the server to only listen for requests on the localhost IP.
Only bind to secure interfaces that the application-level systems can access with access control provided by
system network filtering (i.e. firewall).
port (page 991) is 27017, which is the default MongoDB port for database instances. MongoDB can bind to
any port. You can also filter access based on port using network filtering tools.
Note: UNIX-like systems require superuser privileges to attach processes to ports lower than 1024.
quiet (page 998) is true. This disables all but the most critical entries in output/log file. In normal operation
this is the preferable operation to avoid log noise. In diagnostic or testing situations, set this value to false.
Use setParameter (page 756) to modify this setting during run time.
dbpath (page 993) is /srv/mongodb, which specifies where MongoDB will store its data files.
/srv/mongodb and /var/lib/mongodb are popular locations. The user account that mongod (page 925)
runs under will need read and write access to this directory.
logpath (page 992) is /var/log/mongodb/mongod.log which is where mongod (page 925) will write
its output. If you do not set this value, mongod (page 925) writes all output to standard output (e.g. stdout.)
logappend (page 992) is true, which ensures that mongod (page 925) does not overwrite an existing log
file following the server start operation.
journal (page 995) is true, which enables journaling. Journaling ensures single instance write-durability.
64-bit builds of mongod (page 925) enable journaling by default. Thus, this setting may be redundant.
Given the default configuration, some of these values may be redundant. However, in many situations explicitly stating
the configuration increases overall system intelligibility.
146
Chapter 4. Administration
Security Considerations
The following collection of configuration options are useful for limiting access to a mongod (page 925) instance.
Consider the following:
bind_ip = 127.0.0.1,10.8.0.10,192.168.4.24
nounixsocket = true
auth = true
Replication Configuration Replica set configuration is straightforward, and only requires that the replSet
(page 1000) have a value that is consistent among all members of the set. Consider the following:
replSet = set0
Use descriptive names for sets. Once configured use the mongo (page 942) shell to add hosts to the replica set.
See also:
Replica set reconfiguration (page 483).
To enable authentication for the replica set, add the following option:
keyFile = /srv/mongodb/keyfile
New in version 1.8: for replica sets, and 1.9.1 for sharded replica sets.
Setting keyFile (page 993) enables authentication and specifies a key file for the replica set member use to when
authenticating to each other. The content of the key file is arbitrary, but must be the same on all members of the
replica set and mongos (page 938) instances that connect to the set. The keyfile must be less than one kilobyte in size
and may only contain characters in the base64 set and the file must not have group or world permissions on UNIX
systems.
See also:
The Replica set Reconfiguration (page 483) section for information regarding the process for changing replica set
during operation.
147
Additionally, consider the Replica Set Security (page 238) section for information on configuring authentication with
replica sets.
Finally, see the Replication (page 377) document for more information on replication in MongoDB and replica set
configuration in general.
Sharding Configuration Sharding requires a number of mongod (page 925) instances with different configurations.
The config servers store the clusters metadata, while the cluster distributes data among one or more shard servers.
Note: Config servers are not replica sets.
To set up one or three config server instances as normal (page 146) mongod (page 925) instances, and then add the
following configuration option:
configsvr = true
bind_ip = 10.8.0.12
port = 27001
This creates a config server running on the private IP address 10.8.0.12 on port 27001. Make sure that there are no
port conflicts, and that your config server is accessible from all of your mongos (page 938) and mongod (page 925)
instances.
To set up shards, configure two or more mongod (page 925) instance using your base configuration (page 146), adding
the shardsvr (page 1001) setting:
shardsvr = true
Finally, to establish the cluster, configure at least one mongos (page 938) process with the following settings:
configdb = 10.8.0.12:27001
chunkSize = 64
You can specify multiple configdb (page 1001) instances by specifying hostnames and ports in the form of a comma
separated list. In general, avoid modifying the chunkSize (page 1002) from the default value of 64, 32 and should
ensure this setting is consistent among all mongos (page 938) instances.
See also:
The Sharding (page 493) section of the manual for more information on sharding and cluster configuration.
Run Multiple Database Instances on the Same System
In many cases running multiple instances of mongod (page 925) on a single system is not recommended. On some
types of deployments 33 and for testing purposes you may need to run more than one mongod (page 925) on a single
system.
In these cases, use a base configuration (page 146) for each instance, but consider the following configuration values:
dbpath = /srv/mongodb/db0/
pidfilepath = /srv/mongodb/db0.pid
32 Chunk size is 64 megabytes by default, which provides the ideal balance between the most even distribution of data, for which smaller chunk
sizes are best, and minimizing chunk migration, for which larger chunk sizes are optimal.
33 Single-tenant systems with SSD or other high performance disks may provide acceptable performance levels for multiple mongod (page 925)
instances. Additionally, you may find that multiple databases with small working sets may function acceptably on a single system.
148
Chapter 4. Administration
The dbpath (page 993) value controls the location of the mongod (page 925) instances data directory. Ensure that
each database has a distinct and well labeled data directory. The pidfilepath (page 993) controls where mongod
(page 925) process places its process id file. As this tracks the specific mongod (page 925) file, it is crucial that file
be unique and well labeled to make it easy to start and stop these processes.
Create additional control scripts and/or adjust your existing MongoDB configuration and control script as needed to
control these processes.
Diagnostic Configurations
The following configuration options control various mongod (page 925) behaviors for diagnostic purposes. The
following settings have default values that tuned for general production purposes:
slowms = 50
profile = 3
verbose = true
diaglog = 3
objcheck = true
cpu = true
Use the base configuration (page 146) and add these options if you are experiencing some unknown issue or performance problem as needed:
slowms (page 997) configures the threshold for the database profiler to consider a query slow. The default value is 100 milliseconds. Set a lower value if the database profiler does not return useful results. See
Optimization Strategies for MongoDB (page 160) for more information on optimizing operations in MongoDB.
profile (page 997) sets the database profiler level. The profiler is not active by default because of the possible
impact on the profiler itself on performance. Unless this setting has a value, queries are not profiled.
verbose (page 991) enables a verbose logging mode that modifies mongod (page 925) output and increases
logging to include a greater number of events. Only use this option if you are experiencing an issue that is not
reflected in the normal logging level. If you require additional verbosity, consider the following options:
v = true
vv = true
vvv = true
vvvv = true
vvvvv = true
Each additional level v adds additional verbosity to the logging. The verbose option is equal to v = true.
diaglog (page 994) enables diagnostic logging. Level 3 logs all read and write options.
objcheck (page 992) forces mongod (page 925) to validate all requests from clients upon receipt. Use this
option to ensure that invalid requests are not causing errors, particularly when running a database with untrusted
clients. This option may affect database performance.
cpu (page 993) forces mongod (page 925) to report the percentage of the last interval spent in write lock. The
interval is typically 4 seconds, and each output line in the log includes both the actual interval since the last
report and the percentage of time spent in write lock.
Import and Export MongoDB Data
This document provides an overview of the import and export programs included in the MongoDB distribution. These
tools are useful when you want to backup or export a portion of your data without capturing the state of the entire
database, or for simple data ingestion cases. For more complex data migration tasks, you may want to write your own
149
import and export scripts using a client driver to interact with the database itself. For disaster recovery protection and
routine database backup operation, use full database instance backups (page 136).
Warning: Because these tools primarily operate by interacting with a running mongod (page 925) instance, they
can impact the performance of your running database.
Not only do these processes create traffic for a running database instance, they also force the database to read all
data through memory. When MongoDB reads infrequently used data, it can supplant more frequently accessed
data, causing a deterioration in performance for the databases regular workload.
mongoimport (page 965) and mongoexport (page 969) do not reliably preserve all rich BSON data types,
because BSON is a superset of JSON. Thus, mongoimport (page 965) and mongoexport (page 969) cannot
represent BSON data accurately in JSON. As a result data exported or imported with these tools may lose some
measure of fidelity. See MongoDB Extended JSON (page 108) for more information about MongoDB Extended
JSON.
See also:
See the Backup Strategies for MongoDB Systems (page 136) document for more information on backing up MongoDB
instances. Additionally, consider the following references for commands addressed in this document:
mongoexport (page 969)
mongorestore (page 956)
mongodump (page 951)
If you want to transform and process data once youve imported it in MongoDB consider the documents in the Aggregation (page 275) section, including:
Map-Reduce (page 282) and
Aggregation Concepts (page 279).
Data Type Fidelity
JSON does not have the following data types that exist in BSON documents: data_binary, data_date,
data_timestamp, data_regex, data_oid and data_ref. As a result using any tool that decodes BSON
documents into JSON will suffer some loss of fidelity.
If maintaining type fidelity is important, consider writing a data import and export system that does not force BSON
documents into JSON form as part of the process. The following list of types contain examples for how MongoDB
will represent how BSON documents render in JSON.
data_binary
{ "$binary" : "<bindata>", "$type" : "<t>" }
<bindata> is the base64 representation of a binary string. <t> is the hexadecimal representation of a single
byte indicating the data type.
data_date
Date( <date> )
<date> is the JSON representation of a 64-bit signed integer for milliseconds since epoch.
data_timestamp
Timestamp( <t>, <i> )
150
Chapter 4. Administration
<t> is the JSON representation of a 32-bit unsigned integer for milliseconds since epoch. <i> is a 32-bit
unsigned integer for the increment.
data_regex
/<jRegex>/<jOptions>
<jRegex> is a string that may contain valid JSON characters and unescaped double quote (i.e. ") characters,
but may not contain unescaped forward slash (i.e. http://docs.mongodb.org/manual) characters.
<jOptions> is a string that may contain only the characters g, i, m, and s.
data_oid
ObjectId( "<id>" )
<id> is a 24 character hexadecimal string. These representations require that data_oid values have an
associated field named _id.
data_ref
DBRef( "<name>", "<id>" )
For resilient and non-disruptive backups, use a file system or block-level disk snapshot function, such as the methods
described in the Backup Strategies for MongoDB Systems (page 136) document. The tools and operations discussed
provide functionality thats useful in the context of providing some kinds of backups.
By contrast, use import and export tools to backup a small subset of your data or to move data to or from a 3rd party
system. These backups may capture a small crucial set of data or a frequently modified section of data, for extra
insurance, or for ease of access. No matter how you decide to import or export your data, consider the following
guidelines:
Label files so that you can identify what point in time the export or backup reflects.
Labeling should describe the contents of the backup, and reflect the subset of the data corpus, captured in the
backup or export.
Do not create or apply exports if the backup process itself will have an adverse effect on a production system.
Make sure that they reflect a consistent data state. Export or backup processes can impact data integrity (i.e.
type fidelity) and consistency if updates continue during the backup process.
Test backups and exports by restoring and importing to ensure that the backups are useful.
Human Intelligible Import/Export Formats
This section describes a process to import/export your database, or a portion thereof, to a file in a JSON or CSV format.
See also:
The mongoimport (page 965) and mongoexport (page 969) documents contain complete documentation of these tools.
If you have questions about the function and parameters of these tools not covered here, please refer to these documents.
151
If you want to simply copy a database or collection from one instance to another, consider using the copydb
(page 745), clone (page 748), or cloneCollection (page 748) commands, which may be more suited to this
task. The mongo (page 942) shell provides the db.copyDatabase() (page 878) method.
These tools may also be useful for importing data into a MongoDB database from third party applications.
Collection Export with mongoexport With the mongoexport (page 969) utility you can create a backup file.
In the most simple invocation, the command takes the following form:
mongoexport --collection collection --out collection.json
This will export all documents in the collection named collection into the file collection.json. Without
the output specification (i.e. --out collection.json,) mongoexport (page 969) writes output to standard
output (i.e. stdout.) You can further narrow the results by supplying a query filter using the --query and limit
results to a single database using the --db option. For instance:
mongoexport --db sales --collection contacts --query '{"field": 1}'
This command returns all documents in the sales databases contacts collection, with a field named field with
a value of 1. Enclose the query in single quotes (e.g. ) to ensure that it does not interact with your shell environment.
The resulting documents will return on standard output.
By default, mongoexport (page 969) returns one JSON document per MongoDB document. Specify the
--jsonArray argument to return the export as a single JSON array. Use the --csv file to return the result in
CSV (comma separated values) format.
If your mongod (page 925) instance is not running, you can use the --dbpath option to specify the location to
your MongoDB instances database files. See the following example:
mongoexport --db sales --collection contacts --dbpath /srv/MongoDB/
This reads the data files directly. This locks the data directory to prevent conflicting writes. The mongod (page 925)
process must not be running or attached to these data files when you run mongoexport (page 969) in this configuration.
The --host and --port options allow you to specify a non-local host to connect to capture the export. Consider
the following example:
mongoexport --host mongodb1.example.net --port 37017 --username user --password pass --collection con
On any mongoexport (page 969) command you may, as above specify username and password credentials as above.
Collection Import with mongoimport To restore a backup taken with mongoexport (page 969). Most of the
arguments to mongoexport (page 969) also exist for mongoimport (page 965). Consider the following command:
mongoimport --collection collection --file collection.json
This imports the contents of the file collection.json into the collection named collection. If you do not
specify a file with the --file option, mongoimport (page 965) accepts input over standard input (e.g. stdin.)
If you specify the --upsert option, all of mongoimport (page 965) operations will attempt to update existing
documents in the database and insert other documents. This option will cause some performance impact depending on
your configuration.
You can specify the database option --db to import these documents to a particular database. If your MongoDB
instance is not running, use the --dbpath option to specify the location of your MongoDB instances database
files. Consider using the --journal option to ensure that mongoimport (page 965) records its operations in
the journal. The mongod process must not be running or attached to these data files when you run mongoimport
(page 965) in this configuration.
152
Chapter 4. Administration
Use the --ignoreBlanks option to ignore blank fields. For CSV and TSV imports, this option provides the
desired functionality in most cases: it avoids inserting blank fields in MongoDB documents.
Operational Overview MongoDB includes a number of features that allow database administrators and developers
to segregate application operations to MongoDB deployments by functional or geographical groupings.
This capability provides data center awareness, which allows applications to target MongoDB deployments with
consideration of the physical location of the mongod (page 925) instances. MongoDB supports segmentation of
operations across different dimensions, which may include multiple data centers and geographical regions in multidata center deployments, racks, networks, or power circuits in single data center deployments.
MongoDB also supports segregation of database operations based on functional or operational parameters, to ensure
that certain mongod (page 925) instances are only used for reporting workloads or that certain high-frequency portions
of a sharded collection only exist on specific shards.
Specifically, with MongoDB, you can:
4.1. Administration Concepts
153
ensure write operations propagate to specific members of a replica set, or to specific members of replica sets.
ensure that specific members of a replica set respond to queries.
ensure that specific ranges of your shard key balance onto and reside on specific shards.
combine the above features in a single distributed deployment, on a per-operation (for read and write operations)
and collection (for chunk distribution in sharded clusters distribution) basis.
For full documentation of these features, see the following documentation in the MongoDB Manual:
Read Preferences (page 405), which controls how drivers help applications target read operations to members
of a replica set.
Write Concerns (page 55), which controls how MongoDB ensures that write operations propagate to members
of a replica set.
Replica Set Tags (page 451), which control how applications create and interact with custom groupings of replica
set members to create custom application-specific read preferences and write concerns.
Tag Aware Sharding (page 557), which allows MongoDB administrators to define an application-specific balancing policy, to control how documents belonging to specific ranges of a shard key distribute to shards in the
sharded cluster.
See also:
Before adding operational segregation features to your application and MongoDB deployment, become familiar with
all documentation of replication (page 377), :and doc:sharding </sharding>.
Further Reading
The Write Concern (page 55) and Read Preference (page 405) documents, which address capabilities related to
data center awareness.
Deploy a Geographically Distributed Replica Set (page 425).
GridFS
GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit (page 1015) of
16MB.
Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, 34 and stores each of those
chunks as a separate document. By default GridFS limits chunk size to 256k. GridFS uses two collections to store
files. One collection stores the file chunks, and the other stores file metadata.
When you query a GridFS store for a file, the driver or client will reassemble the chunks as needed. You can perform
range queries on files stored through GridFS. You also can access information from arbitrary sections of files, which
allows you to skip into the middle of a video or audio file.
GridFS is useful not only for storing files that exceed 16MB but also for storing any files for which you want access
without having to load the entire file into memory. For more information on the indications of GridFS, see When
should I use GridFS? (page 588).
Implement GridFS
To store and retrieve files using GridFS, use either of the following:
34
The use of the term chunks in the context of GridFS is not related to the use of the term chunks in the context of sharding.
154
Chapter 4. Administration
A MongoDB driver. See the drivers (page 95) documentation for information on using GridFS with your driver.
The mongofiles (page 986) command-line tool in the mongo (page 942) shell. See mongofiles (page 986).
GridFS Collections
GridFS uses a unique, compound index on the chunks collection for the files_id and n fields. The files_id
field contains the _id of the chunks parent document. The n field contains the sequence number of the chunk.
GridFS numbers all chunks, starting with 0. For descriptions of the documents and fields in the chunks collection,
see GridFS Reference (page 110).
The GridFS index allows efficient retrieval of chunks using the files_id and n values, as shown in the following
example:
cursor = db.fs.chunks.find({files_id: myFileID}).sort({n:1});
See the relevant driver (page 95) documentation for the specific behavior of your GridFS application. If your driver
does not create this index, issue the following operation using the mongo (page 942) shell:
db.fs.chunks.ensureIndex( { files_id: 1, n: 1 }, { unique: true } );
Example Interface
The following is an example of the GridFS interface in Java. The example is for demonstration purposes only. For
API specifics, see the relevant driver (page 95) documentation.
By default, the interface must support the default GridFS bucket, named fs, as in the following:
// returns default GridFS bucket (i.e. "fs" collection)
GridFS myFS = new GridFS(myDatabase);
// saves the file to "fs" GridFS bucket
myFS.createFile(new File("/tmp/largething.mpg"));
Optionally, interfaces may support other additional GridFS buckets as in the following example:
155
Capped Collections
Capped collections are fixed-size collections that support high-throughput operations that insert, retrieve, and delete
documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection
fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection.
See createCollection() (page 878) or createCollection for more information on creating capped collections.
Capped collections have the following behaviors:
Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to
return documents in insertion order. Without this indexing overhead, they can support higher insertion throughput.
Capped collections guarantee that insertion order is identical to the order on disk (natural order) and do so
by prohibiting updates that increase document size. Capped collections only allow updates that fit the original
document size, which ensures a document does not change its location on disk.
Capped collections automatically remove the oldest documents in the collection without requiring scripts or
explicit remove operations.
For example, the oplog.rs collection that stores a log of the operations in a replica set uses a capped collection.
Consider the following potential uses cases for capped collections:
Store log information generated by high-volume systems. Inserting documents in a capped collection without
an index is close to the speed of writing log information directly to a file system. Furthermore, the built-in
first-in-first-out property maintains the order of events, while managing storage use.
Cache small amounts of data in a capped collections. Since caches are read rather than write heavy, you would
either need to ensure that this collection always remains in the working set (i.e. in RAM) or accept some write
penalty for the required index or indexes.
Recommendations and Restrictions
You can update documents in a collection after inserting them. However, these updates cannot cause the documents to grow. If the update operation causes the document to grow beyond their original size, the update
operation will fail.
If you plan to update documents in a capped collection, create an index so that these update operations do not
require a table scan.
You cannot delete documents from a capped collection. To remove all records from a capped collection, use the
emptycapped command. To remove the collection entirely, use the drop() (page 812) method.
You cannot shard a capped collection.
156
Chapter 4. Administration
Capped collections created after 2.2 have an _id field and an index on the _id field by default. Capped
collections created before 2.2 do not have an index on the _id field by default. If you are using capped
collections with replication prior to 2.2, you should explicitly create an index on the _id field.
Warning: If you have a capped collection in a replica set outside of the local database, before 2.2,
you should create a unique index on _id. Ensure uniqueness using the unique: true option to the
ensureIndex() (page 814) method or by using an ObjectId for the _id field. Alternately, you can use
the autoIndexId option to create (page 747) when creating the capped collection, as in the Query a
Capped Collection (page 157) procedure.
Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is
(somewhat) analogous to tail on a log file.
Procedures
Create a Capped Collection You must create capped collections explicitly using the createCollection()
(page 878) method, which is a helper in the mongo (page 942) shell for the create (page 747) command. When
creating a capped collection you must specify the maximum size of the collection in bytes, which MongoDB will preallocate for the collection. The size of the capped collection includes a small amount of space for internal overhead.
db.createCollection( "log", { capped: true, size: 100000 } )
Additionally, you may also specify a maximum number of documents for the collection using the max field as in the
following document:
db.createCollection("log", { capped : true, size : 5242880, max : 5000 } )
Important: The size argument is always required, even when you specify max number of documents. MongoDB
will remove older documents if a collection reaches the maximum size limit before it reaches the maximum document
count.
See
createCollection() (page 878) and create (page 747).
Query a Capped Collection If you perform a find() (page 816) on a capped collection with no ordering specified,
MongoDB guarantees that the ordering of results is the same as the insertion order.
To retrieve documents in reverse insertion order, issue find() (page 816) along with the sort() (page 872) method
with the $natural (page 693) parameter set to -1, as shown in the following example:
db.cappedCollection.find().sort( { $natural: -1 } )
Check if a Collection is Capped Use the isCapped() (page 837) method to determine if a collection is capped,
as follows:
db.collection.isCapped()
Convert a Collection to Capped You can convert a non-capped collection to a capped collection with the
convertToCapped (page 749) command:
157
The size parameter specifies the size of the capped collection in bytes.
Changed in version 2.2: Before 2.2, capped collections did not have an index on _id unless you specified
autoIndexId to the create (page 747), after 2.2 this became the default.
Automatically Remove Data After a Specified Period of Time For additional flexibility when expiring data, consider MongoDBs TTL indexes, as described in Expire Data from Collections by Setting TTL (page 158). These indexes
allow you to expire and remove data from normal collections using a special type, based on the value of a date-typed
field and a TTL value for the index.
TTL Collections (page 158) are not compatible with capped collections.
Tailable Cursor You can use a tailable cursor with capped collections. Similar to the Unix tail -f command,
the tailable cursor tails the end of a capped collection. As new documents are inserted into the capped collection,
you can use the tailable cursor to continue retrieving documents.
See Create Tailable Cursor (page 82) for information on creating a tailable cursor.
Expire Data from Collections by Setting TTL
To enable TTL for a collection, use the ensureIndex() (page 814) method to create a TTL index, as shown in the
examples below. MongoDB begins removing expired documents as soon as the index finishes building.
Note: When the TTL thread is active, you will see a delete (page 50) operations in the output of db.currentOp()
(page 879) or in the data collected by the database profiler (page 167).
Note: When enabling TTL on replica sets, the TTL background thread runs only on primary members. Secondary
members replicate deletion operations from the primary.
158
Chapter 4. Administration
Warning: The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay
between the time a document expires and the time that MongoDB removes the document from the database.
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain
in a collection after they expire but before the background task runs or completes.
The duration of the removal operation depends on the workload of your mongod (page 925) instance. Therefore,
expired data may exist for some time beyond the 60 second period between runs of the background task.
With the exception of the background thread, A TTL index supports queries in the same way normal indexes do. You
can use TTL indexes to expire documents in one of two ways, either:
remove documents a certain number of seconds after creation. The index will support queries for the creation
time of the documents. Alternately,
specify an explicit expiration time. The index will support queries for the expiration-time of the document.
Expire after a Certain Number of Seconds Begin by creating a TTL index and specify an
expireAfterSeconds value of 3600. This sets the an expiration time of 1 hour after the time specified
by the value of the indexed field. The following example, creates an index on the log.events collections status
field:
db.log.events.ensureIndex( { "status": 1 }, { expireAfterSeconds: 3600 } )
To expire documents a certain number of seconds after creation, give the date field a value corresponding to
the insertion time of the documents.For example, given the index on the log.events collection with the
expireAfterSeconds value of 0, and a current date of July 22, 2013: 13:00:00, consider the document in the following insert() (page 832) operation:
db.log.events.insert( {
"status": new Date('July 22, 2013: 13:00:00'),
"logEvent": 2,
"logMessage": "Success!",
} )
The status field must hold values of BSON date type or an array of BSON date-typed objects.
MongoDB will automatically delete documents from the log.events collection when at least one of the values of
a documents status field is a time older than the number of seconds specified in expireAfterSeconds.
Expire at a Certain Clock Time Begin by creating a TTL index and specify an expireAfterSeconds value of
0. The following example, creates an index on the log.events collections status field:
db.log.events.ensureIndex( { "status": 1 }, { expireAfterSeconds: 0 } )
To expire documents at a certain clock time, give the date field a value corresponding to the time a document should
expire, For example, given the index on the log.events collection with the expireAfterSeconds value of
0, and a current date of July 22, 2013: 13:00:00, consider the document in the following insert()
(page 832) operation:
db.log.events.insert( {
"status": new Date('July 22, 2013: 14:00:00'),
"logEvent": 2,
"logMessage": "Success!",
} )
The status field must hold values of BSON date type or an array of BSON date-typed objects.
159
MongoDB will automatically delete documents from the log.events collection when at least one of the values of
a documents status field is a time older than the number of seconds specified in expireAfterSeconds.
Constraints
MongoDB provides a database profiler that shows performance characteristics of each operation against the database.
Use the profiler to locate any queries or write operations that are running slow. You can use this information, for
example, to determine what indexes to create.
For more information, see Database Profiling (page 143).
160
Chapter 4. Administration
The db.currentOp() (page 879) method reports on current operations running on a mongod (page 925) instance.
Use $explain to Evaluate Query Performance
The explain() (page 861) method returns statistics on a query, and reports the index MongoDB selected to fulfill
the query, as well as information about the internal operation of the query.
Example
To use explain() (page 861) on a query for documents matching the expression { a: 1 }, in the collection
named records, use an operation that resembles the following in the mongo (page 942) shell:
db.records.find( { a: 1 } ).explain()
Capped Collections (page 156) are circular, fixed-size collections that keep documents well-ordered, even without the
use of an index. This means that capped collections can receive very high-speed writes and sequential reads.
These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections
where appropriate.
Use Natural Order for Fast Reads
To return documents in the order they exist on disk, return sorted operations using the $natural (page 693) operator.
On a capped collection, this also returns the documents in the order in which they were written.
Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk.
See also:
sort() (page 872) and limit() (page 867).
Optimize Query Performance
Create Indexes to Support Queries
For commonly issued queries, create indexes (page 313). If a query searches multiple fields, create a compound index
(page 322). Scanning an index is much faster than scanning a collection. The indexes structures are smaller than the
documents reference, and store references in order.
Example
If you have a posts collection containing blog posts, and if you regularly issue a query that sorts on the
author_name field, then you can optimize the query by creating an index on the author_name field:
db.posts.ensureIndex( { author_name : 1 } )
161
Indexes also improve efficiency on queries that routinely sort on a given field.
Example
If you regularly issue a query that sorts on the timestamp field, then you can optimize the query by creating an
index on the timestamp field:
Creating this index:
db.posts.ensureIndex( { timestamp : 1 } )
Because MongoDB can read indexes in both ascending and descending order, the direction of a single-key index does
not matter.
Indexes support queries, update operations, and some phases of the aggregation pipeline (page 281).
Index keys that are of the BinData type are more efficiently stored in the index if:
the binary subtype value is in the range of 0-7 or 128-135, and
the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32.
Limit the Number of Query Results to Reduce Network Demand
MongoDB cursors return results in groups of multiple documents. If you know the number of results you want, you
can reduce the demand on network resources by issuing the limit() (page 867) method.
This is typically used in conjunction with sort operations. For example, if you need only 10 results from your query to
the posts collection, you would issue the following command:
db.posts.find().sort( { timestamp : -1 } ).limit(10)
When you need only a subset of fields from documents, you can achieve better performance by returning only the
fields you need:
For example, if in your query to the posts collection, you need only the timestamp, title, author, and
abstract fields, you would issue the following command:
db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1
For more information on using projections, see Limit Fields to Return from a Query (page 72).
Use $hint to Select a Particular Index
In most cases the query optimizer (page 45) selects the optimal index for a specific operation; however, you can
force MongoDB to use a specific index using the hint() (page 866) method. Use hint() (page 866) to support
performance testing, or on some queries where you must select a field or field included in several indexes.
162
Chapter 4. Administration
Use MongoDBs $inc (page 651) operator to increment or decrement values in documents. The operator increments
the value of the field on the server side, as an alternative to selecting a document, making simple modifications in
the client and then writing the entire document to the server. The $inc (page 651) operator can also help avoid race
conditions, which would result when two application instances queried for a document, manually incremented a field,
and saved the entire document back at the same time.
163
Store a JavaScript Function on the Server (page 177) Describes how to store JavaScript functions on a MongoDB
server.
MongoDB Tutorials (page 177) A complete list of tutorials in the MongoDB Manual that address MongoDB operation and use.
Use Database Commands
The MongoDB command interface provides access to all non CRUD database operations. Fetching server stats,
initializing a replica set, and running a map-reduce job are all accomplished with commands.
See Database Commands (page 694) for list of all commands sorted by function, and Database Commands (page 694)
for a list of all commands sorted alphabetically.
Database Command Form
You specify a command first by constructing a standard BSON document whose first key is the name of the command.
For example, specify the isMaster (page 732) command using the following BSON document:
{ isMaster: 1 }
Issue Commands
The mongo (page 942) shell provides a helper method for running commands called db.runCommand()
(page 893). The following operation in mongo (page 942) runs the above command:
db.runCommand( { isMaster: 1 } )
Many drivers (page 95) provide an equivalent for the db.runCommand() (page 893) method. Internally, running
commands with db.runCommand() (page 893) is equivalent to a special query against the $cmd collection.
Many common commands have their own shell helpers or wrappers in the mongo (page 942) shell and drivers, such
as the db.isMaster() (page 890) method in the mongo (page 942) JavaScript shell.
admin Database Commands
You must run some commands on the admin database. Normally, these operations resemble the followings:
use admin
db.runCommand( {buildInfo: 1} )
However, theres also a command helper that automatically runs the command in the context of the admin database:
db._adminCommand( {buildInfo: 1} )
Command Responses
All commands return, at minimum, a document with an ok field indicating whether the command has succeeded:
{ 'ok': 1 }
164
Chapter 4. Administration
By default, MongoDB stores data in the /data/db directory. On Windows, MongoDB stores data in C:\data\db.
On all platforms, MongoDB listens for connections from clients on port 27017.
To start MongoDB using all defaults, issue the following command at the system shell:
mongod
Specify a Data Directory If you want mongod (page 925) to store data files at a path other than /data/db you
can specify a dbpath (page 993). The dbpath (page 993) must exist before you start mongod (page 925). If it does
not exist, create the directory and the permissions so that mongod (page 925) can read and write data to this path. For
more information on permissions, see the security operations documentation (page 236).
To specify a dbpath (page 993) for mongod (page 925) to use as a data directory, use the --dbpath option. The
following invocation will start a mongod (page 925) instance and store data in the /srv/mongodb path
mongod --dbpath /srv/mongodb/
Specify a TCP Port Only a single process can listen for connections on a network interface at a time. If you run
multiple mongod (page 925) processes on a single machine, or have other processes that must use this port, you must
assign each a different port to listen on for client connections.
To specify a port to mongod (page 925), use the --port option on the command line. The following command starts
mongod (page 925) listening on port 12345:
mongod --port 12345
mongod
(page
925)
as
daemon
and
records
log
output
to
165
Additional Configuration Options For an overview of common configurations and common configuration deployments. configurations for common use cases, see Run-time Database Configuration (page 145).
Stop mongod
To stop a mongod (page 925) instance not running as a daemon, press Control+C. MongoDB stops when all
ongoing operations are complete and does a clean exit, flushing and closing all data files.
To stop a mongod (page 925) instance running in the background or foreground, issue the db.shutdownServer()
(page 894) helper in the mongo (page 942) shell. Use the following sequence:
1. To open the mongo (page 942) shell for a mongod (page 925) instance running on the default port of 27017,
issue the following command:
mongo
2. To switch to the admin database and shutdown the mongod (page 925) instance, issue the following commands:
use admin
db.shutdownServer()
You may only use db.shutdownServer() (page 894) when connected to the mongod (page 925) when authenticated to the admin database or on systems without authentication connected via the localhost interface.
Alternately, you can shut down the mongod (page 925) instance from a driver using the shutdown (page 759)
command. For details, see the drivers documentation (page 95) for your driver.
mongod Shutdown and Replica Sets If the mongod (page 925) is the primary in a replica set, the shutdown
process for these mongod (page 925) instances has the following steps:
1. Check how up-to-date the secondaries are.
2. If no secondary is within 10 seconds of the primary, mongod (page 925) will return a message that it will
not shut down. You can pass the shutdown (page 759) command a timeoutSecs argument to wait for a
secondary to catch up.
3. If there is a secondary within 10 seconds of the primary, the primary will step down and wait for the secondary
to catch up.
4. After 60 seconds or once the secondary has caught up, the primary will shut down.
If there is no up-to-date secondary and you want the primary to shut down, issue the shutdown (page 759) command
with the force argument, as in the following mongo (page 942) shell operation:
db.adminCommand({shutdown : 1, force : true})
To keep checking the secondaries for a specified number of seconds if none are immediately up-to-date, issue
shutdown (page 759) with the timeoutSecs argument. MongoDB will keep checking the secondaries for the
specified number of seconds if none are immediately up-to-date. If any of the secondaries catch up within the allotted
time, the primary will shut down. If no secondaries catch up, it will not shut down.
The following command issues shutdown (page 759) with timeoutSecs set to 5:
db.adminCommand({shutdown : 1, timeoutSecs : 5})
166
Chapter 4. Administration
Alternately you can use the timeoutSecs argument with the db.shutdownServer() (page 894) method:
db.shutdownServer({timeoutSecs : 5})
You can cleanly stop mongod (page 925) using a SIGINT or SIGTERM signal on UNIX-like systems. Either ^C for
a non-daemon mongod (page 925) instance, kill -2 <pid>, or kill -15 <pid> will cleanly terminate the
mongod (page 925) instance.
Terminating a mongod (page 925) instance that is not running with journaling with kill -9 <pid> (i.e.
SIGKILL) will probably cause data corruption.
To recover data in situations where mongod (page 925) instances have not terminated cleanly without journaling see
Recover Data after an Unexpected Shutdown (page 195).
Analyze Performance of Database Operations
The database profiler collects fine grained data about MongoDB write operations, cursors, database commands on a
running mongod (page 925) instance. You can enable profiling on a per-database or per-instance basis. The profiling
level (page 167) is also configurable when enabling profiling.
The database profiler writes all the data it collects to the system.profile (page 229) collection, which is a capped
collection (page 156). See Database Profiler Output (page 229) for overview of the data in the system.profile
(page 229) documents created by the profiler.
This document outlines a number of key administration options for the database profiler. For additional related information, consider the following resources:
Database Profiler Output (page 229)
Profile Command (page 770)
db.currentOp() (page 879)
Profiling Levels
You can enable database profiling from the mongo (page 942) shell or through a driver using the profile (page 770)
command. This section will describe how to do so from the mongo (page 942) shell. See your driver documentation
(page 95) if you want to control the profiler from within your application.
167
When you enable profiling, you also set the profiling level (page 167). The profiler records data in the
system.profile (page 229) collection. MongoDB creates the system.profile (page 229) collection in a
database after you enable profiling for that database.
To enable profiling and set the profiling level, issue use the db.setProfilingLevel() (page 894) helper in the
mongo (page 942) shell, passing the profiling level as a parameter. For example, to enable profiling for all database
operations, consider the following operation in the mongo (page 942) shell:
db.setProfilingLevel(2)
The shell returns a document showing the previous level of profiling. The "ok" :
operation succeeded:
To verify the new setting, see the Check Profiling Level (page 168) section.
Specify the Threshold for Slow Operations The threshold for slow operations applies to the entire mongod
(page 925) instance. When you change the threshold, you change it for all databases on the instance.
Important: Changing the slow operation threshold for the database profiler also affects the profiling subsystems
slow operation threshold for the entire mongod (page 925) instance. Always set the threshold to the highest useful
value.
By default the slow operation threshold is 100 milliseconds. Databases with a profiling level of 1 will log operations
slower than 100 milliseconds.
To change the threshold, pass two parameters to the db.setProfilingLevel() (page 894) helper in the mongo
(page 942) shell. The first parameter sets the profiling level for the current database, and the second sets the default
slow operation threshold for the entire mongod (page 925) instance.
For example, the following command sets the profiling level for the current database to 0, which disables profiling,
and sets the slow-operation threshold for the mongod (page 925) instance to 20 milliseconds. Any database on the
instance with a profiling level of 1 will use this threshold:
db.setProfilingLevel(0,20)
To view the profiling level (page 167), issue the following from the mongo (page 942) shell:
db.getProfilingStatus()
168
Chapter 4. Administration
Disable Profiling To disable profiling, use the following helper in the mongo (page 942) shell:
db.setProfilingLevel(0)
Enable Profiling for an Entire mongod Instance For development purposes in testing environments, you can
enable database profiling for an entire mongod (page 925) instance. The profiling level applies to all databases
provided by the mongod (page 925) instance.
To enable profiling for a mongod (page 925) instance, pass the following parameters to mongod (page 925) at startup
or within the configuration file (page 990):
mongod --profile=1 --slowms=15
This sets the profiling level to 1, which collects profiling data for slow operations only, and defines slow operations as
those that last longer than 15 milliseconds.
See also:
profile (page 997) and slowms (page 997).
Database Profiling and Sharding You cannot enable profiling on a mongos (page 938) instance. To enable profiling in a shard cluster, you must enable profiling for each mongod (page 925) instance in the cluster.
View Profiler Data
The database profiler logs information about database operations in the system.profile (page 229) collection.
To view profiling information, query the system.profile (page 229) collection. To view example queries, see
Profiler Overhead (page 170)
For an explanation of the output data, see Database Profiler Output (page 229).
Example Profiler Data Queries This section displays example queries to the system.profile (page 229) collection. For an explanation of the query output, see Database Profiler Output (page 229).
To return the most recent 10 log entries in the system.profile (page 229) collection, run a query similar to the
following:
db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
To return all operations except command operations ($cmd), run a query similar to the following:
db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
To return operations for a particular collection, run a query similar to the following. This example returns operations
in the mydb databases test collection:
db.system.profile.find( { ns : 'mydb.test' } ).pretty()
To return operations slower than 5 milliseconds, run a query similar to the following:
db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
To return information from a certain time range, run a query similar to the following:
169
db.system.profile.find(
{
ts : {
$gt : new ISODate("2012-12-09T03:00:00Z") ,
$lt : new ISODate("2012-12-09T03:40:00Z")
}
}
).pretty()
The following example looks at the time range, suppresses the user field from the output to make it easier to read,
and sorts the results by how long each operation took to run:
db.system.profile.find(
{
ts : {
$gt : new ISODate("2011-07-12T03:00:00Z") ,
$lt : new ISODate("2011-07-12T03:40:00Z")
}
},
{ user : 0 }
).sort( { millis : -1 } )
Show the Five Most Recent Events On a database that has profiling enabled, the show profile helper in the
mongo (page 942) shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show
profile from the mongo (page 942) shell, as follows:
show profile
Profiler Overhead
When enabled, profiling has a minor effect on performance. The system.profile (page 229) collection is a
capped collection with a default size of 1 megabyte. A collection of this size can typically store several thousand
profile documents, but some application may use more or less profiling data per operation.
To change the size of the system.profile (page 229) collection, you must:
1. Disable profiling.
2. Drop the system.profile (page 229) collection.
3. Create a new system.profile (page 229) collection.
4. Re-enable profiling.
For example, to create a new system.profile (page 229) collections thats 4000000 bytes, use the following
sequence of operations in the mongo (page 942) shell:
db.setProfilingLevel(0)
db.system.profile.drop()
db.createCollection( "system.profile", { capped: true, size:4000000 } )
db.setProfilingLevel(1)
170
Chapter 4. Administration
Red Hat Enterprise Linux 6.x series and Amazon Linux AMI require libssl, net-snmp,
net-snmp-libs, and net-snmp-utils. Issue a command such as the following to install these packages:
sudo yum install libssl net-snmp net-snmp-libs net-snmp-utils
Configure SNMP
171
Start Up You can control MongoDB Enterprise using default or custom control scripts, just as with any other
mongod:
Use the following command to view all SNMP options available in your MongoDB:
mongod --help | grep snmp
--logpath /var/log/mongodb/1.log
Optionally, you can set these options in a configuration file (page 990).
To check if mongod is running with SNMP support, issue the following command:
ps -ef | grep 'mongod --snmp'
The command should return output that includes the following line. This indicates that the proper mongod instance is
running:
systemuser 31415 10260
0 Jul13 pts/16
Test SNMP Check for the snmp agent process listening on port 1161 with the following command:
sudo lsof -i :1161
FD
10u
172
Chapter 4. Administration
udp
0 127.0.0.1:1161
0.0.0.0:*
9238/<path>/mongod
Run snmpwalk Locally snmpwalk provides tools for retrieving and parsing the SNMP data according to the
MIB. If you installed all of the required packages above, your system will have snmpwalk.
Issue the following command to collect data from mongod using SNMP:
snmpwalk -m MONGO-MIB -v 2c -c mongodb 127.0.0.1:1161 1.3.6.1.4.1.37601
You may also choose to specify the path to the MIB file:
snmpwalk -m /usr/share/snmp/mibs/MONGO-MIB -v 2c -c mongodb 127.0.0.1:1161 1.3.6.1.4.1.37601
Use this command only to ensure that you can retrieve and validate SNMP data from MongoDB.
Troubleshooting
Always check the logs for errors if something does not run as expected; see the log at /var/log/mongodb/1.log.
The presence of the following line indicates that the mongod cannot read the /etc/snmp/mongod.conf file:
[SNMPAgent] warning: error starting SNMPAgent as master err:1
Overview
Log rotation using MongoDBs standard approach archives the current log file and starts a new one. To do this,
the mongod (page 925) or mongos (page 938) instance renames the current log file by appending a UTC (GMT)
timestamp to the filename, in ISODate format. It then opens a new log file, closes the old log file, and sends all new
log entries to the new log file.
MongoDBs standard approach to log rotation only rotates logs in response to the logRotate (page 760) command,
or when the mongod (page 925) or mongos (page 938) process receives a SIGUSR1 signal from the operating
system.
Alternately, you may configure mongod to send log data to syslog. In this case, you can take advantage of alternate
logrotation tools.
See also:
For information on logging, see the Process Logging (page 141) section.
Log Rotation With MongoDB
173
This is the only available method to rotate log files on Windows systems.
From the UNIX shell, rotate logs for a single process by issuing the following command:
kill -SIGUSR1 <mongod process id>
From the UNIX shell, rotate logs for all mongod (page 925) processes on a machine by issuing the
following command:
killall -SIGUSR1 mongod
For results you get something similar to the following. The timestamps will be different.
server1.log
server1.log.2011-11-24T23-30-00
The example results indicate a log rotation performed at exactly 11:30 pm on November 24th, 2011
UTC, which is the local time offset by the local time zone. The original log file is the one with the timestamp.
The new log is server1.log file.
If you issue a second logRotate (page 760) command an hour later, then an additional file would appear
when listing matching files, as in the following example:
server1.log
server1.log.2011-11-24T23-30-00
server1.log.2011-11-25T00-30-00
This operation does not modify the server1.log.2011-11-24T23-30-00 file created earlier, while
server1.log.2011-11-25T00-30-00 is the previous server1.log file, renamed. server1.log
is a new, empty file that receives all new log output.
Syslog Log Rotation
174
Chapter 4. Administration
Manage Journaling
MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 50) durability and to
provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation to the
journal. If MongoDB should terminate or encounter an error before it can write the changes from the journal to the
data files, MongoDB can re-apply the write operation and maintain a consistent state.
Without a journal, if mongod (page 925) exits unexpectedly, you must assume your data is in an inconsistent state,
and you must run either repair (page 195) or, preferably, resync (page 450) from a clean member of the replica set.
With journaling enabled, if mongod (page 925) stops unexpectedly, the program can recover everything written to the
journal, and the data remains in a consistent state. By default, the greatest extent of lost writes, i.e., those not made
to the journal, are those made in the last 100 milliseconds. See journalCommitInterval (page 995) for more
information on the default.
With journaling, if you want a data set to reside entirely in RAM, you need enough RAM to hold the data set plus
the write working set. The write working set is the amount of unique data you expect to see written between
re-mappings of the private view. For information on views, see Storage Views used in Journaling (page 233).
Important: Changed in version 2.0: For 64-bit builds of mongod (page 925), journaling is enabled by default. For
other platforms, see journal (page 995).
Procedures
Enable Journaling
default.
Changed in version 2.0: For 64-bit builds of mongod (page 925), journaling is enabled by
To enable journaling, start mongod (page 925) with the --journal command line option.
If no journal files exist, when mongod (page 925) starts, it must preallocate new journal files. During this operation,
the mongod (page 925) is not listening for connections until preallocation completes: for some systems this may take
a several minutes. During this period your applications and the mongo (page 942) shell are not available.
Disable Journaling
Warning: Do not disable journaling on production systems. If your mongod (page 925) instance stops witho
shutting down cleanly unexpectedly for any reason, (e.g. power failure) and you are not running with journalin
then you must recover from an unaffected replica set member or backup, as described in repair (page 195).
To disable journaling, start mongod (page 925) with the --nojournal command line option.
Get Commit Acknowledgment You can get commit acknowledgment with the getLastError (page 720) command and the j option. For details, see Write Concern Reference (page 96).
Avoid Preallocation Lag To avoid preallocation lag (page 232), you can preallocate files in the journal directory by
copying them from another instance of mongod (page 925).
Preallocated files do not contain data. It is safe to later remove them. But if you restart mongod (page 925) with
journaling, mongod (page 925) will create them again.
Example
The following sequence preallocates journal files for an instance of mongod (page 925) running on port 27017 with
a database path of /data/db.
For demonstration purposes, the sequence starts by creating a set of journal files in the usual way.
4.2. Administration Tutorials
175
2. Create a set of journal files by staring a mongod (page 925) instance that uses the temporary directory:
mongod --port 10000 --dbpath ~/tmpDbpath --journal
3. When you see the following log output, indicating mongod (page 925) has the files, press CONTROL+C to
stop the mongod (page 925) instance:
web admin interface listening on port 11000
4. Preallocate journal files for the new instance of mongod (page 925) by moving the journal files from the data
directory of the existing instance to the data directory of the new instance:
mv ~/tmpDbpath/journal /data/db/
Monitor Journal Status Use the following commands and methods to monitor journal status:
serverStatus (page 782)
The serverStatus (page 782) command returns database status information that is useful for assessing
performance.
journalLatencyTest (page 805)
Use journalLatencyTest (page 805) to measure how long it takes on your volume to write to the disk in
an append-only fashion. You can run this command on an idle system to get a baseline sync time for journaling.
You can also run this command on a busy system to see the sync time on a busy system, which may be higher if
the journal directory is on the same volume as the data files.
The journalLatencyTest (page 805) command also provides a way to check if your disk drive is buffering
writes in its local cache. If the number is very low (i.e., less than 2 milliseconds) and the drive is non-SSD, the
drive is probably buffering writes. In that case, enable cache write-through for the device in your operating
system, unless you have a disk controller card with battery backed RAM.
Change the Group Commit Interval Changed in version 2.0.
You can set the group commit interval using the --journalCommitInterval command line option. The allowed
range is 2 to 300 milliseconds.
Lower values increase the durability of the journal at the expense of disk performance.
Recover Data After Unexpected Shutdown On a restart after a crash, MongoDB replays all journal files in the
journal directory before the server becomes available. If MongoDB must replay journal files, mongod (page 925)
notes these events in the log output.
There is no reason to run repairDatabase (page 757) in these situations.
176
Chapter 4. Administration
The _id field holds the name of the function and is unique per database.
The value field holds the function definition
Once you save a function in the system.js collection, you can use the function from any JavaScript context (e.g.
eval (page 722) command or the mongo (page 942) shell method db.eval() (page 884), $where (page 634) operator, mapReduce (page 701) or mongo (page 942) shell method db.collection.mapReduce() (page 837)).
Consider the following example from the mongo (page 942) shell that first saves a function named echoFunction
to the system.js collection and calls the function using db.eval() (page 884) method:
db.system.js.save(
{ _id: "echoFunction",
value : function(x) { return x; }
}
)
db.eval( "echoFunction( 'test' )" )
MongoDB Tutorials
This page lists the tutorials available as part of the MongoDB Manual (page 1). In addition to these documents, you
can refer to the introductory MongoDB Tutorial (page 26). If there is a process or pattern that you would like to see
included here, please open a Jira Case36 .
Getting Started
177
Replica Sets
Deploy a Replica Set (page 420)
Convert a Standalone to a Replica Set (page 432)
Add Members to a Replica Set (page 433)
Remove Members from Replica Set (page 436)
Replace a Replica Set Member (page 437)
Adjust Priority for Replica Set Member (page 438)
Resync a Member of a Replica Set (page 450)
Deploy a Geographically Distributed Replica Set (page 425)
Change the Size of the Oplog (page 446)
Force a Member to Become Primary (page 448)
Change Hostnames in a Replica Set (page 458)
Add an Arbiter to Replica Set (page 432)
Convert a Secondary to an Arbiter (page 443)
Configure a Secondarys Sync Target (page 462)
Configure a Delayed Replica Set Member (page 441)
Configure a Hidden Replica Set Member (page 440)
Configure Non-Voting Replica Set Member (page 442)
Prevent Secondary from Becoming Primary (page 439)
Configure Replica Set Tag Sets (page 451)
Manage Chained Replication (page 457)
Reconfigure a Replica Set with Unavailable Members (page 455)
Recover Data after an Unexpected Shutdown (page 195)
Troubleshoot Replica Sets (page 462)
178
Chapter 4. Administration
Sharding
Deploy a Sharded Cluster (page 522)
Convert a Replica Set to a Replicated Sharded Cluster (page 530)
Add Shards to a Cluster (page 529)
Remove Shards from an Existing Sharded Cluster (page 553)
Deploy Three Config Servers for Production Deployments (page 539)
Migrate Config Servers with the Same Hostname (page 540)
Migrate Config Servers with Different Hostnames (page 540)
Replace a Config Server (page 541)
Migrate a Sharded Cluster to Different Hardware (page 542)
Backup Cluster Metadata (page 545)
Backup a Small Sharded Cluster with mongodump (page 188)
Backup a Sharded Cluster with Filesystem Snapshots (page 189)
Backup a Sharded Cluster with Database Dumps (page 190)
Restore a Single Shard (page 192)
Restore a Sharded Cluster (page 192)
Schedule Backup Window for Sharded Clusters (page 191)
Manage Shard Tags (page 536)
Basic Operations
Use Database Commands (page 164)
Recover Data after an Unexpected Shutdown (page 195)
Copy Databases Between Instances (page 193)
Expire Data from Collections by Setting TTL (page 158)
Analyze Performance of Database Operations (page 167)
Rotate Log Files (page 173)
Build Old Style Indexes (page 345)
Manage mongod Processes (page 165)
Backup and Restore with MongoDB Tools (page 181)
Backup and Restore with Filesystem Snapshots (page 184)
Security
Configure Linux iptables Firewall for MongoDB (page 242)
Configure Windows netsh Firewall for MongoDB (page 246)
Enable Authentication (page 255)
Create a User Administrator (page 255)
Add a User to a Database (page 257)
4.2. Administration Tutorials
179
180
Chapter 4. Administration
Important: mongodump (page 951) does not create output for the local database.
Basic mongodump Operations The mongodump (page 951) utility can back up data by either:
connecting to a running mongod (page 925) or mongos (page 938) instance, or
accessing data files without an active instance.
The utility can create a backup for an entire server, database or collection, or can use a query to backup just part of a
collection.
When you run mongodump (page 951) without any arguments, the command connects to the MongoDB instance on
the local system (e.g. 127.0.0.1 or localhost) on port 27017 and creates a database backup named dump/ in
the current directory.
To backup data from a mongod (page 925) or mongos (page 938) instance running on the same machine and on the
default port of 27017 use the following command:
mongodump
181
Note: The format of data created by mongodump (page 951) tool from the 2.2 distribution or later is different and
incompatible with earlier versions of mongod (page 925).
To limit the amount of data included in the database dump, you can specify --db and --collection as options to
the mongodump (page 951) command. For example:
mongodump --dbpath /data/db/ --out /data/backup/
mongodump --host mongodb.example.net --port 27017
mongodump (page 951) will write BSON files that hold a copy of data accessible via the mongod (page 925) listening
on port 27017 of the mongodb.example.net host.
mongodump --collection collection --db test
This command creates a dump of the collection named collection from the database test in a dump/ subdirectory of the current working directory.
Point in Time Operation Using Oplogs Use the --oplog option with mongodump (page 951) to collect the oplog
entries to build a point-in-time snapshot of a database within a replica set. With --oplog, mongodump (page 951)
copies all the data from the source database as well as all of the oplog entries from the beginning of the backup
procedure to until the backup procedure completes. This backup procedure, in conjunction with mongorestore
--oplogReplay, allows you to restore a backup that reflects a consistent and specific moment in time.
Create Backups Without a Running mongod Instance If your MongoDB instance is not running, you can use
the --dbpath option to specify the location to your MongoDB instances database files. mongodump (page 951)
reads from the data files directly with this operation. This locks the data directory to prevent conflicting writes. The
mongod (page 925) process must not be running or attached to these data files when you run mongodump (page 951)
in this configuration. Consider the following example:
Example
Backup a MongoDB Instance Without a Running mongod
Given a MongoDB instance that contains the customers, products, and suppliers databases, the following
mongodump (page 951) operation backs up the databases using the --dbpath option, which specifies the location
of the database files on the host:
mongodump --dbpath /data -o dataout
The --out option allows you to specify the directory where mongodump (page 951) will save the
backup. mongodump (page 951) creates a separate backup directory for each of the backed up databases:
dataout/customers, dataout/products, and dataout/suppliers.
Create Backups from Non-Local mongod Instances The --host and --port options for mongodump
(page 951) allow you to connect to and backup from a remote host. Consider the following example:
mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/m
On any mongodump (page 951) command you may, as above, specify username and password credentials to specify
database authentication.
182
Chapter 4. Administration
The mongorestore (page 956) utility restores a binary backup created by mongodump (page 951). By default,
mongorestore (page 956) looks for a database backup in the dump/ directory.
The mongorestore (page 956) utility can restore data either by:
connecting to a running mongod (page 925) or mongos (page 938) directly, or
writing to a set of MongoDB data files without use of a running mongod (page 925).
mongorestore (page 956) can restore either an entire database backup or a subset of the backup.
To use mongorestore (page 956) to connect to an active mongod (page 925) or mongos (page 938), use a
command with the following prototype form:
mongorestore --port <port number> <path to the backup>
To use mongorestore (page 956) to write to data files without using a running mongod (page 925), use a command
with the following prototype form:
mongorestore --dbpath <database path> <path to the backup>
Here, mongorestore (page 956) imports the database backup in the dump-2012-10-25 directory to the mongod
(page 925) instance running on the localhost interface.
Restore Point in Time Oplog Backup If you created your database dump using the --oplog option to ensure
a point-in-time snapshot, call mongorestore (page 956) with the --oplogReplay option, as in the following
example:
mongorestore --oplogReplay
You may also consider using the mongorestore --objcheck option to check the integrity of objects while
inserting them into the database, or you may consider the mongorestore --drop option to drop each collection
from the database before restoring from backups.
Restore a Subset of data from a Binary Database Dump mongorestore (page 956) also includes the ability to
a filter to all input before inserting it into the new database. Consider the following example:
mongorestore --filter '{"field": 1}'
Here, mongorestore (page 956) only adds documents to the database from the dump located in the dump/ folder
if the documents have a field name field that holds a value of 1. Enclose the filter in single quotes (e.g. ) to
prevent the filter from interacting with your shell environment.
Restore Without a Running mongod mongorestore (page 956) can write data to MongoDB data files without
needing to connect to a mongod (page 925) directly.
Example
Restore a Database Without a Running mongod
Given a set of backed up databases in the dataout directory:
dataout/customers,
4.2. Administration Tutorials
183
dataout/products, and
dataout/suppliers
The following mongorestore (page 956) command restores the products database. The command uses the
--dbpath option to specify which database to restore:
mongorestore --host localhost --port 27107 --dbpath /dataout/products --journal
The mongorestore (page 956) imports the database backup in the /dataout/products directory to the
mongod (page 925) instance that runs on the localhost interface. The mongorestore (page 956) operation imports the backup even if the mongod (page 925) is not running.
The --journal option ensures that mongorestore (page 956) records all operation in the durability journal. The
journal prevents data file corruption if anything (e.g. power failure, disk failure, etc.) interrupts the restore operation.
See also:
mongodump (page 951) and mongorestore (page 956).
Restore Backups to Non-Local mongod Instances By default, mongorestore (page 956) connects to a MongoDB instance running on the localhost interface (e.g. 127.0.0.1) and on the default port (27017). If you want to
restore to a different host or port, use the --host and --port options.
Consider the following example:
mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mong
As above, you may specify username and password connections if your mongod (page 925) requires authentication.
Backup and Restore with Filesystem Snapshots
This document describes a procedure for creating backups of MongoDB systems using system-level tools, such as
LVM or storage appliance, as well as the corresponding restoration strategies.
These filesystem snapshots, or block-level backup methods use system level tools to create copies of the device that
holds MongoDBs data files. These methods complete quickly and work reliably, but require more system configuration outside of MongoDB.
See also:
Backup Strategies for MongoDB Systems (page 136) and Backup and Restore with MongoDB Tools (page 181).
Snapshots Overview
Snapshots work by creating pointers between the live data and a special snapshot volume. These pointers are theoretically equivalent to hard links. As the working data diverges from the snapshot, the snapshot process uses a
copy-on-write strategy. As a result the snapshot only stores modified data.
After making the snapshot, you mount the snapshot image on your file system and copy data from the snapshot. The
resulting backup contains a full copy of all data.
Snapshots have the following limitations:
The database must be in a consistent or recoverable state when the snapshot takes place. This means that all
writes accepted by the database need to be fully written to disk: either to the journal or to data files.
If all writes are not on disk when the backup occurs, the backup will not reflect these changes. If writes are in
progress when the backup occurs, the data files will reflect an inconsistent state. With journaling all data-file
184
Chapter 4. Administration
states resulting from in-progress writes are recoverable; without journaling you must flush all pending writes
to disk before running the backup operation and must ensure that no writes occur during the entire backup
procedure.
If you do use journaling, the journal must reside on the same volume as the data.
Snapshots create an image of an entire disk image. Unless you need to back up your entire system, consider
isolating your MongoDB data files, journal (if applicable), and configuration on one logical disk that doesnt
contain any other data.
Alternately, store all MongoDB data files on a dedicated device so that you can make backups without duplicating extraneous data.
Ensure that you copy data from snapshots and onto other systems to ensure that data is safe from site failures.
Although different snapshots methods provide different capability, the LVM method outlined below does not
provide any capacity for capturing incremental backups.
Snapshots With Journaling If your mongod (page 925) instance has journaling enabled, then you can use any kind
of file system or volume/block level snapshot tool to create backups.
If you manage your own infrastructure on a Linux-based system, configure your system with LVM to provide your disk
packages and provide snapshot capability. You can also use LVM-based setups within a cloud/virtualized environment.
Note: Running LVM provides additional flexibility and enables the possibility of using snapshots to back up MongoDB.
Snapshots with Amazon EBS in a RAID 10 Configuration If your deployment depends on Amazons Elastic
Block Storage (EBS) with RAID configured within your instance, it is impossible to get a consistent state across all
disks using the platforms snapshot tool. As an alternative, you can do one of the following:
Flush all writes to disk and create a write lock to ensure consistent state during the backup process.
If you choose this option see Create Backups on Instances that do not have Journaling Enabled (page 187).
Configure LVM to run and hold your MongoDB data files on top of the RAID within your system.
If you choose this option, perform the LVM backup operation described in Create a Snapshot (page 185).
Backup and Restore Using LVM on a Linux System
This section provides an overview of a simple backup process using LVM on a Linux system. While the tools, commands, and paths may be (slightly) different on your system the following steps provide a high level overview of the
backup operation.
Note: Only use the following procedure as a guideline for a backup system and infrastructure. Production backup
systems must consider a number of application specific requirements and factors unique to specific environments.
Create a Snapshot To create a snapshot with LVM, issue a command as root in the following format:
lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb
This command creates an LVM snapshot (with the --snapshot option) named mdb-snap01 of the mongodb
volume in the vg0 volume group.
185
Restore a Snapshot
mands:
To restore a snapshot created with the above method, issue the following sequence of com-
186
Chapter 4. Administration
Mounts the mdb-new disk image to the /srv/mongodb directory. Modify the mount point to correspond to
your MongoDB data file location, or other location as needed.
Note: The restored snapshot will have a stale mongod.lock file. If you do not remove this file from the snapshot, and MongoDB may assume that the stale lock file indicates an unclean shutdown. If youre running with
journal (page 995) enabled, and you do not use db.fsyncLock() (page 885), you do not need to remove
the mongod.lock file. If you use db.fsyncLock() (page 885) you will need to remove the lock.
umount /dev/vg0/mdb-snap01
lvcreate --size 1G --name mdb-new vg0
dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
You can implement off-system backups using the combined process (page 187) and SSH.
This sequence is identical to procedures explained above, except that it archives and compresses the backup on a
remote system using SSH.
Consider the following procedure:
umount /dev/vg0/mdb-snap01
dd if=/dev/vg0/mdb-snap01 | ssh [email protected] gzip > /opt/backup/mdb-snap01.gz
lvcreate --size 1G --name mdb-new vg0
ssh [email protected] gzip -d -c /opt/backup/mdb-snap01.gz | dd of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
If your mongod (page 925) instance does not run with journaling enabled, or if your journal is on a separate volume,
obtaining a functional backup of a consistent state is more complicated. As described in this section, you must flush all
writes to disk and lock the database to prevent writes during the backup process. If you have a replica set configuration,
then for your backup use a secondary which is not receiving reads (i.e. hidden member).
1. To flush writes to disk and to lock the database (to prevent further writes), issue the db.fsyncLock()
(page 885) method in the mongo (page 942) shell:
db.fsyncLock();
Note:
Changed in version 2.0: MongoDB 2.0 added db.fsyncLock() (page 885) and
db.fsyncUnlock() (page 886) helpers to the mongo (page 942) shell. Prior to this version, use the fsync
(page 751) command with the lock option, as follows:
db.runCommand( { fsync: 1, lock: true } );
db.runCommand( { fsync: 1, lock: false } );
187
Note: The database cannot be locked with db.fsyncLock() (page 885) while profiling is enabled. You
must disable profiling before locking the database with db.fsyncLock() (page 885). Disable profiling using
db.setProfilingLevel() (page 894) as follows in the mongo (page 942) shell:
db.setProfilingLevel(0)
Warning:
Changed in version 2.2: When used in combination with fsync (page 751) or
db.fsyncLock() (page 885), mongod (page 925) may block some reads, including those from
mongodump (page 951), when queued write operation waits behind the fsync (page 751) lock.
Overview If your sharded cluster holds a small data set, you can connect to a mongos (page 938) using
mongodump (page 951). You can create backups of your MongoDB cluster, if your backup infrastructure can capture the entire backup in a reasonable amount of time and if you have a storage system that can hold the complete
MongoDB data set.
Read Sharded Cluster Backup Considerations (page 137) for a high-level overview of important considerations as well
as a list of alternate backup tutorials.
Important: By default mongodump (page 951) issue its queries to the non-primary nodes.
Procedure
Capture Data
Note: If you use mongodump (page 951) without specifying a database or collection, mongodump (page 951) will
capture collection data and the cluster meta-data from the config servers (page 502).
188
Chapter 4. Administration
You cannot use the --oplog option for mongodump (page 951) when capturing data from mongos (page 938).
This option is only available when running directly against a replica set member.
You can perform a backup of a sharded cluster by connecting mongodump (page 951) to a mongos (page 938). Use
the following operation at your systems prompt:
mongodump --host mongos3.example.net --port 27017
mongodump (page 951) will write BSON files that hold a copy of data stored in the sharded cluster accessible via the
mongos (page 938) listening on port 27017 of the mongos3.example.net host.
Restore Data Backups created with mongodump (page 951) do not reflect the chunks or the distribution of data
in the sharded collection or collections. Like all mongodump (page 951) output, these backups contain separate
directories for each database and BSON files for each collection in that database.
You can restore mongodump (page 951) output to any MongoDB instance, including a standalone, a replica set, or a
new sharded cluster. When restoring data to sharded cluster, you must deploy and configure sharding before restoring
data from the backup. See Deploy a Sharded Cluster (page 522) for more information.
Backup a Sharded Cluster with Filesystem Snapshots
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This
procedure uses file system snapshots to capture a copy of the mongod (page 925) instance. An alternate procedure
that uses mongodump (page 951) to create binary database dumps when file-system snapshots are not available. See
Backup a Sharded Cluster with Database Dumps (page 190) for the alternate procedure.
See Sharded Cluster Backup Considerations (page 137) for a full higher level overview backing up a sharded cluster
as well as links to other tutorials that provide alternate procedures.
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a
running production system, you can only capture an approximation of point-in-time snapshot.
Procedure In this procedure, you will stop the cluster balancer and take a backup up of the config database, and
then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time
snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise
the snapshot will only approximate a moment in time.
For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the
cluster by taking the backup from a secondary member of the replica set that provides each shard.
1. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer,
use the sh.stopBalancer() (page 912) method in the mongo (page 942) shell, and see the Disable the
Balancer (page 552) procedure.
Warning: It is essential that you stop the balancer before creating backups. If the balancer remains active,
your resulting backups could have duplicate data or miss some data, as chunks may migrate while recording
backups.
2. Lock one member of each replica set in each shard so that your backups reflect the state of your database at
the nearest possible approximation of a single moment in time. Lock these mongod (page 925) instances in as
short of an interval as possible.
To lock or freeze a sharded cluster, you must:
189
use the db.fsyncLock() (page 885) method in the mongo (page 942) shell connected to a single
secondary member of the replica set that provides shard mongod (page 925) instance.
Shutdown one of the config servers (page 502), to prevent all metadata changes during the backup process.
3. Use mongodump (page 951) to backup one of the config servers (page 502). This backs up the clusters
metadata. You only need to back up one config server, as they all hold the same data.
Issue this command against one of the config mongod (page 925) instances or via the mongos (page 938):
mongodump --db config
4. Back up the replica set members of the shards that you locked. You may back up the shards in parallel. For each
shard, create a snapshot. Use the procedures in Backup and Restore with Filesystem Snapshots (page 184).
5. Unlock all locked replica set members of each shard using the db.fsyncUnlock() (page 886) method in
the mongo (page 942) shell.
6. Re-enable the balancer with the sh.setBalancerState() (page 908) method.
Use the following command sequence when connected to the mongos (page 938) with the mongo (page 942)
shell:
use config
sh.setBalancerState(true)
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This
procedure uses mongodump (page 951) to create dumps of the mongod (page 925) instance. An alternate procedure
uses file system snapshots to capture the backup data, and may be more efficient in some situations if your system
configuration allows file system backups. See Backup a Sharded Cluster with Filesystem Snapshots (page 189).
See Sharded Cluster Backup Considerations (page 137) for a full higher level overview of backing up a sharded cluster
as well as links to other tutorials that provide alternate procedures.
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a
running production system, you can only capture an approximation of point-in-time snapshot.
Procedure In this procedure, you will stop the cluster balancer and take a backup up of the config database, and then
take backups of each shard in the cluster using mongodump (page 951) to capture the backup data. If you need an
exact moment-in-time snapshot of the system, you will need to stop all application writes before taking the filesystem
snapshots; otherwise the snapshot will only approximate a moment of time.
For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the
cluster by taking the backup from a secondary member of the replica set that provides each shard.
1. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer,
use the sh.stopBalancer() (page 912) method in the mongo (page 942) shell, and see the Disable the
Balancer (page 552) procedure.
Warning: It is essential that you stop the balancer before creating backups. If the balancer remains active,
your resulting backups could have duplicate data or miss some data, as chunks migrate while recording
backups.
190
Chapter 4. Administration
2. Lock one member of each replica set in each shard so that your backups reflect the state of your database at
the nearest possible approximation of a single moment in time. Lock these mongod (page 925) instances in as
short of an interval as possible.
To lock or freeze a sharded cluster, you must:
Shutdown one member of each replica set.
Ensure that the oplog has sufficient capacity to allow these secondaries to catch up to the state of the
primaries after finishing the backup procedure. See Oplog Size (page 411) for more information.
Shutdown one of the config servers (page 502), to prevent all metadata changes during the backup process.
3. Use mongodump (page 951) to backup one of the config servers (page 502). This backs up the clusters
metadata. You only need to back up one config server, as they all hold the same data.
Issue this command against one of the config mongod (page 925) instances or via the mongos (page 938):
mongodump --journal --db config
4. Back up the replica set members of the shards that shut down using mongodump (page 951) and specifying the
--dbpath option. You may back up the shards in parallel. Consider the following invocation:
mongodump --journal --dbpath /data/db/ --out /data/backup/
You must run this command on the system where the mongod (page 925) ran. This operation will use journaling and create a dump of the entire mongod (page 925) instance with data files stored in /data/db/.
mongodump (page 951) will write the output of this dump to the /data/backup/ directory.
5. Restart all stopped replica set members of each shard as normal and allow them to catch up with the state of the
primary.
6. Re-enable the balancer with the sh.setBalancerState() (page 908) method.
Use the following command sequence when connected to the mongos (page 938) with the mongo (page 942)
shell:
use config
sh.setBalancerState(true)
Overview In a sharded cluster, the balancer process is responsible for distributing sharded data around the cluster,
so that each shard has roughly the same amount of data.
However, when creating backups from a sharded cluster it is important that you disable the balancer while taking
backups to ensure that no chunk migrations affect the content of the backup captured by the backup procedure. Using
the procedure outlined in the section Disable the Balancer (page 552) you can manually stop the balancer process
temporarily. As an alternative you can use this procedure to define a balancing window so that the balancer is always
disabled during your automated backup operation.
Procedure If you have an automated backup schedule, you can disable all balancing operations for a period of time.
For instance, consider the following command:
use config
db.settings.update( { _id : "balancer" }, { $set : { activeWindow : { start : "6:00", stop : "23:00"
This operation configures the balancer to run between 6:00am and 11:00pm, server time. Schedule your backup
operation to run and complete outside of this time. Ensure that the backup can complete outside the window when
191
the balancer is running and that the balancer can effectively balance the collection among the shards in the window
allotted to each.
Restore a Single Shard
Overview Restoring a single shard from backup with other unaffected shards requires a number of special considerations and practices. This document outlines the additional tasks you must perform when restoring a single shard.
Consider the following resources on backups in general as well as backup and restoration of sharded clusters specifically:
Sharded Cluster Backup Considerations (page 137)
Restore a Sharded Cluster (page 192)
Backup Strategies for MongoDB Systems (page 136)
Procedure Always restore sharded clusters as a whole. When you restore a single shard, keep in mind that the
balancer process might have moved chunks to or from this shard since the last backup. If thats the case, you must
manually move those chunks, as described in this procedure.
1. Restore the shard as you would any other mongod (page 925) instance. See Backup Strategies for MongoDB
Systems (page 136) for overviews of these procedures.
2. For all chunks that migrate away from this shard, you do not need to do anything at this time. You do not
need to delete these documents from the shard because the chunks are automatically filtered out from queries by
mongos (page 938). You can remove these documents from the shard, if you like, at your leisure.
3. For chunks that migrate to this shard after the most recent backup, you must manually recover the chunks using
backups of other shards, or some other source. To determine what chunks have moved, view the changelog
collection in the Config Database (page 564).
Restore a Sharded Cluster
Overview The procedure outlined in this document addresses how to restore an entire sharded cluster. For information on related backup procedures consider the following tutorials which describe backup procedures in greater
detail:
Backup a Sharded Cluster with Filesystem Snapshots (page 189)
Backup a Sharded Cluster with Database Dumps (page 190)
The exact procedure used to restore a database depends on the method used to capture the backup. See the Backup
Strategies for MongoDB Systems (page 136) document for an overview of backups with MongoDB, as well as Sharded
Cluster Backup Considerations (page 137) which provides an overview of the high level concepts important for backing up sharded clusters.
Procedure
1. Stop all mongod (page 925) and mongos (page 938) processes.
2. If shard hostnames have changed, you must manually update the shards collection in the Config Database
(page 564) to use the new hostnames. Do the following:
(a) Start the three config servers (page 502) by issuing commands similar to the following, using values
appropriate to your configuration:
192
Chapter 4. Administration
(b) Restore the Config Database (page 564) on each config server.
(c) Start one mongos (page 938) instance.
(d) Update the Config Database (page 564) collection named shards to reflect the new hostnames.
3. Restore the following:
Data files for each server in each shard. Because replica sets provide each production shard, restore all the
members of the replica set or use the other standard approaches for restoring a replica set from backup.
See the Restore a Snapshot (page 186) and Restore a Database with mongorestore (page 183) sections for
details on these procedures.
Data files for each config server (page 502), if you have not already done so in the previous step.
4. Restart all the mongos (page 938) instances.
5. Restart all the mongod (page 925) instances.
6. Connect to a mongos (page 938) instance from a mongo (page 942) shell and use the
db.printShardingStatus() (page 892) method to ensure that the cluster is operational, as follows:
db.printShardingStatus()
show collections
MongoDB provides the copydb (page 745) and clone (page 748) database commands to support migrations of
entire logical databases between mongod (page 925) instances. With these commands you can copy data between instances with a simple interface without the need for an intermediate stage. The db.cloneDatabase() (page 877)
and db.copyDatabase() (page 878) provide helpers for these operations in the mongo (page 942) shell. These
migration helpers run the commands on the destination server and pull data from the source server..
Data migrations that require an intermediate stage or that involve more than one database instance are beyond the
scope of this tutorial. copydb (page 745) and clone (page 748) are more ideal for use cases that resemble the
following use cases:
data migrations,
data warehousing, and
seeding test environments.
Also consider the Backup Strategies for MongoDB Systems (page 136) and Import and Export MongoDB Data
(page 149) documentation for more related information.
Note: copydb (page 745) and clone (page 748) do not produce point-in-time snapshots of the source database.
Write traffic to the source or destination database during the copy process will result divergent data sets.
Considerations
You must run copydb (page 745) or clone (page 748) on the destination server.
193
You cannot use copydb (page 745) or clone (page 748) with databases that have a sharded collection in a
sharded cluster, or any database via a mongos (page 938).
You can use copydb (page 745) or clone (page 748) with databases that do not have sharded collections in a
cluster when youre connected directly to the mongod (page 925) instance.
You can run copydb (page 745) or clone (page 748) commands on a secondary member of a replica set, with
properly configured read preference.
Each destination mongod (page 925) instance must have enough free disk space on the destination server for
the database you are copying. Use the db.stats() (page 894) operation to check the size of the database on
the source mongod (page 925) instance. For more information, see db.stats() (page 894).
Processes
Copy and Rename a Database To copy a database from one MongoDB instance to another and rename the database
in the process, use the copydb (page 745) command, or the db.copyDatabase() (page 878) helper in the mongo
(page 942) shell.
Use the following procedure to copy the database named test on server db0.example.net to the server named
db1.example.net and rename it to records in the process:
Verify that the database, test exists on the source mongod (page 925) instance running on the
db0.example.net host.
Connect to the destination server, running on the db1.example.net host, using the mongo (page 942) shell.
Model your operation on the following command:
db.copyDatabase( "test", "records", "db0.example.net" )
Rename a Database You can also use copydb (page 745) or the db.copyDatabase() (page 878) helper to:
rename a database within a single MongoDB instance or
create a duplicate database for testing purposes.
Use the following procedure to rename the test database records on a single mongod (page 925) instance:
Connect to the mongod (page 925) using the mongo (page 942) shell.
Model your operation on the following command:
db.copyDatabase( "test", "records" )
Copy a Database with Authentication To copy a database from a source MongoDB instance that has authentication enabled, you can specify authentication credentials to the copydb (page 745) command or the
db.copyDatabase() (page 878) helper in the mongo (page 942) shell.
In the following operation, you will copy the test database from the mongod (page 925) running on
db0.example.net to the records database on the local instance (e.g. db1.example.net.) Because the
mongod (page 925) instance running on db0.example.net requires authentication for all connections, you will
need to pass db.copyDatabase() (page 878) authentication credentials, as in the following procedure:
Connect to the destination mongod (page 925) instance running on the db1.example.net host using the
mongo (page 942) shell.
Issue the following command:
194
Chapter 4. Administration
Use the following operation to initiate the clone (page 748) operation:
db.cloneDatabase( "db0.example.net" )
Indications When you are aware of a mongod (page 925) instance running without journaling that stops unexpectedly and youre not running with replication, you should always run the repair operation before starting MongoDB
again. If youre using replication, then restore from a backup and allow replication to perform an initial sync (page 410)
to restore data.
37 To ensure a clean shut down, use the db.shutdownServer() (page 894) from the mongo (page 942) shell, your control script, the mongod
--shutdown option on Linux systems, Control-C when running mongod (page 925) in interactive mode, or kill $(pidof mongod) or
kill -2 $(pidof mongod).
38 You can also use the db.collection.validate() (page 856) method to test the integrity of a single collection. However, this process
is time consuming, and without journaling you can safely assume that the data is in an invalid state and you should either run the repair operation
or resync from an intact member of the replica set.
195
If the mongod.lock file in the data directory specified by dbpath (page 993), /data/db by default, is not a
zero-byte file, then mongod (page 925) will refuse to start, and you will find a message that contains the following
line in your MongoDB log our output:
Unclean shutdown detected.
This indicates that you need to run mongod (page 925) with the --repair option. If you run repair when the
mongodb.lock file exists in your dbpath (page 993), or the optional --repairpath, you will see a message
that contains the following line:
old lock file: /data/db/mongod.lock. probably means unclean shutdown
If you see this message, as a last resort you may remove the lockfile and run the repair operation before starting the
database normally, as in the following procedure:
Overview
There are two processes to repair data files that result from an unexpected shutdown:
1. Use the --repair option in conjunction with the --repairpath option. mongod (page 925) will read the
existing data files, and write the existing data to new data files. This does not modify or alter the existing data
files.
You do not need to remove the mongod.lock file before using this procedure.
2. Use the --repair option. mongod (page 925) will read the existing data files, write the existing data to new
files and replace the existing, possibly corrupt, files with new files.
You must remove the mongod.lock file before using this procedure.
Note: --repair functionality is also available in the shell with the db.repairDatabase() (page 892) helper
for the repairDatabase (page 757) command.
Procedures To repair your data files using the --repairpath option to preserve the original data files unmodified:
1. Start mongod (page 925) using --repair to read the existing data files.
mongod --dbpath /data/db --repair --repairpath /data/db0
When this completes, the new repaired data files will be in the /data/db0 directory.
2. Start mongod (page 925) using the following invocation to point the dbpath (page 993) at /data/db0:
mongod --dbpath /data/db0
Once you confirm that the data files are operational you may delete or archive the data files in the /data/db
directory.
To repair your data files without preserving the original files, do not use the --repairpath option, as in the
following procedure:
1. Remove the stale lock file:
rm /data/db/mongod.lock
196
Chapter 4. Administration
Replace /data/db with your dbpath (page 993) where your MongoDB instances data files reside.
Warning: After you remove the mongod.lock file you must run the --repair process before using
your database.
2. Start mongod (page 925) using --repair to read the existing data files.
mongod --dbpath /data/db --repair
When this completes, the repaired data files will replace the original data files in the /data/db directory.
3. Start mongod (page 925) using the following invocation to point the dbpath (page 993) at /data/db:
mongod --dbpath /data/db
mongod.lock
In normal operation, you should never remove the mongod.lock file and start mongod (page 925). Instead consider
the one of the above methods to recover the database and remove the lock files. In dire situations you can remove the
lockfile, and start the database using the possibly corrupt files, and attempt to recover data from the database; however,
its impossible to predict the state of the database in these situations.
If you are not running with journaling, and your database shuts down unexpectedly for any reason, you should always
proceed as if your database is in an inconsistent and likely corrupt state. If at all possible restore from backup
(page 136) or, if running as a replica set, restore by performing an initial sync using data from an intact member of the
set, as described in Resync a Member of a Replica Set (page 450).
197
Server-side JavaScript
Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript
operations to execute at the same time. Prior to 2.4, MongoDB operations that required the JavaScript interpreter had
to acquire a lock, and a single mongod (page 925) could only run a single JavaScript operation at a time.
Overview
MongoDB supports the execution of JavaScript code for the following server-side operations:
mapReduce (page 701) and the corresponding mongo (page 942) shell
db.collection.mapReduce() (page 837). See Map-Reduce (page 282) for more information.
method
eval (page 722) command, and the corresponding mongo (page 942) shell method db.eval() (page 884)
$where (page 634) operator
Running .js files via a mongo shell Instance on the Server (page 198)
JavaScript in MongoDB
Although the above operations use JavaScript, most interactions with MongoDB do not use JavaScript but use an
idiomatic driver (page 95) in the language of the interacting application.
See also:
Store a JavaScript Function on the Server (page 177)
You can disable all server-side execution of JavaScript, by passing the --noscripting option on the command
line or setting noscripting (page 996) in a configuration file.
Running .js files via a mongo shell Instance on the Server
You can run a JavaScript (.js) file using a mongo (page 942) shell instance on the server. This is a good technique
for performing batch administrative work. When you run mongo (page 942) shell on the server, connecting via the
localhost interface, the connection is fast with low latency.
The command helpers (page 210) provided in the mongo (page 942) shell are not available in JavaScript files because
they are not valid JavaScript. The following table maps the most common mongo (page 942) shell helpers to their
JavaScript equivalents.
198
Chapter 4. Administration
Shell Helpers
show dbs, show databases
use <db>
show collections
show users
show log <logname>
show logs
it
JavaScript Equivalents
db.adminCommand('listDatabases')
db = db.getSiblingDB('<db>')
db.getCollectionNames()
db.system.users.find()
db.adminCommand( { 'getLog' : '<logname>' } )
db.adminCommand( { 'getLog' : '*' } )
cursor = db.collection.find()
if ( cursor.hasNext() ){
cursor.next();
}
Concurrency
Refer to the individual method or operator documentation for any concurrency information. See also the concurrency
table (page 597).
Data Types in the mongo Shell
MongoDB BSON provide support for additional data types than JSON. Drivers (page 95) provide native support for
these data types in host languages and the mongo (page 942) shell also provides several helper classes to support
the use of these data types in the mongo (page 942) JavaScript shell. See MongoDB Extended JSON (page 108) for
additional information.
Types
Date The mongo (page 942) shell provides various options to return the date, either as a string or as an object:
Date() method which returns the current date as a string.
Date() constructor which returns an ISODate object when used with the new operator.
ISODate() constructor which returns an ISODate object when used with or without the new operator.
Consider the following examples:
To return the date as a string, use the Date() method, as in the following example:
var myDateString = Date();
To print the value of the variable, type the variable name in the shell, as in the following:
myDateString
199
To print the value of the variable, type the variable name in the shell, as in the following:
myDateObject
You can use the new operator with the ISODate() constructor as well.
To print the value of the variable, type the variable name in the shell, as in the following:
myDateObject2
See
ObjectId (page 103) for full documentation of ObjectIds in MongoDB.
200
Chapter 4. Administration
NumberLong By default, the mongo (page 942) shell treats all numbers as floating-point values. The mongo
(page 942) shell provides the NumberLong() class to handle 64-bit integers.
The NumberLong() constructor accepts the long as a string:
NumberLong("2090845886852")
The following examples use the NumberLong() class to write to the collection:
db.collection.insert( { _id: 10, calc: NumberLong("2090845886852") } )
db.collection.update( { _id: 10 },
{ $set: { calc: NumberLong("2555555000000") } } )
db.collection.update( { _id: 10 },
{ $inc: { calc: NumberLong(5) } } )
If you use the $inc (page 651) to increment the value of a field that contains a NumberLong object by a float, the
data type changes to a floating point value, as in the following example:
1. Use $inc (page 651) to increment the calc field by 5, which the mongo (page 942) shell treats as a float:
db.collection.update( { _id: 10 },
{ $inc: { calc: 5 } } )
In the updated document, the calc field contains a floating point value:
{ "_id" : 10, "calc" : 2555555000010 }
NumberInt By default, the mongo (page 942) shell treats all numbers as floating-point values. The mongo
(page 942) shell provides the NumberInt() constructor to explicitly specify 32-bit integers.
Check Types in the mongo Shell
To determine the type of fields, the mongo (page 942) shell provides the following operators:
instanceof returns a boolean to test if a value has a specific type.
typeof returns the type of a field.
Example
Consider the following operations using instanceof and typeof:
The following operation tests whether the _id field is of type ObjectId:
mydoc._id instanceof ObjectId
201
typeof mydoc._id
In this case typeof will return the more generic object type rather than ObjectId type.
From the mongo (page 942) shell or from a JavaScript file, you can instantiate database connections using the
Mongo() (page 919) constructor:
new Mongo()
new Mongo(<host>)
new Mongo(<host:port>)
Consider the following example that instantiates a new connection to the MongoDB instance running on localhost on
the default port and sets the global db variable to myDatabase using the getDB() (page 917) method:
conn = new Mongo();
db = conn.getDB("myDatabase");
Additionally, you can use the connect() method to connect to the MongoDB instance. The following example
connects to the MongoDB instance that is running on localhost with the non-default port 27020 and set the
global db variable:
db = connect("localhost:27020/myDatabase");
When writing scripts for the mongo (page 942) shell, consider the following:
To set the db global variable, use the getDB() (page 917) method or the connect() method. You can assign
the database reference to a variable other than db.
Inside the script, cal db.getLastError() (page 886) explicitly to wait for the result of write operations
(page 50).
You cannot use any shell helper (e.g. use <dbname>, show dbs, etc.) inside the JavaScript file because
they are not valid JavaScript.
The following table maps the most common mongo (page 942) shell helpers to their JavaScript equivalents.
202
Chapter 4. Administration
Shell Helpers
show dbs, show databases
use <db>
show collections
show users
show log <logname>
show logs
it
JavaScript Equivalents
db.adminCommand('listDatabases')
db = db.getSiblingDB('<db>')
db.getCollectionNames()
db.system.users.find()
db.adminCommand( { 'getLog' : '<logname>' } )
db.adminCommand( { 'getLog' : '*' } )
cursor = db.collection.find()
if ( cursor.hasNext() ){
cursor.next();
}
In interactive mode, mongo (page 942) prints the results of operations including the content of all cursors. In
scripts, either use the JavaScript print() function or the mongo (page 942) specific printjson() function
which returns formatted JSON.
Example
To print all items in a result cursor in mongo (page 942) shell scripts, use the following idiom:
cursor = db.collection.find();
while ( cursor.hasNext() ) {
printjson( cursor.next() );
}
Scripting
From the system prompt, use mongo (page 942) to evaluate JavaScript.
--eval option Use the --eval option to mongo (page 942) to pass the shell a JavaScript fragment, as in the
following:
mongo test --eval "printjson(db.getCollectionNames())"
This returns the output of db.getCollectionNames() (page 886) using the mongo (page 942) shell connected
to the mongod (page 925) or mongos (page 938) instance running on port 27017 on the localhost interface.
Execute a JavaScript file You can specify a .js file to the mongo (page 942) shell, and mongo (page 942) will
execute the JavaScript directly. Consider the following example:
mongo localhost:27017/test myjsfile.js
203
This operation executes the myjsfile.js script in a mongo (page 942) shell that connects to the test database
on the mongod (page 925) instance accessible via the localhost interface on port 27017.
Alternately, you can specify the mongodb connection parameters inside of the javascript file using the Mongo()
constructor. See Opening New Connections (page 202) for more information.
You can execute a .js file from within the mongo (page 942) shell, using the load() function, as in the following:
load("myjstest.js")
Note: There is no search path for the load() function. If the desired script is not in the current working directory
or the full specified path, mongo (page 942) will not be able to access the file.
To start the mongo (page 942) shell and connect to your MongoDB (page 925) instance running on localhost with
default port:
1. Go to your <mongodb installation dir>:
cd <mongodb installation dir>
If you have added the <mongodb installation dir>/bin to the PATH environment variable, you can
just type mongo instead of ./bin/mongo.
3. To display the database you are using, type db:
db
The operation should return test, which is the default database. To switch databases, issue the use <db>
helper, as in the following example:
use <database>
To list the available databases, use the helper show dbs. See also How can I access different databases
temporarily? (page 594) to access a different database from the current database without switching your current
database context (i.e. db..)
To start the mongo (page 942) shell with other options, see examples of starting up mongo (page 947) and mongo
reference (page 942) which provides details on the available options.
204
Chapter 4. Administration
Note: When starting, mongo (page 942) checks the users HOME (page 946) directory for a JavaScript file named
.mongorc.js (page 946). If found, mongo (page 942) interprets the content of .mongorc.js before displaying
the prompt for the first time. If you use the shell to evaluate a JavaScript file or expression, either by using the
--eval option on the command line or by specifying a .js file to mongo (page 945), mongo (page 942) will read the
.mongorc.js file after the JavaScript has finished processing.
Executing Queries
From the mongo (page 942) shell, you can use the shell methods (page 806) to run queries, as in the following
example:
db.<collection>.find()
The find() (page 816) method is the JavaScript method to retrieve documents from <collection>. The
find() (page 816) method returns a cursor to the results; however, in the mongo (page 942) shell, if the
returned cursor is not assigned to a variable using the var keyword, then the cursor is automatically iterated up
to 20 times to print up to the first 20 documents that match the query. The mongo (page 942) shell will prompt
Type it to iterate another 20 times.
You can set the DBQuery.shellBatchSize attribute to change the number of iteration from the default
value 20, as in the following example which sets it to 10:
DBQuery.shellBatchSize = 10;
For more information and examples on cursor handling in the mongo (page 942) shell, see Cursors (page 43).
See also Cursor Help (page 209) for list of cursor help in the mongo (page 942) shell.
For more documentation of basic MongoDB operations in the mongo (page 942) shell, see:
Getting Started with MongoDB (page 26)
mongo Shell Quick Reference (page 210)
Read Operations (page 39)
Write Operations (page 50)
Indexing Tutorials (page 338)
Read Operations (page 39)
Write Operations (page 50)
205
The mongo (page 942) shell automatically prints the results of the find() (page 816) method if the returned cursor is
not assigned to a variable using the var keyword. To format the result, you can add the .pretty() to the operation,
as in the following:
db.<collection>.find().pretty()
In addition, you can use the following explicit print methods in the mongo (page 942) shell:
print() to print without formatting
print(tojson(<obj>)) to print with JSON formatting and equivalent to printjson()
printjson() to print with JSON formatting and equivalent to print(tojson(<obj>))
Evaluate a JavaScript File
You can execute a .js file from within the mongo (page 942) shell, using the load() function, as in the following:
load("myjstest.js")
Note: There is no search path for the load() function. If the desired script is not in the current working directory
or the full specified path, mongo (page 942) will not be able to access the file.
You may modify the content of the prompt by creating the variable prompt in the shell. The prompt variable can
hold strings as well as any arbitrary JavaScript. If prompt holds a function that returns a string, mongo (page 942)
can display dynamic information in each prompt. Consider the following examples:
Example
Create a prompt with the number of operations issued in the current session, define the following variables:
cmdCount = 1;
prompt = function() {
return (cmdCount++) + "> ";
}
Example
206
Chapter 4. Administration
To create a mongo (page 942) shell prompt in the form of <database>@<hostname>$ define the following
variables:
host = db.serverStatus().host;
prompt = function() {
return db+"@"+host+"$ ";
}
Example
To create a mongo (page 942) shell prompt that contains the system up time and the number of documents in the
current database, define the following prompt variable:
prompt = function() {
return "Uptime:"+db.serverStatus().uptime+" Documents:"+db.stats().objects+" > ";
}
207
{ "soDoes" : "this" }
>
Note: As mongo (page 942) shell interprets code edited in an external editor, it may modify code in functions,
depending on the JavaScript compiler. For mongo (page 942) may convert 1+1 to 2 or remove comments. The actual
changes affect only the appearance of the code and will vary based on the version of JavaScript used but will not affect
the semantics of the code.
To see the list of options and help for starting the mongo (page 942) shell, use the --help option from the command
line:
mongo --help
Shell Help
To see the list of help, in the mongo (page 942) shell, type help:
help
Database Help
To see the list of databases on the server, use the show dbs command:
show dbs
New in version 2.4: show databases is now an alias for show dbs
To see the list of help for methods you can use on the db object, call the db.help() (page 889) method:
db.help()
208
Chapter 4. Administration
To see the implementation of a method in the shell, type the db.<method name> without the parenthesis (()), as in the following example which will return the implementation of the method db.addUser()
(page 875):
db.addUser
Collection Help
To see the list of collections in the current database, use the show collections command:
show collections
To see the help for methods available on the collection objects (e.g.
db.<collection>.help() method:
db.collection.help()
<collection> can be the name of a collection that exists, although you may specify a collection that doesnt
exist.
To see the collection method implementation, type the db.<collection>.<method> name without the
parenthesis (()), as in the following example which will return the implementation of the save() (page 846)
method:
db.collection.save
Cursor Help
When you perform read operations (page 39) with the find() (page 816) method in the mongo (page 942) shell,
you can use various cursor methods to modify the find() (page 816) behavior and various JavaScript methods to
handle the cursor returned from the find() (page 816) method.
To list the available modifier and cursor handling methods, use the db.collection.find().help()
command:
db.collection.find().help()
<collection> can be the name of a collection that exists, although you may specify a collection that doesnt
exist.
To see the implementation of the cursor method, type the db.<collection>.find().<method> name
without the parenthesis (()), as in the following example which will return the implementation of the
toArray() method:
db.collection.find().toArray
209
Type Help
To get a list of the wrapper classes available in the mongo (page 942) shell, such as BinData(), type help misc
in the mongo (page 942) shell:
help misc
You can retrieve previous commands issued in the mongo (page 942) shell with the up and down arrow keys. Command history is stored in ~/.dbshell file. See .dbshell (page 946) for more information.
Command Line Options
The mongo (page 942) executable can be started with numerous options. See mongo executable (page 942) page for
details on all available options.
The following table displays some common options for mongo (page 942):
OpDescription
tion
--help Show command line options
--nodb Start mongo (page 942) shell without connecting to a database.
To connect later, see Opening New Connections (page 202).
--shellUsed in conjunction with a JavaScript file (i.e. <file.js> (page 945)) to continue in the mongo
(page 942) shell after running the JavaScript file.
See JavaScript file (page 203) for an example.
Command Helpers
The mongo (page 942) shell provides various help. The following table displays some common help methods and
commands:
210
Chapter 4. Administration
The mongo (page 942) shell provides numerous mongo Shell Methods (page 806) methods for database operations.
In the mongo (page 942) shell, db is the variable that references the current database. The variable is automatically
set to the default database test or is set when you use the use <db> to switch current database.
The following table displays some common JavaScript operations:
211
Description
If running in secure mode, authenticate the user.
Set a specific collection in the current database to a variable coll, as in the following example:
coll = db.myCollection;
You can perform operations on the myCollection
using the variable, as in the following example:
coll.find();
212
Function
previous-history
next-history
beginning-of-line
Continued on next page
Chapter 4. Administration
Queries
In the mongo (page 942) shell, perform read operations using the find() (page 816) and findOne() (page 824)
methods.
The find() (page 816) method returns a cursor object which the mongo (page 942) shell iterates to print documents
on screen. By default, mongo (page 942) prints the first 20. The mongo (page 942) shell will prompt the user to
Type it to continue iterating the next 20 results.
213
The following table provides some common read operations in the mongo (page 942) shell:
214
Chapter 4. Administration
Read Operations
db.collection.find(<query>) (page 816)
db.collection.find( <query>,
<projection> ) (page 816)
db.collection.find().sort( <sort
order> ) (page 872)
db.collection.findOne( <query> )
(page 824)
Description
Find the documents matching the <query> criteria in
the collection. If the <query> criteria is not specified
or is empty (i.e {} ), the read operation selects all documents in the collection.
The following example selects the documents in the
users collection with the name field equal to "Joe":
coll = db.users;
coll.find( { name: "Joe" } );
For more information on specifying the <query> criteria, see Query Documents (page 68).
Find documents matching the <query> criteria and return just specific fields in the <projection>.
The following example selects all documents from the
collection but returns only the name field and the _id
field. The _id is always returned unless explicitly specified to not return.
coll = db.users;
coll.find( { },
{ name: true }
);
For
more
information
on
specifying
the
<projection>, see Limit Fields to Return from
a Query (page 72).
Return results in the specified <sort order>.
The following example selects all documents from the
collection and returns the results sorted by the name
field in ascending order (1). Use -1 for descending order:
coll = db.users;
coll.find().sort( { name: 1 } );
Return the documents matching the <query> criteria
in the specified <sort order>.
Limit result to <n> rows. Highly recommended if you
need only a certain number of rows for best performance.
Skip <n> results.
Returns total number of documents in the collection.
Returns the total number of documents that match the
query.
The count() (page 860) ignores limit()
(page 867) and skip() (page 871). For example, if 100 records match but the limit is 10, count()
(page 860) will return 100. This will be faster than
iterating yourself, but still take time.
Find and return a single document. Returns null if not
found.
The following example selects a single document in the users collection with the
name field matches to "Joe":
coll = db.users;
coll.findOne( { name: "Joe" } );
Internally, the findOne() (page 824) method is
the find() (page 816) method with a limit(1)
215
(page 867).
See Query Documents (page 68) and Read Operations (page 39) documentation for more information and examples.
See Operators (page 621) to specify other query operators.
Error Checking Methods
The mongo (page 942) shell provides numerous administrative database methods (page 806), including error checking
methods. These methods are:
Error Checking Methods
db.getLastError() (page 886)
db.getLastErrorObj() (page 886)
Description
Returns error message from the last operation.
Returns the error document from the last operation.
The following table lists some common methods to support database administration:
JavaScript Database
Administration Methods
db.cloneDatabase(<host>)
(page 877)
db.copyDatabase(<from>,
<to>, <host>) (page 878)
Description
Clone the current database from the <host> specified. The <host>
database instance must be in noauth mode.
Copy the <from> database from the <host> to the <to> database on the
current server.
The <host> database instance must be in noauth mode.
db.fromColl.renameCollection(<toColl>)
Rename collection from fromColl to <toColl>.
(page 846)
db.repairDatabase()
Repair and compact the current database. This operation can be very slow
(page 892)
on large databases.
db.addUser( <user>,
Add user to current database.
<pwd> ) (page 875)
db.getCollectionNames() Get the list of all collections in the current database.
(page 886)
db.dropDatabase()
Drops the current database.
(page 884)
See also administrative database methods (page 806) for a full list of methods.
Opening Additional Connections
You can create new connections within the mongo (page 942) shell.
The following table displays the methods to create the connections:
JavaScript Connection Create Methods
db = connect("<host><:port>/<dbname>")
conn = new Mongo()
db = conn.getDB("dbname")
Description
Open a new database connection.
Open a connection to a new server using new
Mongo().
Use getDB() method of the connection to select a
database.
See also Opening New Connections (page 202) for more information on the opening new connections from the mongo
(page 942) shell.
216
Chapter 4. Administration
Miscellaneous
Description
Prints the BSON size of an <document>
See the MongoDB JavaScript API Documentation40 for a full list of JavaScript methods .
Additional Resources
Consider the following reference material that addresses the mongo (page 942) shell and its interface:
mongo (page 942)
mongo Shell Methods (page 806)
Operators (page 621)
Database Commands (page 694)
Aggregation Reference (page 306)
Additionally, the MongoDB source code repository includes a jstests directory41 which contains numerous mongo
(page 942) shell scripts.
See also:
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica
Set Tutorials (page 419) and Sharded Cluster Tutorials (page 521) for additional tutorials and information.
217
Packages
MongoDB
Be sure you have the latest stable release. All releases are available on the Downloads45 page. This is a good place to
verify what is current, even if you then choose to install via a package manager.
Always use 64-bit builds for production. The 32-bit build MongoDB offers for test and development environments is
not suitable for production deployments as it can store no more than 2GB of data. See the 32-bit limitations (page 584)
for more information.
32-bit builds exist to support use on development machiness.
Operating Systems
MongoDB distributions are currently available for Mac OS X, Linux, Windows Server 2008 R2 64bit, Windows 7 (32
bit and 64 bit), Windows Vista, and Solaris platforms.
Note: MongoDB uses the GNU C Library46 (glibc) if available on a system. MongoDB requires version at least
glibc-2.12-1.2.el6 to avoid a known bug with earlier versions. For best results use at least version 2.13.
Concurrency
In earlier versions of MongoDB, all write operations contended for a single readers-writer lock on the MongoDB
instance. As of version 2.2, each database has a readers-writer lock that allows concurrent reads access to a database,
but gives exclusive access to a single write operation per database. See the Concurrency (page 596) page for more
information.
Write Concern
Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation.
The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak
write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns
may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm
the write operations.
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB
42 http://mms.mongodb.com
43 http://mms.mongodb.com/
44 http://mms.mongodb.com/help/
45 http://www.mongodb.org/downloads
46 http://www.gnu.org/software/libc/
218
Chapter 4. Administration
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather
than ensure persistence to the entire deployment.
See Write Concern (page 55) for more information about choosing an appropriate write concern level for your deployment.
Journaling
MongoDB uses write ahead logging to an on-disk journal to guarantee that MongoDB is able to quickly recover the
write operations (page 50) following a crash or other serious failure.
In order to ensure that mongod (page 925) will be able to recover and remain in a consistent state following a crash,
you should leave journaling enabled. See Journaling (page 232) for more information.
Connection Pool
To avoid overloading the connection resources of a single mongod (page 925) or mongos (page 938) instance, ensure
that clients maintain reasonable connection pool sizes.
The connPoolStats (page 765) database command returns information regarding the number of open connections
to the current database for mongos (page 938) instances and mongod (page 925) instances in sharded clusters.
Hardware Requirements and Limitations
MongoDB is designed specifically with commodity hardware in mind and has few hardware requirements or limitations. MongoDBs core components run on little-endian hardware, primarily x86/x86_64 processors. Client libraries
(i.e. drivers) can run on big or little endian systems.
The hardware for the most effective MongoDB deployments have the following properties:
As with all software, more RAM and a faster CPU clock speed are important for performance.
In general, databases are not CPU bound. As such, increasing the number of cores can help, but does not provide
significant marginal return.
MongoDB has good results and a good price-performance ratio with SATA SSD (Solid State Disk).
Use SSD if available and economical. Spinning disks can be performant, but SSDs capacity for random I/O
operations works well with the update model of mongod (page 925).
Commodity (SATA) spinning drives are often a good option, as the increase to random I/O for more expensive
drives is not that dramatic (only on the order of 2x). Using SSDs or increasing RAM may be more effective in
increasing I/O throughput.
Remote file storage can create performance problems in MongoDB. See Remote Filesystems (page 220) for
more information about storage and MongoDB.
MongoDB on NUMA Hardware
Important: The discussion of NUMA in this section only applies to Linux, and therefore does not affect deployments
where mongod (page 925) instances run other UNIX-like systems or on Windows.
Running MongoDB on a system with Non-Uniform Access Memory (NUMA) can cause a number of operational
problems, including slow performance for periods of time or high system process usage.
219
When running MongoDB on NUMA hardware, you should disable NUMA for MongoDB and instead set an interleave
memory policy.
Note: MongoDB version 2.0 and greater checks these settings on start up when deployed on a Linux-based system,
and prints a warning if the system is NUMA-based.
To disable NUMA for MongoDB and set an interleave memory policy, use the numactl command and start mongod
(page 925) in the following manner:
numactl --interleave=all /usr/bin/local/mongod
Then, disable zone reclaim in the proc settings using the following command:
echo 0 > /proc/sys/vm/zone_reclaim_mode
To fully disable NUMA, you must perform both operations. For more information, see the Documentation for
/proc/sys/vm/*47 .
See the The MySQL swap insanity problem and the effects of NUMA48 post, which describes the effects of NUMA
on databases. This blog post addresses the impact of NUMA for MySQL, but the issues for MongoDB are similar. The
post introduces NUMA and its goals, and illustrates how these goals are not compatible with production databases.
Disk and Storage Systems
Swap
Assign swap space for your systems. Allocating swap space can avoid issues with memory contention and can prevent
the OOM Killer on Linux systems from killing mongod (page 925).
The method mongod (page 925) uses to map memory files to memory ensures that the operating system will never
store MongoDB data in swap space.
RAID
The Network File System protocol (NFS) is not recommended for use with MongoDB as some versions perform
poorly.
Performance problems arise when both the data files and the journal files are hosted on NFS. You may experience
better performance if you place the journal on local or iscsi volumes. If you must use NFS, add the following NFS
options to your /etc/fstab file: bg, nolock, and noatime.
Many MongoDB deployments work successfully with Amazons Elastic Block Store (EBS) volumes. There are certain
intrinsic performance characteristics with EBS volumes that users should consider.
47 http://www.kernel.org/doc/Documentation/sysctl/vm.txt
48 http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
220
Chapter 4. Administration
MongoDB on Linux
Important: The following discussion only applies to Linux, and therefore does not affect deployments where
mongod (page 925) instances run other UNIX-like systems or on Windows.
When running MongoDB in production on Linux, it is recommended that you use Linux kernel version 2.6.36 or later.
MongoDB preallocates its database files before using them and often creates large files. As such, you should use the
Ext4 and XFS file systems:
In general, if you use the Ext4 file system, use at least version 2.6.23 of the Linux Kernel.
In general, if you use the XFS file system, use at least version 2.6.25 of the Linux Kernel.
Some Linux distributions require different versions of the kernel to support using ext4 and/or xfs:
Linux Distribution
CentOS 5.5
CentOS 5.6
CentOS 5.8
CentOS 6.1
RHEL 5.6
RHEL 6.0
Ubuntu 10.04.4 LTS
Amazon Linux AMI releast 2012.03
Filesystem
ext4, xfs
ext4, xfs
ext4, xfs
ext4, xfs
ext4
xfs
ext4, xfs
ext4
Kernel Version
2.6.18-194.el5
2.6.18-238.el5
2.6.18-308.8.2.el5
2.6.32-131.0.15.el6.x86_64
2.6.18-238
2.6.32-71
2.6.32-38-server
3.2.12-3.2.4.amzn1.x86_64
Important: MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual
Boxs shared folders do not support this operation.
Recommended Configuration
Turn off atime for the storage volume containing the database files.
Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000, according to the suggestions in the UNIX ulimit Settings (page 225). A low ulimit will affect MongoDB when under heavy use and can
produce errors and lead to failed connections to MongoDB processes and loss of service.
Do not use hugepages virtual memory pages as MongoDB performs better with normal virtual memory pages.
Disable NUMA in your BIOS. If that is not possible see MongoDB on NUMA Hardware (page 219).
Ensure that readahead settings for the block devices that store the database files are appropriate. For random
access use patterns, set low readahead values. A readahead of 32 (16kb) often works well.
Use the Network Time Protocol (NTP) to synchronize time among your hosts. This is especially important in
sharded clusters.
Networking
Always run MongoDB in a trusted environment, with network rules that prevent access from all unknown machines,
systems, and networks. As with any sensitive system dependent on network access, your MongoDB deployment
should only be accessible to specific systems that require access, such as application servers, monitoring services, and
other MongoDB components.
4.3. Administration Reference
221
Note: By default, auth (page 993) is not enabled and mongod (page 925) assumes a trusted environment. You can
enable security/auth (page 237) mode if you need it.
See documents in the Security (page 235) section for additional information, specifically:
Configuration Options (page 239)
Firewalls (page 240)
Configure Linux iptables Firewall for MongoDB (page 242)
Configure Windows netsh Firewall for MongoDB (page 246)
For Windows users, consider the Windows Server Technet Article on TCP Configuration49 when
deploying MongoDB on Windows.
MongoDB on Virtual Environments
The section describes considerations when running MongoDB in some of the more common virtual environments.
EC2
MongoDB is compatible with EC2 and requires no configuration changes specific to the environment.
You may alternately choose to obtain a set of Amazon Machine Images (AMI) that bundle together MongoDB and
Amazons Provisioned IOPS storage volumes. Provisioned IOPS can greatly increase MongoDBs performance and
ease of use. For more information, see this blog post50 .
VMWare
MongoDB is compatible with VMWare. As some users have run into issues with VMWares memory overcommit
feature, disabling the feature is recommended.
It is possible to clone a virtual machine running MongoDB. You might use this function to spin up a new virtual host
to add as a member of a replica set. If you clone a VM with journaling enabled, the clone snapshot will be consistent.
If not using journaling, first stop mongod (page 925), then clone the VM, and finally, restart mongod (page 925).
OpenVZ
Some users have had issues when running MongoDB on some older version of OpenVZ due to its handling of virtual
memory, as with VMWare.
This issue seems to have been resolved in the more recent versions of OpenVZ.
Performance Monitoring
iostat
On Linux, use the iostat command to check if disk I/O is a bottleneck for your database. Specify a number of
seconds when running iostat to avoid displaying stats covering the time since server boot.
49 http://technet.microsoft.com/en-us/library/dd349797.aspx
50 http://www.mongodb.com/blog/post/provisioned-iops-aws-marketplace-significantly-boosts-mongodb-performance-ease-use
222
Chapter 4. Administration
For example, the following command will display extended statistics and the time for each displayed report, with
traffic in MB/s, at one second intervals:
iostat -xmt 1
bwm-ng51 is a command-line tool for monitoring network use. If you suspect a network-based bottleneck, you may
use bwm-ng to begin your diagnostic process.
Separate Components onto Different Storage Devices
For improved performance, consider separating your databases data, journal, and logs onto different storage devices,
based on your applications access and write pattern.
Note: This will affect your ability to create snapshot-style backups of your data, since the files will be on different
devices and volumes.
Backups
To make backups of your MongoDB database, please refer to Backup Strategies for MongoDB Systems (page 136).
Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This facilitates iterative
development and polymorphism. Nevertheless, collections often hold documents with highly homogeneous structures.
See Data Modeling Considerations for MongoDB Applications (page 117) for more information.
Some operational considerations include:
the exact set of collections to be used;
the indexes to be used: with the exception of the _id index, all indexes must be created explicitly;
shard key declarations: choosing a good shard key is very important as the shard key cannot be changed once
set.
Avoid importing unmodified data directly from a relational database. In general, you will want to roll up certain
data into richer documents that take advantage of MongoDBs support for sub-documents and nested arrays.
51 http://www.gropp.org/?id=projects&sub=bwm-ng
223
MongoDB strings are case sensitive. So a search for "joe" will not find "Joe".
Consider:
storing data in a normalized case format, or
using regular expressions ending with http://docs.mongodb.org/manuali, and/or
using $toLower (page 685) or $toUpper (page 685) in the aggregation framework (page 279).
Type Sensitive Fields
MongoDB data is stored in the BSON52 format, a binary encoded serialization of JSON-like documents. BSON
encodes additional type information. See bsonspec.org53 for more information.
Consider the following document which has a field x with the string value "123":
{ x : "123" }
Then the following query which looks for a number value 123 will not return that document:
db.mycollection.find( { x : 123 } )
General Considerations
By Default, Updates Affect one Document
To update multiple documents that meet your query criteria, set the update multi option to true or 1. See:
Update Multiple Documents (page 53).
Prior to MongoDB 2.2, you would specify the upsert and multi options in the update method as positional
boolean options. See: the update method reference documentation.
BSON Document Size Limit
The BSON Document Size (page 1015) limit is currently set at 16MB per document. If you require larger documents, use GridFS (page 154).
No Fully Generalized Transactions
MongoDB does not have fully generalized transactions (page 84). Creating rich documents that closely resemble and
reflect your application-level objects to (words).
Replica Set Considerations
Use an Odd Number of Replica Set Members
Replica sets (page 377) perform consensus elections. To ensure that elections will proceed successfully, either use an
odd number of members, typically three, or else use an arbiter to ensure an odd number of votes.
52 http://docs.mongodb.org/meta-driver/latest/legacy/bson/
53 http://bsonspec.org/#/specification
224
Chapter 4. Administration
MongoDB replica sets support automatic failover (page 396). It is important for your secondaries to be up-to-date.
There are various strategies for assessing consistency:
1. Use monitoring tools to alert you to lag events. See Monitoring for MongoDB (page 138) for a detailed discussion of MongoDBs monitoring options.
2. Specify appropriate write concern.
3. If your application requires manual fail over, you can configure your secondaries as priority 0 (page 386).
Priority 0 secondaries require manual action for a failover. This may be practical for a small replica set, but
large deployments should fail over automatically.
See also:
replica set rollbacks (page 401).
Sharding Considerations
Pick your shard keys carefully. You cannot choose a new shard key for a collection that is already sharded.
Shard key values are immutable.
When enabling sharding on an existing collection, MongoDB imposes a maximum size on those collections to ensure that it is possible to create chunks. For a detailed explanation of this limit, see:
<sharding-existing-collection-data-size>.
To shard large amounts of data, create a new empty sharded collection, and ingest the data from the source
collection using an application level import operation.
Unique indexes are not enforced across shards except for the shard key itself. See Enforce Unique Keys for
Sharded Collections (page 558).
Consider pre-splitting (page 521) a sharded collection before a massive bulk import.
Resource Utilization
mongod (page 925) and mongos (page 938) each use threads and file descriptors to track connections and manage
internal operations. This section outlines the general resource utilization patterns for MongoDB. Use these figures in
combination with the actual information about your deployment and its use to determine ideal ulimit settings.
Generally, all mongod (page 925) and mongos (page 938) instances:
track each incoming connection with a file descriptor and a thread.
225
1 file descriptor for each data file in use by the mongod (page 925) instance.
1 file descriptor for each journal file used by the mongod (page 925) instance when journal (page 995) is
true.
In replica sets, each mongod (page 925) maintains a connection to all other members of the set.
mongod (page 925) uses background threads for a number of internal processes, including TTL collections (page 158),
replication, and replica set health checks, which may require a small number of additional resources.
mongos
In addition to the threads and file descriptors for client connections, mongos (page 938) must maintain connects to
all config servers and all shards, which includes all members of all replica sets.
For mongos (page 938), consider the following behaviors:
mongos (page 938) instances maintain a connection pool to each shard so that the mongos (page 938) can
reuse connections and quickly fulfill requests without needing to create new connections.
You can limit the number of incoming connections using the maxConns (page 992) run-time option.
By restricting the number of incoming connections you can prevent a cascade effect where the mongos
(page 938) creates too many connections on the mongod (page 925) instances.
Note: You cannot set maxConns (page 992) to a value higher than 20000.
Note: Both the hard and the soft ulimit affect MongoDBs performance. The hard ulimit refers to the
maximum number of processes that a user can have active at any time. This is the ceiling: no non-root process can
increase the hard ulimit. In contrast, the soft ulimit is the limit that is actually enforced for a session or
process, but any process can increase it up to hard ulimit maximum.
A low soft ulimit can cause cant create new thread, closing connection errors if the number
of connections grows too high. For this reason, it is extremely important to set both ulimit values to the recommended values.
You can use the ulimit command at the system prompt to check system limits, as in the following example:
$ ulimit -a
-t: cpu time (seconds)
-f: file size (blocks)
-d: data seg size (kbytes)
-s: stack size (kbytes)
-c: core file size (blocks)
-m: resident set size (kbytes)
-u: processes
-n: file descriptors
226
unlimited
unlimited
unlimited
8192
0
unlimited
192276
21000
Chapter 4. Administration
40000
unlimited
unlimited
192276
819200
30
65
unlimited
ulimit refers to the per-user limitations for various resources. Therefore, if your mongod (page 925) instance
executes as a user that is also running multiple processes, or multiple mongod (page 925) processes, you might see
contention for these resources. Also, be aware that the processes value (i.e. -u) refers to the combined number of
distinct processes and sub-process threads.
You can change ulimit settings by issuing a command in the following form:
ulimit -n <value>
For many distributions of Linux you can change values by substituting the -n option for any possible value in the
output of ulimit -a. On OS X, use the launchctl limit command. See your operating system documentation
for the precise procedure for changing system limits on running systems.
Note: After changing the ulimit settings, you must restart the process to take advantage of the modified settings.
You can use the http://docs.mongodb.org/manualproc file system to see the current limitations on a
running process.
Depending on your systems configuration, and default settings, any change to system limits made using ulimit
may revert following system a system restart. Check your distribution and operating system documentation for more
information.
227
You can copy and paste this function into a current shell session or load it as part of a script. Call the function with
one the following invocations:
return-limits mongod
return-limits mongos
return-limits mongod mongos
Soft Limit
unlimited
unlimited
unlimited
8720000
0
unlimited
192276
1024
40960000
unlimited
unlimited
192276
819200
30
65
unlimited
Hard Limit
unlimited
unlimited
unlimited
unlimited
unlimited
unlimited
192276
4096
40960000
unlimited
unlimited
192276
819200
30
65
unlimited
Units
seconds
bytes
bytes
bytes
bytes
bytes
processes
files
bytes
bytes
locks
signals
bytes
us
Recommended Settings
Every deployment may have unique requirements and settings; however, the following thresholds and settings are
particularly important for mongod (page 925) and mongos (page 938) deployments:
-f (file size): unlimited
-t (cpu time): unlimited
-v (virtual memory): unlimited 54
-n (open files): 64000
-m (memory size): unlimited 1
-u (processes/threads): 32000
Always remember to restart your mongod (page 925) and mongos (page 938) instances after changing the ulimit
settings to make sure that the settings change takes effect.
If you limit virtual or resident memory size on a system running MongoDB the operating system will refuse to honor additional allocation
requests.
228
Chapter 4. Administration
Collections
System collections include these collections stored directly in the database:
<database>.system.namespaces
The <database>.system.namespaces (page 229) collection contains information about all of the
databases collections. Additional namespace metadata exists in the database.ns files and is opaque to
database users.
<database>.system.indexes
The <database>.system.indexes (page 229) collection lists all the indexes in the database. Add and
remove data from this collection via the ensureIndex() (page 814) and dropIndex() (page 812)
<database>.system.profile
The <database>.system.profile (page 229) collection stores database profiling information. For information on profiling, see Database Profiling (page 143).
<database>.system.users
The <database>.system.users (page 270) collection stores credentials for users who have access
to the database. For more information on this collection, see Add a User to a Database (page 257) and
<database>.system.users (page 270).
<database>.system.js
The <database>.system.js (page 229) collection holds special JavaScript code for use in server side
JavaScript (page 198). See Store a JavaScript Function on the Server (page 177) for more information.
229
}
},
"nscanned" : 8,
"moved" : true,
"nmoved" : 1,
"nupdated" : 1,
"keyUpdates" : 0,
"numYield" : 0,
"lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(0),
"w" : NumberLong(258)
},
"timeAcquiringMicros" : {
"r" : NumberLong(0),
"w" : NumberLong(7)
}
},
"millis" : 0,
"client" : "127.0.0.1",
"user" : ""
}
Output Reference
For any single operation, the documents created by the database profiler will include a subset of the following fields.
The precise selection of fields in these documents depends on the type of operation.
system.profile.ts
The timestamp of the operation.
system.profile.op
The type of operation. The possible values are:
insert
query
update
remove
getmore
command
system.profile.ns
The namespace the operation targets. Namespaces in MongoDB take the form of the database, followed by a
dot (.), followed by the name of the collection.
system.profile.query
The query document (page 68) used.
system.profile.command
The command operation.
system.profile.updateobj
The <update> document passed in during an update (page 50) operation.
system.profile.cursorid
The ID of the cursor accessed by a getmore operation.
230
Chapter 4. Administration
system.profile.ntoreturn
Changed in version 2.2: In 2.0, MongoDB includes this field for query and command operations. In 2.2, this
information MongoDB also includes this field for getmore operations.
The number of documents the operation specified to return. For example, the profile (page 770) command would return one document (a results document) so the ntoreturn (page 230) value would be 1. The
limit(5) (page 867) command would return five documents so the ntoreturn (page 230) value would be
5.
If the ntoreturn (page 230) value is 0, the command did not specify a number of documents to return, as
would be the case with a simple find() (page 816) command with no limit specified.
system.profile.ntoskip
New in version 2.2.
The number of documents the skip() (page 871) method specified to skip.
system.profile.nscanned
The number of documents that MongoDB scans in the index (page 313) in order to carry out the operation.
In general, if nscanned (page 231) is much higher than nreturned (page 232), the database is scanning
many objects to find the target objects. Consider creating an index to improve this.
system.profile.moved
If moved (page 231) has a value of true indicates that the update operation moved one or more documents
to a new location on disk. These operations take more time than in-place updates, and typically occur when
documents grow as a result of document growth.
system.profile.nmoved
New in version 2.2.
The number of documents moved on disk by the operation.
system.profile.nupdated
New in version 2.2.
The number of documents updated by the operation.
system.profile.keyUpdates
New in version 2.2.
The number of index (page 313) keys the update changed in the operation. Changing an index key carries a
small performance cost because the database must remove the old key and inserts a new key into the B-tree
index.
system.profile.numYield
New in version 2.2.
The number of times the operation yielded to allow other operations to complete. Typically, operations yield
when they need access to data that MongoDB has not yet fully read into memory. This allows other operations
that have data in memory to complete while MongoDB reads in data for the yielding operation. For more
information, see the FAQ on when operations yield (page 597).
system.profile.lockStats
New in version 2.2.
The time in microseconds the operation spent acquiring and holding locks. This field reports data for the
following lock types:
R - global read lock
W - global write lock
r - database-specific read lock
231
232
Chapter 4. Administration
Important: If you place the journal on a different filesystem from your data files you cannot use a filesystem snapshot
to capture consistent backups of a dbpath (page 993) directory.
Note: Depending on your file system, you might experience a preallocation lag the first time you start a mongod
(page 925) instance with journaling enabled.
MongoDB may preallocate journal files if the mongod (page 925) process determines that it is more efficient to
preallocate journal files than create new journal files as needed. The amount of time required to pre-allocate lag might
last several minutes, during which you will not be able to connect to the database. This is a one-time preallocation and
does not occur with future invocations.
To avoid preallocation lag, see Avoid Preallocation Lag (page 175).
Storage Views used in Journaling
Journaling adds three storage views to MongoDB.
The shared view stores modified data for upload to the MongoDB data files. The shared view is the only view
with direct access to the MongoDB data files. When running with journaling, mongod (page 925) asks the operating
system to map your existing on-disk data files to the shared view memory view. The operating system maps the
files but does not load them. MongoDB later loads data files to shared view as needed.
The private view stores data for use in read operations (page 39). MongoDB maps private view to the
shared view and is the first place MongoDB applies new write operations (page 50).
The journal is an on-disk view that stores new write operations after MongoDB applies the operation to the private
cache but before applying them to the data files. The journal provides durability. If the mongod (page 925) instance
were to crash without having applied the writes to the data files, the journal could replay the writes to the shared
view for eventual upload to the data files.
How Journaling Records Write Operations
MongoDB copies the write operations to the journal in batches called group commits.
See
journalCommitInterval (page 995) for more information on the default commit interval. These group
commits help minimize the performance impact of journaling.
Journaling stores raw operations that allow MongoDB to reconstruct the following:
document insertion/updates
index modifications
changes to the namespace files
As write operations (page 50) occur, MongoDB writes the data to the private view in RAM and then copies the
write operations in batches to the journal. The journal stores the operations on disk to ensure durability. MongoDB
adds the operations as entries on the journals forward pointer. Each entry describes which bytes the write operation
changed in the data files.
MongoDB next applies the journals write operations to the shared view. At this point, the shared view
becomes inconsistent with the data files.
At default intervals of 60 seconds, MongoDB asks the operating system to flush the shared view to disk. This
brings the data files up-to-date with the latest write operations.
When MongoDB flushes write operations to the data files, MongoDB removes the write operations from the journals
behind pointer. The behind pointer is always far back from the advanced pointer.
233
As part of journaling, MongoDB routinely asks the operating system to remap the shared view to the private
view, for consistency.
Note: The interaction between the shared view and the on-disk data files is similar to how MongoDB works
without journaling, which is that MongoDB asks the operating system to flush in-memory changes back to the data
files every 60 seconds.
234
Chapter 4. Administration
CHAPTER 5
Security
This section outlines basic security and risk management strategies and access control. The included tutorials outline
specific tasks for configuring firewalls, authentication, and system privileges.
Security Introduction (page 235) A high-level introduction to security and MongoDB deployments.
Security Concepts (page 237) The core documentation of security.
Access Control (page 237) Control access to MongoDB instances using authentication and authorization.
Network Exposure and Security (page 239) Discusses potential security risks related to the network and strategies for decreasing possible network-based attack vectors for MongoDB.
Security and MongoDB API Interfaces (page 241) Discusses potential risks related
JavaScript, HTTP and REST interfaces, including strategies to control those risks.
to
MongoDBs
Security Tutorials (page 242) Tutorials for enabling and configuring security features for MongoDB.
Create a Vulnerability Report (page 263) Report a vulnerability in MongoDB.
Network Security Tutorials (page 242) Ensure that the underlying network configuration supports a secure operating environment for MongoDB deployments, and appropriately limits access to MongoDB deployments.
Access Control Tutorials (page 254) MongoDBs access control system provides role-based access control for
limiting access to MongoDB deployments. These tutorials describe procedures relevant for the operation
and maintenance of this access control system.
Security Reference (page 264) Reference for security related functions.
235
most effective way to reduce the risk of exploitation is to run MongoDB in a trusted environment, limit access, follow
a system of least privilege, and follow best development and deployment practices.
236
Chapter 5. Security
237
Authorization
MongoDB provisions authorization, or access to databases and operations, on a per-database level. MongoDB uses
a role-based approach to authorization, storing each users roles in a privilege document (page 265) in a databases
system.users (page 270) collection. For more information on privilege documents and available user roles, see
system.users Privilege Documents (page 270) and User Privilege Roles in MongoDB (page 265).
Important: The admin database provides roles that are unavailable in other databases, including a role that effectively makes a user a MongoDB system superuser. See Database Administration Roles (page 266) and Administrative
Roles (page 267).
To assign roles to users, you must be a user with administrative role in the database. As such, you must first create an
administrative user. For details, see Create a User Administrator (page 255) and Add a User to a Database (page 257).
system.users Collection
A databases system.users (page 270) collection stores information for authentication and authorization to that
database. Specifically, the collection stores user credentials for authentication and user privilege information for
authorization. MongoDB requires authorization to access the system.users (page 270) collection in order to
prevent privilege escalation attacks. To access the collection, you must have either userAdmin (page 267) or
userAdminAnyDatabase (page 269) role.
Changed in version 2.4: The schema of system.users (page 270) changed to accommodate a more sophisticated
authorization using user privilege model, as defined in privilege documents (page 265).
238
Chapter 5. Security
Note: You may chose to set these run-time configuration options using the --keyFile (or mongos --keyFile)
options on the command line.
Setting keyFile (page 993) enables authentication and specifies a key file for the replica set members to use when
authenticating to each other. The content of the key file is arbitrary but must be the same on all members of the replica
set and on all mongos (page 938) instances that connect to the set.
The key file must be less one kilobyte in size and may only contain characters in the base64 set. The key file must
not have group or world permissions on UNIX systems. See Generate a Key File (page 258) for instructions on
generating a key file.
The nohttpinterface (page 996) setting for mongod (page 925) and mongos (page 938) instances disables
the home status page, which would run on port 28017 by default. The status interface is read-only by default. You may also specify this option on the command line as mongod --nohttpinterface or mongos
--nohttpinterface.
Authentication does not control or affect access to this interface.
Important: Disable this option for production deployments. If you do leave this interface enabled, you should only
allow trusted clients to access this port. See Firewalls (page 240).
rest
The rest (page 997) setting for mongod (page 925) enables a fully interactive administrative REST interface, which
is disabled by default. The status interface, which is enabled by default, is read-only. This configuration makes that
interface fully interactive. The REST interface does not support any authentication and you should always restrict
access to this interface to only allow trusted clients to connect to this port.
You may also enable this interface on the command line as mongod --rest.
Important: Disable this option for production deployments. If do you leave this interface enabled, you should only
allow trusted clients to access this port.
239
bind_ip
The bind_ip (page 991) setting for mongod (page 925) and mongos (page 938) instances limits the network interfaces on which MongoDB programs will listen for incoming connections. You can also specify a number of interfaces
by passing bind_ip (page 991) a comma separated list of IP addresses. You can use the mongod --bind_ip
and mongos --bind_ip option on the command line at run time to limit the network accessibility of a MongoDB
program.
Important: Make sure that your mongod (page 925) and mongos (page 938) instances are only accessible on
trusted networks. If your system has more than one network interface, bind MongoDB programs to the private or
internal network interface.
port
The port (page 991) setting for mongod (page 925) and mongos (page 938) instances changes the main port on
which the mongod (page 925) or mongos (page 938) instance listens for connections. The default port is 27017.
Changing the port does not meaningfully reduce risk or limit exposure. You may also specify this option on the
command line as mongod --port or mongos --port. Setting port (page 991) also indirectly sets the port for
the HTTP status interface, which is always available on the port numbered 1000 greater than the primary mongod
(page 925) port.
Only allow trusted clients to connect to the port for the mongod (page 925) and mongos (page 938) instances. See
Firewalls (page 240).
See also Security Considerations (page 147) and Default MongoDB Port (page 272).
Firewalls
Firewalls allow administrators to filter and control access to a system by providing granular control over what network
communications. For administrators of MongoDB, the following capabilities are important: limiting incoming traffic
on a specific port to specific systems, and limiting incoming traffic from untrusted hosts.
On Linux systems, the iptables interface provides access to the underlying netfilter firewall. On Windows
systems, netsh command line interface provides access to the underlying Windows Firewall. For additional information about firewall configuration, see Configure Linux iptables Firewall for MongoDB (page 242) and Configure
Windows netsh Firewall for MongoDB (page 246).
For best results and to minimize overall exposure, ensure that only traffic from trusted sources can reach mongod
(page 925) and mongos (page 938) instances and that the mongod (page 925) and mongos (page 938) instances can
only connect to trusted outputs.
See also:
For MongoDB deployments on Amazons web services, see the Amazon EC26 page, which addresses Amazons
Security Groups and other EC2-specific security features.
Virtual Private Networks
Virtual private networks, or VPNs, make it possible to link two networks over an encrypted and limited-access trusted
network. Typically MongoDB users who use VPNs use SSL rather than IPSEC VPNs for performance issues.
Depending on configuration and implementation, VPNs provide for certificate validation and a choice of encryption
protocols, which requires a rigorous level of authentication and identification of all clients. Furthermore, because
6 http://docs.mongodb.org/ecosystem/platforms/amazon-ec2
240
Chapter 5. Security
VPNs provide a secure tunnel, by using a VPN connection to control access to your MongoDB instance, you can
prevent tampering and man-in-the-middle attacks.
The mongo (page 942) program can evaluate JavaScript expressions using the command line --eval option.
Also, the mongo (page 942) program can evaluate a JavaScript file (.js) passed directly to it (e.g. mongo
someFile.js).
Because the mongo (page 942) program evaluates the JavaScript without validating the input, this behavior presents
a vulnerability.
.mongorc.js File
If a .mongorc.js file exists 7 , the mongo (page 942) shell will evaluate a .mongorc.js file before starting. You
can disable this behavior by passing the mongo --norc option.
HTTP Status Interface
The HTTP status interface provides a web-based interface that includes a variety of operational data, logs, and status
reports regarding the mongod (page 925) or mongos (page 938) instance. The HTTP interface is always available
on the port numbered 1000 greater than the primary mongod (page 925) port. By default, the HTTP interface port
is 28017, but is indirectly set using the port (page 991) option which allows you to configure the primary mongod
(page 925) port.
Without the rest (page 997) setting, this interface is entirely read-only, and limited in scope; nevertheless, this
interface may represent an exposure. To disable the HTTP interface, set the nohttpinterface (page 996) run
time option or the --nohttpinterface command line option. See also Configuration Options (page 239).
REST API
The REST API to MongoDB provides additional information and write access on top of the HTTP Status interface.
While the REST API does not provide any support for insert, update, or remove operations, it does provide administrative access, and its accessibility represents a vulnerability in a secure environment. The REST interface is disabled
by default, and is not recommended for production use.
If you must use the REST API, please control and limit access to the REST API. The REST API does not include any
support for authentication, even when running with auth (page 993) enabled.
See the following documents for instructions on restricting access to the REST API interface:
7 On Linux and Unix systems, mongo (page 942) reads the .mongorc.js file from $HOME/.mongorc.js (i.e. ~/.mongorc.js). On
Windows, mongo.exe reads the .mongorc.js file from %HOME%.mongorc.js or %HOMEDRIVE%%HOMEPATH%.mongorc.js.
241
242
Chapter 5. Security
This document outlines basic firewall configurations for iptables firewalls on Linux. Use these approaches as a
starting point for your larger networking organization. For a detailed overview of security practices and risk management for MongoDB, see Security Concepts (page 237).
See also:
For MongoDB deployments on Amazons web services, see the Amazon EC28 page, which addresses Amazons
Security Groups and other EC2-specific security features.
Overview
Rules in iptables configurations fall into chains, which describe the process for filtering and processing specific
streams of traffic. Chains have an order, and packets must pass through earlier rules in a chain to reach later rules.
This document addresses only the following two chains:
INPUT Controls all incoming traffic.
OUTPUT Controls all outgoing traffic.
Given the default ports (page 239) of all MongoDB processes, you must configure networking rules that permit only
required communication between your application and the appropriate mongod (page 925) and mongos (page 938)
instances.
Be aware that, by default, the default policy of iptables is to allow all connections and traffic unless explicitly
disabled. The configuration changes outlined in this document will create rules that explicitly allow traffic from
specific addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. When
you have properly configured your iptables rules to allow only the traffic that you want to permit, you can Change
Default Policy to DROP (page 245).
Patterns
This section contains a number of patterns and examples for configuring iptables for use with MongoDB deployments. If you have configured different ports using the port (page 991) configuration setting, you will need to modify
the rules accordingly.
Traffic to and from mongod Instances This pattern is applicable to all mongod (page 925) instances running as
standalone instances or as part of a replica set.
The goal of this pattern is to explicitly allow traffic to the mongod (page 925) instance from the application server. In
the following examples, replace <ip-address> with the IP address of the application server:
The first rule allows all incoming traffic from <ip-address> on port 27017, which allows the application server
to connect to the mongod (page 925) instance. The second rule, allows outgoing traffic from the mongod (page 925)
to reach the application server.
Optional
If you have only one application server, you can replace <ip-address> with either the IP address itself, such as:
198.51.100.55. You can also express this using CIDR notation as 198.51.100.55/32. If you want to permit
a larger block of possible IP addresses you can allow traffic from a http://docs.mongodb.org/manual24
using one of the following specifications for the <ip-address>, as follows:
8 http://docs.mongodb.org/ecosystem/platforms/amazon-ec2
243
10.10.10.10/24
10.10.10.10/255.255.255.0
Traffic to and from mongos Instances mongos (page 938) instances provide query routing for sharded clusters.
Clients connect to mongos (page 938) instances, which behave from the clients perspective as mongod (page 925)
instances. In turn, the mongos (page 938) connects to all mongod (page 925) instances that are components of the
sharded cluster.
Use the same iptables command to allow traffic to and from these instances as you would from the mongod
(page 925) instances that are members of the replica set. Take the configuration outlined in the Traffic to and from
mongod Instances (page 243) section as an example.
Traffic to and from a MongoDB Config Server Config servers, host the config database that stores metadata
for sharded clusters. Each production cluster has three config servers, initiated using the mongod --configsvr
option. 9 Config servers listen for connections on port 27019. As a result, add the following iptables rules to the
config server to allow incoming and outgoing connection on port 27019, for connection to the other config servers.
Replace <ip-address> with the address or address space of all the mongod (page 925) that provide config servers.
Additionally, config servers need to allow incoming connections from all of the mongos (page 938) instances in the
cluster and all mongod (page 925) instances in the cluster. Add rules that resemble the following:
Replace <ip-address> with the address of the mongos (page 938) instances and the shard mongod (page 925)
instances.
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 10 Because
the default port number when running with shardsvr (page 1001) is 27018, you must configure the following
iptables rules to allow traffic to and from each shard:
Replace the <ip-address> specification with the IP address of all mongod (page 925). This allows you to permit
incoming and outgoing traffic between all shards including constituent replica set members, to:
all mongod (page 925) instances in the shards replica sets.
all mongod (page 925) instances in other shards.
11
You can also run a config server by setting the configsvr (page 1001) option in a configuration file.
You can also specify the shard server option using the shardsvr (page 1001) setting in the configuration file. Shard members are also often
conventional replica sets using the default port.
11 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations.
10
244
Chapter 5. Security
iptables -A OUTPUT -d <ip-address> -p tcp --source-port 27018 -m state --state ESTABLISHED -j ACCEPT
Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface.
For all deployments, you should restrict access to this port to only the monitoring instance.
Optional
For shard server mongod (page 925) instances running with shardsvr (page 1001), the rule would resemble
the following:
For config server mongod (page 925) instances running with configsvr (page 1001), the rule would resemble
the following:
The default policy for iptables chains is to allow all traffic. After completing all iptables configuration changes,
you must change the default policy to DROP so that all traffic that isnt explicitly allowed as above will not be able to
reach components of the MongoDB deployment. Issue the following commands to change this policy:
iptables -P INPUT DROP
iptables -P OUTPUT DROP
This section contains a number of basic operations for managing and using iptables. There are various front end
tools that automate some aspects of iptables configuration, but at the core all iptables front ends provide the
same basic functionality:
Make all iptables Rules Persistent By default all iptables rules are only stored in memory. When your
system restarts, your firewall rules will revert to their defaults. When you have tested a rule set and have guaranteed
that it effectively controls traffic you can use the following operations to you should make the rule set persistent.
On Red Hat Enterprise Linux, Fedora Linux, and related distributions you can issue the following command:
service iptables save
On Debian, Ubuntu, and related distributions, you can use the following command to dump the iptables rules to
the /etc/iptables.conf file:
245
Place this command in your rc.local file, or in the /etc/network/if-up.d/iptables file with other
similar operations.q
List all iptables Rules To list all of currently applied iptables rules, use the following operation at the system
shell.
iptables --L
Flush all iptables Rules If you make a configuration mistake when entering iptables rules or simply need to
revert to the default rule set, you can use the following operation at the system shell to flush all rules:
iptables --F
If youve already made your iptables rules persistent, you will need to repeat the appropriate procedure in the
Make all iptables Rules Persistent (page 245) section.
Configure Windows netsh Firewall for MongoDB
On Windows Server systems, the netsh program provides methods for managing the Windows Firewall. These
firewall rules make it possible for administrators to control what hosts can connect to the system, and limit risk
exposure by limiting the hosts that can connect to a system.
This document outlines basic Windows Firewall configurations. Use these approaches as a starting point for your
larger networking organization. For a detailed over view of security practices and risk management for MongoDB, see
Security Concepts (page 237).
See also:
Windows Firewall12 documentation from Microsoft.
Overview
Windows Firewall processes rules in an ordered determined by rule type, and parsed in the following order:
1. Windows Service Hardening
2. Connection security rules
3. Authenticated Bypass Rules
4. Block Rules
5. Allow Rules
6. Default Rules
12 http://technet.microsoft.com/en-us/network/bb545423.aspx
246
Chapter 5. Security
By default, the policy in Windows Firewall allows all outbound connections and blocks all incoming connections.
Given the default ports (page 239) of all MongoDB processes, you must configure networking rules that permit only
required communication between your application and the appropriate mongod.exe (page 948) and mongos.exe
(page 950) instances.
The configuration changes outlined in this document will create rules which explicitly allow traffic from specific
addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed.
You can configure the Windows Firewall with using the netsh command line tool or through a windows application.
On Windows Server 2008 this application is Windows Firewall With Advanced Security in Administrative Tools. On
previous versions of Windows Server, access the Windows Firewall application in the System and Security control
panel.
The procedures in this document use the netsh command line tool.
Patterns
This section contains a number of patterns and examples for configuring Windows Firewall for use with MongoDB
deployments. If you have configured different ports using the port (page 991) configuration setting, you will need to
modify the rules accordingly.
Traffic to and from mongod.exe Instances This pattern is applicable to all mongod.exe (page 948) instances
running as standalone instances or as part of a replica set. The goal of this pattern is to explicitly allow traffic to the
mongod.exe (page 948) instance from the application server.
netsh advfirewall firewall add rule name="Open mongod port 27017" dir=in action=allow protocol=TCP lo
This rule allows all incoming traffic to port 27017, which allows the application server to connect to the
mongod.exe (page 948) instance.
Windows Firewall also allows enabling network access for an entire application rather than to a specific port, as in the
following example:
netsh advfirewall firewall add rule name="Allowing mongod" dir=in action=allow program=" C:\mongodb\b
You can allow all access for a mongos.exe (page 950) server, with the following invocation:
netsh advfirewall firewall add rule name="Allowing mongos" dir=in action=allow program=" C:\mongodb\b
Traffic to and from mongos.exe Instances mongos.exe (page 950) instances provide query routing for
sharded clusters. Clients connect to mongos.exe (page 950) instances, which behave from the clients perspective as mongod.exe (page 948) instances. In turn, the mongos.exe (page 950) connects to all mongod.exe
(page 948) instances that are components of the sharded cluster.
Use the same Windows Firewall command to allow traffic to and from these instances as you would from the
mongod.exe (page 948) instances that are members of the replica set.
netsh advfirewall firewall add rule name="Open mongod shard port 27018" dir=in action=allow protocol=
Traffic to and from a MongoDB Config Server Configuration servers, host the config database that stores metadata for sharded clusters. Each production cluster has three configuration servers, initiated using the mongod
--configsvr option. 13 Configuration servers listen for connections on port 27019. As a result, add the following Windows Firewall rules to the config server to allow incoming and outgoing connection on port 27019, for
connection to the other config servers.
13
You can also run a config server by setting the configsvr (page 1001) option in a configuration file.
247
netsh advfirewall firewall add rule name="Open mongod config svr port 27019" dir=in action=allow prot
Additionally, config servers need to allow incoming connections from all of the mongos.exe (page 950) instances
in the cluster and all mongod.exe (page 948) instances in the cluster. Add rules that resemble the following:
netsh advfirewall firewall add rule name="Open mongod config svr inbound" dir=in action=allow protoco
Replace <ip-address> with the addresses of the mongos.exe (page 950) instances and the shard mongod.exe
(page 948) instances.
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 14 Because
the default port number when running with shardsvr (page 1001) is 27018, you must configure the following
Windows Firewall rules to allow traffic to and from each shard:
netsh advfirewall firewall add rule name="Open mongod shardsvr inbound" dir=in action=allow protocol=
netsh advfirewall firewall add rule name="Open mongod shardsvr outbound" dir=out action=allow protoco
Replace the <ip-address> specification with the IP address of all mongod.exe (page 948) instances. This allows
you to permit incoming and outgoing traffic between all shards including constituent replica set members to:
all mongod.exe (page 948) instances in the shards replica sets.
all mongod.exe (page 948) instances in other shards.
15
netsh advfirewall firewall add rule name="Open mongod config svr outbound" dir=out action=allow proto
netsh advfirewall firewall add rule name="Open mongod HTTP monitoring inbound" dir=in action=all
Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface.
For all deployments, you should restrict access to this port to only the monitoring instance.
Optional
For shard server mongod.exe (page 948) instances running with shardsvr (page 1001), the rule would
resemble the following:
netsh advfirewall firewall add rule name="Open mongos HTTP monitoring inbound" dir=in action=all
14 You can also specify the shard server option using the shardsvr (page 1001) setting in the configuration file. Shard members are also often
conventional replica sets using the default port.
15 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations.
248
Chapter 5. Security
For config server mongod.exe (page 948) instances running with configsvr (page 1001), the rule would
resemble the following:
netsh advfirewall firewall add rule name="Open mongod configsvr HTTP monitoring inbound" dir=in
This section contains a number of basic operations for managing and using netsh. While you can use the GUI front
ends to manage the Windows Firewall, all core functionality is accessible is accessible from netsh.
Delete all Windows Firewall Rules To delete the firewall rule allowing mongod.exe (page 948) traffic:
netsh advfirewall firewall delete rule name="Open mongod port 27017" protocol=tcp localport=27017
netsh advfirewall firewall delete rule name="Open mongod shard port 27018" protocol=tcp localport=270
List All Windows Firewall Rules To return a list of all Windows Firewall rules:
netsh advfirewall firewall show rule name=all
Backup and Restore Windows Firewall Rules To simplify administration of larger collection of systems, you can
export or import firewall systems from different servers) rules very easily on Windows:
Export all firewall rules with the following command:
netsh advfirewall export "C:\temp\MongoDBfw.wfw"
Replace "C:\temp\MongoDBfw.wfw" with a path of your choosing. You can use a command in the following
form to import a file created using this operation:
netsh advfirewall import "C:\temp\MongoDBfw.wfw"
249
Combine SSL Certificate and Key File Before you can use SSL, you must have a .pem file that contains the public
key certificate and private key. MongoDB can use any valid SSL certificate. To generate a self-signed certificate and
private key, use a command that resembles the following:
cd /etc/ssl/
openssl req -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key
This operation generates a new, self-signed certificate with no passphrase that is valid for 365 days. Once you have
the certificate, concatenate the certificate and private key to a .pem file, as in the following example:
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem
Set Up mongod and mongos with SSL Certificate and Key To use SSL in your MongoDB deployment, include
the following run-time options with mongod (page 925) and mongos (page 938):
sslOnNormalPorts (page 1003)
sslPEMKeyFile (page 1003) with the .pem file that contains the SSL certificate and key.
Consider the following syntax for mongod (page 925):
mongod --sslOnNormalPorts --sslPEMKeyFile <pem>
For example, given an SSL certificate located at /etc/ssl/mongodb.pem, configure mongod (page 925) to use
SSL encryption for all connections with the following command:
mongod --sslOnNormalPorts --sslPEMKeyFile /etc/ssl/mongodb.pem
Note:
Specify <pem> with the full path name to the certificate.
If the private key portion of the <pem> is encrypted, specify the encryption password with the
sslPEMKeyPassword (page 1003) option.
You may also specify these options in the configuration file (page 990), as in the following example:
sslOnNormalPorts = true
sslPEMKeyFile = /etc/ssl/mongodb.pem
To connect, to mongod (page 925) and mongos (page 938) instances using SSL, the mongo (page 942) shell and
MongoDB tools must include the --ssl option. See SSL Configuration for Clients (page 252) for more information
on connecting to mongod (page 925) and mongos (page 938) running with SSL.
Set Up mongod and mongos with Certificate Validation To set up mongod (page 925) or mongos (page 938)
for SSL encryption using an SSL certificate signed by a certificate authority, include the following run-time options
during startup:
sslOnNormalPorts (page 1003)
sslPEMKeyFile (page 1003) with the name of the .pem file that contains the signed SSL certificate and key.
sslCAFile (page 1004) with the name of the .pem file that contains the root certificate chain from the
Certificate Authority.
Consider the following syntax for mongod (page 925):
250
Chapter 5. Security
For example, given a signed SSL certificate located at /etc/ssl/mongodb.pem and the certificate authority file
at /etc/ssl/ca.pem, you can configure mongod (page 925) for SSL encryption as follows:
mongod --sslOnNormalPorts --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile /etc/ssl/ca.pem
Note:
Specify the <pem> file and the <ca> file with either the full path name or the relative path name.
If the <pem> is encrypted, specify the encryption password with the sslPEMKeyPassword (page 1003)
option.
You may also specify these options in the configuration file (page 990), as in the following example:
sslOnNormalPorts = true
sslPEMKeyFile = /etc/ssl/mongodb.pem
sslCAFile = /etc/ssl/ca.pem
To connect, to mongod (page 925) and mongos (page 938) instances using SSL, the mongo (page 942) tools must
include the both the --ssl and --sslPEMKeyFile option. See SSL Configuration for Clients (page 252) for more
information on connecting to mongod (page 925) and mongos (page 938) running with SSL.
Block Revoked Certificates for Clients To prevent clients with revoked certificates from connecting, include the
sslCRLFile (page 1004) to specify a .pem file that contains revoked certificates.
For example, the following mongod (page 925) with SSL configuration includes the sslCRLFile (page 1004)
setting:
Clients with revoked certificates in the /etc/ssl/ca-crl.pem will not be able to connect to this mongod
(page 925) instance.
Validate Only if a Client Presents a Certificate In most cases it is important to ensure that clients present valid
certificates. However, if you have clients that cannot present a client certificate, or are transitioning to using a certificate
authority you may only want to validate certificates from clients that present a certificate.
If you want to bypass validation for clients that dont present certificates, include the
sslWeakCertificateValidation (page 1004) run-time option with mongod (page 925) and mongos
(page 938). If the client does not present a certificate, no validation occurs. These connections, though not validated,
are still encrypted using SSL.
For example, consider the following mongod (page 925) with an SSL configuration that includes the
sslWeakCertificateValidation (page 1004) setting:
Then, clients can connect either with the option --ssl and no certificate or with the option --ssl and a valid
certificate. See SSL Configuration for Clients (page 252) for more information on SSL connections for clients.
Note: If the client presents a certificate, the certificate must be a valid certificate.
All connections, including those that have not presented certificates are encrypted using SSL.
251
Run in FIPS Mode If your mongod (page 925) or mongos (page 938) is running on a system with an OpenSSL
library configured with the FIPS 140-2 module, you can run mongod (page 925) or mongos (page 938) in FIPS
mode, with the sslFIPSMode (page 1004) setting.
SSL Configuration for Clients
Clients must have support for SSL to work with a mongod (page 925) or a mongos (page 938) instance that has SSL
support enabled. The current versions of the Python, Java, Ruby, Node.js, .NET, and C++ drivers have support for
SSL, with full support coming in future releases of other drivers.
mongo SSL Configuration For SSL connections, you must use the mongo (page 942) shell built with SSL support
or distributed with MongoDB Enterprise. To support SSL, mongo (page 942) has the following settings:
--ssl
--sslPEMKeyFile (page 1003) with the name of the .pem file that contains the SSL certificate and key.
--sslCAFile (page 1004) with the name of the .pem file that contains the certificate from the Certificate
Authority.
--sslPEMKeyPassword (page 1003) option if the client certificate-key file is encrypted.
Connect to MongoDB Instance with SSL Encryption To connect to a mongod (page 925) or mongos (page 938)
instance that requires only a SSL encryption mode (page 250), start mongo (page 942) shell with --ssl, as in the
following:
mongo --ssl
Connect to MongoDB Instance that Requires Client Certificates To connect to a mongod (page 925) or mongos
(page 938) that requires CA-signed client certificates (page 250), start the mongo (page 942) shell with --ssl and
the --sslPEMKeyFile (page 1003) option to specify the signed certificate-key file, as in the following:
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem
Connect to MongoDB Instance that Validates when Presented with a Certificate To connect to a mongod
(page 925) or mongos (page 938) instance that only requires valid certificates when the client presents a certificate (page 251), start mongo (page 942) shell either with the --ssl ssl and no certificate or with the --ssl ssl and
a valid signed certificate.
For example, if mongod (page 925) is running with weak certificate validation, both of the following mongo
(page 942) shell clients can connect to that mongod (page 925):
mongo --ssl
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem
MMS Monitoring Agent The Monitoring agent will also have to connect via SSL in order to gather its stats. Because the agent already utilizes SSL for its communications to the MMS servers, this is just a matter of enabling SSL
support in MMS itself on a per host basis.
Use the Edit host button (i.e. the pencil) on the Hosts page in the MMS console to enable SSL.
252
Chapter 5. Security
Please see the MMS documentation18 for more information about MMS configuration.
PyMongo Add the ssl=True parameter to a PyMongo MongoClient19 to create a MongoDB connection to
an SSL MongoDB instance:
from pymongo import MongoClient
c = MongoClient(host="mongodb.example.net", port=27017, ssl=True)
throws Exception {
Ruby The recent versions of the Ruby driver have support for connections to SSL servers. Install the latest version
of the driver with the following command:
gem install mongo
253
connection = MongoReplicaSetClient.new(['localhost:27017'],
['localhost:27018'],
:ssl => true)
.NET As of release 1.6, the .NET driver supports SSL connections with mongod (page 925) and mongos (page 938)
instances. To connect using SSL, you must add an option to the connection string, specifying ssl=true as follows:
var connectionString = "mongodb://localhost/?ssl=true";
var server = MongoServer.Create(connectionString);
The .NET driver will validate the certificate against the local trusted certificate store, in addition to providing encryption of the server. This behavior may produce issues during testing if the server uses a self-signed certificate. If
you encounter this issue, add the sslverifycertificate=false option to the connection string to prevent the
.NET driver from validating the certificate, as follows:
var connectionString = "mongodb://localhost/?ssl=true&sslverifycertificate=false";
var server = MongoServer.Create(connectionString);
254
Chapter 5. Security
Enable Authentication
Enable authentication using the auth (page 993) or keyFile (page 993) settings. Use auth (page 993) for standalone instances, and keyFile (page 993) with replica sets and sharded clusters. keyFile (page 993) implies
auth (page 993) and allows members of a MongoDB deployment to authenticate internally.
Authentication requires at least one administrator user in the admin database. You can create the user before enabling
authentication or after enabling authentication.
See also:
Deploy MongoDB with Kerberos Authentication (page 259).
Procedures
You can enable authentication using either of the following procedures, depending
Create the Administrator Credentials and then Enable Authentication
1. Start the mongod (page 925) or mongos (page 938) instance without the auth (page 993) or keyFile
(page 993) setting.
2. Create the administrator user as described in Create a User Administrator (page 255).
3. Re-start the mongod (page 925) or mongos (page 938) instance with the auth (page 993) or keyFile
(page 993) setting.
Enable Authentication and then Create Administrator
1. Start the mongod (page 925) or mongos (page 938) instance with the auth (page 993) or keyFile
(page 993) setting.
2. Connect to the instance on the same system so that you can authenticate using the localhost exception (page 256).
3. Create the administrator user as described in Create a User Administrator (page 255).
Query Authenticated Users
If you have the userAdmin (page 267) or userAdminAnyDatabase (page 269) role on a database, you can query
authenticated users in that database with the following operation:
db.system.users.find()
255
Users with the userAdmin (page 267) and userAdminAnyDatabase (page 269) privileges are not the same as
the UNIX root superuser in that this role confers no additional access beyond user administration. These users
cannot perform administrative operations or read or write data without first conferring themselves with additional
permissions.
Note: The userAdmin (page 267) is a database specific privilege, and only grants a user the ability to administer
users on a single database. However, for the admin database, userAdmin (page 267) allows a user the ability to gain
userAdminAnyDatabase (page 269), and so for the admin database only these roles are effectively the same.
3. Add the user with either the userAdmin (page 267) role or userAdminAnyDatabase (page 269) role,
and only that role, by issuing a command similar to the following, where <username> is the username and
<password> is the password:
db.addUser( { user: "<username>",
pwd: "<password>",
roles: [ "userAdminAnyDatabase" ] } )
To authenticate as this user, you must authenticate against the admin database.
Authenticate with Full Administrative Access via Localhost
If there are no users for the admin database, you can connect with full administrative access via the localhost interface.
This bypass exists to support bootstrapping new deployments. This approach is useful, for example, if you want to run
mongod (page 925) or mongos (page 938) with authentication before creating your first user.
To authenticate via localhost, connect to the mongod (page 925) or mongos (page 938) from a client running on the
same system. Your connection will have full administrative access.
To disable the localhost bypass, set the enableLocalhostAuthBypass (page 1005) parameter using
setParameter (page 999) during startup:
mongod --setParameter enableLocalhostAuthBypass=0
Note: For versions of MongoDB 2.2 prior to 2.2.4, if mongos (page 938) is running with keyFile (page 993),
then all users connecting over the localhost interface must authenticate, even if there arent any users in the admin
database. Connections on localhost are not correctly granted full access on sharded systems that run those versions.
MongoDB 2.2.4 resolves this issue.
Note: In version 2.2, you cannot add the first user to a sharded cluster using the localhost connection. If you are
256
Chapter 5. Security
running a 2.2 sharded cluster and want to enable authentication, you must deploy the cluster and add the first user to
the admin database before restarting the cluster to run with keyFile (page 993).
Example
The following creates a user named Bob in the admin database. The privilege document (page 270) uses
Bobs credentials from the products database and assigns him userAdmin privileges.
use admin
db.addUser( { user: "Bob",
userSource: "products",
roles: [ "userAdmin" ]
} )
Example
The following creates a user named Carlos in the admin database and gives him readWrite access to the
config database, which lets him change certain settings for sharded clusters, such as to disable the balancer.
db = db.getSiblingDB('admin')
db.addUser( { user: "Carlos",
pwd: "Moon1234",
roles: [ "clusterAdmin" ],
257
Only the admin database supports the otherDBRoles (page 271) field.
and
the
new
desired
password
to
the
Example
The following operation changes the reporting users password to SOhSS3TbYhxusooLiW8ypJPxmt1oOfL:
db = db.getSiblingDB('records')
db.changeUserPassword("reporting", "SOhSS3TbYhxusooLiW8ypJPxmt1oOfL")
Note: In previous versions of MongoDB, you could change an existing users password by calling db.addUser()
(page 875) again with the users username and their updated password. Anything specified in the addUser() method
would override the existing information for that user. In newer versions of MongoDB, this will result in a duplicate
key error.
For more about changing a users password prior to version 2.4, see: Add a User to a Database (page 257).
Use the following openssl command at the system shell to generate pseudo-random content for a key file:
openssl rand -base64 741
Be aware that MongoDB strips whitespace characters (e.g. x0d, x09, and x20,) for cross-platform convenience. As
a result, the following operations produce identical keys:
258
Chapter 5. Security
echo
echo
echo
echo
-e
-e
-e
-e
Process Overview
Create Users and Privilege Documents For every user that you want to be able to authenticate using Kerberos, you
must create corresponding privilege documents in the system.users (page 270) collection to provision access to
users. Consider the following document:
{
user: "application/[email protected]",
roles: ["read"],
userSource: "$external"
}
259
This grants the Kerberos user principal application/[email protected] read only access to a
database. The userSource (page 271) $external reference allows mongod (page 925) to consult an external
source (i.e. Kerberos) to authenticate this user.
In the mongo (page 942) shell you can pass the db.addUser() (page 875) a user privilege document to provision
access to users, as in the following operation:
db = db.getSiblingDB("records")
db.addUser( {
"user": "application/[email protected]",
"roles": [ "read" ],
"userSource": "$external"
} )
These operations grants the Kerberos user application/[email protected] access to the records
database.
To remove access to a user, use the remove() (page 844) method, as in the following example:
db.system.users.remove( { user: "application/[email protected]" } )
To modify a user document, use update (page 50) operations on documents in the system.users (page 270)
collection.
See also:
system.users Privilege Documents (page 270) and User Privilege Roles in MongoDB (page 265).
Start mongod with Kerberos Support Once you have provisioned privileges to users in the mongod (page 925),
and obtained a valid keytab file, you must start mongod (page 925) using a command in the following form:
env KRB5_KTNAME=<path to keytab file> <mongod invocation>
For successful operation with mongod (page 925) use the following run time options in addition to your normal
default configuration options:
--setParameter with the authenticationMechanisms=GSSAPI argument to enable support for
Kerberos.
--auth to enable authentication.
--keyFile to allow components of a single MongoDB deployment to communicate with each other, if needed
to support replica set and sharded cluster operations. keyFile (page 993) implies auth (page 993).
For example, consider the following invocation:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
/opt/mongodb/bin/mongod --dbpath /opt/mongodb/data \
--fork --logpath /opt/mongodb/log/mongod.log \
--auth --setParameter authenticationMechanisms=GSSAPI
You can also specify these options using the configuration file. As in the following:
# /opt/mongodb/mongod.conf, Example configuration file.
fork = true
auth = true
dbpath = /opt/mongodb/data
logpath = /opt/mongodb/log/mongod.log
setParameter = authenticationMechanisms=GSSAPI
260
Chapter 5. Security
To use this configuration file, start mongod (page 925) as in the following:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
/opt/mongodb/bin/mongod --config /opt/mongodb/mongod.conf
To start a mongos (page 938) instance using Kerberos, you must create a Kerberos service principal and deploy a
keytab file for this instance, and then start the mongos (page 938) with the following invocation:
env KRB5_KTNAME=/opt/mongodb/mongos.keytab \
/opt/mongodb/bin/mongos
--configdb shard0.example.net,shard1.example.net,shard2.example.net \
--setParameter authenticationMechanisms=GSSAPI \
--keyFile /opt/mongodb/mongos.keyfile
If you encounter problems when trying to start mongod (page 925) or mongos (page 938), please see the troubleshooting section (page 262) for more information.
Important: Before users can authenticate to MongoDB using Kerberos you must create users (page 259) and grant
them privileges within MongoDB. If you have not created users when you start MongoDB with Kerberos you can
use the localhost authentication exception (page 256) to add users. See the Create Users and Privilege Documents
(page 259) section and the User Privilege Roles in MongoDB (page 265) document for more information.
Authenticate mongo Shell with Kerberos To connect to a mongod (page 925) instance using the mongo
(page 942) shell you must begin by using the kinit program to initialize and authenticate a Kerberos session. Then,
start a mongo (page 942) instance, and use the db.auth() (page 876) method, to authenticate against the special
$external database, as in the following operation:
use $external
db.auth( { mechanism: "GSSAPI", user: "application/[email protected]" } )
Alternately, you can authenticate using command line options to mongo (page 942), as in the following equivalent
example:
mongo --authenticationMechanism=GSSAPI
--authenticationDatabase='$external' \
--username application/[email protected]
261
Kerberos and the HTTP Console MongoDB does not support kerberizing the HTTP Console25 .
Troubleshooting
Kerberos Configuration Checklist If youre having trouble getting mongod (page 925) to start with Kerberos,
there are a number of Kerberos-specific issues that can prevent successful authentication. As you begin troubleshooting
your Kerberos deployment, ensure that:
The mongod (page 925) is from MongoDB Enterprise.
You are not using the HTTP Console26 . MongoDB Enterprise does not support Kerberos authentication over the
HTTP Console interface.
You have a valid keytab file specified in the environment running the mongod (page 925). For the
mongod (page 925) instance running on the db0.example.net host, the service principal should be
mongodb/db0.example.net.
DNS allows the mongod (page 925) to resolve the components of the Kerberos infrastructure. You should
have both A and PTR records (i.e. forward and reverse DNS) for the system that runs the mongod (page 925)
instance.
The canonical system hostname of the system that runs the mongod (page 925) instance is the resolvable fully
qualified domain for this host. Test system hostname resolution with the hostname -f command at the
system prompt.
Both the Kerberos KDC and the system running mongod (page 925) instance must be able to resolve each other
using DNS 27
The time systems of the systems running the mongod (page 925) instances and the Kerberos infrastructure are
synchronized. Time differences greater than 5 minutes will prevent successful authentication.
If you still encounter problems with Kerberos, you can start both mongod (page 925) and mongo (page 942) (or
another client) with the environment variable KRB5_TRACE set to different files to produce more verbose logging of
the Kerberos process to help further troubleshooting, as in the following example:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
KRB5_TRACE=/opt/mongodb/log/mongodb-kerberos.log \
/opt/mongodb/bin/mongod --dbpath /opt/mongodb/data \
--fork --logpath /opt/mongodb/log/mongod.log \
--auth --setParameter authenticationMechanisms=GSSAPI
Common Error Messages In some situations, MongoDB will return error messages from the GSSAPI interface if
there is a problem with the Kerberos service.
GSSAPI error in client while negotiating security context.
This error occurs on the client and reflects insufficient credentials or a malicious attempt to authenticate.
If you receive this error ensure that youre using the correct credentials and the correct fully qualified
domain name when connecting to the host.
GSSAPI error acquiring credentials.
This error only occurs when attempting to start the mongod (page 925) or mongos (page 938) and
reflects improper configuration of system hostname or a missing or incorrectly configured keytab file. If
25 http://docs.mongodb.org/ecosystem/tools/http-interface/#http-console
26 http://docs.mongodb.org/ecosystem/tools/http-interface/#http-console
27
By default, Kerberos attempts to resolve hosts using the content of the /etc/kerb5.conf before using DNS to resolve hosts.
262
Chapter 5. Security
you encounter this problem, consider all the items in the Kerberos Configuration Checklist (page 262), in
particular:
examine the keytab file, with the following command:
klist -k <keytab>
Ensure that this name matches the name in the keytab file, or use the saslHostName (page 1007)
to pass MongoDB the correct hostname.
Enable the Traditional MongoDB Authentication Mechanism For testing and development purposes you can
enable both the Kerberos (i.e. GSSAPI) authentication mechanism in combination with the traditional MongoDB
challenge/response authentication mechanism (i.e. MONGODB-CR), using the following setParameter (page 999)
run-time option:
mongod --setParameter authenticationMechanisms=GSSAPI,MONGODB-CR
Warning: All keyFile (page 993) internal authentication between members of a replica set or sharded cluster
still uses the MONGODB-CR authentication mechanism, even if MONGODB-CR is not enabled. All client authentication will still use Kerberos.
263
Description
Adds a user to a database, and allows administrators to configure the
users privileges.
Authenticates a user to a database.
Changes an existing users password.
29 https://jira.mongodb.org
30 https://jira.mongodb.org/browse/SECURITY
31 [email protected]
32 https://jira.mongodb.org/browse/SECURITY
264
Chapter 5. Security
In general, you should set this option if your deployment does not need to support legacy user documents. Typically
legacy user documents are only useful during the upgrade process and while you migrate applications to the updated
privilege document form.
See privilege documents (page 270) and Delegated Credentials for MongoDB Authentication (page 272) for more
information about permissions and authentication in MongoDB.
Database User Roles
read
Provides users with the ability to read data from any collection within a specific logical database. This includes
find() (page 816) and the following database commands:
aggregate (page 694)
checkShardingIndex (page 736)
cloneCollectionAsCapped (page 749)
collStats (page 763)
count (page 695)
dataSize (page 769)
dbHash (page 761)
dbStats (page 767)
distinct (page 696)
265
266
Chapter 5. Security
Administrative Roles
clusterAdmin
clusterAdmin (page 267) grants access to several administration operations that affect or present information
about the whole system, rather than just a single database. These privileges include but are not limited to replica
set and sharded cluster administrative functions.
clusterAdmin (page 267) is only applicable on the admin database, and does not confer any access to the
local or config databases.
Specifically, users with the clusterAdmin (page 267) role have access to the following operations:
addShard (page 736)
closeAllDatabases (page 749)
connPoolStats (page 765)
connPoolSync (page 752)
_cpuProfilerStart
_cpuProfilerStop
cursorInfo (page 769)
diagLogging (page 769)
dropDatabase (page 746)
enableSharding (page 736)
flushRouterConfig (page 735)
fsync (page 751)
267
268
Chapter 5. Security
Note: You must specify the following any database roles on the admin databases. These roles apply to all
databases in a mongod (page 925) instance and are roughly equivalent to their single-database equivalents.
If you add any of these roles to a user privilege document (page 270) outside of the admin database, the privilege
will have no effect. However, only the specification of the roles must occur in the admin database, with delegated
authentication credentials (page 272), users can gain these privileges by authenticating to another database.
readAnyDatabase
readAnyDatabase (page 269) provides users with the same read-only permissions as read (page 265),
except it applies to all logical databases in the MongoDB environment.
readWriteAnyDatabase
readWriteAnyDatabase (page 269) provides users with the same read and write permissions as
readWrite (page 266), except it applies to all logical databases in the MongoDB environment.
userAdminAnyDatabase
userAdminAnyDatabase (page 269) provides users with the same access to user administration operations
as userAdmin (page 267), except it applies to all logical databases in the MongoDB environment.
Important: Because users with userAdminAnyDatabase (page 269) and userAdmin (page 267) have
the ability to create and modify permissions in addition to their own level of access, this role is effectively the
MongoDB system superuser. However, userAdminAnyDatabase (page 269) and userAdmin (page 267)
do not explicitly authorize a user for any privileges beyond user administration.
dbAdminAnyDatabase
dbAdminAnyDatabase (page 269) provides users with the same access to database administration operations
as dbAdmin (page 266), except it applies to all logical databases in the MongoDB environment.
Combined Access
Some operations are only available to users that have multiple roles. Consider the following:
sh.status() (page 910) Requires clusterAdmin (page 267) and read (page 265) access to the config
(page 565) database.
applyOps (page 732), eval (page 722) 33 Requires
readWriteAnyDatabase
(page
269),
userAdminAnyDatabase (page 269), dbAdminAnyDatabase (page 269) and clusterAdmin
(page 267) (on the admin database.)
Some operations related to cluster administration are not available to users who only have the clusterAdmin
(page 267) role:
rs.conf() (page 896) Requires read (page 265) on the local database.
sh.addShard() (page 903) Requires readWrite (page 266) on the config database.
269
The documents in the <database>.system.users (page 270) collection store credentials and user privilege
information used by the authentication system to provision access to users in the MongoDB system. See User Privilege
Roles in MongoDB (page 265) for more information about access roles, and Security (page 235) for an overview of
security in MongoDB.
Data Model
<database>.system.users
Changed in version 2.4.
Documents in the <database>.system.users (page 270) collection stores credentials and user roles
(page 265) for users who have access to the database. Consider the following prototypes of user privilege
documents:
{
user: "<username>",
pwd: "<hash>",
roles: []
}
{
user: "<username>",
userSource: "<database>",
roles: []
}
Note: The pwd (page 270) and userSource (page 271) fields are mutually exclusive. A single document
cannot contain both.
The following privilege document with the otherDBRoles (page 271) field is only supported on the admin
database:
{
user: "<username>",
userSource: "<database>",
otherDBRoles: {
<database0> : [],
<database1> : []
},
roles: []
}
Consider the content of the following fields in the system.users (page 270) documents:
<database>.system.users.user
user (page 270) is a string that identifies each user. Users exist in the context of a single logical database;
however, users from one database may obtain access in another database by way of the otherDBRoles
(page 271) field on the admin database, the userSource (page 271) field, or the Any Database Roles
(page 269).
270
Chapter 5. Security
<database>.system.users.pwd
pwd (page 270) holds a hashed shared secret used to authenticate the user (page 270). pwd (page 270)
field is mutually exclusive with the userSource (page 271) field.
<database>.system.users.roles
roles (page 271) holds an array of user roles. The available roles are:
read (page 265)
readWrite (page 266)
dbAdmin (page 266)
userAdmin (page 267)
clusterAdmin (page 267)
readAnyDatabase (page 269)
readWriteAnyDatabase (page 269)
userAdminAnyDatabase (page 269)
dbAdminAnyDatabase (page 269)
See Roles (page 265) for full documentation of all available user roles.
<database>.system.users.userSource
A string that holds the name of the database that contains the credentials for the user. If userSource
(page 271) is $external, then MongoDB will use an external resource, such as Kerberos, for authentication credentials.
Note: In the current release, the only external authentication source is Kerberos, which is only available
in MongoDB Enterprise.
Use userSource (page 271) to ensure that a single users authentication credentials are only stored in a
single location in a mongod (page 925) instances data.
A userSource (page 271) and user (page 270) pair identifies a unique user in a MongoDB system.
admin.system.users.otherDBRoles
A document that holds one or more fields with a name that is the name of a database in the MongoDB
instance with a value that holds a list of roles this user has on other databases. Consider the following
example:
{
user: "admin",
userSource: "$external",
roles: [ "clusterAdmin"],
otherDBRoles:
{
config: [ "read" ],
records: [ "dbadmin" ]
}
}
271
Then for every database that the application0 user requires access, add documents to the system.users
(page 270) collection that resemble the following:
{
user: "application0",
roles: ['readWrite'],
userSource: "accounts"
}
To gain privileges to databases where the application0 has access, you must first authenticate to the accounts
database.
Disable Legacy Privilege Documents
By default MongoDB 2.4 includes support for both new, role-based privilege documents style as well 2.2 and earlier
privilege documents. MongoDB assumes any privilege document without a roles (page 271) field is a 2.2 or earlier
document.
To ensure that mongod (page 925) instances will only provide access to users defined with the new role-based privilege
documents, use the following setParameter (page 999) run-time option:
mongod --setParameter supportCompatibilityFormPrivilegeDocuments=0
272
Description
The default port for mongod (page 925) and mongos (page 938) instances. You can change this port
with port (page 991) or --port.
The default port when running with --shardsvr runtime operation or shardsvr (page 1001)
setting.
The default port when running with --configsvr runtime operation or configsvr (page 1001)
setting.
The default port for the web status page. The web status page is always accessible at a port number
that is 1000 greater than the port determined by port (page 991).
Chapter 5. Security
If a user has the same password for multiple databases, the hash will be the same. A malicious user could exploit this
to gain access on a second database using a different users credentials.
As a result, always use unique username and password combinations for each database.
Thanks to Will Urbanski, from Dell SecureWorks, for identifying this issue.
34
273
274
Chapter 5. Security
CHAPTER 6
Aggregation
Aggregations operations process data records and return computed results. Aggregation operations group values from
multiple documents together, and can perform a variety of operations on the grouped data to return a single result.
MongoDB provides three ways to perform aggregation: the aggregation pipeline (page 279), the map-reduce function
(page 282), and single purpose aggregation methods and commands (page 283).
Aggregation Introduction (page 275) A high-level introduction to aggregation.
Aggregation Concepts (page 279) Introduces the use and operation of the data aggregation modalities available in
MongoDB.
Aggregation Pipeline (page 279) The aggregation pipeline is a framework for performing aggregation tasks,
modeled on the concept of data processing pipelines. Using this framework, MongoDB passes the documents of a single collection through a pipeline. The pipeline transforms the documents into aggregated
results, and is accessed through the aggregate (page 694) database command.
Map-Reduce (page 282) Map-reduce is a generic multi-phase data aggregation modality for processing quantities of data. MongoDB provides map-reduce with the mapReduce (page 701) database command.
Single Purpose Aggregation Operations (page 283) MongoDB provides a collection of specific data aggregation operations to support a number of common data aggregation functions. These operations include
returning counts of documents, distinct values of a field, and simple grouping operations.
Aggregation Mechanics (page 286) Details internal optimization operations, limits, support for sharded collections, and concurrency concerns.
Aggregation Examples (page 290) Examples and tutorials for data aggregation operations in MongoDB.
Aggregation Reference (page 306) References for all aggregation operations material for all data aggregation methods in MongoDB.
275
Figure 6.1: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two phases:
$match (page 666) and $group (page 669).
Map-Reduce
MongoDB also provides map-reduce (page 282) operations to perform aggregation. In general, map-reduce operations
have two phases: a map stage that processes each document and emits one or more objects for each input document,
276
Chapter 6. Aggregation
and reduce phase that combines the output of the map operation. Optionally, map-reduce can have a finalize stage to
make final modifications to the result. Like other aggregation operations, map-reduce can specify a query condition to
select the input documents as well as sort and limit the results.
Map-reduce uses custom JavaScript functions to perform the map and reduce operations, as well as the optional finalize
operation. While the custom JavaScript provide great flexibility compared to the aggregation pipeline, in general, mapreduce is less efficient and more complex than the aggregation pipeline.
Additionally, map-reduce operations can have output sets that exceed the 16 megabyte output limitation of the aggregation pipeline.
Tip
Starting in MongoDB 2.4, certain mongo (page 942) shell functions and properties are inaccessible in map-reduce
operations. MongoDB 2.4 also provides support for multiple JavaScript operations to run at the same time. Before
MongoDB 2.4, JavaScript code executed in a single thread, raising concurrency issues for map-reduce.
277
documents from a single collection. While these operations provide simple access to common aggregation processes,
they lack the flexibility and capabilities of the aggregation pipeline and map-reduce.
278
Chapter 6. Aggregation
279
Figure 6.4: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two phases:
$match (page 666) and $group (page 669).
280
Chapter 6. Aggregation
Pipeline Expressions
Each pipeline operator takes a pipeline expression as its operand. Pipeline expressions specify the transformation to
apply to the input documents. Expressions have a document structure and can contain fields, values, and operators
(page 673).
Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other
documents: expression operations provide in-memory transformation of documents.
Generally, expressions are stateless and are only evaluated when seen by the aggregation process with one exception:
accumulator expressions. The accumulator expressions, used with the $group (page 669) pipeline operator, maintain
their state (e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline.
For the expression operators, see Expression Operators (page 673).
Aggregation Pipeline Behavior
In MongoDB, the aggregate (page 694) command operates on a single collection, logically passing the entire
collection into the aggregation pipeline. To optimize the operation, wherever possible, use the following strategies to
avoid scanning the entire collection.
Pipeline Operators and Indexes
The $match (page 666), $sort (page 670), $limit (page 667), and $skip (page 668) pipeline operators can
take advantage of an index when they occur at the beginning of the pipeline before any of the following aggregation
operators: $project (page 664), $unwind (page 668), and $group (page 669).
New in version 2.4: The $geoNear (page 671) pipeline operator takes advantage of a geospatial index. When using
$geoNear (page 671), the $geoNear (page 671) pipeline operation must appear as the first stage in an aggregation
pipeline.
For unsharded collections, when the aggregation pipeline only needs to access the indexed fields to fulfill its operations,
an index can cover (page 45) the pipeline.
Example
Consider the following index on the orders collection:
{ status: 1, amount: 1, cust_id: 1 }
This index can cover the following aggregation pipeline operation because MongoDB does not need to inspect the data
outside of the index to fulfill the operation:
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } }
])
Early Filtering
If your aggregation operation requires only a subset of the data in a collection, use the $match (page 666), $limit
(page 667), and $skip (page 668) stages to restrict the documents that enter at the beginning of the pipeline. When
placed at the beginning of a pipeline, $match (page 666) operations use suitable indexes to scan only the matching
documents in a collection.
6.2. Aggregation Concepts
281
Placing a $match (page 666) pipeline stage followed by a $sort (page 670) stage at the start of the pipeline is
logically equivalent to a single query with a sort and can use an index. When possible, place $match (page 666)
operators at the beginning of the pipeline.
Additional Features
The aggregation pipeline has an internal optimization phase that provides improved performance for certain sequences
of operators. For details, see Pipeline Sequence Optimization (page 286).
The aggregation pipeline supports operations on sharded collections. See Aggregation Pipeline and Sharded Collections (page 288).
6.2.2 Map-Reduce
Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For
map-reduce operations, MongoDB provides the mapReduce (page 701) database command.
Consider the following map-reduce operation:
Chapter 6. Aggregation
the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further
condense or process the results of the aggregation.
All map-reduce functions in MongoDB are JavaScript and run within the mongod (page 925) process. Map-reduce
operations take the documents of a single collection as the input and can perform any arbitrary sorting and limiting
before beginning the map stage. mapReduce (page 701) can return the results of a map-reduce operation as a
document, or may write the results to collections. The input and the output collections may be sharded.
Note: For most aggregation operations, the Aggregation Pipeline (page 279) provides better performance and more
coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the
aggregation pipeline.
283
{
{
{
{
a:
a:
a:
a:
1,
1,
1,
2,
b:
b:
b:
b:
0
1
4
2
}
}
}
}
The following operation would count all documents in the collection and return the number 4:
db.records.count()
The following operation will count only the documents where the value of the field a is 1 and return 3:
db.records.count( { a: 1 } )
Distinct
The distinct operation takes a number of documents that match a query and returns all of the unique values for a field in
the matching documents. The distinct (page 696) command and db.collection.distinct() (page 812)
method provide this operation in the mongo (page 942) shell. Consider the following examples of a distinct operation:
Example
Given a collection named records with only the following documents:
{
{
{
{
{
{
a:
a:
a:
a:
a:
a:
1,
1,
1,
1,
2,
2,
b:
b:
b:
b:
b:
b:
0
1
1
4
2
2
}
}
}
}
}
}
Consider the following db.collection.distinct() (page 812) operation which returns the distinct values of
the field b:
db.records.distinct( "b" )
Group
The group operation takes a number of documents that match a query, and then collects groups of documents based
on the value of a field or fields. It returns an array of documents with computed results for each group of documents.
Access the grouping functionality via the group (page 697) command or the db.collection.group()
(page 828) method in the mongo (page 942) shell.
Warning: group (page 697) does not support data in sharded collections. In addition, the results of the group
(page 697) operation must be no larger than 16 megabytes.
Consider the following group operation:
Example
284
Chapter 6. Aggregation
285
a:
a:
a:
a:
a:
a:
a:
1,
1,
1,
2,
2,
1,
4,
count:
count:
count:
count:
count:
count:
count:
4
2
4
3
1
5
4
}
}
}
}
}
}
}
Consider the following group (page 697) operation which groups documents by the field a, where a is less than 3,
and sums the field count for each group:
db.records.group( {
key: { a: 1 },
cond: { a: { $lt: 3 } },
reduce: function(cur, result) { result.count += cur.count },
initial: { count: 0 }
} )
See also:
The $group (page 669) for related functionality in the aggregation pipeline (page 279).
286
Chapter 6. Aggregation
$sort + $skip + $limit Sequence Optimization When you have a sequence with $sort (page 670) followed
by a $skip (page 668) followed by a $limit (page 667), an optimization occurs that moves the $limit (page 667)
operator before the $skip (page 668) operator. For example, if the pipeline consists of the following stages:
{ $sort: { age : -1 } },
{ $skip: 10 },
{ $limit: 5 }
During the optimization phase, the optimizer transforms the sequence to the following:
{ $sort: { age : -1 } },
{ $limit: 15 }
{ $skip: 10 }
Note: The $limit (page 667) value has increased to the sum of the initial value and the $skip (page 668) value.
The optimized sequence now has $sort (page 670) immediately preceding the $limit (page 667). See $sort
(page 670) for information on the behavior of the $sort (page 670) operation when it immediately precedes $limit
(page 667).
$limit + $skip + $limit + $skip Sequence Optimization When you have a continuous sequence of a
$limit (page 667) pipeline stage followed by a $skip (page 668) pipeline stage, the optimization phase attempts
to arrange the pipeline stages to combine the limits and skips. For example, if the pipeline consists of the following
stages:
{
{
{
{
$limit: 100 },
$skip: 5 },
$limit: 10},
$skip: 2 }
During the intermediate step, the optimizer reverses the position of the $skip (page 668) followed by a $limit
(page 667) to $limit (page 667) followed by the $skip (page 668).
{
{
{
{
$limit: 100 },
$limit: 15},
$skip: 5 },
$skip: 2 }
The $limit (page 667) value has increased to the sum of the initial value and the $skip (page 668) value. Then,
for the final $limit (page 667) value, the optimizer selects the minimum between the adjacent $limit (page 667)
values. For the final $skip (page 668) value, the optimizer adds the adjacent $skip (page 668) values, to transform
the sequence to the following:
{ $limit: 15 },
{ $skip: 7 }
Projection Optimization
If the aggregation pipeline contains a $project (page 664) stage that specifies the fields to include, then MongoDB
applies the projection to the head of the pipeline. This reduces the amount of data passing through the pipeline from
the start. In the following example, the $project (page 664) stage specifies that the results of this stage return only
the _id and the amount fields. The optimization phase applies the projection to the head of the pipeline such that
only the _id and the amount fields return in the resulting documents from the $match (page 666) stage as well.
287
db.orders.aggregate(
{ $match: { status: "A" } },
{ $project: { amount: 1 } }
)
The aggregation pipeline (page 279) cannot operate on values of the following types: Symbol, MinKey, MaxKey,
DBRef, Code, CodeWScope.
Changed in version 2.4: Removed restriction on Binary type data. In MongoDB 2.2, the pipeline could not operate
on Binary type data.
Result Size Restrictions
Output from the pipeline cannot exceed the BSON Document Size (page 1015) limit, which is currently 16
megabytes. If the result set exceeds this limit, the aggregate (page 694) command produces an error.
Memory Restrictions
If any single aggregation operation consumes more than 10 percent or more of system RAM, the operation will produce
an error.
Cumulative operators, such as $sort (page 670) and $group (page 669), require access to the entire input set before
they can produce any output. These operators log a warning if the cumulative operator consumes 5% or more of the
physical memory on the host. Like any aggregation operation, these operators produce an error if they consume 10%
or more of the physical memory on the host. See the $sort (page 670) and $group (page 669) reference pages for
details on their specific memory requirements and use.
Aggregation Pipeline and Sharded Collections
The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the
aggregation pipeline (page 279) and sharded collections.
Note: Changed in version 2.2: Some aggregation pipeline (page 694) operations will cause mongos
(page 938) instances to require more CPU resources than in previous versions. This modified performance profile
may dictate alternate architectural decisions if you use the aggregation pipeline (page 279) extensively in a sharded
environment.
When operating on a sharded collection, the aggregation pipeline is split into two parts. First, the aggregation pipeline
pushes all of the operators up to the first $group (page 669) or $sort (page 670) operation to each shard 1 . Then,
a second pipeline runs on the mongos (page 938). This pipeline consists of the first $group (page 669) or $sort
(page 670) and any remaining pipeline operators, and runs on the results received from the shards.
1 If an early $match (page 666) can exclude shards through the use of the shard key in the predicate, then these operators are only pushed to
the relevant shards.
288
Chapter 6. Aggregation
The $group (page 669) operator brings in any sub-totals from the shards and combines them: in some cases these
may be structures. For example, the $avg (page 676) expression maintains a total and count for each shard; mongos
(page 938) combines these values and then divides.
Map-Reduce and Sharded Collections
Map-reduce supports operations on sharded collections, both as an input and as an output. This section describes the
behaviors of mapReduce (page 701) specific to sharded collections.
Sharded Collection as Input
When using sharded collection as the input for a map-reduce operation, mongos (page 938) will automatically dispatch the map-reduce job to each shard in parallel. There is no special option required. mongos (page 938) will wait
for jobs on all shards to finish.
Sharded Collection as Output
289
290
Chapter 6. Aggregation
{
"_id": "10280",
"city": "NEW YORK",
"state": "NY",
"pop": 5574,
"loc": [
-74.016323,
40.710537
]
}
Aggregations operations using the aggregate() (page 808) helper process all documents in the zipcodes collection. aggregate() (page 808) connects a number of pipeline (page 279) operators, which define the aggregation
process.
In this example, the pipeline passes all documents in the zipcodes collection through the following steps:
the $group (page 669) operator collects all documents and creates documents for each state.
These new per-state documents have one field in addition the _id field: totalPop which is a generated field
using the $sum (page 678) operation to calculate the total value of all pop fields in the source documents.
After the $group (page 669) operation the documents in the pipeline resemble the following:
{
"_id" : "AK",
"totalPop" : 550043
}
the $match (page 666) operation filters these documents so that the only documents that remain are those
where the value of totalPop is greater than or equal to 10 million.
The $match (page 666) operation does not alter the documents, which have the same format as the documents
output by $group (page 669).
The equivalent SQL for this operation is:
291
Aggregations operations using the aggregate() (page 808) helper process all documents in the zipcodes collection. aggregate() (page 808) connects a number of pipeline (page 279) operators that define the aggregation
process.
In this example, the pipeline passes all documents in the zipcodes collection through the following steps:
the $group (page 669) operator collects all documents and creates new documents for every combination of
the city and state fields in the source document.
After this stage in the pipeline, the documents resemble the following:
{
"_id" : {
"state" : "CO",
"city" : "EDGEWATER"
},
"pop" : 13154
}
the second $group (page 669) operator collects documents by the state field and use the $avg (page 676)
expression to compute a value for the avgCityPop field.
The final output of this aggregation operation is:
{
"_id" : "MN",
"avgCityPop" : 5335
},
292
Chapter 6. Aggregation
Aggregation operations using the aggregate() (page 808) helper process all documents in the zipcodes collection. aggregate() (page 808) combines a number of pipeline (page 279) operators that define the aggregation
process.
All documents from the zipcodes collection pass into the pipeline, which consists of the following steps:
the $group (page 669) operator collects all documents and creates new documents for every combination of
the city and state fields in the source documents.
By specifying the value of _id as a sub-document that contains both fields, the operation preserves the state
field for use later in the pipeline. The documents produced by this stage of the pipeline have a second field,
pop, which uses the $sum (page 678) operator to provide the total of the pop fields in the source document.
At this stage in the pipeline, the documents resemble the following:
{
"_id" : {
"state" : "CO",
"city" : "EDGEWATER"
},
"pop" : 13154
}
$sort (page 670) operator orders the documents in the pipeline based on the vale of the pop field from largest
to smallest. This operation does not alter the documents.
the second $group (page 669) operator collects the documents in the pipeline by the state field, which is a
field inside the nested _id document.
Within each per-state document this $group (page 669) operator specifies four fields: Using the $last
(page 675) expression, the $group (page 669) operator creates the biggestcity and biggestpop fields
that store the city with the largest population and that population. Using the $first (page 675) expression, the
$group (page 669) operator creates the smallestcity and smallestpop fields that store the city with
the smallest population and that population.
The documents, at this stage in the pipeline resemble the following:
{
"_id" : "WA",
"biggestCity" : "SEATTLE",
"biggestPop" : 520096,
"smallestCity" : "BENGE",
"smallestPop" : 2
}
The final operation is $project (page 664), which renames the _id field to state and moves
the biggestCity, biggestPop, smallestCity, and smallestPop into biggestCity and
smallestCity sub-documents.
293
All documents from the users collection pass through the pipeline, which consists of the following operations:
The $project (page 664) operator:
creates a new field called name.
converts the value of the _id to upper case, with the $toUpper (page 685) operator. Then the
$project (page 664) creates a new field, named name to hold this value.
suppresses the id field. $project (page 664) will pass the _id field by default, unless explicitly
suppressed.
The $sort (page 670) operator orders the results by the name field.
294
Chapter 6. Aggregation
The pipeline passes all documents in the users collection through the following operations:
The $project (page 664) operator:
Creates two new fields: month_joined and name.
Suppresses the id from the results. The aggregate() (page 808) method includes the _id, unless
explicitly suppressed.
The $month (page 686) operator converts the values of the joined field to integer representations of the
month. Then the $project (page 664) operator assigns those values to the month_joined field.
The $sort (page 670) operator sorts the results by the month_joined field.
The operation returns results that resemble the following:
{
"month_joined" : 1,
"name" : "ruth"
},
{
"month_joined" : 1,
"name" : "harold"
},
{
"month_joined" : 1,
"name" : "kate"
}
{
"month_joined" : 2,
295
"name" : "jill"
}
The pipeline passes all documents in the users collection through the following operations:
The $project (page 664) operator creates a new field called month_joined.
The $month (page 686) operator converts the values of the joined field to integer representations of the
month. Then the $project (page 664) operator assigns the values to the month_joined field.
The $group (page 669) operator collects all documents with a given month_joined value and counts how
many documents there are for that value. Specifically, for each unique value, $group (page 669) creates a new
per-month document with two fields:
_id, which contains a nested document with the month_joined field and its value.
number, which is a generated field. The $sum (page 678) operator increments this field by 1 for every
document containing the given month_joined value.
The $sort (page 670) operator sorts the documents created by $group (page 669) according to the contents
of the month_joined field.
The result of this aggregation operation would resemble the following:
{
"_id" : {
"month_joined" : 1
},
"number" : 3
},
{
"_id" : {
"month_joined" : 2
},
"number" : 9
},
{
"_id" : {
"month_joined" : 3
},
"number" : 5
}
296
Chapter 6. Aggregation
The pipeline begins with all documents in the users collection, and passes these documents through the following
operations:
The $unwind (page 668) operator separates each value in the likes array, and creates a new version of the
source document for every element in the array.
Example
Given the following document from the users collection:
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : ["golf", "racquetball"]
}
The $unwind (page 668) operator would create the following documents:
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : "golf"
}
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : "racquetball"
}
The $group (page 669) operator collects all documents the same value for the likes field and counts each
grouping. With this information, $group (page 669) creates a new document with two fields:
_id, which contains the likes value.
number, which is a generated field. The $sum (page 678) operator increments this field by 1 for every
document containing the given likes value.
The $sort (page 670) operator sorts these documents by the number field in reverse order.
The $limit (page 667) operator only includes the first 5 result documents.
The results of aggregation would resemble the following:
{
"_id" : "golf",
"number" : 33
},
297
{
"_id" : "racquetball",
"number" : 31
},
{
"_id" : "swimming",
"number" : 24
},
{
"_id" : "handball",
"number" : 19
},
{
"_id" : "tennis",
"number" : 18
}
2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices:
The valuesPrices is an array whose elements are the price values emitted by the map function and
grouped by keyCustId.
298
Chapter 6. Aggregation
The function reduces the valuesPrice array to the sum of its elements.
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function
and the reduceFunction1 reduce function.
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
2. Define the corresponding reduce function with two arguments keySKU and countObjVals:
countObjVals is an array whose elements are the objects mapped to the grouped keySKU values
passed by map function to the reducer function.
The function reduces the countObjVals array to a single object reducedValue that contains the
count and the qty fields.
In reducedVal, the count field contains the sum of the count fields from the individual array elements, and the qty field contains the sum of the qty fields from the individual array elements.
var reduceFunction2 = function(keySKU, countObjVals) {
reducedVal = { count: 0, qty: 0 };
for (var idx = 0; idx < countObjVals.length; idx++) {
reducedVal.count += countObjVals[idx].count;
299
reducedVal.qty += countObjVals[idx].qty;
}
return reducedVal;
};
3. Define a finalize function with two arguments key and reducedVal. The function modifies the
reducedVal object to add a computed field named avg and returns the modified object:
var finalizeFunction2 = function (key, reducedVal) {
reducedVal.avg = reducedVal.qty/reducedVal.count;
return reducedVal;
};
using
the
mapFunction2,
db.orders.mapReduce( mapFunction2,
reduceFunction2,
{
out: { merge: "map_reduce_example" },
query: { ord_date:
{ $gt: new Date('01/01/2012') }
},
finalize: finalizeFunction2
}
)
This operation uses the query field to select only those documents with ord_date greater than new
Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the
map_reduce_example collection already exists, the operation will merge the existing contents with the
results of this map-reduce operation.
300
Chapter 6. Aggregation
Data Setup
The sessions collection contains documents that log users sessions each day, for example:
db.sessions.save(
db.sessions.save(
db.sessions.save(
db.sessions.save(
{
{
{
{
userid:
userid:
userid:
userid:
"a",
"b",
"c",
"d",
ts:
ts:
ts:
ts:
ISODate('2011-11-03
ISODate('2011-11-03
ISODate('2011-11-03
ISODate('2011-11-03
14:17:00'),
14:23:00'),
15:02:00'),
16:45:00'),
length:
length:
length:
length:
95 } );
110 } );
120 } );
45 } );
db.sessions.save(
db.sessions.save(
db.sessions.save(
db.sessions.save(
{
{
{
{
userid:
userid:
userid:
userid:
"a",
"b",
"c",
"d",
ts:
ts:
ts:
ts:
ISODate('2011-11-04
ISODate('2011-11-04
ISODate('2011-11-04
ISODate('2011-11-04
11:05:00'),
13:14:00'),
17:00:00'),
15:37:00'),
length:
length:
length:
length:
105 } );
120 } );
130 } );
65 } );
2. Define the corresponding reduce function with two arguments key and values to calculate the total time and
the count. The key corresponds to the userid, and the values is an array whose elements corresponds to
the individual objects mapped to the userid in the mapFunction.
var reduceFunction = function(key, values) {
var reducedObject = {
userid: key,
total_time: 0,
count:0,
avg_time:0
};
values.forEach( function(value) {
reducedObject.total_time += value.total_time;
reducedObject.count += value.count;
}
);
return reducedObject;
};
3. Define the finalize function with two arguments key and reducedValue. The function modifies the
reducedValue document to add another field average and returns the modified document.
301
if (reducedValue.count > 0)
reducedValue.avg_time = reducedValue.total_time / reducedValue.cou
return reducedValue;
};
4. Perform map-reduce on the session collection using the mapFunction, the reduceFunction, and the
finalizeFunction functions. Output the results to a collection session_stat. If the session_stat
collection already exists, the operation will replace the contents:
db.sessions.mapReduce( mapFunction,
reduceFunction,
{
out: { reduce: "session_stat" },
finalize: finalizeFunction
}
)
{
{
{
{
userid:
userid:
userid:
userid:
"a",
"b",
"c",
"d",
ts:
ts:
ts:
ts:
ISODate('2011-11-05
ISODate('2011-11-05
ISODate('2011-11-05
ISODate('2011-11-05
14:17:00'),
14:23:00'),
15:02:00'),
16:45:00'),
length:
length:
length:
length:
100 } );
115 } );
125 } );
55 } );
At the end of the day, perform incremental map-reduce on the sessions collection, but use the query field to select
only the new documents. Output the results to the collection session_stat, but reduce the contents with the
results of the incremental map-reduce:
db.sessions.mapReduce( mapFunction,
reduceFunction,
{
query: { ts: { $gt: ISODate('2011-11-05 00:00:00') } },
out: { reduce: "session_stat" },
finalize: finalizeFunction
}
);
302
Chapter 6. Aggregation
status: 'A',
price: 250,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 } ]
}
1. Define the map function that maps the price to the cust_id for each document and emits the cust_id and
price pair:
var map = function() {
emit(this.cust_id, this.price);
};
3. Invoke the map function with a single document from the orders collection:
var myDoc = db.orders.findOne( { _id: ObjectId("50a8240b927d5d8b5891743c") } );
map.apply(myDoc);
5. Invoke the map function with multiple documents from the orders collection:
var myCursor = db.orders.find( { cust_id: "abc123" } );
while (myCursor.hasNext()) {
var doc = myCursor.next();
print ("document _id= " + tojson(doc._id));
map.apply(doc);
print();
}
303
5. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};
8. Verify the reduceFunction2 returned a document with exactly the count and the qty field:
{ "count" : 6, "qty" : 30 }
304
Chapter 6. Aggregation
2. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};
3. Invoke the reduceFunction2 first with values1 and then with values2:
reduceFunction2('myKey', values1);
reduceFunction2('myKey', values2);
305
return reducedValue;
};
3. Define a sample valuesIdempotent array that contains an element that is a call to the reduceFunction2
function:
var valuesIdempotent = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
reduceFunction2(myKey, [ { count:3, qty: 15 } ] )
];
4. Define a sample values1 array that combines the values passed to reduceFunction2:
var values1 = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
{ count: 3, qty: 15 }
];
5. Invoke the reduceFunction2 first with myKey and valuesIdempotent and then with myKey and
values1:
reduceFunction2(myKey, valuesIdempotent);
reduceFunction2(myKey, values1);
306
Chapter 6. Aggregation
Flexibility
In addition to grouping
operations, can perform complex
aggregation tasks as well as
perform incremental aggregation
on continuously growing
datasets.
See Map-Reduce Examples
(page 298) and Perform
Incremental Map-Reduce
(page 300).
Custom map, reduce and
finalize JavaScript functions
offer flexibility to aggregation
logic.
See mapReduce (page 701) for
details and restrictions on the
functions.
Description
Reshapes a document stream. $project (page 664) can rename, add, or remove fields as well
as create computed values and sub-documents.
Filters the document stream, and only allows matching documents to pass into the next pipeline
stage. $match (page 666) uses standard MongoDB queries.
Restricts the number of documents in an aggregation pipeline.
Skips over a specified number of documents from the pipeline and returns the rest.
Takes an array of documents and returns them as a stream of documents.
Groups documents together for the purpose of calculating aggregate values based on a collection
of documents.
Takes all input documents and returns them in a stream of sorted documents.
Returns an ordered stream of documents based on proximity to a geospatial point.
Expression Operators
Expression operators calculate values within the Pipeline Operators (page 664).
$group Operators
Boolean Operators
These operators perform comparisons between two values and return a Boolean, in most cases reflecting the result of
the comparison.
All comparison operators take an array with a pair of values. You may compare numbers, strings, and dates. Except
for $cmp (page 680), all comparison operators return a Boolean value. $cmp (page 680) returns an integer.
308
Chapter 6. Aggregation
Arithmetic Operators
Date operators take a Date typed value as a single argument and return a number.
Conditional Expressions
Examples
The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB statements. The examples in the table assume the following conditions:
The SQL examples assume two tables, orders and order_lineitem that join by the
order_lineitem.order_id and the orders.id columns.
The MongoDB examples assume one collection orders that contain documents of the following prototype:
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
309
The MongoDB statements prefix the names of the fields from the documents in the collection orders with a $
character when they appear as operands to the aggregation operations.
310
Chapter 6. Aggregation
SQL Example
MongoDB Example
Description
Count all records from orders
db.orders.aggregate( [
{ $group: { _id: null,
count: { $sum: 1 } } }
] )
For
each
unique
cust_id,
SELECT cust_id,
db.orders.aggregate( [
ord_date grouping, sum the
ord_date,
{ $group: { _id: { cust_id: "$cust_id",
price field and return only where
SUM(price) AS total
ord_date: "$ord_date" },
the sum is greater than 250.
FROM orders
total: { $sum: "$price" } } },
GROUP BY cust_id, ord_date
{ $match: { total: { $gt: 250 } } }
HAVING total > 250
] )
For each unique cust_id with staSELECT cust_id,
db.orders.aggregate( [
tus A, sum the price field.
SUM(price) as total
{ $match: { status: 'A' } },
FROM orders
{ $group: { _id: "$cust_id",
WHERE status = 'A'
total: { $sum: "$price" } } }
GROUP BY cust_id
] )
For each unique cust_id with staSELECT cust_id,
db.orders.aggregate( [
tus A, sum the price field and return
SUM(price) as total
{ $match: { status: 'A' } },
only where the sum is greater than
FROM orders
{ $group: { _id: "$cust_id",
250.
WHERE status = 'A'
total: { $sum: "$price" } } },
6.4.
Aggregation
Reference
311
GROUP
BY cust_id
{ $match: { total: { $gt: 250 } } }
HAVING total > 250
] )
For each unique cust_id, sum the
Description
Performs aggregation tasks (page 279) such as group using the aggregation framework.
Counts the number of documents in a collection.
Displays the distinct values found for a specified key in a collection.
Groups documents in a collection by the specified key and performs simple aggregation.
Performs map-reduce (page 282) aggregation for large data sets.
Aggregation Methods
Name
db.collection.aggregate()
(page 808)
db.collection.group()
(page 828)
db.collection.mapReduce()
(page 837)
312
Description
Provides access to the aggregation pipeline (page 279).
Groups documents in a collection by the specified key and performs
simple aggregation.
Performs map-reduce (page 282) aggregation for large data sets.
Chapter 6. Aggregation
CHAPTER 7
Indexes
Indexes provide high performance read operations for frequently used queries.
This section introduces indexes in MongoDB, describes the types and configuration options for indexes, and describes
special types of indexing MongoDB supports. The section also provides tutorials detailing procedures and operational
concerns, and providing information on how applications may use indexes.
Index Introduction (page 313) An introduction to indexes in MongoDB.
Index Concepts (page 318) The core documentation of indexes in MongoDB, including geospatial and text indexes.
Index Types (page 319) MongoDB provides different types of indexes for different purposes and different types
of content.
Index Properties (page 333) The properties you can specify when building indexes.
Index Creation (page 335) The options available when creating indexes.
Indexing Tutorials (page 338) Examples of operations involving indexes, including index creation and querying indexes.
Indexing Reference (page 374) Reference material for indexes in MongoDB.
313
Figure 7.1: Diagram of a query selecting documents using an index. MongoDB narrows the query by scanning the
range of documents with values of score less than 30.
Create indexes to support common and user-facing queries. Having these indexes will ensure that MongoDB only
scans the smallest possible number of documents.
Indexes can also optimize the performance of other operations in specific situations:
Sorted Results
MongoDB can use indexes to return documents sorted by the index key directly from the index without
requiring an additional sort phase.
Covered Results
When the query criteria and the projection of a query include only the indexed fields, MongoDB will return
results directly from the index without scanning any documents or bringing documents into memory.
These covered queries can be very efficient. Indexes can also cover aggregation pipeline operations
(page 279).
314
Chapter 7. Indexes
Figure 7.2: Diagram of a query that uses an index to select and return sorted results. The index stores score values
in ascending order. MongoDB can traverse the index in either ascending or descending order to return sorted results.
Figure 7.3: Diagram of a query that uses only the index to match the query criteria and return the results. MongoDB
does not need to inspect data outside of the index to fulfill the query.
315
Single Field
In addition to the MongoDB-defined _id index, MongoDB supports user-defined indexes on a single field of a document (page 320). Consider the following illustration of a single-field index:
Compound Index
MongoDB also supports user-defined indexes on multiple fields. These compound indexes (page 322) behave like
single-field indexes; however, the query can select documents based on additional fields. The order of fields listed
in a compound index has significance. For instance, if a compound index consists of { userid: 1, score:
-1 }, the index sorts first by userid and then, within each userid value, sort by score. Consider the following
illustration of this compound index:
Figure 7.5: Diagram of a compound index on the userid field (ascending) and the score field (descending). The
index sorts first by the userid field and then by the score field.
316
Chapter 7. Indexes
Multikey Index
MongoDB uses multikey indexes (page 324) to index the content stored in arrays. If you index a field that holds an
array value, MongoDB creates separate index entries for every element of the array. These multikey indexes (page 324)
allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB
automatically determines whether to create a multikey index if the indexed field contains an array value; you do not
need to explicitly specify the multikey type.
Consider the following illustration of a multikey index:
Figure 7.6: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address
documents. The address documents contain the zip field.
Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes
(page 330) that uses planar geometry when returning results and 2sphere indexes (page 328) that use spherical geometry to return results.
See 2d Index Internals (page 330) for a high level introduction to geospatial indexes.
Text Indexes
MongoDB provides a beta text index type that supports searching for string content in a collection. These text
indexes do not store language-specific stop words (e.g. the, a, or) and stem the words in a collection to only
store root words.
See Text Indexes (page 332) for more information on text indexes and search.
317
Hashed Indexes
To support hash based sharding (page 506), MongoDB provides a hashed index (page 333) type, which indexes the
hash of the value of a field. These indexes have a more random distribution of values along their range, but only
support equality matches and cannot support range-based queries.
318
Chapter 7. Indexes
Sparse Indexes (page 335) A sparse index does not index documents that do not have the indexed field.
Index Creation (page 335) The options available when creating indexes.
MongoDB indexes may be ascending, (i.e. 1) or descending (i.e. -1) in their ordering. Nevertheless, MongoDB may
also traverse the index in either directions. As a result, for single-field indexes, ascending and descending indexes are
interchangeable. This is not the case for compound indexes: in compound indexes, the direction of the sort order can
have a greater impact on the results.
See Sort Order (page 323) for more information on the impact of index order on results in compound indexes.
Redundant Indexes
A single query can only use one index, except for queries that use the $or (page 625) operator that can use a different
index for each clause.
See also:
Index Limitations (page 1016).
Index Type Documentation
Single Field Indexes (page 320) A single field index only includes data from a single field of the documents in a
collection. MongoDB supports single field indexes on fields at the top level of a document and on fields in
sub-documents.
Compound Indexes (page 322) A compound index includes more than one field of the documents in a collection.
Multikey Indexes (page 324) A multikey index references an array and records a match if a query includes any value
in the array.
319
Geospatial Indexes and Queries (page 326) Geospatial indexes support location-based searches on data that is stored
as either GeoJSON objects or legacy coordinate pairs.
Text Indexes (page 332) Text indexes supports search of string content in documents.
Hashed Index (page 333) Hashed indexes maintain entries with hashes of the values of the indexed field.
Single Field Indexes
MongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections
have an index on the _id field (page 321), and applications and users may add additional indexes to support important
queries and operations.
MongoDB supports indexes that contain either a single field or multiple fields depending on the operations that index
supports. This document describes indexes that contain a single field. Consider the following illustration of a single
field index.
Cases
320
Chapter 7. Indexes
_id Field Index For all collections, MongoDB creates the default _id index, which is a unique index (page 334)
on the _id field. MongoDB creates this index by default on all collections. You cannot delete the index on _id.
You can think of the _id field as the primary key for the collection. Every document must have a unique _id field.
You may store any unique value in the _id field. The default value of _id is an ObjectID on every insert()
(page 832) operation. An ObjectId is a 12-byte unique identifiers suitable for use as the value of an _id field.
Note: In sharded clusters, if you do not use the _id field as the shard key, then your application must ensure the
uniqueness of the values in the _id field to prevent errors. This is most-often done by using a standard auto-generated
ObjectId.
Before version 2.2, capped collections did not have an _id field. In version 2.2 and newer, capped collection do
have an _id field, except those in the local database. See Capped Collections Reccomendations and Restrictions
(page 156) for more information.
Indexes on Embedded Fields You can create indexes on fields embedded in sub-documents, just as you can index
top-level fields in documents. Indexes on embedded fields differ from indexes on sub-documents (page 321), which
include the full content up to the maximum Index Size (page ??) of the sub-document in the index. Instead,
indexes on embedded fields allow you to use a dot notation, to introspect into sub-documents.
Consider a collection named people that holds documents that resemble the following example document:
{"_id": ObjectId(...)
"name": "John Doe"
"address": {
"street": "Main"
"zipcode": 53511
"state": "WI"
}
}
You can create an index on the address.zipcode field, using the following specification:
db.people.ensureIndex( { "address.zipcode": 1 } )
The metro field is a subdocument, containing the embedded fields city and state. The following creates an index
on the metro field as a whole:
db.factories.ensureIndex( { metro: 1 } )
The following query can use the index on the metro field:
db.factories.find( { metro: { city: "New York", state: "NY" } } )
321
This query returns the above document. When performing equality matches on subdocuments, field order matters and
the subdocuments must match exactly. For example, the following query does not match the above document:
db.factories.find( { metro: { state: "NY", city: "New York" } } )
See Query Subdocuments (page 818) for more information regarding querying on subdocuments.
The index on the metro field can also support the following query:
db.factories.find( { metro: { $gte : { city: "New York" } } } )
This query returns the above document because { city: "New York", state: "NY" } is greater than {
city: "New York" }. The order of comparison is in ascending key order in the order that the keys occur in the
BSON document.
Compound Indexes
MongoDB supports compound indexes, where a single index structure holds references to multiple fields
collections documents. The following diagram illustrates an example of a compound index on two fields:
within a
Figure 7.8: Diagram of a compound index on the userid field (ascending) and the score field (descending). The
index sorts first by the userid field and then by the score field.
Compound indexes can support queries that match on multiple fields.
Example
Consider a collection named products that holds documents that resemble the following document:
{
"_id": ObjectId(...)
"item": "Banana"
"category": ["food", "produce", "grocery"]
"location": "4th Street Store"
"stock": 4
"type": cases
"arrival": Date(...)
}
2
MongoDB imposes a limit of 31 fields for any compound index (page 1016).
322
Chapter 7. Indexes
If applications query on the item field as well as query on both the item field and the stock field, you can specify
a single compound index to support both of these queries:
db.products.ensureIndex( { "item": 1, "stock": 1 } )
Important: You may not create compound indexes that have hashed index fields. You will receive an error if you
attempt to create a compound index that includes a hashed index (page 333).
The order of the fields in a compound index is very important. In the previous example, the index will contain
references to documents sorted first by the values of the item field and, within each value of the item field, sorted
by values of the stock field. See Sort Order (page 323) for more information.
In addition to supporting queries that match on all the index fields, compound indexes can support queries that match
on the prefix of the index fields. For details, see Prefixes (page 323).
Sort Order Indexes store references to fields in either ascending (1) or descending (-1) sort order. For single-field
indexes, the sort order of keys doesnt matter because MongoDB can traverse the index in either direction. However,
for compound indexes (page 322), sort order can matter in determining whether the index can support a sort operation.
Consider a collection events that contains documents with the fields username and date. Applications can issue
queries that return results sorted first by ascending username values and then by descending (i.e. more recent to last)
date values, such as:
db.events.find().sort( { username: 1, date: -1 } )
or queries that return results sorted first by descending username values and then by ascending date values, such
as:
db.events.find().sort( { username: -1, date: 1 } )
However, the above index cannot support sorting by ascending username values and then by ascending date values,
such as the following:
db.events.find().sort( { username: 1, date: 1 } )
Prefixes Compound indexes support queries on any prefix of the index fields. Index prefixes are the beginning
subset of indexed fields. For example, given the index { a: 1, b: 1, c: 1 }, both { a: 1 } and {
a: 1, b: 1 } are prefixes of the index.
If you have a collection that has a compound index on { a: 1, b: 1 }, as well as an index that consists of the
prefix of that index, i.e. { a: 1 }, assuming none of the index has a sparse or unique constraints, then you can
drop the { a: 1 } index. MongoDB will be able to use the compound index in all of situations that it would have
used the { a: 1 } index.
Example
Given the following index:
{ "item": 1, "location": 1, "stock": 1 }
323
Multikey Indexes
To index a field that holds an array value, MongoDB adds index items for each item in the array. These multikey indexes
allow MongoDB return documents from queries using the value of an array. MongoDB automatically determines
whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify
the multikey type.
Consider the following illustration of a multikey index:
Figure 7.9: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address
documents. The address documents contain the zip field.
Multikey indexes support all operations of a other MongoDB indexes; however, applications may use multikey indexes
to select documents based on ranges of values for the value of an array. Multikey indexes support arrays that hold both
values (e.g. strings, numbers) and nested documents.
Limitations
324
Chapter 7. Indexes
Interactions between Compound and Multikey Indexes While you can create multikey compound indexes
(page 322), at most one field in a compound index may hold an array. For example, given an index on { a: 1,
b: 1 }, the following documents are permissible:
{a: [1, 2], b: 1}
{a: 1, b: [1, 2]}
However, the following document is impermissible, and MongoDB cannot insert such a document into a collection
with the {a: 1, b: 1 } index:
{a: [1, 2], b: [1, 2]}
If you attempt to insert a such a document, MongoDB will reject the insertion, and produce an error that says cannot
index parallel arrays. MongoDB does not index parallel arrays because they require the index to include
each value in the Cartesian product of the compound keys, which could quickly result in incredibly large and difficult
to maintain indexes.
Shard Keys
Important: The index of a shard key cannot be a multi-key index.
Hashed Indexes hashed indexes are not compatible with multi-key indexes.
To compute the hash for a hashed index, MongoDB collapses sub-documents and computes the hash for the entire
value. For fields that hold arrays or sub-documents, you cannot use the index to support queries that introspect the
sub-document.
Examples
Index Basic Arrays Given the following document:
{
"_id" : ObjectId("..."),
"name" : "Warm Weather",
"author" : "Steve",
"tags" : [ "weather", "hot", "record", "april" ]
}
"weather",
"hot",
"record", and
"april".
Queries could use the multikey index to return queries for any of the above values.
Index Arrays with Nested Documents You can use multikey indexes to index fields within objects embedded in
arrays, as in the following example:
Consider a feedback collection with documents in the following form:
325
{
"_id": ObjectId(...)
"title": "Grocery Quality"
"comments": [
{ author_id: ObjectId(...)
date: Date(...)
text: "Please expand the cheddar selection." },
{ author_id: ObjectId(...)
date: Date(...)
text: "Please expand the mustard selection." },
{ author_id: ObjectId(...)
date: Date(...)
text: "Please expand the olive selection." }
]
}
An index on the comments.text field would be a multikey index and would add items to the index for all of the
sub-documents in the array.
With an index, such as { comments.text:
This would select the document, that contains the following document in the comments.text array:
{ author_id: ObjectId(...)
date: Date(...)
text: "Please expand the olive selection." }
MongoDB offers a number of indexes and query mechanisms to handle geospatial information. This section introduces
MongoDBs geospatial features.
Surfaces Before storing your location data and writing queries, you must decide the type of surface to use to perform
calculations. The type you choose affects how you store data, what type of index to build, and the syntax of your
queries.
MongoDB offers two surface types:
Spherical
To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use 2dsphere
(page 328) index.
Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate reference system for GeoJSON uses the WGS84 datum.
Flat
To calculate distances on a Euclidean plane, store your location data as legacy coordinate pairs and use a 2d
(page 330) index.
Location Data If you choose spherical surface calculations, you store location data as
GeoJSON objects (preferred).
326
Chapter 7. Indexes
Queries on GeoJSON objects always calculate on a sphere. The default coordinate reference system for GeoJSON uses the WGS84 datum.
New in version 2.4: The storage and querying of GeoJSON objects is new in version 2.4. Prior to version 2.4,
all geospatial data was stored as coordinate pairs.
MongoDB supports the following GeoJSON objects:
Point
LineString
Polygon
Legacy coordinate pairs
MongoDB supports spherical surface calculations on legacy coordinate pairs by converting the data to the GeoJSON Point type.
If you choose flat surface calculations, you can store data only as legacy coordinate pairs.
Query Operations MongoDBs geospatial query operators let you query for:
Inclusion. MongoDB can query for locations contained entirely within a specified polygon. Inclusion queries
use the $geoWithin (page 635) operator.
Intersection. MongoDB can query for locations that intersect with a specified geometry. These queries apply
only to data on a spherical surface. These queries use the $geoIntersects (page 637) operator.
Proximity. MongoDB can query for the points nearest to another point. Proximity queries use the $near
(page 637) operator. The $near (page 637) operator requires a 2d or 2dsphere index.
Geospatial Indexes MongoDB provides the following geospatial index types to support the geospatial queries:
2dsphere (page 328), which supports:
Calculations on a sphere
Both GeoJSON objects and legacy coordinate pairs
A compound index with scalar index fields (i.e. ascending or descending) as a prefix or suffix of the
2dsphere index field
New in version 2.4: 2dsphere indexes are not available before version 2.4.
2d (page 330), which supports:
Calculations using flat geometry
Legacy coordinate pairs (i.e., geospatial points on a flat coordinate system)
A compound index with only one additional field, as a suffix of the 2d index field
Geospatial Indexes and Sharding You cannot use a geospatial index as the shard key index.
You can create and maintain a geospatial index on a sharded collection if using different fields as the shard key.
Queries using $near (page 637) are not supported for sharded collections. Use geoNear (page 708) instead. You
also can query for geospatial data using $geoWithin (page 635).
327
Additional Resources The following pages provide complete documentation for geospatial indexes and queries:
2dsphere Indexes (page 328) A 2dsphere index supports queries that calculate geometries on an earth-like sphere.
The index supports data stored as both GeoJSON objects and as legacy coordinate pairs.
2d Indexes (page 330) The 2d index supports data stored as legacy coordinate pairs and is intended for use in MongoDB 2.2 and earlier.
Haystack Indexes (page 330) A haystack index is a special index optimized to return results over small areas. For
queries that use spherical geometry, a 2dsphere index is a better option than a haystack index.
2d Index Internals (page 330) Provides a more in-depth explanation of the internals of geospatial indexes. This material is not necessary for normal operations but may be useful for troubleshooting and for further understanding.
2dsphere Indexes New in version 2.4.
A 2dsphere index supports queries that calculate geometries on an earth-like sphere. The index supports data stored
as both GeoJSON objects and as legacy coordinate pairs. The index supports legacy coordinate pairs by converting
the data to the GeoJSON Point type.
The 2dsphere index supports all MongoDB geospatial queries: queries for inclusion, intersection and proximity.
A compound (page 322) 2dsphere index can reference multiple location and non-location fields within a collections
documents. You can arrange the fields in any order.
The default datum for an earth-like sphere in MongoDB 2.4 is WGS84. Coordinate-axis order is longitude, latitude.
Important: MongoDB allows only one geospatial index per collection. You can create either a 2dsphere or a 2d
(page 330) per collection.
Important: You cannot use a 2dsphere index as a shard key when sharding a collection. However, you can create
and maintain a geospatial index on a sharded collection by using a different field as the shard key.
328
Chapter 7. Indexes
Polygons consist of an array of GeoJSON LinearRing coordinate arrays. These LinearRings are closed
LineStrings. Closed LineStrings have at least four coordinate pairs and specify the same position as the
first and last coordinates.
The following example stores a GeoJSON Polygon with an exterior ring and no interior rings (or holes). Note the
first and last coordinate pair with the [ 0 , 0 ] coordinate:
{ loc :
{ type : "Polygon" ,
coordinates : [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ]
} }
329
2d Indexes
Important: MongoDB allows only one geospatial index per collection. You can create either a 2d or a 2dsphere
(page 328) per collection.
Use a 2d index for data stored as points on a two-dimensional plane. The 2d index is intended for legacy coordinate
pairs used in MongoDB 2.2 and earlier.
Use a 2d index if:
your database has legacy location data from MongoDB 2.2 or earlier, and
you do not intend to store any location data as GeoJSON objects.
Do not use a 2d index if your location data includes GeoJSON objects. To index on both legacy coordinate pairs and
GeoJSON objects, use a 2dsphere (page 328) index.
The 2d index supports calculations on a flat, Euclidean plane. The 2d index also supports distance-only calculations
on a sphere, but for geometric calculations (e.g. $geoWithin (page 635)) on a sphere, store data as GeoJSON
objects and use the 2dsphere index type.
A 2d index can reference two fields. The first must be the location field. A 2d compound index constructs queries
that select first on the location field, and then filters those results by the additional criteria. A compound 2d index can
cover queries.
Note: You cannot use a 2d index as a shard key when sharding a collection. However, you can create and maintain a
geospatial index on a sharded collection by using a different field as the shard key.
Store Points on a 2D Plane To store location data as legacy coordinate pairs, use either an array (preferred):
loc : [ <longitude> , <latitude> ]
Or an embedded document:
loc : { lng : <longitude> , lat : <latitude> }
Arrays are preferred as certain languages do not guarantee associative map ordering.
Whether as an array or document, if you use longitude and latitude, store coordinates in this order: longitude, latitude.
Haystack Indexes A haystack index is a special index that is optimized to return results over small areas. Haystack
indexes improve performance on queries that use flat geometry.
For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2dsphere indexes
(page 328) allow field reordering; haystack indexes require the first field to be the location field. Also, haystack indexes
are only usable via commands and so always return all results at once.
Haystack indexes create buckets of documents from the same geographic area in order to improve performance for
queries limited to that area. Each bucket in a haystack index contains all the documents within a specified proximity
to a given longitude and latitude.
To create a geohaystacks index, see Create a Haystack Index (page 354). For information and example on querying a
haystack index, see Query a Haystack Index (page 355).
2d Index Internals This document provides a more in-depth explanation of the internals of MongoDBs 2d geospatial indexes. This material is not necessary for normal operations or application development but may be useful for
troubleshooting and for further understanding.
330
Chapter 7. Indexes
Calculation of Geohash Values for 2d Indexes When you create a geospatial index on legacy coordinate pairs,
MongoDB computes geohash values for the coordinate pairs within the specified location range (page 352) and then
indexes the geohash values.
To calculate a geohash value, recursively divide a two-dimensional map into quadrants. Then assign each quadrant a
two-bit value. For example, a two-bit representation of four quadrants would be:
01
11
00
10
These two-bit values (00, 01, 10, and 11) represent each of the quadrants and all points within each quadrant. For
a geohash with two bits of resolution, all points in the bottom left quadrant would have a geohash of 00. The top
left quadrant would have the geohash of 01. The bottom right and top right would have a geohash of 10 and 11,
respectively.
To provide additional precision, continue dividing each quadrant into sub-quadrants. Each sub-quadrant would have
the geohash value of the containing quadrant concatenated with the value of the sub-quadrant. The geohash for the
upper-right quadrant is 11, and the geohash for the sub-quadrants would be (clockwise from the top left): 1101,
1111, 1110, and 1100, respectively.
Multi-location Documents for 2d Indexes New in version 2.0: Support for multiple locations in a document.
While 2d geospatial indexes do not support more than one set of coordinates in a document, you can use a multi-key
index (page 324) to index multiple coordinate pairs in a single document. In the simplest example you may have a
field (e.g. locs) that holds an array of coordinates, as in the following example:
{ _id : ObjectId(...),
locs : [ [ 55.5 , 42.3 ] ,
[ -74 , 44.74 ] ,
{ lng : 55.5 , lat : 42.3 } ]
}
The values of the array may be either arrays, as in [ 55.5, 42.3 ], or embedded documents, as in { lng :
55.5 , lat : 42.3 }.
You could then create a geospatial index on the locs field, as in the following:
db.places.ensureIndex( { "locs": "2d" } )
You may also model the location data as a field inside of a sub-document. In this case, the document would contain
a field (e.g. addresses) that holds an array of documents where each document has a field (e.g. loc:) that holds
location coordinates. For example:
{ _id : ObjectId(...),
name : "...",
addresses : [ {
context
loc : [
} ,
{
context
loc : [
}
]
}
: "home" ,
55.5, 42.3 ]
: "home",
-74 , 44.74 ]
You could then create the geospatial index on the addresses.loc field as in the following example:
331
For documents with multiple coordinate values, queries may return the same document multiple times if more than
one indexed coordinate pair satisfies the query constraints. Use the uniqueDocs parameter to geoNear (page 708)
or the $uniqueDocs (page 643) operator with $geoWithin (page 635).
To include the location field with the distance field in multi-location document queries, specify includeLocs:
true in the geoNear (page 708) command.
See also:
Geospatial Query Compatibility (page 643)
Text Indexes
Create Text Index To create a text index, use the db.collection.ensureIndex() (page 814) method. To
index a field that contains a string or an array of string elements, include the field and specify the string literal "text"
in the index document, as in the following example:
db.reviews.ensureIndex( { comments: "text" } )
For examples of creating text indexes on multiple fields, see Create a text Index (page 358).
text indexes drop language-specific stop words (e.g. in English, the, an, a, and, etc.) and uses simple
language-specific suffix stemming. See Text Search Languages (page 719) for the supported languages and Specify a
Language for Text Index (page 362) for details on specifying languages with text indexes.
text indexes can cover a text search. If an index covers a text search, MongoDB does not need to inspect data outside
of the index to fulfill the search. For details, see Create text Index to Cover Queries (page 366).
Storage Requirements and Performance Costs text indexes have the following storage requirements and performance costs:
text indexes change the space allocation method for all future record allocations in a collection to
usePowerOf2Sizes (page 755).
text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed
field for each document inserted.
Building a text index is very similar to building a large multi-key index and will take longer than building a
simple ordered (scalar) index on the same data.
When building a large text index on an existing collection, ensure that you have a sufficiently high limit on
open file descriptors. See the recommended settings (page 225).
332
Chapter 7. Indexes
text indexes will impact insertion throughput because MongoDB must add an index entry for each unique
post-stemmed word in each indexed field of each new source document.
Additionally, text indexes do not store phrases or information about the proximity of words in the documents.
As a result, phrase queries will run much more effectively when the entire collection fits in RAM.
Text Search Text search supports the search of string content in documents of a collection. MongoDB provides the
text (page 715) command to perform the text search. The text (page 715) command accesses the text index.
The text search process:
tokenizes and stems the search term(s) during both the index creation and the text command execution.
assigns a score to each document that contains the search term in the indexed fields. The score determines the
relevance of a document to a given search query.
By default, the text (page 715) command returns at most the top 100 matching documents as determined by the
scores. The command can search for words and phrases. The command matches on the complete stemmed words. For
example, if a document field contains the word blueberry, a search on the term blue will not match the document.
However, a search on either blueberry or blueberries will match.
For information and examples on various text search patterns, see Search String Content for Text (page 359).
Hashed Index
This operation creates a hashed index for the active collection on the a field.
333
Unique Indexes (page 334) A unique index causes MongoDB to reject all documents that contain a duplicate value
for the indexed field.
Sparse Indexes (page 335) A sparse index does not index documents that do not have the indexed field.
TTL Indexes
TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after
a certain amount of time. This is ideal for some types of information like machine generated event data, logs, and
session information that only need to persist in a database for a limited amount of time.
These indexes have the following limitations:
Compound indexes (page 322) are not supported.
The indexed field must be a date type.
If the field holds an array, and there are multiple date-typed data in the index, the document will expire when
the lowest (i.e. earliest) matches the expiration threshold.
Warning: The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay
between the time a document expires and the time that MongoDB removes the document from the database.
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain
in a collection after they expire but before the background task runs or completes.
The duration of the removal operation depends on the workload of your mongod (page 925) instance. Therefore,
expired data may exist for some time beyond the 60 second period between runs of the background task.
In all other respects, TTL indexes are normal indexes, and if appropriate, MongoDB can use these indexes to fulfill
arbitrary queries.
See
Expire Data from Collections by Setting TTL (page 158)
Unique Indexes
A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. To
create a unique index on the user_id field of the members collection, use the following operation in the mongo
(page 942) shell:
db.addresses.ensureIndex( { "user_id": 1 }, { unique: true } )
334
Chapter 7. Indexes
Sparse Indexes
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null
value. The index skips over any document that is missing the indexed field. The index is sparse because it does not
include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing
null values for those documents that do not contain the indexed field.
The following example in the mongo (page 942) shell creates a sparse index on the xmpp_id field of the members
collection:
db.addresses.ensureIndex( { "xmpp_id": 1 }, { sparse: true } )
Example
Sparse index on a collection can results in incomplete results.
Consider a collection score that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
Then, the following query to return all documents in the scores collection sorted by the score field results in
incomplete results:
db.scores.find().sort( { score: -1 } )
Because the document for the userid "newbie" does not contain the score field, the query, which uses the sparse
index, will return incomplete results that omit that document:
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
Note: Do not confuse sparse indexes in MongoDB with block-level3 indexes in other databases. Think of them as
dense indexes with a specific filter.
You can combine the sparse index option with the unique indexes (page 334) option so that mongod (page 925) will
reject documents that have duplicate values for a field, but that ignore documents that do not have the key.
335
Related
Some options that you can specify to ensureIndex() (page 814) options control the properties of the index
(page 333), which are not index creation options. For example, the unique (page 334) option affects the behavior
of the index after creation.
For a detailed description of MongoDBs index types, see Index Types (page 319) and Index Properties (page 333) for
related documentation.
Background Construction
By default, creating an index blocks all other operations on a database. When building an index on a collection, the
database that holds the collection is unavailable for read or write operations until the index build completes. Any
operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index
build to complete.
For potentially long running index building operations, consider the background operation so that the MongoDB
database remains available during the index building operation. For example, to create an index in the background of
the zipcode field of the people collection, issue the following:
db.people.ensureIndex( { zipcode: 1}, {background: true} )
Behavior
As of MongoDB version 2.4, a mongod (page 925) instance can build more than one index in the background concurrently.
Changed in version 2.4: Before 2.4, a mongod (page 925) instance could only build one background
index per database at a time.
Changed in version 2.2: Before 2.2, a single mongod (page 925) instance could only build one index at a
time.
Background indexing operations run in the background so that other database operations can run while creating the
index. However, the mongo (page 942) shell session or connection where you are creating the index will block until
the index build is complete. To continue issuing commands to the database, open another connection or mongo
(page 942) instance.
Queries will not use partially-built indexes: the index will only be usable once the index build is complete.
Note: If MongoDB is building an index in the background, you cannot perform other administrative operations involving that collection, including running repairDatabase (page 757), dropping the collection (i.e.
db.collection.drop() (page 812)), and running compact (page 752). These operations will return an error during background index builds.
Performance
The background index operation uses an incremental approach that is slower than the normal foreground index
builds. If the index is larger than the available RAM, then the incremental process can take much longer than the
336
Chapter 7. Indexes
foreground build.
If your application includes ensureIndex() (page 814) operations, and an index doesnt exist for other operational
concerns, building the index can have a severe impact on the performance of the database.
To avoid performance issues, make sure that your application checks for the indexes at start up using the
getIndexes() (page 826) method or the equivalent method for your driver4 and terminates if the proper indexes
do not exist. Always build indexes in production instances using separate application code, during designated maintenance windows.
Building Indexes on Secondaries
Background index operations on a replica set primary become foreground indexing operations on secondary members
of the set. All indexing operations on secondaries block replication.
To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and
build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other
members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step
down the primary, restart it as a standalone, and build the index on the former primary.
Remember, the amount of time required to build the index on a secondary must be within the window of the oplog, so
that the secondary can catch up with the primary.
Indexes on secondary members in recovering mode are always built in the foreground to allow them to catch up as
soon as possible.
See Build Indexes on Replica Sets (page 343) for a complete procedure for building indexes on secondaries.
Drop Duplicates
MongoDB cannot create a unique index (page 334) on a field that has duplicate values. To force the creation of a
unique index, you can specify the dropDups option, which will only index the first occurrence of a value for the key,
and delete all subsequent values.
Important: As in all unique indexes, if a document does not have the indexed field, MongoDB will include it in the
index with a null value.
If subsequent fields do not have the indexed field, and you have set {dropDups: true}, MongoDB will remove
these documents from the collection when creating the index. If you combine dropDups with the sparse (page 335)
option, this index will only include documents in the index that have the value, and the documents without the field
will remain in the database.
To create a unique index that drops duplicates on the username field of the accounts collection, use a command
in the following form:
db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } )
true } will delete data from your database. Use with extreme cau-
337
Index Names
The default name for an index is the concatenation of the indexed keys and each keys direction in the index, 1 or -1.
Example
Issue the following command to create an index on item and quantity:
db.products.ensureIndex( { item: 1, quantity: -1 } )
338
Chapter 7. Indexes
Create a Sparse Index (page 341) Build an index that omits references to documents that do not include the indexed
field. This saves space when indexing fields that are present in only some documents.
Create a Hashed Index (page 342) Compute a hash of the value of a field in a collection and index the hashed value.
These indexes permit equality queries and may be suitable shard keys for some collections.
Build Indexes on Replica Sets (page 343) To build indexes on a replica set, you build the indexes separately on the
primary and the secondaries, as described here.
Build Indexes in the Background (page 344) Background index construction allows read and write operations to
continue while building the index, but take longer to complete and result in a larger index.
Build Old Style Indexes (page 345) A {v :
2.0 (or later) to MongoDB version 1.8.
Create an Index
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection. MongoDB creates an index on the _id field of every collection by default, but allows users
to create indexes for any collection using on any field in a document.
This tutorial describes how to create an index on a single field. MongoDB also supports compound indexes (page 322),
which are indexes on multiple fields. See Create a Compound Index (page 340) for instructions on building compound
indexes.
Create an Index on a Single Field
To create an index, use ensureIndex() (page 814) or a similar method from your driver5 . For example the
following creates an index on the phone-number field of the people collection:
db.people.ensureIndex( { "phone-number": 1 } )
ensureIndex() (page 814) only creates an index if an index of the same specification does not already exist.
All indexes support and optimize the performance for queries that select on this field. For queries that cannot use an
index, MongoDB must scan all documents in a collection for documents that match the query.
Examples
If you create an index on the user_id field in the records, this index is, the index will support the following
query:
db.records.find( { user_id: 2 } )
However, the following query, on the profile_url field is not supported by this index:
db.records.find( { profile_url: 2 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 336). To
build indexes on replica sets, see the Build Indexes on Replica Sets (page 343) section for more information.
5 http://api.mongodb.org/
339
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 343).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any
affect on the resulting index.
See also:
Create a Compound Index (page 340), Indexing Tutorials (page 338) and Index Concepts (page 318) for more information.
Create a Compound Index
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection. MongoDB supports indexes that include content on a single field, as well as compound
indexes (page 322) that include content from multiple fields. Continue reading for instructions and examples of
building a compound index.
Build a Compound Index
To create a compound index (page 322) use an operation that resembles the following prototype:
db.collection.ensureIndex( { a: 1, b: 1, c: 1 } )
Example
The following operation will create an index on the item, category, and price fields of the products collection:
db.products.ensureIndex( { item: 1, category: 1, price: 1 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 336). To
build indexes on replica sets, see the Build Indexes on Replica Sets (page 343) section for more information.
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 343).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any
affect on the resulting index.
See also:
Create an Index (page 339), Indexing Tutorials (page 338) and Index Concepts (page 318) for more information.
Create a Unique Index
MongoDB allows you to specify a unique constraint (page 334) on an index. These constraints prevent applications
from inserting documents that have duplicate values for the inserted fields. Additionally, if you want to create an index
on a collection that has existing data that might have duplicate values for the indexed field, you may chose combine
unique enforcement with duplicate dropping (page 337).
340
Chapter 7. Indexes
Unique Indexes
For example, you may want to create a unique index on the "tax-id": of the accounts collection to prevent
storing multiple account records for the same legal entity:
db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } )
The _id index (page 321) is a unique index. In some situations you may consider using _id field itself for this kind of
data rather than using a unique index on another field.
In many situations you will want to combine the unique constraint with the sparse option. When MongoDB
indexes a field, if a document does not have a value for a field, the index entry for that item will be null. Since
unique indexes cannot have duplicate values for a field, without the sparse option, MongoDB will reject the second
document and all subsequent documents without the indexed field. Consider the following prototype.
db.collection.ensureIndex( { a: 1 }, { unique: true, sparse: true } )
You can also enforce a unique constraint on compound indexes (page 322), as in the following prototype:
db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } )
These indexes enforce uniqueness for the combination of index keys and not for either key individually.
Drop Duplicates
To force the creation of a unique index (page 334) index on a collection with duplicate values in the field you are
indexing you can use the dropDups option. This will force MongoDB to create a unique index by deleting documents
with duplicate values when building the index. Consider the following prototype invocation of ensureIndex()
(page 814):
db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } )
See the full documentation of duplicate dropping (page 337) for more information.
Warning: Specifying { dropDups:
tion.
true } may delete data from your database. Use with extreme cau-
Refer to the ensureIndex() (page 814) documentation for additional index creation options.
Create a Sparse Index
Sparse indexes are like non-sparse indexes, except that they omit references to documents that do not include the
indexed field. For fields that are only present in some documents sparse indexes may provide a significant space
savings. See Sparse Indexes (page 335) for more information about sparse indexes and their use.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 338) for more information.
341
Prototype
To create a sparse index (page 335) on a field, use an operation that resembles the following prototype:
db.collection.ensureIndex( { a: 1 }, { sparse: true } )
Example
The following operation, creates a sparse index on the users collection that only includes a document in the index if
the twitter_name field exists in a document.
db.users.ensureIndex( { twitter_name: 1 }, { sparse: true } )
The index excludes all documents that do not include the twitter_name field.
Considerations
Note: Sparse indexes can affect the results returned by the query, particularly with respect to sorts on fields not
included in the index. See the sparse index (page 335) section for more information.
Procedure
To create a hashed index (page 333), specify hashed as the value of the index key, as in the following example:
Example
Specify a hashed index on _id
db.collection.ensureIndex( { _id: "hashed" } )
342
Chapter 7. Indexes
Considerations
MongoDB supports hashed indexes of any single field. The hashing function collapses sub-documents and computes
the hash for the entire value, but does not support multi-key (i.e. arrays) indexes.
You may not create compound indexes that have hashed index fields.
Build Indexes on Replica Sets
Background index creation operations (page 336) become foreground indexing operations on secondary members of
replica sets. The foreground index building process blocks all replication and read operations on the secondaries while
they build the index.
Secondaries will begin building indexes after the primary finishes building the index. In sharded clusters, the mongos
(page 938) will send ensureIndex() (page 814) to the primary members of the replica set for each shard, which
then replicate to the secondaries after the primary finishes building the index.
To minimize the impact of building an index on your replica set, use the following procedure to build indexes on
secondaries:
See
Indexing Tutorials (page 338) and Index Concepts (page 318) for more information.
Considerations
Warning: Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete
without falling too far behind to catch up. See the oplog sizing (page 411) documentation for additional information.
Note: This procedure does take one member out of the replica set at a time. However, this procedure will only affect
one member of the set at a time rather than all secondaries at the same time.
Procedure
Note: If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that
provides each shard.
Stop One Secondary Stop the mongod (page 925) process on one secondary. Restart the mongod (page 925)
process without the --replSet option and running on a different port. 6 This instance is now in standalone mode.
For example, if your mongod (page 925) normally runs with on the default port of 27017 with the --replSet
option you would use the following invocation:
mongod --port 47017
6 By running the mongod (page 925) on a different port, you ensure that the other members of the replica set and all clients will not contact the
member while you are building the index.
343
Build the Index Create the new index using the ensureIndex() (page 814) in the mongo (page 942) shell, or
comparable method in your driver. This operation will create or rebuild the index on this mongod (page 925) instance
For example, to create an ascending index on the username field of the records collection, use the following
mongo (page 942) shell operation:
db.records.ensureIndex( { username: 1 } )
See also:
Create an Index (page 339) and Create a Compound Index (page 340) for more information.
Restart the Program mongod When the index build completes, start the mongod (page 925) instance with the
--replSet option on its usual port:
mongod --port 27017 --replSet rs0
Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed.
Allow replication to catch up on this member.
Build Indexes on all Secondaries For each secondary in the set, build an index according to the following steps:
1. Stop One Secondary (page 343)
2. Build the Index (page 344)
3. Restart the Program mongod (page 344)
Build the Index on the Primary Finally, to build the index on the primary, begin by stepping down the primary.
Use the rs.stepDown() (page 899) method in the mongo (page 942) shell to cause the current primary to become
a secondary graceful and allow the set to elect another member as primary.
Then repeat the index building procedure, listed below, to build the index on the primary:
1. Stop One Secondary (page 343)
2. Build the Index (page 344)
3. Restart the Program mongod (page 344)
Build Indexes in the Background
By default, MongoDB builds indexes in the foreground and prevent all read and write operations to the database while
the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur
during a foreground index build.
Background index construction (page 336) allows read and write operations to continue while building the index.
Note: Background index builds take longer to complete and result in a larger index.
After the index finishes building, MongoDB treats indexes built in the background the same as any other index.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 338) for more information.
344
Chapter 7. Indexes
Procedure
To create an index in the background, add the background argument to the ensureIndex() (page 814) operation,
as in the following index:
db.collection.ensureIndex( { a: 1 }, { background: true } )
Consider the section on background index construction (page 336) for more information about these indexes and their
implications.
Build Old Style Indexes
Important: Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier
than 2.0.
MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1}
format and the earlier {v:0} format.
MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a
version prior to 2.0, you must drop and re-create your indexes.
To build pre-2.0 indexes, use the dropIndexes() (page 813) and ensureIndex() (page 814) methods. You
cannot simply reindex the collection. When you reindex on versions that only support {v:0} indexes, the v fields in
the index definition still hold values of 1, even though the indexes would now use the {v:0} format. If you were to
upgrade again to version 2.0 or later, these indexes would not work.
Example
Suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the
items collection:
{ "v" : 1, "key" : { "name" : 1 }, "ns" : "mydb.items", "name" : "name_1" }
The v field tells you the index is a {v:1} index, which is incompatible with version 1.8.
To drop the index, issue the following command:
db.items.dropIndex( { name : 1 } )
See also:
Index Performance Enhancements (page 1061).
345
Return a List of All Indexes (page 347) Obtain a list of all indexes on a collection or of all indexes on all collections
in a database.
Measure Index Use (page 348) Study query operations and observe index use for your database.
Remove Indexes
To remove an index from a collection use the dropIndex() (page 812) method and the following procedure. If you
simply need to rebuild indexes you can use the process described in the Rebuild Indexes (page 346) document.
See also:
Indexing Tutorials (page 338) and Index Concepts (page 318) for more information about indexes and indexing operations in MongoDB.
Operations
To remove an index, use the db.collection.dropIndex() (page 812) method, as in the following example:
db.accounts.dropIndex( { "tax-id": 1 } )
This will remove the index on the "tax-id" field in the accounts collection. The shell provides the following
document after completing the operation:
{ "nIndexesWas" : 3, "ok" : 1 }
Where the value of nIndexesWas reflects the number of indexes before removing this index. You can also use the
db.collection.dropIndexes() (page 813) to remove all indexes, except for the _id index (page 321) from a
collection.
These shell helpers provide wrappers around the dropIndexes (page 750) database command. Your client library
(page 95) may have a different or additional interface for these operations.
Rebuild Indexes
If you need to rebuild indexes for a collection you can use the db.collection.reIndex() (page 844) method
to rebuild all indexes on a collection in a single operation. This operation drops all indexes, including the _id index
(page 321), and then rebuilds all indexes.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 338).
Process
MongoDB will return the following document when the operation completes:
{
"nIndexesWas" : 2,
"msg" : "indexes dropped for collection",
"nIndexes" : 2,
"indexes" : [
346
Chapter 7. Indexes
{
"key" : {
"_id" : 1,
"tax-id" : 1
},
"ns" : "records.accounts",
"name" : "_id_"
}
],
"ok" : 1
}
This shell helper provides a wrapper around the reIndex (page 756) database command. Your client library
(page 95) may have a different or additional interface for this operation.
Additional Considerations
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 343).
To return a list of all indexes on a collection, use the db.collection.getIndexes() (page 826) method or a
similar method for your driver7 .
For example, to view all indexes on the people collection:
7 http://api.mongodb.org/
347
db.people.getIndexes()
To return a list of all indexes on all collections in a database, use the following operation in the mongo (page 942)
shell:
db.system.indexes.find()
See system.indexes (page 229) for more information about these documents.
Measure Index Use
Synopsis
Query performance is a good general indicator of index use; however, for more precise insight into index use, MongoDB provides a number of tools that allow you to study query operations and observe index use for your database.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 338) for more information.
Operations
Return Query Plan with explain() Append the explain() (page 861) method to any cursor (e.g. query) to
return a document with statistics about the query process, including the index used, the number of documents scanned,
and the time the query takes to process in milliseconds.
Control Index Use with hint() Append the hint() (page 866) to any cursor (e.g. query) with the index as the
argument to force MongoDB to use a specific index to fulfill the query. Consider the following example:
db.people.find( { name: "John Doe", zipcode: { $gt: 63000 } } } ).hint( { zipcode: 1 } )
You can use hint() (page 866) and explain() (page 861) in conjunction with each other to compare the effectiveness of a specific index. Specify the $natural operator to the hint() (page 866) method to prevent MongoDB
from using any index:
db.people.find( { name: "John Doe", zipcode: { $gt: 63000 } } } ).hint( { $natural: 1 } )
Instance Index Use Reporting MongoDB provides a number of metrics of index use and operation that you may
want to consider when analyzing index use for your database:
In the output of serverStatus (page 782):
indexCounters (page 788)
scanned (page 796)
scanAndOrder (page 796)
In the output of collStats (page 763):
totalIndexSize (page 765)
indexSizes (page 765)
348
Chapter 7. Indexes
The following are four example commands for creating a 2dsphere index:
db.points.ensureIndex(
db.points.ensureIndex(
db.points.ensureIndex(
db.points.ensureIndex(
{
{
{
{
loc : "2dsphere"
loc : "2dsphere"
rating : 1 , loc
loc : "2dsphere"
}
,
:
,
)
type : 1 } )
"2dsphere" } )
rating : 1 , category : -1 } )
The first example creates a simple geospatial index on the location field loc. The second example creates a compound
index where the second field contains non-location data. The third example creates an index where the location field
is not the primary field: the location field does not have to be the first field in a 2dsphere index. The fourth example
creates a compound index with three fields. You can include as many fields as you like in a 2dsphere index.
Query a 2dsphere Index
The following sections describe queries supported by the 2dsphere index. For an overview of recommended geospatial queries, see Geospatial Query Compatibility (page 643).
349
The $geoWithin (page 635) operator queries for location data found within a GeoJSON polygon. Your location
data must be stored in GeoJSON format. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ <coordinates> ]
} } } } )
The following example selects all points and shapes that exist entirely within a GeoJSON polygon:
db.places.find( { loc :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ [
[
[
[
[
] ]
} } } } )
0
3
6
0
,
,
,
,
0
6
1
0
] ,
] ,
] ,
]
The following example uses $geoIntersects (page 637) to select all indexed points and shapes that intersect with
the polygon defined by the coordinates array.
db.places.find( { loc :
{ $geoIntersects :
{ $geometry :
{ type : "Polygon" ,
coordinates: [ [
[
[
[
[
] ]
} } } } )
350
0
3
6
0
,
,
,
,
0
6
1
0
] ,
] ,
] ,
]
Chapter 7. Indexes
Proximity queries return the points closest to the defined point and sorts the results by distance. A proximity query on
GeoJSON data requires a 2dsphere index.
To query for proximity to a GeoJSON point, use either the $near (page 637) operator or geoNear (page 708)
command. Distance is in meters.
The $near (page 637) uses the following syntax:
db.<collection>.find( { <location field> :
{ $near :
{ $geometry :
{ type : "Point" ,
coordinates : [ <longitude> , <latitude> ] } ,
$maxDistance : <distance in meters>
} } } )
The geoNear (page 708) command offers more options and returns more information than does the $near
(page 637) operator. To run the command, see geoNear (page 708).
Points within a Circle Defined on a Sphere
To select all grid coordinates in a spherical cap on a sphere, use $geoWithin (page 635) with the
$centerSphere (page 641) operator. Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 355).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere :
[ [ <x>, <y> ] , <radius> ] }
} } )
The following example queries grid coordinates and returns all documents within a 10 mile radius of longitude 88 W
and latitude 30 N. The example converts the distance, 10 miles, to radians by dividing by the approximate radius of
the earth, 3959 miles:
db.places.find( { loc :
{ $geoWithin :
{ $centerSphere :
[ [ 88 , 30 ] , 10 / 3959 ]
} } } )
Create a 2d Index
To build a geospatial 2d index, use the ensureIndex() (page 814) method and specify 2d. Use the following
syntax:
7.3. Indexing Tutorials
351
By default, a 2d index assumes longitude and latitude and has boundaries of -180 inclusive and 180 non-inclusive
(i.e. [ -180 , 180 ]). If documents contain coordinate data outside of the specified range, MongoDB returns an
error.
Important: The default boundaries allow applications to insert documents with invalid latitudes greater than 90 or
less than -90. The behavior of geospatial queries with such invalid points is not defined.
On 2d indexes you can change the location range.
You can build a 2d geospatial index with a location range other than the default. Use the min and max options when
creating the index. Use the following syntax:
db.collection.ensureIndex( { <location field> : "2d" } ,
{ min : <lower bound> , max : <upper bound> } )
By default, a 2d index on legacy coordinate pairs uses 26 bits of precision, which is roughly equivalent to 2 feet or 60
centimeters of precision using the default range of -180 to 180. Precision is measured by the size in bits of the geohash
values used to store location data. You can configure geospatial indexes with up to 32 bits of precision.
Index precision does not affect query accuracy. The actual grid coordinates are always used in the final query processing. Advantages to lower precision are a lower processing overhead for insert operations and use of less space. An
advantage to higher precision is that queries scan smaller portions of the index to return results.
To configure a location precision other than the default, use the bits option when creating the index. Use following
syntax:
db.<collection>.ensureIndex( {<location field> : "<index type>"} ,
{ bits : <bit precision> } )
For information on the internals of geohash values, see Calculation of Geohash Values for 2d Indexes (page 331).
Query a 2d Index
The following sections describe queries supported by the 2d index. For an overview of recommended geospatial
queries, see Geospatial Query Compatibility (page 643).
Points within a Shape Defined on a Flat Surface
To select all legacy coordinate pairs found within a given shape on a flat surface, use the $geoWithin (page 635)
operator along with a shape operator. Use the following syntax:
352
Chapter 7. Indexes
The following queries for documents within a rectangle defined by [ 0 , 0 ] at the bottom left corner and by [
100 , 100 ] at the top right corner.
db.places.find( { loc :
{ $geoWithin :
{ $box : [ [ 0 , 0 ] ,
[ 100 , 100 ] ]
} } } )
The following queries for documents that are within the circle centered on [ -74 , 40.74 ] and with a radius of
10:
db.places.find( { loc: { $geoWithin :
{ $center : [ [-74, 40.74 ] , 10 ]
} } } )
For syntax and examples for each shape, see the following:
$box (page 641)
$polygon (page 642)
$center (page 640) (defines a circle)
Points within a Circle Defined on a Sphere
MongoDB supports rudimentary spherical queries on flat 2d indexes for legacy reasons. In general, spherical calculations should use a 2dsphere index, as described in 2dsphere Indexes (page 328).
To query for legacy coordinate pairs in a spherical cap on a sphere, use $geoWithin (page 635) with the
$centerSphere (page 641) operator. Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 355).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere : [ [ <x>, <y> ] , <radius> ] }
} } )
The following example query returns all documents within a 10-mile radius of longitude 88 W and latitude 30 N.
The example converts distance to radians by dividing distance by the approximate radius of the earth, 3959 miles:
db.<collection>.find( { loc : { $geoWithin :
{ $centerSphere :
[ [ 88 , 30 ] , 10 / 3959 ]
} } } )
353
Proximity queries return the 100 legacy coordinate pairs closest to the defined point and sort the results by distance.
Use either the $near (page 637) operator or geoNear (page 708) command. Both require a 2d index.
The $near (page 637) operator uses the following syntax:
db.<collection>.find( { <location field> :
{ $near : [ <x> , <y> ]
} } )
The geoNear (page 708) command offers more options and returns more information than does the $near
(page 637) operator. To run the command, see geoNear (page 708).
Exact Matches on a Flat Surface
You can use the db.collection.find() (page 816) method to query for an exact match on a location. These
queries use the following syntax:
db.<collection>.find( { <location field>: [ <x> , <y> ] } )
This query will return any documents with the value of [ <x> , <y> ].
Create a Haystack Index
To build a haystack index, use the bucketSize option when creating the index. A bucketSize of 5 creates an
index that groups location values that are within 5 units of the specified longitude and latitude. The bucketSize also
determines the granularity of the index. You can tune the parameter to the distribution of your data so that in general
you search only very small regions. The areas defined by buckets can overlap. A document can exist in multiple
buckets.
A haystack index can reference two fields: the location field and a second field. The second field is used for exact
matches. Haystack indexes return documents based on location and an exact match on a single additional criterion.
These indexes are not necessarily suited to returning the closest documents to a particular location.
To build a haystack index, use the following syntax:
db.coll.ensureIndex( { <location field> : "geoHaystack" ,
<additional field> : 1 } ,
{ bucketSize : <bucket value> } )
Example
If you have a collection with documents that contain fields similar to the following:
{ _id : 100, pos: { lng : 126.9, lat : 35.2 } , type : "restaurant"}
{ _id : 200, pos: { lng : 127.5, lat : 36.1 } , type : "restaurant"}
{ _id : 300, pos: { lng : 128.0, lat : 36.7 } , type : "national park"}
The following operations create a haystack index with buckets that store keys within 1 unit of longitude or latitude.
354
Chapter 7. Indexes
This index stores the document with an _id field that has the value 200 in two different buckets:
In a bucket that includes the document where the _id field has a value of 100
In a bucket that includes the document where the _id field has a value of 300
To query using a haystack index you use the geoSearch (page 709) command. See Query a Haystack Index
(page 355).
By default, queries that use a haystack index return 50 documents.
Query a Haystack Index
A haystack index is a special 2d geospatial index that is optimized to return results over small areas. To create a
haystack index see Create a Haystack Index (page 354).
To query a haystack index, use the geoSearch (page 709) command. You must specify both the coordinates and the
additional field to geoSearch (page 709). For example, to return all documents with the value restaurant in the
type field near the example point, the command would resemble:
db.runCommand( { geoSearch : "places" ,
search : { type: "restaurant" } ,
near : [-74, 40.74] ,
maxDistance : 10 } )
Note: Haystack indexes are not suited to queries for the complete list of documents closest to a particular location.
The closest documents could be more distant compared to the bucket size.
Note: Spherical query operations (page 355) are not currently supported by haystack indexes.
The find() (page 816) method and geoNear (page 708) command cannot access the haystack index.
true } option.
Important: These three queries use radians for distance. Other query types do not.
For spherical query operators to function properly, you must convert distances to radians, and convert from radians to
the distances units used by your application.
355
To convert:
distance to radians: divide the distance by the radius of the sphere (e.g. the Earth) in the same units as the
distance measurement.
radians to distance: multiply the radian measure by the radius of the sphere (e.g. the Earth) in the units system
that you want to convert the distance to.
The radius of the Earth is approximately 3,959 miles or 6,371 kilometers.
The following query would return documents from the places collection within the circle described by the center [
-74, 40.74 ] with a radius of 100 miles:
db.places.find( { loc: { $geoWithin: { $centerSphere: [ [ -74, 40.74 ] ,
100 / 3959 ] } } } )
You may also use the distanceMultiplier option to the geoNear (page 708) to convert radians in the mongod
(page 925) process, rather than in your application code. See distance multiplier (page 357).
The following spherical query, returns all documents in the collection places within 100 miles from the point [
-74, 40.74 ].
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true
} )
Warning: Spherical queries that wrap around the poles or at the transition from -180 to 180 longitude raise an
error.
Note: While the default Earth-like bounds for geospatial indexes are between -180 inclusive, and 180, valid values
for latitude are between -90 and 90.
356
Chapter 7. Indexes
Distance Multiplier
The distanceMultiplier option of the geoNear (page 708) command returns distances only after multiplying
the results by an assigned value. This allows MongoDB to return converted values, and removes the requirement to
convert units in application logic.
Using distanceMultiplier in spherical queries provides results from the geoNear (page 708) command that
do not need radian-to-distance conversion. The following example uses distanceMultiplier in the geoNear
(page 708) command with a spherical (page 355) example:
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true,
distanceMultiplier: 3959
} )
357
Create text Index to Cover Queries (page 366) Perform text searches that return results without the need to scan documents.
Enable Text Search
New in version 2.4.
The text search (page 333) is currently a beta feature. As a beta feature:
You need to explicitly enable the feature before creating a text index (page 332) or using the text (page 715)
command.
To enable text search on replica sets (page 381) and sharded clusters (page 498), you need to enable on each
and every mongod (page 925) for replica sets and on each and every mongos (page 938) for sharded clusters.
Warning:
Do not enable or use text search on production systems.
Text indexes have significant storage requirements and performance costs. See Storage Requirements and
Performance Costs (page 332) for more information.
You can enable the text search feature at startup with the textSearchEnabled (page 1008) parameter:
mongod --setParameter textSearchEnabled=true
You may prefer to set the textSearchEnabled (page 1008) parameter in the configuration file (page 990).
Additionally, you can enable the feature in the mongo (page 942) shell with the setParameter (page 756) command. This command does not propagate from the primary to the secondaries. You must enable on each and every
mongod (page 925) for replica sets.
Note: You must set the parameter every time you start the server. You may prefer to add the parameter to the
configuration files (page 990).
The following example creates a text index on the fields subject and content:
db.collection.ensureIndex(
{
subject: "text",
content: "text"
}
)
This text index catalogs all string data in the subject field and the content field, where the field value is either
a string or an array of string elements.
358
Chapter 7. Indexes
To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain
string content.
The following example indexes any string value in the data of every field of every document in collection and
names the index TextIndex:
db.collection.ensureIndex(
{ "$**": "text" },
{ name: "TextIndex" }
)
Because text (page 715) command is case-insensitive, the text search will match the following document in the
quotes collection:
{
"_id" : ObjectId("50ecef5f8abea0fda30ceab3"),
"quote" : "tomorrow, and tomorrow, and tomorrow, creeps in this petty pace",
"related_quotes" : [
"is this a dagger which I see before me",
"the handle toward my hand?"
],
"src" : {
"title" : "Macbeth",
"from" : "Act V, Scene V"
},
"speaker" : "macbeth"
}
If the search string is a space-delimited text, text (page 715) command performs a logical OR search on each term
and returns documents that contains any of the terms.
For example, the search string "tomorrow largo" searches for the term tomorrow OR the term largo:
7.3. Indexing Tutorials
359
The command will match the following documents in the quotes collection:
{
"_id" : ObjectId("50ecef5f8abea0fda30ceab3"),
"quote" : "tomorrow, and tomorrow, and tomorrow, creeps in this petty pace",
"related_quotes" : [
"is this a dagger which I see before me",
"the handle toward my hand?"
],
"src" : {
"title" : "Macbeth",
"from" : "Act V, Scene V"
},
"speaker" : "macbeth"
}
{
"_id" : ObjectId("50ecf0cd8abea0fda30ceab4"),
"quote" : "Es tan corto el amor y es tan largo el olvido.",
"related_quotes" : [
"Como para acercarla mi mirada la busca.",
"Mi corazn la busca, y ella no est conmigo."
],
"speaker" : "Pablo Neruda",
"src" : {
"title" : "Veinte poemas de amor y una cancin desesperada",
"from" : "Poema 20"
}
}
Match Phrases
To match the exact phrase that includes a space(s) as a single term, escape the quotes.
For example, the following command searches for the exact phrase "and tomorrow":
db.quotes.runCommand( "text", { search: "\"and tomorrow\"" } )
If the search string contains both phrases and individual terms, the text (page 715) command performs a compound
logical AND of the phrases with the compound logical OR of the single terms, including the individual terms from each
phrase.
For example, the following search string contains both individual terms corto and largo as well as the phrase
\"and tomorrow\":
db.quotes.runCommand( "text", { search: "corto largo \"and tomorrow\"" } )
The text (page 715) command performs the equivalent to the following logical operation, where the individual
terms corto, largo, as well as the term tomorrow from the phrase "and tomorrow", are part of a logical OR
expression:
(corto OR largo OR tomorrow) AND ("and tomorrow")
As such, the results for this search will include documents that only contain the phrase "and tomorrow" as well as
documents that contain the phrase "and tomorrow" and the terms corto and/or largo. Documents that contain
360
Chapter 7. Indexes
the phrase "and tomorrow" as well as the terms corto and largo will generally receive a higher score for this
search.
Match Some Words But Not Others
A negated term is a term that is prefixed by a minus sign -. If you negate a term, the text (page 715) command will
exclude the documents that contain those terms from the results.
Note: If the search text contains only negated terms, the text (page 715) command will not return any results.
The following example returns those documents that contain the term tomorrow but not the term petty.
db.quotes.runCommand( "text" , { search: "tomorrow -petty" } )
Note: The result from the text (page 715) command must fit within the maximum BSON Document Size
(page 1015).
By default, the text (page 715) command will return up to 100 matching documents, from highest to lowest scores.
To override this default limit, use the limit option in the text (page 715) command, as in the following example:
db.quotes.runCommand( "text", { search: "tomorrow", limit: 2 } )
The text (page 715) command will return at most 2 of the highest scoring results.
The limit can be any number as long as the result set fits within the maximum BSON Document Size
(page 1015).
Specify Which Fields to Return in the Result Set
In the text (page 715) command, use the project option to specify the fields to include (1) or exclude (0) in the
matching documents.
Note: The _id field is always returned unless explicitly excluded in the project document.
The following example returns only the _id field and the src field in the matching documents:
db.quotes.runCommand( "text", { search: "tomorrow",
project: { "src": 1 } } )
The text (page 715) command can also use the filter option to specify additional query conditions.
The following example will return the documents that contain the term tomorrow AND the speaker is macbeth:
db.quotes.runCommand( "text", { search: "tomorrow",
filter: { speaker : "macbeth" } } )
See also:
Limit the Number of Entries Scanned (page 365)
7.3. Indexing Tutorials
361
You can specify the language that determines the tokenization, stemming, and removal of stop words, as in the following example:
db.quotes.runCommand( "text", { search: "amor", language: "spanish" } )
See Text Search Languages (page 719) for a list of supported languages as well as Specify a Language for Text Index
(page 362) for specifying languages for the text index.
Text Search Output
The text (page 715) command returns a document that contains the result set.
See Output (page 718) for information on the output.
Specify a Language for Text Index
This tutorial describes how to specify the default language associated with the text index (page 362) and also how to
create text indexes for collections that contain documents in different languages (page 362).
Specify the Default Language for a text Index
The default language associated with the indexed data determines the list of stop words and the rules for the stemmer
and tokenizer. The default language for the indexed data is english.
To specify a different language, use the default_language option when creating the text index. See Text Search
Languages (page 719) for the languages available for default_language.
The following example creates a text index on the content field and sets the default_language to spanish:
db.collection.ensureIndex(
{ content : "text" },
{ default_language: "spanish" }
)
Specify the Index Language within the Document If a collection contains documents that are in different languages, include a field in the documents that contain the language to use:
If you include a field named language in the document, by default, the ensureIndex() (page 814) method
will use the value of this field to override the default language.
To use a field with a name other than language, you must specify the name of this field to the
ensureIndex() (page 814) method with the language_override option.
See Text Search Languages (page 719) for a list of supported languages.
Include the language Field Include a field language that specifies the language to use for the individual documents.
For example, the documents of a multi-language collection quotes contain the field language:
362
Chapter 7. Indexes
For the documents that contain the language field, the text index uses that language to determine the stop
words and the rules for the stemmer and the tokenizer.
For documents that do not contain the language field, the index uses the default language, which is English,
to determine the stop words and rules for the stemmer and the tokenizer.
For example, the Spanish word que is a stop word. So the following text (page 715) command would not match
any document:
db.quotes.runCommand( "text", { search: "que", language: "spanish" } )
Use any Field to Specify the Language for a Document Include a field that specifies the language to use for the
individual documents. To use a field with a name other than language, include the language_override option
when creating the index.
For example, the documents of a multi-language collection quotes contain the field idioma:
{ _id: 1, idioma: "portuguese", quote: "A sorte protege os audazes" }
{ _id: 2, idioma: "spanish", quote: "Nada hay ms surreal que la realidad." }
{ _id: 3, idioma: "english", quote: "is this a dagger which I see before me" }
Create a text index on the field quote with the language_override option:
db.quotes.ensureIndex( { quote : "text" },
{ language_override: "idioma" } )
For the documents that contain the idioma field, the text index uses that language to determine the stop
words and the rules for the stemmer and the tokenizer.
For documents that do not contain the idioma field, the index uses the default language, which is English, to
determine the stop words and rules for the stemmer and the tokenizer.
For example, the Spanish word que is a stop word. So the following text (page 715) command would not match
any document:
db.quotes.runCommand( "text", { search: "que", language: "spanish" } )
363
To avoid creating an index with a name that exceeds the index name length limit (page 1016), you can pass
the name option to the db.collection.ensureIndex() (page 814) method:
db.collection.ensureIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
},
{
name: "MyTextIndex"
}
)
Note:
To drop the text index, use the index name.
db.collection.getIndexes() (page 826).
To create a text index with different field weights for the content field and the keywords field, include the
weights option to the ensureIndex() (page 814) method. For example, the following command creates an
index on three fields and assigns weights to two of the fields:
db.blog.ensureIndex(
{
content: "text",
keywords: "text",
364
Chapter 7. Indexes
about: "text"
},
{
weights: {
content: 10,
keywords: 5,
},
name: "TextIndex"
}
)
_id:
_id:
_id:
_id:
_id:
_id:
1,
2,
3,
4,
5,
6,
dept:
dept:
dept:
dept:
dept:
dept:
A common use case is to perform text searches by individual departments, such as:
db.inventory.runCommand( "text", {
search: "green",
filter: { dept : "kitchen" }
}
)
To limit the text search to scan only those documents within a specific dept, create a compound index that specifies
an ascending/descending index key on the field dept and a text index key on the field description:
db.inventory.ensureIndex(
{
dept: 1,
description: "text"
}
)
365
Important:
The ascending/descending index keys must be listed before, or prefix, the text index keys.
By prefixing the text index fields with ascending/descending index fields, MongoDB will only index documents that have the prefix fields.
You cannot include multi-key (page 324) index fields or geospatial (page 327) index fields.
The text (page 715) command must include the filter option that specifies an equality condition for the
prefix fields.
Then, the text search within a particular department will limit the scan of indexed documents. For example, the
following text (page 715) command scans only those documents with dept equal to kitchen:
db.inventory.runCommand( "text", {
search: "green",
filter: { dept : "kitchen" }
}
)
The returned result includes the statistics that shows that the command scanned 1 document, as indicated by the
nscanned field:
{
"queryDebugString" : "green||||||",
"language" : "english",
"results" : [
{
"score" : 0.75,
"obj" : {
"_id" : 3,
"dept" : "kitchen",
"description" : "a green placemat"
}
}
],
"stats" : {
"nscanned" : 1,
"nscannedObjects" : 0,
"n" : 1,
"nfound" : 1,
"timeMicros" : 211
},
"ok" : 1
}
For more information on the result set, see Output (page 718).
Create text Index to Cover Queries
To create a text index that can cover queries (page 368):
1. Append scalar index fields to a text index, as in the following example which specifies an ascending index
key on username:
366
Chapter 7. Indexes
Warning: You cannot include multi-key (page 324) index field or geospatial (page 327) index field.
2. Use the project option in the text (page 715) to return only the fields in the index, as in the following:
db.quotes.runCommand( "text", { search: "tomorrow",
project: { username: 1,
_id: 0
}
}
)
Note: By default, the _id field is included in the result set. Since the example index did not include the _id field,
you must explicitly exclude the field in the project document.
367
Create a Single-Key Index if All Queries Use the Same, Single Key
If you only ever query on a single key in a given collection, then you need to create just one single-key index for that
collection. For example, you might create an index on category in the product collection:
db.products.ensureIndex( { "category": 1 } )
If you sometimes query on only one key and at other times query on that key combined with a second key, then creating
a compound index is more efficient than creating a single-key index. MongoDB will use the compound index for both
queries. For example, you might create an index on both category and item.
db.products.ensureIndex( { "category": 1, "item": 1 } )
This allows you both options. You can query on just category, and you also can query on category combined
with item. A single compound index (page 322) on multiple fields can support all the queries that search a prefix
subset of those fields.
Note: With the exception of queries that use the $or (page 625) operator, a query does not use multiple indexes. A
query uses only one index.
Example
The following index on a collection:
{ x: 1, y: 1, z: 1 }
There are some situations where the prefix indexes may offer better query performance: for example if z is a large
array.
The { x:
1, y:
1, z:
1 } index can also support many of the same queries as the following index:
{ x: 1, z: 1 }
Also, { x:
1, z:
db.collection.find( { x: 5 } ).sort( { z: 1} )
The { x: 1, z: 1 } index supports both the query and the sort operation, while the { x: 1, y: 1,
z: 1 } index only supports the query. For more information on sorting, see Use Indexes to Sort Query Results
(page 370).
368
Chapter 7. Indexes
Because the index covers the query, MongoDB can both match the query conditions (page 68) and return the results
using only the index; MongoDB does not need to look at the documents, only the index, to fulfill the query. An index
can also cover an aggregation pipeline operation (page 281) on unsharded collections.
Querying only the index can be much faster than querying documents outside of the index. Index keys are typically
smaller than the documents they catalog, and indexes are typically available in RAM or located sequentially on disk.
MongoDB automatically uses an index that covers a query when possible. To ensure that an index can cover a query,
create an index that includes all the fields listed in the query document (page 68) and in the query result. You can
specify the fields to return in the query results with a projection (page 72) document. By default, MongoDB includes
the _id field in the query result. So, if the index does not include the _id field, then you must exclude the _id field
(i.e. _id: 0) from the query results.
Example
Given collection users with an index on the fields user and status, as created by the following option:
db.users.ensureIndex( { status: 1, user: 1 } )
Then, this index will cover the following query which selects on the status field and returns only the user field:
db.users.find( { status: "A" }, { user: 1, _id: 0 } )
If the projection document does not specify the exclusion of the _id field, the query returns the _id field. The
following query is not covered by the index on the status and the user fields because with the projection document
{ user: 1 }, the query returns both the user field and the _id field:
db.users.find( { status: "A" }, { user: 1 } )
The { user:
To determine whether a query is a covered query, use the explain() (page 861) method. If the explain()
(page 861) output displays true for the indexOnly (page 864) field, the query is covered by an index, and MongoDB queries only that index to match the query and return the results.
369
For more information on sorting by index prefixes, see Sort Subset Starts at the Index Beginning (page 370).
If the query includes equality match conditions on an index prefix, you can sort on a subset of the index that starts
after or overlaps with the prefix. For example, given an index { a: 1, b: 1, c: 1, d: 1 }, if the
query condition includes equality match conditions on a and b, you can specify a sort on the subsets { c: 1 } or
{ c: 1, d: 1 }:
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1 } )
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1, d: 1 } )
In these operations, the equality match and the sort documents together cover the index prefixes { a:
c: 1 } and { a: 1, b: 1, c: 1, d: 1 } respectively.
1, b:
1,
You can also specify a sort order that includes the prefix; however, since the query condition specifies equality matches
on these fields, they are constant in the resulting documents and do not contribute to the sort order:
db.collection.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1 } )
db.collection.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1, d: 1 } )
For more information on sorting by index subsets that are not prefixes, see Sort Subset Does Not Start at the Index
Beginning (page 371).
Note: For in-memory sorts that do not use an index, the sort() (page 872) operation is significantly slower. The
sort() (page 872) operation will abort when it uses 32 megabytes of memory.
If the sort document contains a subset of the compound index fields, the subset can determine whether MongoDB
can use the index efficiently to both retrieve and sort the query results. If MongoDB can efficiently use the index to
both retrieve and sort the query results, the output from the explain() (page 861) will display scanAndOrder
(page 864) as false or 0. If MongoDB can only use the index for retrieving documents that meet the query criteria,
MongoDB must manually sort the resulting documents without the use of the index. For in-memory sort operations,
explain() (page 861) will display scanAndOrder (page 864) as true or 1.
Sort Subset Starts at the Index Beginning If the sort document is a subset of a compound index and starts from
the beginning of the index, MongoDB can use the index to both retrieve and sort the query results.
For example, the collection collection has the following index:
370
Chapter 7. Indexes
{ a: 1, b: 1, c: 1, d: 1 }
The following operations include a sort with a subset of the index. Because the sort subset starts at beginning of the
index, the operations can use the index for both the query retrieval and sort:
db.collection.find().sort( { a:1 } )
db.collection.find().sort( { a:1, b:1 } )
db.collection.find().sort( { a:1, b:1, c:1 } )
db.collection.find( { a: 4 } ).sort( { a: 1, b: 1 } )
db.collection.find( { a: { $gt: 4 } } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: 5 } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: { $gt:5 }, c: { $gt: 1 } } ).sort( { a: 1, b: 1 } )
The last two operations include query conditions on the field b but does not include a query condition on the field a:
db.collection.find( { b: 5 } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: { $gt:5 }, c: { $gt: 1 } } ).sort( { a: 1, b: 1 } )
Consider the case where the collection has the index { b: 1 } in addition to the { a: 1, b: 1, c: 1,
d: 1 } index. Because of the query condition on b, it is not immediately obvious which index MongoDB may
select as the best index. To explicitly specify the index to use, see hint() (page 866).
Sort Subset Does Not Start at the Index Beginning The sort document can be a subset of a compound index that
does not start from the beginning of the index. For instance, { c: 1 } is a subset of the index { a: 1, b:
1, c: 1, d: 1 } that omits the preceding index fields a and b. MongoDB can use the index efficiently if the
query document includes all the preceding fields of the index, in this case a and b, in equality conditions. In other
words, the equality conditions in the query document and the subset in the sort document contiguously cover a prefix
of the index.
For example, the collection collection has the following index:
{ a: 1, b: 1, c: 1, d: 1 }
In the first operation, the query document { a: 5 } with the sort document { b:
the prefix { a:1 , b: 1, c: 1 } of the index.
In the second operation, the query document { a:
1 } covers the full index.
5, c:
4, b:
1, c:
1 } cover
Only the index fields preceding the sort subset must have the equality conditions in the query document. The other
index fields may have other conditions. The following operations can efficiently use the index since the equality
conditions in the query document and the subset in the sort document contiguously cover a prefix of the index:
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1 } )
db.collection.find( { a: 5, b: 3, c: { $lt: 4 } } ).sort( { c: 1 } )
371
1, b:
1, c:
1, d:
The above example shows an index size of almost 4.3 gigabytes. To ensure this index fits in RAM, you must not only
have more than that much RAM available but also must have RAM available for the rest of the working set. Also
remember:
If you have and use multiple collections, you must consider the size of all indexes on all collections. The indexes and
the working set must be able to fit in memory at the same time.
There are some limited cases where indexes do not need to fit in memory. See Indexes that Hold Only Recent Values
in RAM (page 372).
See also:
collStats (page 763) and db.collection.stats() (page 848)
Indexes that Hold Only Recent Values in RAM
Indexes do not have to fit entirely into RAM in all cases. If the value of the indexed field increments with every insert,
and most queries select recently added documents; then MongoDB only needs to keep the parts of the index that hold
the most recent or right-most values in RAM. This allows for efficient index use for read and write operations and
minimize the amount of RAM required to support the index.
Create Queries that Ensure Selectivity
Selectivity is the ability of a query to narrow results using the index. Effective indexes are more selective and allow
MongoDB to use the index for a larger portion of the work associated with fulfilling the query.
To ensure selectivity, write queries that limit the number of possible documents with the indexed field. Write queries
that are appropriately selective relative to your indexed data.
Example
Suppose you have a field called status where the possible values are new and processed. If you add an index
on status youve created a low-selectivity index. The index will be of little help in locating records.
A better strategy, depending on your queries, would be to create a compound index (page 322) that includes the lowselectivity field and another field. For example, you could create a compound index on status and created_at.
Another option, again depending on your use case, might be to use separate collections, one for each status.
Example
Consider an index { a : 1 } (i.e. an index on the key a sorted in ascending order) on a collection where a has
three values evenly distributed across the collection:
372
Chapter 7. Indexes
{
{
{
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
a:
a:
a:
a:
a:
a:
a:
a:
a:
1,
1,
1,
2,
2,
2,
3,
3,
3,
b:
b:
b:
b:
b:
b:
b:
b:
b:
"ab"
"cd"
"ef"
"jk"
"lm"
"no"
"pq"
"rs"
"tv"
}
}
}
}
}
}
}
}
}
If you query for { a: 2, b: "no" } MongoDB must scan 3 documents in the collection to return the one
matching result. Similarly, a query for { a: { $gt: 1}, b: "tv" } must scan 6 documents, also to
return one result.
Consider the same index on a collection where a has nine values evenly distributed across the collection:
{
{
{
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
a:
a:
a:
a:
a:
a:
a:
a:
a:
1,
2,
3,
4,
5,
6,
7,
8,
9,
b:
b:
b:
b:
b:
b:
b:
b:
b:
"ab"
"cd"
"ef"
"jk"
"lm"
"no"
"pq"
"rs"
"tv"
}
}
}
}
}
}
}
}
}
If you query for { a: 2, b: "cd" }, MongoDB must scan only one document to fulfill the query. The index
and query are more selective because the values of a are evenly distributed and the query can select a specific document
using the index.
However, although the index on a is more selective, a query such as { a:
still need to scan 4 documents.
{ $gt:
5 }, b:
"tv" } would
If overall selectivity is low, and if MongoDB must read a number of documents to return results, then some queries
may perform faster without indexes. To determine performance, see Measure Index Use (page 348).
For a conceptual introduction to indexes in MongoDB see Index Concepts (page 318).
373
374
Chapter 7. Indexes
Description
Removes indexes from a collection.
Defragments a collection and rebuilds the indexes.
Rebuilds all indexes on a collection.
Internal command that scans for a collections data and indexes for
correctness.
Experimental command that collects and aggregates statistics on all
indexes.
Performs a geospatial query that returns the documents closest to a given
point.
Performs a geospatial query that uses MongoDBs haystack index
functionality.
An internal command to support geospatial queries.
Internal command that validates index on shard key.
Description
Selects geometries within a bounding GeoJSON geometry.
Selects geometries that intersect with a GeoJSON geometry.
Returns geospatial objects in proximity to a point.
Returns geospatial objects in proximity to a point on a sphere.
Description
Forces MongoDB to report on query execution plans. See explain() (page 861).
Forces MongoDB to use a specific index. See hint() (page 866)
Specifies a minimum exclusive upper limit for the index to use in a query. See max()
(page 867).
Specifies a minimum inclusive lower limit for the index to use in a query. See min()
(page 869).
Forces the cursor to only return fields included in the index.
Forces the query to use the index on the _id field. See snapshot() (page 872).
375
376
Chapter 7. Indexes
CHAPTER 8
Replication
A replica set in MongoDB is a group of mongod (page 925) processes that maintain the same data set. Replica sets
provide redundancy and high availability, and are the basis for all production deployments. This section introduces
replication in MongoDB as well as the components and architecture of replica sets. The section also provides tutorials
for common tasks related to replica sets.
Replication Introduction (page 377) An introduction to replica sets, their behavior, operation, and use.
Replication Concepts (page 381) The core documentation of replica set operations, configurations, architectures and
behaviors.
Replica Set Members (page 382) Introduces the components of replica sets.
Replica Set Deployment Architectures (page 390) Introduces architectural considerations related to replica
sets deployment planning.
Replica Set High Availability (page 396) Presents the details of the automatic failover and recovery process
with replica sets.
Replica Set Read and Write Semantics (page 402) Presents the semantics for targeting read and write operations to the replica set, with an awareness of location and set configuration.
Replica Set Tutorials (page 419) Tutorials for common tasks related to the use and maintenance of replica sets.
Replication Reference (page 467) Reference for functions and operations related to replica sets.
377
Figure 8.1: Diagram of default routing of reads and writes to the primary.
The secondaries replicate the primarys oplog and apply the operations to their data sets. Secondaries data sets reflect
the primarys data set. If the primary is unavailable, the replica set will elect a secondary to be primary. By default,
clients read from the primary, however, clients can specify a read preferences (page 405) to send read operations to
secondaries. See secondaries (page 382) for more information
You may add an extra mongod (page 925) instance a replica set as an arbiter. Arbiters do not maintain a data set.
Arbiters only exist to vote in elections. If your replica set has an even number of members, add an arbiter to obtain a
majority of votes in an election for primary. Arbiters do not require dedicated hardware. See arbiter (page 389) for
more information.
378
Chapter 8. Replication
Figure 8.2: Diagram of a 3 member replica set that consists of a primary and two secondaries.
Figure 8.3: Diagram of a replica set that consists of a primary, a secondary, and an arbiter.
379
Note: An arbiter will always be an arbiter. A primary may step down and become a secondary. A secondary may
become the primary during an election.
Asynchronous Replication
Secondaries apply operations from the primary asynchronously. By applying operations after the primary, sets can
continue to function without some members. However, as a result secondaries may not return the most current data to
clients.
See Replica Set Oplog (page 410) and Replica Set Data Synchronization (page 412) for more information. See Read
Preference (page 405) for more on read operations and secondaries.
Automatic Failover
When a primary does not communicate with the other members of the set for more than 10 seconds, the replica set
will attempt to select another member to become the new primary. The first secondary that receives a majority of the
votes becomes primary.
Figure 8.4: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
380
Chapter 8. Replication
See Replica Set Elections (page 397) and Rollbacks During Replica Set Failover (page 401) for more information.
Additional Features
Replica sets provide a number of options to support application needs. For example, you may deploy a replica set
with members in multiple data centers (page 396), or control the outcome of elections by adjusting the priority
(page 481) of some members. Replica sets also support dedicated members for reporting, disaster recovery, or backup
functions.
See Priority 0 Replica Set Members (page 386), Hidden Replica Set Members (page 387) and Delayed Replica Set
Members (page 387) for more information.
381
The minimum requirements for a replica set are: A primary (page ??), a secondary (page ??), and an arbiter (page ??).
Most deployments, however, will keep three members that store data: A primary (page ??) and two secondary members
(page ??).
Replica Set Primary
The primary is the only member in the replica set that receives write operations. MongoDB applies write operations
on the primary and then records the operations on the primarys oplog (page 410). Secondary (page ??) members
replicate this log and apply the operations to their data sets.
In the following three-member replica set, the primary accepts all write operations. Then the secondaries replicate the
oplog to apply to their data sets.
All members of the replica set can accept read operations. However, by default, an application directs its read operations to the primary member. See Read Preference (page 405) for details on changing the default read behavior.
The replica set can have at most one primary. If the current primary becomes unavailable, an election determines the
new primary. See Replica Set Elections (page 397) for more details.
In the following 3-member replica set, the primary becomes unavailable. This triggers an election which selects one
of the remaining secondaries as the new primary.
Replica Set Secondary Members
A secondary maintains a copy of the primarys data set. To replicate data, a secondary applies operations from the
primarys oplog (page 410) to its own data set in an asynchronous process. A replica set can have one or more
secondaries.
The following three-member replica set has two secondary members. The secondaries replicate the primarys oplog
and apply the operations to their data sets.
Although clients cannot write data to secondaries, clients can read data from secondary members. See Read Preference
(page 405) for more information on how clients direct read operations to replica sets.
A secondary can become a primary. If the current primary becomes unavailable, the replica set holds an election to
choose with of the secondaries becomes the new primary.
In the following three-member replica set, the primary becomes unavailable. This triggers an election where one of
the remaining secondaries becomes the new primary.
See Replica Set Elections (page 397) for more details.
1 While replica sets are the recommended solution for production, a replica set can support only 12 members in total. If your deployment requires
more than 12 members, youll need to use master-slave (page 413) replication. Master-slave replication lacks the automatic failover capabilities.
382
Chapter 8. Replication
Figure 8.5: Diagram of default routing of reads and writes to the primary.
383
Figure 8.6: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
Figure 8.7: Diagram of a 3 member replica set that consists of a primary and two secondaries.
384
Chapter 8. Replication
Figure 8.8: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
385
You can configure a secondary member for a specific purpose. You can configure a secondary to:
Prevent it from becoming a primary in an election, which allows it to reside in a secondary data center or to
serve as a cold standby. See Priority 0 Replica Set Members (page 386).
Prevent applications from reading from it, which allows it to run applications that require separation from normal
traffic. See Hidden Replica Set Members (page 387).
Keep a running historical snapshot for use in recovery from certain errors, such as unintentionally deleted
databases. See Delayed Replica Set Members (page 387).
Priority 0 Replica Set Members
A priority 0 member is a secondary that cannot become primary. Priority 0 members cannot trigger elections.
Otherwise these members function as normal secondaries. A priority 0 member maintains a copy of the data set,
accepts read operations, and votes in elections. Configure a priority 0 member to prevent secondaries from becoming
primary, which is particularly useful in multi-data center deployments.
In a three-member replica set, in one data center hosts the primary and a secondary. A second data center hosts one
priority 0 member that cannot become primary.
Figure 8.9: Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0
member.
Priority 0 Members as Standbys A priority 0 member can function as a standby. In some replica sets, it might not
be possible to add a new member in a reasonable amount of time. A standby member keeps a current copy of the data
to be able to replace an unavailable member.
In many cases, you need not set standby to priority 0. However, in sets with varied hardware or geographic distribution
(page 396), a priority 0 standby ensures that only qualified members become primary.
A priority 0 standby may also be valuable for some members of a set with different hardware or workload profiles.
In these cases, deploy a member with priority 0 so it cant become primary. Also consider using an hidden member
(page 387) for this purpose.
If your set already has seven voting members, also configure the member as non-voting (page 400).
386
Chapter 8. Replication
Priority 0 Members and Failover When configuring a priority 0 member, consider potential failover patterns,
including all possible network partitions. Always ensure that your main data center contains both a quorum of voting
members and contains members that are eligible to be primary.
Configuration To configure a priority 0 member, see Prevent Secondary from Becoming Primary (page 439).
Hidden Replica Set Members
A hidden member maintains a copy of the primarys data set but is invisible to client applications. Hidden members
are ideal for workloads with different usage patterns from the other members in the replica set. Hidden members are
also priority 0 members (page 386) and cannot become primary. The db.isMaster() (page 890) method does
not display hidden members. Hidden members, however, do vote in elections (page 397).
In the following five-member replica set, all four secondary members have copies of the primarys data set, but one of
the secondary members is hidden.
Figure 8.10: Diagram of a 5 member replica set with a hidden priority 0 member.
Secondary reads do not reach a hidden member, so the member receives no traffic beyond what replication requires. It
can be useful to keep a hidden member dedicated to reporting or to do backups.
For dedicated backup, ensure that the hidden member has low network latency to the primary or likely primary. Ensure
that the replication lag is minimal or non-existent.
Avoid stopping the mongod (page 925) process of a hidden members. Instead, for filesystem snapshots, use
db.fsyncLock() (page 885) to flush all writes and lock the mongod (page 925) instance for the duration of
the backup.
For more information about backing up MongoDB databases, see Backup Strategies for MongoDB Systems (page 136).
To configure a hidden member, see Configure a Hidden Replica Set Member (page 440).
Delayed Replica Set Members
Delayed members contain copies of a replica sets data set. However, a delayed members data set reflects an earlier,
or delayed, state of the set. For example, if the current time is 09:52 and a member has a delay of an hour, the delayed
member has no operation more recent than 08:52.
387
Because delayed members are a rolling backup or a running historical snapshot of the data set, they may help
you recover from various kinds of human error. For example, a delayed member can make it possible to recover from
unsuccessful application upgrades and operator errors including dropped databases and collections.
Requirements Delayed members:
Must be priority 0 (page 386) members. Set the priority to 0 to prevent a delayed member from becoming
primary.
Should be hidden (page 387) members. Always prevent applications from seeing and querying delayed members.
do vote in elections for primary.
Delayed members apply operations from the oplog on a delay. When choosing the amount of delay, consider that the
amount of delay:
must be is equal to or greater than your maintenance windows.
must be smaller than the capacity of the oplog. For more information on oplog size, see Oplog Size (page 411).
Example In the following 5-member replica set, the primary and all secondaries have copies of the data set. One
member applies operations with a delay of 3600 seconds, or an hour. This delayed member is also hidden and is a
priority 0 member.
Figure 8.11: Diagram of a 5 member replica set with a hidden delayed priority 0 member.
Configuration A delayed member has its priority (page 481) equal to 0, hidden (page 481) equal to true,
and its slaveDelay (page 482) equal to the number of seconds of delay:
{
"_id" : <num>,
"host" : <hostname:port>,
"priority" : 0,
"slaveDelay" : <seconds>,
"hidden" : true
}
To configure a delayed member, see Configure a Delayed Replica Set Member (page 441).
388
Chapter 8. Replication
For example, in the following replica set, an arbiter allows the set to have an odd number of votes for elections:
Figure 8.12: Diagram of a four member replica set plus an arbiter for odd number of votes.
Security
Authentication When running with auth (page 993), arbiters exchange credentials with other members of the set
to authenticate. MongoDB encrypts the authentication process. The MongoDB authentication exchange is cryptographically secure.
Arbiters, use keyfiles to authenticate to the replica set.
Communication The only communication between arbiters and other set members are: votes during elections,
heartbeats, and configuration data. These exchanges are not encrypted.
However, if your MongoDB deployment uses SSL, MongoDB will encrypt all communication between replica set
members. See Connect to MongoDB with SSL (page 249) for more information.
As with all MongoDB components, run arbiters on in trusted network environments.
389
Strategies
Determine the Number of Members
Fault Tolerance.
1
1
2
2
Adding a member to the replica set does not always increase the fault tolerance. However, in these cases, additional
members can provide support for dedicated functions, such as backups or reporting.
Use Hidden and Delayed Members for Dedicated Functions Add hidden (page 387) or delayed (page 387) members to support dedicated functions, such as backup or reporting.
Load Balance on Read-Heavy Deployments In a deployment with very high read traffic, you can improve read
throughput by distributing reads to secondary members. As your deployment grows, add or move members to alternate
data centers to improve redundancy and availability.
Always ensure that the main facility is able to elect a primary.
Add Capacity Ahead of Demand The existing members of a replica set must have spare capacity to support adding
a new member. Always add new members before the current demand saturates the capacity of the set.
390
Chapter 8. Replication
Distribute Members Geographically To protect your data if your main data center fails, keep at least one member
in an alternate data center. Set these members priority (page 481) to 0 to prevent them from becoming primary.
Keep a Majority of Members in One Location When a replica set has members in multiple data centers, network
partitions can prevent communication between data centers. To replicate data, members must be able to communicate
to other members.
In an election, members must see each other to create a majority. To ensure that the replica set members can confirm
a majority and elect a primary, keep a majority of the sets members in one location.
Target Operations with Tags
Use replica set tags (page 451) to ensure that operations replicate to specific data centers. Tags also support targeting
read operations to specific machines.
See also:
Data Center Awareness (page 153) and Operational Segregation in MongoDB Deployments (page 153).
Use Journaling to Protect Against Power Failures
Enable journaling to protect data against service interruptions. Without journaling MongoDB cannot recover data after
unexpected shutdowns, including power failures and unexpected reboots.
All 64-bit versions of MongoDB after version 2.0 have journaling enabled by default.
Deployment Patterns
The following documents describe common replica set deployment patterns. Other patterns are possible and effective depending on the the applications requirements. If needed, combine features of each architecture in your own
deployment:
Three Member Replica Sets (page 391) Three-member replica sets provide the minimum recommended architecture
for a replica set.
Replica Sets with Four or More Members (page 392) Four or more member replica sets provide greater redundancy
and can support greater distribution of read operations and dedicated functionality.
Geographically Distributed Replica Sets (page 396) Geographically distributed sets include members in multiple locations to protect against facility-specific failures, such as power outages.
Three Member Replica Sets
The minimum architecture of a replica set has three members. A three member replica set can have either three
members that hold data, or two members that hold data and an arbiter.
Primary with Two Secondary Members A replica set with three members that store data has:
One primary (page 382).
Two secondary (page 382) members. Both secondaries can become the primary in an election (page 397).
391
Figure 8.13: Diagram of a 3 member replica set that consists of a primary and two secondaries.
These deployments provide two complete copies of the data set at all times in addition to the primary. These replica
sets provide additional fault tolerance and high availability (page 396). If the primary is unavailable, the replica set
elects a secondary to be primary and continues normal operation. The old primary rejoins the set when available.
Primary with a Secondary and an Arbiter A three member replica set with a two members that store data has:
One primary (page 382).
One secondary (page 382) member. The secondary can become primary in an election (page 397).
One arbiter (page 389). The arbiter only votes in elections.
Since the arbiter does not hold a copy of the data, these deployments provides only one complete copy of the data.
Arbiters require fewer resources, at the expense of more limited redundancy and fault tolerance.
However, a deployment with a primary, secondary, and an arbiter ensures that a replica set remains available if the
primary or the secondary is unavailable. If the primary is unavailable, the replica set will elect the secondary to be
primary.
See also:
Deploy a Replica Set (page 420).
Replica Sets with Four or More Members
Although the standard replica set configuration has three members you can deploy larger sets. Add additional members
to a set to increase redundancy or to add capacity for distributing secondary read operations.
When adding members, ensure that:
The set has an odd number of voting members. If you have an even number of voting members, deploy an
arbiter (page ??) so that the set has an odd number.
The following replica set needs an arbiter to have an odd number of voting members.
A replica set can have up to 12 members, 2 but only 7 voting members. See non-voting members (page 400) for
more information.
2
While replica sets are the recommended solution for production, a replica set can support only 12 members in total. If your deployment requires
more than 12 members, youll need to use master-slave (page 413) replication. Master-slave replication lacks the automatic failover capabilities.
392
Chapter 8. Replication
Figure 8.14: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
Figure 8.15: Diagram of a replica set that consists of a primary, a secondary, and an arbiter.
393
Figure 8.16: Diagram of an election of a new primary. In a three member replica set with a secondary and an arbiter, the
primary becomes unreachable. The loss of a primary triggers an election where the secondary becomes new primary.
Figure 8.17: Diagram of a four member replica set plus an arbiter for odd number of votes.
394
Chapter 8. Replication
The following 9 member replica set has 7 voting members and 2 non-voting members.
Figure 8.18: Diagram of a 9 member replica set with the maximum of 7 voting members.
Members that cannot become primary in a failover have priority 0 configuration (page 386).
For instance, some members that have limited resources or networking constraints and should never be able to
become primary. Configure members that should not become primary to have priority 0 (page 386). In following
replica set, the secondary member in the third data center has a priority of 0:
Figure 8.19: Diagram of a 5 member replica set distributed across three data centers. Replica set includes a priority 0
member.
A majority of the sets members should be in your applications main data center.
See also:
Deploy a Replica Set (page 420), Add an Arbiter to Replica Set (page 432), and Add Members to a Replica Set
(page 433).
395
Adding members to a replica set in multiple data centers adds redundancy and provides fault tolerance if one data
center is unavailable. Members in additional data centers should have a priority of 0 (page 386) to prevent them from
becoming primary.
For example: the architecture of a geographically distributed replica set may be:
One primary in the main data center.
One secondary member in the main data center. This member can become primary at any time.
One priority 0 (page 386) member in a second data center. This member cannot become primary.
In the following replica set, the primary and one secondary are in Data Center 1, while Data Center 2 has a priority 0
(page 386) secondary that cannot become a primary.
Figure 8.20: Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0
member.
If the primary is unavailable, the replica set will elect a new primary from Data Center 1. If the data centers cannot
connect to each other, the member in Data Center 2 will not become the primary.
If Data Center 1 becomes unavailable, you can manually recover the data set from Data Center 2 with minimal
downtime. With sufficient write concern (page 55), there will be no data loss.
To facilitate elections, the main data center should hold a majority of members. Also ensure that the set has an odd
number of members. If adding a member in another data center results in a set with an even number of members,
deploy an arbiter (page ??). For more information on elections, see Replica Set Elections (page 397).
See also:
Deploy a Geographically Distributed Replica Set (page 425).
396
Chapter 8. Replication
The deployment of a replica set affects the outcome of failover situations. To support effective failover, ensure that one
facility can elect a primary if needed. Choose the facility that hosts the core application systems to host the majority
of the replica set. Place a majority of voting members and all the members that can become primary in this facility.
Otherwise, network partitions could prevent the set from being able to form a majority.
Failover Processes
The replica set recovers from the loss of a primary by holding an election. Consider the following:
Replica Set Elections (page 397) Elections occur when the primary becomes unavailable and the replica set members
autonomously select a new primary.
Rollbacks During Replica Set Failover (page 401) A rollback reverts write operations on a former primary when the
member rejoins the replica set after a failover.
Replica Set Elections
Replica sets use elections to determine which set member will become primary. Elections occur after initiating a
replica set, and also any time the primary becomes unavailable. The primary is the only member in the set that can
accept write operations. If a primary becomes unavailable, elections allow the set to recover normal operations without
manual intervention. Elections are part of the failover process (page 396).
Important: Elections are essential for independent operation of a replica set; however, elections take time to complete. While an election is in process, the replica set has no primary and cannot accept writes. MongoDB avoids
elections unless necessary.
In the following three-member replica set, the primary is unavailable. The remaining secondaries hold an election to
choose a new primary.
Factors and Conditions that Affect Elections
Heartbeats Replica set members send heartbeats (pings) to each other every two seconds. If a heartbeat does not
return within 10 seconds, the other members mark the delinquent member as inaccessible.
Priority Comparisons The priority (page 481) setting affects elections. Members will prefer to vote for members with the highest priority value.
Members with a priority value of 0 cannot become primary and do not seek election. For details, see Priority 0 Replica
Set Members (page 386).
A replica set does not hold an election as long as the current primary has the highest priority value and is within 10
seconds of the latest oplog entry in the set. If a higher-priority member catches up to within 10 seconds of the latest
oplog entry of the current primary, the set holds an election in order to provide the higher-priority node a chance to
become primary.
Optime The optime (page 727) is the timestamp of the last operation that a member applied from the oplog. A
replica set member cannot become primary unless it has the highest (i.e. most recent) optime (page 727) of any
visible member in the set.
3
Replica sets remove rollback data when needed without intervention. Administrators must apply or discard rollback data manually.
397
Figure 8.21: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
398
Chapter 8. Replication
Connections A replica set member cannot become primary unless it can connect to a majority of the members in the
replica set. For the purposes of elections, a majority refers to the total number of votes, rather than the total number of
members.
If you have a three-member replica set, where every member has one vote, the set can elect a primary as long as two
members can connect to each other. If two members are unavailable, the remaining member remains a secondary
because it cannot connect to a majority of the sets members. If the remaining member is a primary and two members
become unavailable, the primary steps down and becomes and secondary.
Network Partitions Network partitions affect the formation of a majority for an election. If a primary steps down
and neither portion of the replica set has a majority the set will not elect a new primary. The replica set becomes
read-only.
To avoid this situation, place a majority of instances in one data center and a minority of instances in any other data
centers combined.
Election Mechanics
Election Triggering Events Replica sets hold an election any time there is no primary. Specifically, the following:
the initiation of a new replica set.
a secondary loses contact with a primary. Secondaries call for elections when they cannot see a primary.
a primary steps down.
Note: Priority 0 members (page 386), do not trigger elections, even when they cannot connect to the primary.
A primary will step down:
after receiving the replSetStepDown (page 730) command.
if one of the current secondaries is eligible for election and has a higher priority.
if primary cannot contact a majority of the members of the replica set.
Important: When a primary steps down, it closes all open client connections, so that clients dont attempt to write
data to a secondary. This helps clients maintain an accurate view of the replica set and helps prevent rollbacks.
Participation in Elections Every replica set member has a priority that helps determine its eligibility to become a
primary. In an election, the replica set elects an eligible member with the highest priority (page 481) value as
primary. By default, all members have a priority of 1 and have an equal chance of becoming primary. In the default,
all members also can trigger an election.
You can set the priority (page 481) value to weight the election in favor of a particular member or group of
members. For example, if you have a geographically distributed replica set (page 396), you can adjust priorities so
that only members in a specific data center can become primary.
The first member to receive the majority of votes becomes primary. By default, all members have a single vote, unless
you modify the votes (page 482) setting. Non-voting members (page 442) have votes (page 482) value of 0.
The state (page 727) of a member also affects its eligibility to vote. Only members in the following states can vote:
PRIMARY, SECONDARY, RECOVERING, ARBITER, and ROLLBACK.
Important: Do not alter the number of votes in a replica set to control the outcome of an election. Instead, modify
the priority (page 481) value.
399
Vetoes in Elections All members of a replica set can veto an election, including non-voting members (page 400). A
member will veto an election:
If the member seeking an election is not a member of the voters set.
If the member seeking an election is not up-to-date with the most recent operation accessible in the replica set.
If the member seeking an election has a lower priority than another member in the set that is also eligible for
election.
If a priority 0 member (page 386) 4 is the most current member at the time of the election. In this case, another
eligible member of the set will catch up to the state of this secondary member and then attempt to become
primary.
If the current primary has more recent operations (i.e. a higher optime (page 727)) than the member seeking
election, from the perspective of the voting member.
If the current primary has the same or more recent operations (i.e. a higher or equal optime (page 727)) than
the member seeking election.
Non-Voting Members Non-voting members hold copies of the replica sets data and can accept read operations from
client applications. Non-voting members do not vote in elections, but can veto (page 400) an election and become
primary.
Because a replica set can have up to 12 members but only up to seven voting members, non-voting members allow a
replica set to have more than seven members.
For instance, the following nine-member replica set has seven voting members and two non-voting members.
Figure 8.22: Diagram of a 9 member replica set with the maximum of 7 voting members.
A non-voting member has a votes (page 482) setting equal to 0 in its member configuration:
{
"_id" : <num>
"host" : <hostname:port>,
"votes" : 0
}
4
Remember that hidden (page 387) and delayed (page 387) imply priority 0 (page 386) configuration.
400
Chapter 8. Replication
Important: Do not alter the number of votes to control which members will become primary. Instead, modify the
priority (page 481) option. Only alter the number of votes in exceptional cases. For example, to permit more than
seven members.
When possible, all members should have only one vote. Changing the number of votes can cause ties, deadlocks, and
the wrong members to become primary.
To configure a non-voting member, see Configure Non-Voting Replica Set Member (page 442).
Rollbacks During Replica Set Failover
A rollback reverts write operations on a former primary when the member rejoins its replica set after a failover.
A rollback is necessary only if the primary had accepted write operations that the secondaries had not successfully
replicated before the primary stepped down. When the primary rejoins the set as a secondary, it reverts, or rolls back,
its write operations to maintain database consistency with the other members.
MongoDB attempts to avoid rollbacks, which should be rare. When a rollback does occur, it is often the result of a
network partition. Secondaries that can not keep up with the throughput of operations on the former primary, increase
the size an impact of the rollback.
A rollback does not occur if the write operations replicate to another member of the replica set before the primary
steps down and if that member remains available and accessible to a majority of the replica set.
Collect Rollback Data When a rollback does occur, administrators must decide whether to apply or ignore the
rollback data. MongoDB writes the rollback data to BSON files in the rollback/ folder under the databases
dbpath (page 993) directory. The names of rollback files have the following form:
<database>.<collection>.<timestamp>.bson
For example:
records.accounts.2011-05-09T18-10-04.0.bson
Administrators must apply rollback data manually after the member completes the rollback and returns to secondary
status. Use bsondump (page 960) to read the contents of the rollback files. Then use mongorestore (page 956) to
apply the changes to the new primary.
Avoid Replica Set Rollbacks To prevent rollbacks, use replica acknowledged write concern (page 57) to guarantee
that the write operations propagate to the members of a replica set.
Rollback Limitations A mongod (page 925) instance will not rollback more than 300 megabytes of data. If your
system must rollback more than 300 megabytes, you must manually intervene to recover the data. If this is the case,
the following line will appear in your mongod (page 925) log:
[replica set sync] replSet syncThread: 13410 replSet too much data to roll back
In this situation, save the data directly or force the member to perform an initial sync. To force initial sync, sync from a
current member of the set by deleting the content of the dbpath (page 993) directory for the member that requires
a larger rollback.
See also:
Replica Set High Availability (page 396) and Replica Set Elections (page 397).
401
The default write concern confirms write operations only on the primary. You can configure write concern to confirm
write operations to additional replica set members as well by issuing the getLastError (page 720) command with
the w option.
The w option confirms that write operations have replicated to the specified number of replica set members, including
the primary. You can either specify a number or specify majority, which ensures the write propagates to a majority
of set members.
5
Sharded clusters where the shards are also replica sets provide the same configuration options with regards to write and read operations.
402
Chapter 8. Replication
Figure 8.23: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one
secondary.
403
If you specify a w value greater than the number of members that hold a copy of the data (i.e., greater than the number
of non-arbiter members), the operation blocks until those members become available. This can cause the operation
to block forever. To specify a timeout threshold for the getLastError (page 720) operation, use the wtimeout
argument. argument. A wtimeout value of 0 means that the operation will never time out.
See getLastError Examples (page 721) for example invocations.
Modify Default Write Concern
You can configure your own default getLastError (page 720) behavior for a replica set. Use the
getLastErrorDefaults (page 483) setting in the replica set configuration (page 479). The following sequence
of commands creates a configuration that waits for the write operation to complete on a majority of the set members
before returning:
cfg = rs.conf()
cfg.settings = {}
cfg.settings.getLastErrorDefaults = {w: "majority"}
rs.reconfig(cfg)
The getLastErrorDefaults (page 483) setting affects only those getLastError (page 720) commands that
have no other arguments.
Note: Use of insufficient write concern can lead to rollbacks (page 401) in the case of replica set failover (page 396).
Always ensure that your operations have specified the required write concern for your application.
See also:
Write Concern (page 55) and Write Concern Options (page 1012)
Custom Write Concerns
You can use replica set tags to create custom write concerns using the getLastErrorDefaults (page 483) and
getLastErrorModes (page 483) replica set settings.
Note: Custom write concern modes specify the field name and a number of distinct values for that field. By contrast,
read preferences use the value of fields in the tag document to direct read operations.
In some cases, you may be able to use the same tags for read preferences and write concerns; however, you may need
to create additional tags for write concerns depending on the requirements of your application.
Consider a five member replica set, where each member has one of the following tag sets:
{
{
{
{
{
"use":
"use":
"use":
"use":
"use":
"reporting" }
"backup" }
"application" }
"application" }
"application" }
You could create a custom write concern mode that will ensure that applicable write operations will not return until
members with two different values of the use tag have acknowledged the write operation. Create the mode with the
following sequence of operations in the mongo (page 942) shell:
404
Chapter 8. Replication
cfg = rs.conf()
cfg.settings = { getLastErrorModes: { use2: { "use": 2 } } }
rs.reconfig(cfg)
To use this mode pass the string use2 to the w option of getLastError (page 720) as follows:
db.runCommand( { getLastError: 1, w: "use2" } )
If you have a three member replica with the following tag sets:
{ "disk": "ssd" }
{ "disk": "san" }
{ "disk": "spinning" }
You cannot specify a custom getLastErrorModes (page 483) value to ensure that the write propagates to the san
before returning. However, you may implement this write concern policy by creating the following additional tags, so
that the set resembles the following:
{ "disk": "ssd" }
{ "disk": "san", "disk.san": "san" }
{ "disk": "spinning" }
To use this mode pass the string san to the w option of getLastError (page 720) as follows:
db.runCommand( { getLastError: 1, w: "san" } )
This operation will not return until a replica set member with the tag disk.san returns.
You may set a custom write concern mode as the default write concern mode using getLastErrorDefaults
(page 483) replica set as in the following setting:
cfg = rs.conf()
cfg.settings.getLastErrorDefaults = { ssd: 1 }
rs.reconfig(cfg)
See also:
Configure Replica Set Tag Sets (page 451) for further information about replica set reconfiguration and tag sets.
Read Preference
Read preference describes how MongoDB clients route read operations to members of a replica set.
By default, an application directs its read operations to the primary member in a replica set. Reading from the primary
guarantees that read operations reflect the latest version of a document. However, by distributing some or all reads to
secondary members of the replica set, you can improve read throughput or reduce latency for an application that does
not require fully up-to-date data.
Important: You must exercise care when specifying read preferences: modes other than primary (page 489) can
405
and will return stale data because the secondary queries will not include the most recent write operations to the replica
sets primary.
The following are common use cases for using non-primary (page 489) read preference modes:
Running systems operations that do not affect the front-end application.
Issuing reads to secondaries helps distribute load and prevent operations from affecting the main workload of
the primary. This can be a good choice for reporting and analytics workloads, for example.
Note: Read preferences arent relevant to direct connections to a single mongod (page 925) instance. However,
in order to perform read operations on a direct connection to a secondary member of a replica set, you must set
a read preference, such as secondary.
Providing local reads for geographically distributed applications.
If you have application servers in multiple data centers, you may consider having a geographically distributed
replica set (page 396) and using a non primary read preference or the nearest (page 490). This reduces
network latency by having the application server to read from a nearby secondary, rather than a distant primary.
Maintaining availability during a failover.
Use primaryPreferred (page 489) if you want your application to do consistent reads from the primary
under normal circumstances, but to allow stale reads from secondaries in an emergency. This provides a readonly mode for your application during a failover.
Figure 8.24: Read operations to a replica set. Default read preference routes the read to the primary. Read preference
of nearest routes the read to the nearest member.
Note: In general, do not use primary (page 489) and primaryPreferred (page 489) to provide extra capacity.
Sharding (page 493) increases read and write capacity by distributing read and write operations across a group of
machines, and is often a better strategy for adding capacity.
406
Chapter 8. Replication
See
Read Preference Processes (page 408) for more information about the internal application of read preferences.
Tag sets allow you to specify custom read preferences (page 405) and write concerns (page 55) so that your application
can target operations to specific members.
Custom read preferences and write concerns evaluate tags sets in different ways: read preferences consider the value
of a tag when selecting a member to read from. while write concerns ignore the value of a tag to when selecting a
member except to consider whether or not the value is unique.
You can specify tag sets with the following read preference modes:
6 http://api.mongodb.org/
7 http://api.mongodb.org/
407
Clients, by way of their drivers, and mongos (page 938) instances for sharded clusters, periodically update their view
of the replica sets state.
When you select non-primary (page 489) read preference, the driver will determine which member to target using
the following process:
1. Assembles a list of suitable members, taking into account member type (i.e. secondary, primary, or all members).
2. Excludes members not matching the tag sets, if specified.
3. Determines which suitable member is the closest to the client in absolute terms.
4. Builds a list of members that are within a defined ping distance (in milliseconds) of the absolute nearest
member.
Applications can configure the threshold used in this stage. The default acceptable latency is 15 milliseconds, which you can override in the drivers with their own secondaryAcceptableLatencyMS option.
For mongos (page 938) you can use the --localThreshold or localThreshold (page 1002) runtime
options to set this value.
5. Selects a member from these hosts at random. The member receives the read operation.
Drivers can then associate the thread or connection with the selected member. This request association (page 409) is
configurable by the application. See your driver (page 95) documentation about request association configuration and
default behavior.
408
Chapter 8. Replication
Request Association
Important: Request association is configurable by the application. See your driver (page 95) documentation about
request association configuration and default behavior.
Because secondary members of a replica set may lag behind the current primary by different amounts, reads for
secondary members may reflect data at different points in time. To prevent sequential reads from jumping around in
time, the driver can associate application threads to a specific member of the set after the first read, thereby preventing
reads from other members. The thread will continue to read from the same member until:
The application performs a read with a different read preference,
The thread terminates, or
The client receives a socket exception, as is the case when theres a network error or when the mongod
(page 925) closes connections during a failover. This triggers a retry (page 409), which may be transparent
to the application.
When using request association, if the client detects that the set has elected a new primary, the driver will discard all
associations between threads and members.
Auto-Retry
Connections between MongoDB drivers and mongod (page 925) instances in a replica set must balance two concerns:
1. The client should attempt to prefer current results, and any connection should read from the same member of
the replica set as much as possible.
2. The client should minimize the amount of time that the database is inaccessible as the result of a connection
issue, networking problem, or failover in a replica set.
As a result, MongoDB drivers and mongos (page 938):
Reuse a connection to specific mongod (page 925) for as long as possible after establishing a connection to that
instance. This connection is pinned to this mongod (page 925).
Attempt to reconnect to a new member, obeying existing read preference modes (page 489), if the connection to
mongod (page 925) is lost.
Reconnections are transparent to the application itself. If the connection permits reads from secondary members, after reconnecting, the application can receive two sequential reads returning from different secondaries.
Depending on the state of the individual secondary members replication, the documents can reflect the state of
your database at different moments.
Return an error only after attempting to connect to three members of the set that match the read preference mode
(page 489) and tag set (page 407). If there are fewer than three members of the set, the client will error after
connecting to all existing members of the set.
After this error, the driver selects a new member using the specified read preference mode. In the absence of a
specified read preference, the driver uses primary (page 489).
After detecting a failover situation,
possible.
the driver attempts to refresh the state of the replica set as quickly as
8 When a failover occurs, all members of the set close all client connections that produce a socket error in the driver. This behavior prevents or
minimizes rollback.
409
Changed in version 2.2: Before version 2.2, mongos (page 938) did not support the read preference mode semantics
(page 489).
In most sharded clusters, each shard consists of a replica set. As such, read preferences are also applicable. With
regard to read preference, read operations in a sharded cluster are identical to unsharded replica sets.
Unlike simple replica sets, in sharded clusters, all interactions with the shards pass from the clients to the mongos
(page 938) instances that are actually connected to the set members. mongos (page 938) is then responsible for the
application of read preferences, which is transparent to applications.
There are no configuration changes required for full support of read preference modes in sharded environments, as
long as the mongos (page 938) is at least version 2.2. All mongos (page 938) maintain their own connection pool to
the replica set members. As a result:
A request without a specified preference has primary (page 489), the default, unless, the mongos (page 938)
reuses an existing connection that has a different mode set.
To prevent confusion, always explicitly set your read preference mode.
All nearest (page 490) and latency calculations reflect the connection between the mongos (page 938) and
the mongod (page 925) instances, not the client and the mongod (page 925) instances.
This produces the desired result, because all results must pass through the mongos (page 938) before returning
to the client.
410
Chapter 8. Replication
Oplog Size
When you start a replica set member for the first time, MongoDB creates an oplog of a default size. The size depends
on the architectural details of your operating system.
In most cases, the default oplog size is sufficient. For example, if an oplog is 5% of free disk space and fills up in 24
hours of operations, then secondaries can stop copying entries from the oplog for up to 24 hours without becoming
stale. However, most replica sets have much lower operation volumes, and their oplogs can hold much higher numbers
of operations.
Before mongod (page 925) creates an oplog, you can specify its size with the oplogSize (page 1000) option.
However, after you have started a replica set member for the first time, you can only change the size of the oplog using
the Change the Size of the Oplog (page 446) procedure.
By default, the size of the oplog is as follows:
For 64-bit Linux, Solaris, FreeBSD, and Windows systems, MongoDB allocates 5% of the available free disk
space to the oplog. If this amount is smaller than a gigabyte, then MongoDB allocates 1 gigabyte of space.
For 64-bit OS X systems, MongoDB allocates 183 megabytes of space to the oplog.
For 32-bit systems, MongoDB allocates about 48 megabytes of space to the oplog.
Workloads that Might Require a Larger Oplog Size
If you can predict your replica sets workload to resemble one of the following patterns, then you might want to create
an oplog that is larger than the default. Conversely, if your application predominantly performs reads and writes only
a small amount of data, you will oplog may be sufficient.
The following workloads might require a larger oplog size.
Updates to Multiple Documents at Once The oplog must translate multi-updates into individual operations in order
to maintain idempotency. This can use a great deal of oplog space without a corresponding increase in data size or
disk use.
Deletions Equal the Same Amount of Data as Inserts If you delete roughly the same amount of data as you insert,
the database will not grow significantly in disk use, but the size of the operation log can be quite large.
Significant Number of In-Place Updates If a significant portion of the workload is in-place updates, the database
records a large number of operations but does not change the quantity of data on disk.
Oplog Status
To view oplog status, including the size and the time range of operations, issue the
db.printReplicationInfo() (page 891) method. For more information on oplog status, see Check the
Size of the Oplog (page 465).
Under various exceptional situations, updates to a secondarys oplog might lag behind the desired performance time.
Use db.getReplicationInfo() (page 887) from a secondary member and the replication status (page 887)
output to assess the current state of replication and determine if there is any unintended replication delay.
See Replication Lag (page 463) for more information.
411
Initial sync copies all the data from one member of the replica set to another member. A member uses initial sync
when the member has no data, such as when the member is new, or when the member has data but is missing a history
of the sets replication.
When you perform an initial sync, MongoDB does the following:
1. Clones all databases. To clone, the mongod (page 925) queries every collection in each source database and
inserts all data into its own copies of these collections.
2. Applies all changes to the data set. Using the oplog from the source, the mongod (page 925) updates its data
set to reflect the current state of the replica set.
3. Builds all indexes on all collections.
When the mongod (page 925) finishes building all index builds, the member can transition to a normal state,
i.e. secondary.
To perform an initial sync, see Resync a Member of a Replica Set (page 450).
Replication
Replica set members replicate data continuously after the initial sync. This process keeps the members up to date
with all changes to the replica sets data. In most cases, secondaries synchronize from the primary. Secondaries
may automatically change their sync targets if needed based on changes in the ping time and state of other members
replication.
For a member to sync from another, the buildIndexes (page 481) setting for both members must have the same
value/ buildIndexes (page 481) must be either true or false for both members.
Beginning in version 2.2, secondaries avoid syncing from delayed members (page 387) and hidden members
(page 387).
Consistency and Durability
In a replica set, only the primary can accept write operations. Writing only to the primary provides strict consistency
among members.
Journaling provides single-instance write durability. Without journaling, if a MongoDB instance terminates ungracefully, you should assume that the database is in a corrupt or inconsistent state.
Multithreaded Replication
MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups
batches by namespace and applies operations using a group of threads, but always applies the write operations to a
namespace in order.
412
Chapter 8. Replication
While applying a batch, MongoDB blocks all reads. As a result, secondaries can never return data that reflects a state
that never existed on the primary.
Pre-Fetching Indexes to Improve Replication Throughput
To help improve the performance of applying oplog entries, MongoDB fetches memory pages that hold affected data
and indexes. This pre-fetch stage minimizes the amount of time MongoDB holds the write lock while applying oplog
entries. By default, secondaries will pre-fetch all Indexes (page 313).
Optionally, you can disable all pre-fetching or only pre-fetch the index on the _id field.
replIndexPrefetch (page 1000) setting for more information.
See the
To configure a master-slave deployment, start two mongod (page 925) instances: one in master (page 1000) mode,
and the other in slave (page 1001) mode.
To start a mongod (page 925) instance in master (page 1000) mode, invoke mongod (page 925) as follows:
mongod --master --dbpath /data/masterdb/
With the --master option, the mongod (page 925) will create a local.oplog.$main (page 486) collection,
which the operation log that queues operations that the slaves will apply to replicate operations from the master. The
--dbpath is optional.
To start a mongod (page 925) instance in slave (page 1001) mode, invoke mongod (page 925) as follows:
mongod --slave --source <masterhostname><:<port>> --dbpath /data/slavedb/
Specify the hostname and port of the master instance to the --source argument. The --dbpath is optional.
For slave (page 1001) instances, MongoDB stores data about the source server in the local.sources (page 486)
collection.
Configuration Options for Master-Slave Deployments
As an alternative to specifying the --source run-time option, can add a document to local.sources (page 486)
specifying the master (page 1000) instance, as in the following operation in the mongo (page 942) shell:
413
1
2
3
use local
db.sources.find()
db.sources.insert( { host: <masterhostname> <,only: databasename> } );
In line 1, you switch context to the local database. In line 2, the find() (page 816) operation should return no documents, to ensure that there are no documents in the sources collection. Finally, line 3 uses
db.collection.insert() (page 832) to insert the source document into the local.sources (page 486)
collection. The model of the local.sources (page 486) document is as follows:
host
The host field specifies the master (page 1000)mongod (page 925) instance, and holds a resolvable hostname,
i.e. IP address, or a name from a host file, or preferably a fully qualified domain name.
You can append <:port> to the host name if the mongod (page 925) is not running on the default 27017
port.
only
Optional. Specify a name of a database. When specified, MongoDB will only replicate the indicated database.
Operational Considerations for Replication with Master Slave Deployments
Master instances store operations in an oplog which is a capped collection (page 156). As a result, if a slave falls too
far behind the state of the master, it cannot catchup and must re-sync from scratch. Slave may become out of sync
with a master if:
The slave falls far behind the data updates available from that master.
The slave stops (i.e. shuts down) and restarts later after the master has overwritten the relevant operations from
the master.
When slaves, are out of sync, replication stops. Administrators must intervene manually to restart replication. Use the
resync (page 731) command. Alternatively, the --autoresync allows a slave to restart replication automatically,
after ten second pause, when the slave falls out of sync with the master. With --autoresync specified, the slave
will only attempt to re-sync once in a ten minute period.
To prevent these situations you should specify a larger oplog when you start the master (page 1000) instance, by
adding the --oplogSize option when starting mongod (page 925). If you do not specify --oplogSize, mongod
(page 925) will allocate 5% of available disk space on start up to the oplog, with a minimum of 1GB for 64bit machines
and 50MB for 32bit machines.
Run time Master-Slave Configuration
MongoDB provides a number of run time configuration options for mongod (page 925) instances in master-slave deployments. You can specify these options in configuration files (page 145) or on the command-line. See documentation
of the following:
For master nodes:
master (page 1000)
slave (page 1001)
For slave nodes:
source (page 1001)
only (page 1001)
slaveDelay (page 1001)
414
Chapter 8. Replication
Also consider the Master-Slave Replication Command Line Options (page 934) for related options.
Diagnostics
On a master instance, issue the following operation in the mongo (page 942) shell to return replication status from the
perspective of the master:
db.printReplicationInfo()
On a slave instance, use the following operation in the mongo (page 942) shell to return the replication status from
the perspective of the slave:
db.printSlaveReplicationInfo()
Use the serverStatus (page 782) as in the following operation, to return status of the replication:
db.serverStatus()
See server status repl fields (page 790) for documentation of the relevant section of output.
Security
When running with auth (page 993) enabled, in master-slave deployments configure a keyFile (page 993) so that
slave mongod (page 925) instances can authenticate and communicate with the master mongod (page 925) instance.
To enable authentication and configure the keyFile (page 993) add the following option to your configuration file:
keyFile = /srv/mongodb/keyfile
Note: You may chose to set these run-time configuration options using the --keyFile option on the command line.
Setting keyFile (page 993) enables authentication and specifies a key file for the mongod (page 925) instances to
use when authenticating to each other. The content of the key file is arbitrary but must be the same on all members of
the deployment can connect to each other.
The key file must be less one kilobyte in size and may only contain characters in the base64 set. The key file must not
have group or world permissions on UNIX systems. Use the following command to use the OpenSSL package to
generate random content for use in a key file:
openssl rand -base64 741
See also:
Security (page 235) for more information about security in MongoDB
Ongoing Administration and Operation of Master-Slave Deployments
Deploy Master-Slave Equivalent using Replica Sets
If you want a replication configuration that resembles master-slave replication, using replica sets replica sets, consider the following replica configuration document. In this deployment hosts <master> and <slave> 9 provide
replication that is roughly equivalent to a two-instance master-slave deployment:
9
In replica set configurations, the host (page 480) field must hold a resolvable hostname.
415
{
_id : 'setName',
members : [
{ _id : 0, host : "<master>", priority : 1 },
{ _id : 1, host : "<slave>", priority : 0, votes : 0 }
]
}
See Replica Set Configuration (page 479) for more information about replica set configurations.
Convert a Master-Slave Deployment to a Replica Set
To convert a master-slave deployment to a replica set, restart the current master as a one-member replica set. Then
remove the data directors from previous secondaries and add them as new secondaries to the new replica set.
1. To confirm that the current instance is master, run:
db.isMaster()
2. Shut down the mongod (page 925) processes on the master and all slave(s), using the following command while
connected to each instance:
db.adminCommand({shutdown : 1, force : true})
3. Back up your /data/db directories, in case you need to revert to the master-slave deployment.
4. Start the former master with the --replSet option, as in the following:
mongod --replSet <setname>
5. Connect to the mongod (page 925) with the mongo (page 942) shell, and initiate the replica set with the
following command:
rs.initiate()
When the command returns, you will have successfully deployed a one-member replica set. You can check the
status of your replica set at any time by running the following command:
rs.status()
You can now follow the convert a standalone to a replica set (page 432) tutorial to deploy your replica set, picking up
from the Expand the Replica Set (page 433) section.
Failing over to a Slave (Promotion)
To permanently failover from a unavailable or damaged master (A in the following example) to a slave (B):
1. Shut down A.
416
Chapter 8. Replication
Removing local.* is irrevocable and cannot be undone. Perform this step with extreme
If you have a master (A) and a slave (B) and you would like to reverse their roles, follow this procedure. The procedure
assumes A is healthy, up-to-date and available.
If A is not healthy but the hardware is okay (power outage, server crash, etc.), skip steps 1 and 2 and in step 8 replace
all of As files with Bs files in step 8.
If A is not healthy and the hardware is not okay, replace A with a new machine. Also follow the instructions in the
previous paragraph.
To invert the master and slave in a deployment:
1. Halt writes on A using the fsync command.
2. Make sure B is up to date with the state of A.
3. Shut down B.
4. Back up and move all data files that begin with local on B from the dbpath (page 993) to remove the existing
local.sources data.
Warning:
caution.
Removing local.* is irrevocable and cannot be undone. Perform this step with extreme
If you can stop write operations to the master for an indefinite period, you can copy the data files from the master to
the new slave and then start the slave with --fastsync.
Warning:
forever.
Be careful with --fastsync. If the data on both instances is identical, a discrepancy will exist
417
fastsync (page 1000) is a way to start a slave by starting with an existing master disk image/backup. This option
declares that the administrator guarantees the image is correct and completely up-to-date with that of the master. If
you have a full and complete copy of data from a master you can use this option to avoid a full synchronization upon
starting the slave.
Creating a Slave from an Existing Slaves Disk Image
You can just copy the other slaves data file snapshot without any special options. Only take data snapshots when a
mongod (page 925) process is down or locked using db.fsyncLock() (page 885).
Resyncing a Slave that is too Stale to Recover
Slaves asynchronously apply write operations from the master that the slaves poll from the masters oplog. The oplog
is finite in length, and if a slave is too far behind, a full resync will be necessary. To resync the slave, connect to a
slave using the mongo (page 942) and issue the resync (page 731) command:
use admin
db.runCommand( { resync: 1 } )
This forces a full resync of all data (which will be very slow on a large database). You can achieve the same effect
by stopping mongod (page 925) on the slave, deleting the entire content of the dbpath (page 993) on the slave, and
restarting the mongod (page 925).
Slave Chaining
Slaves cannot be chained. They must all connect to the master directly.
If a slave attempts slave from another slave you will see the following line in the mongod (page 925) long of the
shell:
assertion 13051 tailable cursor requested on non capped collection ns:local.oplog.$main
To change a slaves source, manually modify the slaves local.sources (page 486) collection.
Example
Consider the following: If you accidentally set an incorrect hostname for the slaves source (page 1001), as in the
following example:
mongod --slave --source prod.mississippi
You can correct this, by restarting the slave without the --slave and --source arguments:
mongod
Connect to this mongod (page 925) instance using the mongo (page 942) shell and update the local.sources
(page 486) collection, with the following operation sequence:
use local
db.sources.update( { host : "prod.mississippi" },
{ $set : { host : "prod.mississippi.example.net" } } )
418
Chapter 8. Replication
Restart the slave with the correct command line arguments or with no --source option. After configuring
local.sources (page 486) the first time, the --source will have no subsequent effect. Therefore, both of
the following invocations are correct:
mongod --slave --source prod.mississippi.example.net
or
mongod --slave
419
Troubleshoot Replica Sets (page 462) Describes common issues and operational challenges for replica sets. For additional diagnostic information, see FAQ: MongoDB Diagnostics (page 616).
Three member replica sets provide enough redundancy to survive most network partitions and other system failures.
These sets also have sufficient capacity for many distributed read operations. Replica sets should always have an odd
number of members. This ensures that elections (page 397) will proceed smoothly. For more about designing replica
sets, see the Replication overview (page 377).
The basic procedure is to start the mongod (page 925) instances that will become members of the replica set, configure
the replica set itself, and then add the mongod (page 925) instances to it.
Requirements
For production deployments, you should maintain as much separation between members as possible by hosting the
mongod (page 925) instances on separate machines. When using virtual machines for production deployments, you
should place each mongod (page 925) instance on a separate host server serviced by redundant power circuits and
redundant network paths.
Before you can deploy a replica set, you must install MongoDB on each system that will be part of your replica set. If
you have not already installed MongoDB, see the installation tutorials (page 3).
420
Chapter 8. Replication
Before creating your replica set, you should verify that your network configuration allows all possible connections
between each member. For a successful replica set deployment, every member must be able to connect to every other
member. For instructions on how to check your connection, see Test Connections Between all Members (page 464).
Procedure
Each member of the replica set resides on its own machine and all of the MongoDB processes bind to port
27017 (the standard MongoDB port).
Each member of the replica set must be accessible by way of resolvable DNS or hostnames, as in the following
scheme:
mongodb0.example.net
mongodb1.example.net
mongodb2.example.net
mongodbn.example.net
You will need to either configure your DNS names appropriately, or set up your systems /etc/hosts file to
reflect this configuration.
Ensure that network traffic can pass between all members in the network securely and efficiently. Consider the
following:
Establish a virtual private network. Ensure that your network topology routes all traffic between members
within a single site over the local area network.
Configure authentication using auth (page 993) and keyFile (page 993), so that only servers and
processes with authentication can connect to the replica set.
Configure networking and firewall rules so that only traffic (incoming and outgoing packets) on the default
MongoDB port (e.g. 27017) from within your deployment is permitted.
For more information on security and firewalls, see Inter-Process Authentication (page 238).
You must specify the run time configuration on each system in a configuration file (page 990) stored in
/etc/mongodb.conf or a related location. Do not specify the sets configuration in the mongo (page 942)
shell.
Use the following configuration for each of your MongoDB instances. You should set values that are appropriate
for your systems, as needed:
port = 27017
bind_ip = 10.8.0.10
dbpath = /srv/mongodb/
fork = true
replSet = rs0
The dbpath (page 993) indicates where you want mongod (page 925) to store data files. The dbpath
(page 993) must exist before you start mongod (page 925). If it does not exist, create the directory and ensure
mongod (page 925) has permission to read and write data to this path. For more information on permissions,
see the security operations documentation (page 236).
Modifying bind_ip (page 991) ensures that mongod (page 925) will only listen for connections from applications on the configured address.
421
For more information about the run time options used above and other configuration options, see Configuration
File Options (page 990).
To deploy a production replica set:
1. Start a mongod (page 925) instance on each system that will be part of your replica set. Specify the same
replica set name on each instance. For additional mongod (page 925) configuration options specific to replica
sets, see Replication Options (page 933).
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
If you use a configuration file, then start each mongod (page 925) instance with a command similar to following,
where mongodb.conf is the configuration file:
mongod --config /etc/mongodb.conf
Note: You will likely want to use and configure a control script to manage this process in production deployments. Control scripts are beyond the scope of this document.
2. Open a mongo (page 942) shell connected to one of the hosts by issuing the following command:
mongo
3. Use rs.initiate() (page 897) to initiate a replica set consisting of the current member and using the default
configuration, as follows:
rs.initiate()
1. In the mongo (page 942) shell connected to the primary, add the remaining members to the replica set
using rs.add() (page 895) in the mongo (page 942) shell on the current primary (in this example,
mongodb0.example.net). The commands should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
Check the status of your replica set at any time with the rs.status() (page 898) operation.
See also:
The documentation of the following shell functions for more information:
422
Chapter 8. Replication
Three member replica sets provide enough redundancy to survive most network partitions and other system failures.
These sets also have sufficient capacity for many distributed read operations. Replica sets should always have an odd
number of members. This ensures that elections (page 397) will proceed smoothly. For more about designing replica
sets, see the Replication overview (page 377).
The basic procedure is to start the mongod (page 925) instances that will become members of the replica set, configure
the replica set itself, and then add the mongod (page 925) instances to it.
Requirements
For test and development systems, you can run your mongod (page 925) instances on a local system, or within a
virtual instance.
Before you can deploy a replica set, you must install MongoDB on each system that will be part of your replica set. If
you have not already installed MongoDB, see the installation tutorials (page 3).
Before creating your replica set, you should verify that your network configuration allows all possible connections
between each member. For a successful replica set deployment, every member must be able to connect to every other
member. For instructions on how to check your connection, see Test Connections Between all Members (page 464).
Procedure
Important: These instructions should only be used for test or development deployments.
The examples in this procedure create a new replica set named rs0.
Important: If your application connects to more than one replica set, each set should have a distinct
name. Some drivers group replica set connections by replica set name.
423
You will begin by starting three mongod (page 925) instances as members of a replica set named rs0.
1. Create the necessary data directories for each member by issuing a command similar to the following:
mkdir -p /srv/mongodb/rs0-0 /srv/mongodb/rs0-1 /srv/mongodb/rs0-2
This will create directories called rs0-0, rs0-1, and rs0-2, which will contain the instances database files.
2. Start your mongod (page 925) instances in their own shell windows by issuing the following commands:
First member:
mongod --port 27017 --dbpath /srv/mongodb/rs0-0 --replSet rs0 --smallfiles --oplogSize 128
Second member:
mongod --port 27018 --dbpath /srv/mongodb/rs0-1 --replSet rs0 --smallfiles --oplogSize 128
Third member:
mongod --port 27019 --dbpath /srv/mongodb/rs0-2 --replSet rs0 --smallfiles --oplogSize 128
This starts each instance as a member of a replica set named rs0, each running on a distinct port, and specifies
the path to your data directory with the --dbpath setting. If you are already using the suggested ports, select
different ports.
The --smallfiles and --oplogSize settings reduce the disk space that each mongod (page 925) instance uses. This is ideal for testing and development deployments as it prevents overloading your machine. For
more information on these and other configuration options, see Configuration File Options (page 990).
3. Connect to one of your mongod (page 925) instances through the mongo (page 942) shell. You will need to
indicate which instance by specifying its port number. For the sake of simplicity and clarity, you may want to
choose the first one, as in the following command;
mongo --port 27017
4. In the mongo (page 942) shell, use rs.initiate() (page 897) to initiate the replica set. You can create a
replica set configuration object in the mongo (page 942) shell environment, as in the following example:
rsconf = {
_id: "rs0",
members: [
{
_id: 0,
host: "<hostname>:27017"
}
]
}
replacing <hostname> with your systems hostname, and then pass the rsconf file to rs.initiate()
(page 897) as follows:
rs.initiate( rsconf )
5. Display the current replica configuration (page 479) by issuing the following command:
rs.conf()
424
Chapter 8. Replication
"version" : 4,
"members" : [
{
"_id" : 1,
"host" : "localhost:27017"
}
]
}
6. In the mongo (page 942) shell connected to the primary, add the second and third mongod (page 925) instances
to the replica set using the rs.add() (page 895) method. Replace <hostname> with your systems hostname
in the following examples:
rs.add("<hostname>:27018")
rs.add("<hostname>:27019")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
Check the status of your replica set at any time with the rs.status() (page 898) operation.
See also:
The documentation of the following shell functions for more information:
rs.initiate() (page 897)
rs.conf() (page 896)
rs.reconfig() (page 897)
rs.add() (page 895)
You may also consider the simple setup script10 as an example of a basic automatically-configured replica set.
Refer to Replica Set Read and Write Semantics (page 402) for a detailed explanation of read and write semantics in
MongoDB.
Deploy a Geographically Distributed Replica Set
This tutorial outlines the process for deploying a replica set with members in multiple locations. The tutorial addresses
three-member sets, four-member sets, and sets with more than four members.
For appropriate background, see Replication (page 377) and Replica Set Deployment Architectures (page 390). For
related tutorials, see Deploy a Replica Set (page 420) and Add Members to a Replica Set (page 433).
Overview
While replica sets provide basic protection against single-instance failure, replica sets whose members are all located
in a single facility are susceptible to errors in that facility. Power outages, network interruptions, and natural disasters
are all issues that can affect replica sets whose members are colocated. To protect against these classes of failures,
deploy a replica set with one or more members in a geographically distinct facility or data center.
Requirements
In general, the requirements for any geographically distributed replica set are as follows:
10 https://github.com/mongodb/mongo-snippets/blob/master/replication/simple-setup.py
425
Ensure that a majority of the voting members (page 400) are within a primary facility, Site A. This includes
priority 0 members (page 386) and arbiters (page 389). Other members can be distributed in secondary facilities,
Site B, Site C, etc. For more information on the need to keep the voting majority on one site, see Replica
Set Elections (page 397).
If you deploy a replica set with an even number of members, deploy an arbiter (page 389) on Site A. The arbiter
must be on site A to keep the majority there.
For instance, for a three-member replica set you need two instances in a Site A, and one member in a secondary facility,
Site B. Site A should be the same facility or very close to your primary application infrastructure (i.e. application
servers, caching layer, users, etc.)
A four-member replica set should have at least two members in Site A, with the remaining members in one or more
secondary sites, as well as a single arbiter in Site A.
For all configurations in this tutorial, deploy each replica set member on a separate system. Although you may deploy
more than one replica set member on a single system, doing so reduces the redundancy and capacity of the replica set.
Such deployments are typically for testing purposes and beyond the scope of this tutorial.
This tutorial assumes you have installed MongoDB on each system that will be part of your replica set. If you have
not already installed MongoDB, see the installation tutorials (page 3).
Procedures
General Considerations
Each member of the replica set resides on its own machine and all of the MongoDB processes bind to port
27017 (the standard MongoDB port).
Each member of the replica set must be accessible by way of resolvable DNS or hostnames, as in the following
scheme:
mongodb0.example.net
mongodb1.example.net
mongodb2.example.net
mongodbn.example.net
You will need to either configure your DNS names appropriately, or set up your systems /etc/hosts file to
reflect this configuration.
Ensure that network traffic can pass between all members in the network securely and efficiently. Consider the
following:
Establish a virtual private network. Ensure that your network topology routes all traffic between members
within a single site over the local area network.
Configure authentication using auth (page 993) and keyFile (page 993), so that only servers and
processes with authentication can connect to the replica set.
Configure networking and firewall rules so that only traffic (incoming and outgoing packets) on the default
MongoDB port (e.g. 27017) from within your deployment is permitted.
For more information on security and firewalls, see Inter-Process Authentication (page 238).
You must specify the run time configuration on each system in a configuration file (page 990) stored in
/etc/mongodb.conf or a related location. Do not specify the sets configuration in the mongo (page 942)
shell.
Use the following configuration for each of your MongoDB instances. You should set values that are appropriate
for your systems, as needed:
426
Chapter 8. Replication
port = 27017
bind_ip = 10.8.0.10
dbpath = /srv/mongodb/
fork = true
replSet = rs0
The dbpath (page 993) indicates where you want mongod (page 925) to store data files. The dbpath
(page 993) must exist before you start mongod (page 925). If it does not exist, create the directory and ensure
mongod (page 925) has permission to read and write data to this path. For more information on permissions,
see the security operations documentation (page 236).
Modifying bind_ip (page 991) ensures that mongod (page 925) will only listen for connections from applications on the configured address.
For more information about the run time options used above and other configuration options, see Configuration
File Options (page 990).
Figure 8.25: Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0
member.
Deploy a Distributed Three-Member Replica Set
1. Start a mongod (page 925) instance on each system that will be part of your replica set. Specify the same
replica set name on each instance. For additional mongod (page 925) configuration options specific to replica
sets, see Replication Options (page 933).
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
If you use a configuration file, then start each mongod (page 925) instance with a command similar to following,
where mongodb.conf is the configuration file:
mongod --config /etc/mongodb.conf
Note: You will likely want to use and configure a control script to manage this process in production deploy-
427
3. Use rs.initiate() (page 897) to initiate a replica set consisting of the current member and using the default
configuration, as follows:
rs.initiate()
1. In the mongo (page 942) shell connected to the primary, add the remaining members to the replica set
using rs.add() (page 895) in the mongo (page 942) shell on the current primary (in this example,
mongodb0.example.net). The commands should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
6. Make sure that you have configured the member located
mongodb2.example.net) as a priority 0 member (page 386):
in
Site
(in
this
example,
(a) Issue the following command to determine the members (page 480) array position for the member:
rs.conf()
(b) In the members (page 480) array, save the position of the member whose priority you wish to change.
The example in the next step assumes this value is 2, for the third item in the list. You must record array
position, not _id, as these ordinals will be different if you remove a member.
(c) In the mongo (page 942) shell connected to the replica sets primary, issue a command sequence similar
to the following:
cfg = rs.conf()
cfg.members[2].priority = 0
rs.reconfig(cfg)
When the operations return, mongodb2.example.net has a priority of 0. It cannot become primary.
Note: The rs.reconfig() (page 897) shell method can force the current primary to step down, causing an election. When the primary steps down, all clients will disconnect. This is the intended behavior.
While most elections complete within a minute, always make sure any replica configuration changes occur
during scheduled maintenance periods.
428
Chapter 8. Replication
After these commands return, you have a geographically distributed three-member replica set.
Check the status of your replica set at any time with the rs.status() (page 898) operation.
See also:
The documentation of the following shell functions for more information:
rs.initiate() (page 897)
rs.conf() (page 896)
rs.reconfig() (page 897)
rs.add() (page 895)
Refer to Replica Set Read and Write Semantics (page 402) for a detailed explanation of read and write semantics in
MongoDB.
Deploy a Distributed Four-Member Replica Set A geographically distributed four-member deployment has two
additional considerations:
One host (e.g. mongodb3.example.net) must be an arbiter. This host can run on a system that is also used
for an application server or on the same machine as another MongoDB process.
You must decide how to distribute your systems. There are three possible architectures for the four-member
replica set:
Three members in Site A, one priority 0 member (page 386) in Site B, and an arbiter in Site A.
Two members in Site A, two priority 0 members (page 386) in Site B, and an arbiter in Site A.
Two members in Site A, one priority 0 member in Site B, one priority 0 member in Site C, and an arbiter
in site A.
In most cases, the first architecture is preferable because it is the least complex.
To deploy a geographically distributed four-member set:
1. Start a mongod (page 925) instance on each system that will be part of your replica set. Specify the same
replica set name on each instance. For additional mongod (page 925) configuration options specific to replica
sets, see Replication Options (page 933).
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
If you use a configuration file, then start each mongod (page 925) instance with a command similar to following,
where mongodb.conf is the configuration file:
mongod --config /etc/mongodb.conf
Note: You will likely want to use and configure a control script to manage this process in production deployments. Control scripts are beyond the scope of this document.
2. Open a mongo (page 942) shell connected to one of the hosts by issuing the following command:
mongo
429
3. Use rs.initiate() (page 897) to initiate a replica set consisting of the current member and using the default
configuration, as follows:
rs.initiate()
5. Add the remaining members to the replica set using rs.add() (page 895) in a mongo (page 942) shell
connected to the current primary. The commands should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
rs.add("mongodb3.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
6. In the same shell session, issue the following command to add the arbiter (e.g. mongodb4.example.net):
rs.addArb("mongodb4.example.net")
7. Make sure that you have configured each member located outside of Site A (e.g. mongodb3.example.net)
as a priority 0 member (page 386):
(a) Issue the following command to determine the members (page 480) array position for the member:
rs.conf()
(b) In the members (page 480) array, save the position of the member whose priority you wish to change.
The example in the next step assumes this value is 2, for the third item in the list. You must record array
position, not _id, as these ordinals will be different if you remove a member.
(c) In the mongo (page 942) shell connected to the replica sets primary, issue a command sequence similar
to the following:
cfg = rs.conf()
cfg.members[2].priority = 0
rs.reconfig(cfg)
When the operations return, mongodb2.example.net has a priority of 0. It cannot become primary.
Note: The rs.reconfig() (page 897) shell method can force the current primary to step down, causing an election. When the primary steps down, all clients will disconnect. This is the intended behavior.
While most elections complete within a minute, always make sure any replica configuration changes occur
during scheduled maintenance periods.
After these commands return, you have a geographically distributed four-member replica set.
430
Chapter 8. Replication
Check the status of your replica set at any time with the rs.status() (page 898) operation.
See also:
The documentation of the following shell functions for more information:
rs.initiate() (page 897)
rs.conf() (page 896)
rs.reconfig() (page 897)
rs.add() (page 895)
Refer to Replica Set Read and Write Semantics (page 402) for a detailed explanation of read and write semantics in
MongoDB.
Figure 8.26: Diagram of a 5 member replica set distributed across three data centers. Replica set includes a priority 0
member.
Deploy a Distributed Set with More than Four Members The above procedures detail the steps necessary for
deploying a geographically distributed replica set. Larger replica set deployments follow the same steps, but have
additional considerations:
Never deploy more than seven voting members.
If you have an even number of members, use the procedure for a four-member set (page 429)). Ensure that
the a single facility, Site A, always has a majority of the members by deploying the arbiter in that site. For
example, if a set has six members, deploy at least three voting members in addition to the arbiter in Site A, and
the remaining members in alternate sites.
If you have an odd number of members, use the procedure for a three-member set (page 427). Ensure that a
single facility, Site A always has a majority of the members of the set. For example, if a set has five members,
deploy three members within Site A and two members in other facilities.
If you have a majority of the members of the set outside of Site A and the network partitions to prevent communication between sites, the current primary in Site A will step down, even if none of the members outside of
Site A are eligible to become primary.
431
Add an Arbiter
1. Create a data directory (e.g. dbpath (page 993)) for the arbiter. The mongod (page 925) instance uses the
directory for configuration data. The directory will not hold the data set. For example, create the /data/arb
directory:
mkdir /data/arb
2. Start the arbiter. Specify the data directory and the replica set name. The following, starts an arbiter using the
/data/arb dbpath (page 993) for the rs replica set:
mongod --port 30000 --dbpath /data/arb --replSet rs
3. Connect to the primary and add the arbiter to the replica set. Use the rs.addArb() (page 896) method, as in
the following example:
rs.addArb("m1.example.net:30000")
This operation adds the arbiter running on port 30000 on the m1.example.net host.
Convert a Standalone to a Replica Set
432
Chapter 8. Replication
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
For more information on configuration options, see Configuration File Options (page 990) and the mongod
(page 925) manual page.
3. Connect to the mongod (page 925) instance.
4. Use rs.initiate() (page 897) to initiate the new replica set:
rs.initiate()
Replace <hostname> and <port> with the resolvable hostname and port of the mongod (page 925) instance
to add to the set. For more information on adding a host to a replica set, see Add Members to a Replica Set
(page 433).
Sharding Considerations If the new replica set is part of a sharded cluster, change the shard host information in
the config database by doing the following:
1. Connect to one of the sharded clusters mongos (page 938) instances and issue a command in the following
form:
Replace <name> with the name of the shard. Replace <replica-set> with the name of the replica set.
Replace <member,><member,><> with the list of the members of the replica set.
2. Restart all mongos (page 938) instances. If possible, restart all components of the replica sets (i.e., all mongos
(page 938) and all shard mongod (page 925) instances).
Add Members to a Replica Set
Overview
This tutorial explains how to add an additional member to an existing replica set. For background on replication
deployment patterns, see the Replica Set Deployment Architectures (page 390) document.
Maximum Voting Members A replica set can have a maximum of seven voting members (page 397). To add
a member to a replica set that already has seven votes, you must either add the member as a non-voting member
(page 400) or remove a vote from an existing member (page 482).
433
Control Scripts In production deployments you can configure a control script to manage member processes.
Existing Members You can use these procedures to add new members to an existing set. You can also use the same
procedure to re-add a removed member. If the removed members data is still relatively recent, it can recover and
catch up easily.
Data Files If you have a backup or snapshot of an existing member, you can move the data files (e.g. the dbpath
(page 993) directory) to a new system and use them to quickly initiate a new member. The files must be:
A consistent copy of the database from a member of the same replica set. See Backup and Restore with Filesystem Snapshots (page 184) document for more information.
Important: Always use filesystem snapshots to create a copy a member of the existing replica set. Do not use
mongodump (page 951) and mongorestore (page 956) to seed a new replica set member.
More recent than the oldest operation in the primarys oplog. The new member must be able to become current
by applying operations from the primarys oplog.
Requirements
Prepare the Data Directory Before adding a new member to an existing replica set, prepare the new members data
directory using one of the following strategies:
Make sure the new members data directory does not contain data. The new member will copy the data from an
existing member.
If the new member is in a recovering state, it must exit and become a secondary before MongoDB can copy all
data as part of the replication process. This process takes time but does not require administrator intervention.
Manually copy the data directory from an existing member. The new member becomes a secondary member
and will catch up to the current state of the replica set. Copying the data over may shorten the amount of time
for the new member to become current.
Ensure that you can copy the data directory to the new member and begin replication within the window allowed
by the oplog (page 411). Otherwise, the new instance will have to perform an initial sync, which completely
resynchronizes the data, as described in Resync a Member of a Replica Set (page 450).
Use db.printReplicationInfo() (page 891) to check the current state of replica set members with
regards to the oplog.
For background on replication deployment patterns, see the Replica Set Deployment Architectures (page 390) document.
434
Chapter 8. Replication
Take note of the host name and port information for the new mongod (page 925) instance.
For more information on configuration options, see the mongod (page 925) manual page.
Optional
You can specify the data directory and replica set in the mongo.conf configuration file (page 990), and start
the mongod (page 925) with the following command:
mongod --config /etc/mongodb.conf
4. Verify that the member is now part of the replica set. Call the rs.conf() (page 896) method, which displays
the replica set configuration (page 479):
rs.conf()
To view replica set status, issue the rs.status() (page 898) method. For a description of the status fields,
see replSetGetStatus (page 726).
Configure and Add a Member You can add a member to a replica set by passing to the rs.add()
(page 895) method a members (page 480) document.
The document must be in the form of a
local.system.replset.members (page 480) document. These documents define a replica set member in
the same form as the replica set configuration document (page 480).
Important: Specify a value for the _id field of the members (page 480) document. MongoDB does not automatically populate the _id field in this case. Finally, the members (page 480) document must declare the host value.
All other fields are optional.
Example
To add a member with the following configuration:
an _id of 1.
a hostname and port number (page 480) of mongodb3.example.net:27017.
a priority (page 481) value within the replica set of 0.
a configuration as hidden (page 481),
435
1. Shut down the mongod (page 925) instance for the member you wish to remove. To shut down the instance,
connect using the mongo (page 942) shell and the db.shutdownServer() (page 894) method.
2. Connect to the replica sets current primary. To determine the current primary, use db.isMaster()
(page 890) while connected to any member of the replica set.
3. Use rs.remove() (page 898) in either of the following forms to remove the member:
rs.remove("mongod3.example.net:27017")
rs.remove("mongod3.example.net")
MongoDB disconnects the shell briefly as the replica set elects a new primary. The shell then automatically
reconnects. The shell displays a DBClientCursor::init call() failed error even though the command succeeds.
Remove a Member Using rs.reconfig()
To remove a member you can manually edit the replica set configuration document (page 479), as described here.
1. Shut down the mongod (page 925) instance for the member you wish to remove. To shut down the instance,
connect using the mongo (page 942) shell and the db.shutdownServer() (page 894) method.
2. Connect to the replica sets current primary. To determine the current primary, use db.isMaster()
(page 890) while connected to any member of the replica set.
3. Issue the rs.conf() (page 896) method to view the current configuration document and determine the position in the members array of the member to remove:
Example
mongod_C.example.net is in position 2 of the following configuration file:
{
"_id" : "rs",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "mongod_A.example.net:27017"
},
{
"_id" : 1,
"host" : "mongod_B.example.net:27017"
},
{
"_id" : 2,
436
Chapter 8. Replication
"host" : "mongod_C.example.net:27017"
}
]
}
6. Overwrite the replica set configuration document with the new configuration by issuing the following:
rs.reconfig(cfg)
As a result of rs.reconfig() (page 897) the shell will disconnect while the replica set renegotiates which
member is primary. The shell displays a DBClientCursor::init call() failed error even though
the command succeeds, and will automatically reconnected.
7. To confirm the new configuration, issue rs.conf() (page 896).
For the example above the output would be:
{
"_id" : "rs",
"version" : 8,
"members" : [
{
"_id" : 0,
"host" : "mongod_A.example.net:27017"
},
{
"_id" : 1,
"host" : "mongod_B.example.net:27017"
}
]
}
To change the hostname for a replica set member modify the host (page 480) field. The value of _id (page 480)
field will not change when you reconfigure the set.
See Replica Set Configuration (page 479) and rs.reconfig() (page 897) for more information.
437
Note: Any replica set configuration change can trigger the current primary to step down, which forces an election
(page 397). During the election, the current shell session and clients connected to this replica set disconnect, which
produces an error even when the operation succeeds.
Example
To change the hostname to mongo2.example.net for the replica set member configured at members[0], issue
the following sequence of commands:
cfg = rs.conf()
cfg.members[0].host = "mongo2.example.net"
rs.reconfig(cfg)
The first operation uses rs.conf() (page 896) to set the local variable cfg to the contents of the current replica
set configuration, which is a document. The next three operations change the priority (page 481) value in the
cfg document for the first three members configured in the members (page 480) array. The final operation calls
rs.reconfig() (page 897) with the argument of cfg to initialize the new configuration.
When updating the replica configuration object, access the replica set members in the members (page 480) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 480)
field in each document in the members (page 480) array.
438
Chapter 8. Replication
If a member has priority (page 481) set to 0, it is ineligible to become primary and will not seek election. Hidden
members (page 387), delayed members (page 387), and arbiters (page ??) all have priority (page 481) set to 0.
All members have a priority (page 481) equal to 1 by default.
The value of priority (page 481) can be any floating point (i.e. decimal) number between 0 and 1000. Priorities
are only used to determine the preference in election. The priority value is used only in relation to other members.
With the exception of members with a priority of 0, the absolute value of the priority (page 481) value is irrelevant.
Replica sets will preferentially elect and maintain the primary status of the member with the highest priority
(page 481) setting.
Warning: Replica set reconfiguration can force the current primary to step down, leading to an election for
primary in the replica set. Elections cause the current primary to close all open client connections.
Perform routine replica set reconfiguration during scheduled maintenance windows.
See also:
The Replica Reconfiguration Usage (page 483) example revolves around changing the priorities of the members
(page 480) of a replica set.
Prevent Secondary from Becoming Primary
To prevent a secondary member from ever becoming a primary in a failover, assign the secondary a priority of 0,
as described here. You can set this secondary-only mode for any member of the replica set, except the current
primary. For a detailed description of secondary-only members and their purposes, see Priority 0 Replica Set Members
(page 386).
To configure a member as secondary-only, set its priority (page 481) value to 0 in the members (page 480)
document in its replica set configuration. Any member with a priority (page 481) equal to 0 will never seek
election (page 397) and cannot become primary in any situation.
{
"_id" : <num>,
"host" : <hostname:port>,
"priority" : 0
}
MongoDB does not permit the current primary to have a priority of 0. To prevent the current primary from again
becoming a primary, you must first step down the current primary using rs.stepDown() (page 899), and then you
must reconfigure the replica set (page 483) with rs.conf() (page 896) and rs.reconfig() (page 897).
Example
As an example of modifying member priorities, assume a four-member replica set. Use the following sequence of
operations to modify member priorities in the mongo (page 942) shell connected to the primary. Identify each member
by its array index in the members (page 480) array:
cfg = rs.conf()
cfg.members[0].priority
cfg.members[1].priority
cfg.members[2].priority
cfg.members[3].priority
rs.reconfig(cfg)
=
=
=
=
2
1
0.5
0
The sequence of operations reconfigures the set with the following priority settings:
439
Related Documents
440
Chapter 8. Replication
Example
The following example hides the secondary member currently at the index 0 in the members (page 480) array. To
configure a hidden member, use the following sequence of operations in a mongo (page 942) shell connected to the
primary, specifying the member to configure by its array index in the members (page 480) array:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
rs.reconfig(cfg)
After re-configuring the set, this secondary member has a priority of 0 so that it cannot become primary and is hidden.
The other members in the set will not advertise the hidden member in the isMaster (page 732) or db.isMaster()
(page 890) output.
When updating the replica configuration object, access the replica set members in the members (page 480) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 480)
field in each document in the members (page 480) array.
Warning:
The rs.reconfig() (page 897) shell method can force the current primary to step down, which causes
an election (page 397). When the primary steps down, the mongod (page 925) closes all client connections.
While this typically takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 432) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Changed in version 2.0: For sharded clusters running with replica sets before 2.0, if you reconfigured a member as
hidden, you had to restart mongos (page 938) to prevent queries from reaching the hidden member.
Related Documents
441
Example
The following example sets a 1-hour delay on a secondary member currently at the index 0 in the members (page 480)
array. To set the delay, issue the following sequence of operations in a mongo (page 942) shell connected to the
primary:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
cfg.members[0].slaveDelay = 3600
rs.reconfig(cfg)
After the replica set reconfigures, the delayed secondary member cannot become primary and is hidden from applications. The slaveDelay (page 482) value delays both replication and the members oplog by 3600 seconds (1
hour).
When updating the replica configuration object, access the replica set members in the members (page 480) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 480)
field in each document in the members (page 480) array.
Warning:
The rs.reconfig() (page 897) shell method can force the current primary to step down, which causes
an election (page 397). When the primary steps down, the mongod (page 925) closes all client connections.
While this typically takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 432) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Related Documents
To disable the ability to vote in elections for the fourth, fifth, and sixth replica set members, use the following command
sequence in the mongo (page 942) shell connected to the primary. You identify each replica set member by its array
index in the members (page 480) array:
cfg = rs.conf()
cfg.members[3].votes = 0
cfg.members[4].votes = 0
442
Chapter 8. Replication
cfg.members[5].votes = 0
rs.reconfig(cfg)
This sequence gives 0 votes to the fourth, fifth, and sixth members of the set according to the order of the members
(page 480) array in the output of rs.conf() (page 896). This setting allows the set to elect these members as primary
but does not allow them to vote in elections. Place voting members so that your designated primary or primaries can
reach a majority of votes in the event of a network partition.
When updating the replica configuration object, access the replica set members in the members (page 480) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 480)
field in each document in the members (page 480) array.
Warning:
The rs.reconfig() (page 897) shell method can force the current primary to step down, which causes
an election (page 397). When the primary steps down, the mongod (page 925) closes all client connections.
While this typically takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 432) to ensure that members can quickly obtain a
majority of votes in an election for primary.
In general and when possible, all members should have only 1 vote. This prevents intermittent ties, deadlocks, or the
wrong members from becoming primary. Use priority (page 481) to control which members are more likely to
become primary.
Related Documents
Convert Secondary to Arbiter and Reuse the Port Number (page 444)
Convert Secondary to Arbiter Running on a New Port Number (page 444)
If you have a secondary in a replica set that no longer needs to hold data but that needs to remain in the set to ensure that
the set can elect a primary (page 397), you may convert the secondary to an arbiter (page ??) using either procedure
in this tutorial. Both procedures are operationally equivalent:
You may operate the arbiter on the same port as the former secondary. In this procedure, you must shut down
the secondary and remove its data before restarting and reconfiguring it as an arbiter.
For this procedure, see Convert Secondary to Arbiter and Reuse the Port Number (page 444).
Run the arbiter on a new port. In this procedure, you can reconfigure the server as an arbiter before shutting
down the instance running as a secondary.
For this procedure, see Convert Secondary to Arbiter Running on a New Port Number (page 444).
443
1. If your application is connecting directly to the secondary, modify the application so that MongoDB queries
dont reach the secondary.
2. Shut down the secondary.
3. Remove the secondary from the replica set by calling the rs.remove() (page 898) method. Perform this
operation while connected to the current primary in the mongo (page 942) shell:
rs.remove("<hostname><:port>")
4. Verify that the replica set no longer includes the secondary by calling the rs.conf() (page 896) method in
the mongo (page 942) shell:
rs.conf()
Optional
You may remove the data instead.
6. Create a new, empty data directory to point to when restarting the mongod (page 925) instance. You can reuse
the previous name. For example:
mkdir /data/db
7. Restart the mongod (page 925) instance for the secondary, specifying the port number, the empty data directory,
and the replica set. You can use the same port number you used before. Issue a command similar to the
following:
mongod --port 27021 --dbpath /data/db --replSet rs
8. In the mongo (page 942) shell convert the secondary to an arbiter using the rs.addArb() (page 896) method:
rs.addArb("<hostname><:port>")
9. Verify the arbiter belongs to the replica set by calling the rs.conf() (page 896) method in the mongo
(page 942) shell.
rs.conf()
1. If your application is connecting directly to the secondary or has a connection string referencing the secondary,
modify the application so that MongoDB queries dont reach the secondary.
2. Create a new, empty data directory to be used with the new port number. For example:
mkdir /data/db-temp
444
Chapter 8. Replication
3. Start a new mongod (page 925) instance on the new port number, specifying the new data directory and the
existing replica set. Issue a command similar to the following:
mongod --port 27021 --dbpath /data/db-temp --replSet rs
4. In the mongo (page 942) shell connected to the current primary, convert the new mongod (page 925) instance
to an arbiter using the rs.addArb() (page 896) method:
rs.addArb("<hostname><:port>")
5. Verify the arbiter has been added to the replica set by calling the rs.conf() (page 896) method in the mongo
(page 942) shell.
rs.conf()
8. Verify that the replica set no longer includes the old secondary by calling the rs.conf() (page 896) method
in the mongo (page 942) shell:
rs.conf()
Optional
You may remove the data instead.
445
Change Hostnames in a Replica Set (page 458) Update the replica set configuration to reflect changes in members
hostnames.
Configure a Secondarys Sync Target (page 462) Specify the member that a secondary member synchronizes from.
Change the Size of the Oplog
The oplog exists internally as a capped collection, so you cannot modify its size in the course of normal operations. In
most cases the default oplog size (page 411) is an acceptable size; however, in some situations you may need a larger
or smaller oplog. For example, you might need to change the oplog size if your applications perform large numbers of
multi-updates or deletes in short periods of time.
This tutorial describes how to resize the oplog. For a detailed explanation of oplog sizing, see Oplog Size (page 411).
For details how oplog size affects delayed members and affects replication lag, see Delayed Replica Set Members
(page 387).
Overview
To change the size of the oplog, you must perform maintenance on each member of the replica set in turn. The
procedure requires: stopping the mongod (page 925) instance and starting as a standalone instance, modifying the
oplog size, and restarting the member.
Important: Always start rolling replica set maintenance with the secondaries, and finish with the maintenance on
primary member.
Procedure
Restart this mongod (page 925) as a standalone instance running on a different port and without the --replSet
parameter. Use a command similar to the following:
mongod --port 37017 --dbpath /srv/mongodb
446
Chapter 8. Replication
Create a Backup of the Oplog (Optional) Optionally, backup the existing oplog on the standalone instance, as in
the following example:
mongodump --db local --collection 'oplog.rs' --port 37017
Recreate the Oplog with a New Size and a Seed Entry Save the last entry from the oplog. For example, connect
to the instance using the mongo (page 942) shell, and enter the following command to switch to the local database:
use local
In mongo (page 942) shell scripts you can use the following operation to set the db object:
db = db.getSiblingDB('local')
Use the db.collection.save() (page 846) method and a sort on reverse natural order to find the last entry and
save it to a temporary collection:
db.temp.save( db.oplog.rs.find( { }, { ts: 1, h: 1 } ).sort( {$natural : -1} ).limit(1).next() )
Remove the Existing Oplog Collection Drop the old oplog.rs collection in the local database. Use the following command:
db = db.getSiblingDB('local')
db.oplog.rs.drop()
Insert the Last Entry of the Old Oplog into the New Oplog Insert the previously saved last entry from the old
oplog into the new oplog. For example:
db.oplog.rs.save( db.temp.findOne() )
To confirm the entry is in the new oplog, use the following operation:
db.oplog.rs.find()
Restart the mongod (page 925) as a member of the replica set on its usual port. For example:
db.shutdownServer()
mongod --replSet rs0 --dbpath /srv/mongodb
The replica set member will recover and catch up before it is eligible for election to primary.
447
Repeat Process for all Members that may become Primary Repeat this procedure for all members you want to
change the size of the oplog. Repeat the procedure for the primary as part of the following step.
Change the Size of the Oplog on the Primary To finish the rolling maintenance operation, step down the primary
with the rs.stepDown() (page 899) method and repeat the oplog resizing procedure above.
Force a Member to Become Primary
Synopsis
You can force a replica set member to become primary by giving it a higher priority (page 481) value than any
other member in the set.
Optionally, you also can force a member never to become primary by setting its priority (page 481) value to 0,
which means the member can never seek election (page 397) as primary. For more information, see Priority 0 Replica
Set Members (page 386).
Procedures
Force a Member to be Primary by Setting its Priority High Changed in version 2.0.
For more information on priorities, see priority (page 481).
This procedure assumes your current primary is m1.example.net and that youd like to instead make
m3.example.net primary. The procedure also assumes you have a three-member replica set with the configuration below. For more information on configurations, see Replica Set Configuration Use (page 483).
This procedure assumes this configuration:
{
"_id" : "rs",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "m1.example.net:27017"
},
{
"_id" : 1,
"host" : "m2.example.net:27017"
},
{
"_id" : 2,
"host" : "m3.example.net:27017"
}
]
}
1. In the mongo (page 942) shell, use the following sequence of operations to make m3.example.net the
primary:
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 0.5
cfg.members[2].priority = 1
rs.reconfig(cfg)
448
Chapter 8. Replication
This prevents m1.example.net from being primary for 1,000,000 seconds, even if there is no other member
that can become primary. When m3.example.net catches up with m1.example.net it will become
primary.
If you later want to make m1.example.net primary again while it waits for m3.example.net to catch
up, issue the following command to make m1.example.net seek election again:
rs.freeze()
The rs.freeze() (page 897) provides a wrapper around the replSetFreeze (page 726) database command.
Force a Member to be Primary Using Database Commands Changed in version 1.8.
Consider a replica set with the following members:
mdb0.example.net - the current primary.
mdb1.example.net - a secondary.
mdb2.example.net - a secondary .
To force a member to become primary use the following procedure:
1. In a mongo (page 942) shell, run rs.status() (page 898) to ensure your replica set is running as expected.
2. In a mongo (page 942) shell connected to the mongod (page 925) instance running on mdb2.example.net,
freeze mdb2.example.net so that it does not attempt to become primary for 120 seconds.
rs.freeze(120)
3. In a mongo (page 942) shell connected the mongod (page 925) running on mdb0.example.net, step down
this instance that the mongod (page 925) is not eligible to become primary for 120 seconds:
rs.stepDown(120)
449
For more information, consider the rs.freeze() (page 897) and rs.stepDown() (page 899) methods that wrap
the replSetFreeze (page 726) and replSetStepDown (page 730) commands.
Resync a Member of a Replica Set
A replica set member becomes stale when its replication process falls so far behind that the primary overwrites
oplog entries the member has not yet replicated. The member cannot catch up and becomes stale. When this occurs,
you must completely resynchronize the member by removing its data and performing an initial sync (page 412).
This tutorial addressed both resyncing a stale member and to creating a new member using seed data from another
member. When syncing a member, choose a time when the system has the bandwidth to move a large amount of data.
Schedule the synchronization during a time of low usage or during a maintenance window.
MongoDB provides two options for performing an initial sync:
Restart the mongod (page 925) with an empty data directory and let MongoDBs normal initial syncing feature
restore the data. This is the more simple option but may take longer to replace the data.
See Automatically Sync a Member (page 450).
Restart the machine with a copy of a recent data directory from another member in the replica set. This procedure
can replace the data more quickly but requires more manual steps.
See Sync by Copying Data Files from Another Member (page 450).
Automatically Sync a Member
This procedure relies on MongoDBs regular process for initial sync (page 412). This will store the current data on the
member. For an overview of MongoDB initial sync process, see the Replication Processes (page 410) section.
To sync or resync a member:
1. If the member is an existing member, do the following:
(a) Stop the members mongod (page 925) instance.
To ensure a clean shutdown, use the
db.shutdownServer() (page 894) method from the mongo (page 942) shell or on Linux systems,
the mongod --shutdown option.
(b) Delete all data and sub-directories from the members data directory. By removing the data dbpath
(page 993), MongoDB will perform a complete resync. Consider making a backup first.
2. Start the mongod (page 925) instance on the member. For example:
mongod --dbpath /data/db/ --replSet rsProduction
At this point, the mongod (page 925) will perform an initial sync. The length of the initial sync may process
depends on the size of the database and network connection between members of the replica set.
Initial sync operations can impact the other members of the set and create additional traffic to the primary and
can only occur if another member of the set is accessible and up to date.
Sync by Copying Data Files from Another Member
This approach seeds a new or stale member using the data files from an existing member of the replica set. The data
files must be sufficiently recent to allow the new member to catch up with the oplog. Otherwise the member would
need to perform an initial sync.
450
Chapter 8. Replication
Copy the Data Files You can capture the data files as either a snapshot or a direct copy. However, in most cases
you cannot copy data files from a running mongod (page 925) instance to another because the data files will change
during the file copy operation.
Important: If copying data files, you must copy the content of the local database.
You cannot use a mongodump (page 951) backup to for the data files, only a snapshot backup. For approaches
to capture a consistent snapshot of a running mongod (page 925) instance, see the Backup Strategies for MongoDB
Systems (page 136) documentation.
Sync the Member After you have copied the data files from the seed source, start the mongod (page 925) instance
and allow it to apply all operations from the oplog until it reflects the current state of the replica set.
Configure Replica Set Tag Sets
Tag sets let you customize write concern and read preferences for a replica set. MongoDB stores tag sets in the replica
set configuration object, which is the document returned by rs.conf() (page 896), in the members[n].tags
(page 482) sub-document.
This section introduces the configuration of tag sets. For an overview on tag sets and their use, see Replica Set Write
Concern (page 57) and Tag Sets (page 407).
Differences Between Read Preferences and Write Concerns
Custom read preferences and write concerns evaluate tags sets in different ways:
Read preferences consider the value of a tag when selecting a member to read from.
Write concerns do not use the value of a tag to select a member except to consider whether or not the value is
unique.
For example, a tag set for a read operation may resemble the following document:
{ "disk": "ssd", "use": "reporting" }
To fulfill such a read operation, a member would need to have both of these tags. Any of the following tag sets would
satisfy this requirement:
{
{
{
{
"disk":
"disk":
"disk":
"disk":
"ssd",
"ssd",
"ssd",
"ssd",
"use":
"use":
"use":
"use":
"reporting" }
"reporting", "rack": "a" }
"reporting", "rack": "d" }
"reporting", "mem": "r"}
The following tag sets would not be able to fulfill this query:
{ "disk": "ssd" }
{ "use": "reporting" }
{ "disk": "ssd", "use": "production" }
451
You could add tag sets to the members of this replica set with the following command sequence in the mongo
(page 942) shell:
conf = rs.conf()
conf.members[0].tags = { "dc": "east", "use": "production" }
conf.members[1].tags = { "dc": "east", "use": "reporting" }
conf.members[2].tags = { "use": "production" }
rs.reconfig(conf)
After this operation the output of rs.conf() (page 896) would resemble the following:
{
"_id" : "rs0",
"version" : 2,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017",
"tags" : {
"dc": "east",
"use": "production"
}
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017",
"tags" : {
"dc": "east",
"use": "reporting"
}
},
452
Chapter 8. Replication
{
"_id" : 2,
"host" : "mongodb2.example.net:27017",
"tags" : {
"use": "production"
}
}
]
}
Given a five member replica set with members in two data centers:
1. a facility VA tagged dc.va
2. a facility GTO tagged dc.gto
Create a custom write concern to require confirmation from two data centers using replica set tags, using the following
sequence of operations in the mongo (page 942) shell:
1. Create a replica set configuration JavaScript object conf:
conf = rs.conf()
=
=
=
=
=
{
{
{
{
{
"dc.va": "rack1"}
"dc.va": "rack2"}
"dc.gto": "rack1"}
"dc.gto": "rack2"}
"dc.va": "rack1"}
3. Create a custom getLastErrorModes (page 483) setting to ensure that the write operation will propagate
to at least one member of each facility:
conf.settings = { getLastErrorModes: { MultipleDC : { "dc.va": 1, "dc.gto": 1}}
4. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
To ensure that a write operation propagates to at least one member of the set in both data centers, use the MultipleDC
write concern mode as follows:
db.runCommand( { getLastError: 1, w: "MultipleDC" } )
Alternatively, if you want to ensure that each write operation propagates to at least 2 racks in each facility, reconfigure
the replica set as follows in the mongo (page 942) shell:
1. Create a replica set configuration object conf:
conf = rs.conf()
2. Redefine the getLastErrorModes (page 483) value to require two different values of both dc.va and
dc.gto:
453
3. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
Now, the following write concern operation will only return after the write operation propagates to at least two different
racks in the each facility:
db.runCommand( { getLastError: 1, w: "MultipleDC" } )
Configure Tag Sets for Functional Segregation of Read and Write Operations
11
To target a read operation to a member of the replica set with a disk type of ssd, you could use the following tag set:
{ disk: "ssd" }
However, to create comparable write concern modes, you would specify a different set of getLastErrorModes
(page 483) configuration. Consider the following sequence of operations in the mongo (page 942) shell:
1. Create a replica set configuration object conf:
conf = rs.conf()
2. Redefine the getLastErrorModes (page 483) value to configure two write concern modes:
conf.settings = {
"getLastErrorModes" : {
"ssd" : {
"ssd" : 1
},
"MultipleDC" : {
"dc.va" : 1,
"dc.gto" : 1
}
}
}
3. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
11
Since read preferences and write concerns use the value of fields in tag sets differently, larger deployments may have some redundancy.
454
Chapter 8. Replication
Now you can specify the MultipleDC write concern mode, as in the following operation, to ensure that a write
operation propagates to each data center.
db.runCommand( { getLastError: 1, w: "MultipleDC" } )
Additionally, you can specify the ssd write concern mode to ensure that a write operation propagates to at least one
instance with an SSD.
Reconfigure a Replica Set with Unavailable Members
To reconfigure a replica set when a minority of members are unavailable, use the rs.reconfig() (page 897)
operation on the current primary, following the example in the Replica Set Reconfiguration Procedure (page 483).
This document provides the following options for re-configuring a replica set when a majority of members are not
accessible:
Reconfigure by Forcing the Reconfiguration (page 455)
Reconfigure by Replacing the Replica Set (page 456)
You may need to use one of these procedures, for example, in a geographically distributed replica set, where no local
group of members can reach a majority. See Replica Set Elections (page 397) for more information on this situation.
Reconfigure by Forcing the Reconfiguration
3. On the same member, remove the down and unreachable members of the replica set from the members
(page 480) array by setting the array equal to the surviving members alone. Consider the following example,
which uses the cfg variable created in the previous step:
cfg.members = [cfg.members[0] , cfg.members[4] , cfg.members[7]]
4. On the same member, reconfigure the set by using the rs.reconfig() (page 897) command with the force
option set to true:
rs.reconfig(cfg, {force : true})
This operation forces the secondary to use the new configuration. The configuration is then propagated to all the
surviving members listed in the members array. The replica set then elects a new primary.
455
Note: When you use force : true, the version number in the replica set configuration increases significantly, by tens or hundreds of thousands. This is normal and designed to prevent set version collisions if you
accidentally force re-configurations on both sides of a network partition and then the network partitioning ends.
5. If the failure or partition was only temporary, shut down or decommission the removed members as soon as
possible.
Reconfigure by Replacing the Replica Set
Use the following procedure only for versions of MongoDB prior to version 2.0. If youre running MongoDB 2.0 or
later, use the above procedure, Reconfigure by Forcing the Reconfiguration (page 455).
These procedures are for situations where a majority of the replica set members are down or unreachable. If a majority
is running, then skip these procedures and instead use the rs.reconfig() (page 897) command according to the
examples in Example Reconfiguration Operations (page 483).
If you run a pre-2.0 version and a majority of your replica set is down, you have the two options described here. Both
involve replacing the replica set.
Reconfigure by Turning Off Replication This option replaces the replica set with a standalone server.
1. Stop the surviving mongod (page 925) instances. To ensure a clean shutdown, use an existing control script or
use the db.shutdownServer() (page 894) method.
For example, to use the db.shutdownServer() (page 894) method, connect to the server using the mongo
(page 942) shell and issue the following sequence of commands:
use admin
db.shutdownServer()
2. Create a backup of the data directory (i.e. dbpath (page 993)) of the surviving members of the set.
Optional
If you have a backup of the database you may instead remove this data.
3. Restart one of the mongod (page 925) instances without the --replSet parameter.
The data is now accessible and provided by a single server that is not a replica set member. Clients can use this
server for both reads and writes.
When possible, re-deploy a replica set to provide redundancy and to protect your deployment from operational interruption.
Reconfigure by Breaking the Mirror This option selects a surviving replica set member to be the new primary
and to seed a new replica set. In the following procedure, the new primary is db0.example.net. MongoDB
copies the data from db0.example.net to all the other members.
1. Stop the surviving mongod (page 925) instances. To ensure a clean shutdown, use an existing control script or
use the db.shutdownServer() (page 894) method.
For example, to use the db.shutdownServer() (page 894) method, connect to the server using the mongo
(page 942) shell and issue the following sequence of commands:
use admin
db.shutdownServer()
456
Chapter 8. Replication
2. Move the data directories (i.e. dbpath (page 993)) for all the members except db0.example.net, so that
all the members except db0.example.net have empty data directories. For example:
mv /data/db /data/db-old
3. Move the data files for local database (i.e. local.*) so that db0.example.net has no local database.
For example
mkdir /data/local-old
mv /data/db/local* /data/local-old/
MongoDB performs an initial sync on the added members by copying all data from db0.example.net to
the added members.
See also:
Resync a Member of a Replica Set (page 450)
Manage Chained Replication
Starting in version 2.0, MongoDB supports chained replication. A chained replication occurs when a secondary
member replicates from another secondary member instead of from the primary. This might be the case, for example,
if a secondary selects its replication target based on ping time and if the closest member is another secondary.
Chained replication can reduce load on the primary. But chained replication can also result in increased replication
lag, depending on the topology of the network.
New in version 2.2.2.
You can use the chainingAllowed (page 482) setting in Replica Set Configuration (page 479) to disable chained
replication for situations where chained replication is causing lag.
MongoDB enables chained replication by default. This procedure describes how to disable it and how to re-enable it.
Note: If chained replication is disabled, you still can use replSetSyncFrom (page 730) to specify that a secondary
replicates from another secondary. But that configuration will last only until the secondary recalculates which member
to sync from.
To disable chained replication, set the chainingAllowed (page 482) field in Replica Set Configuration (page 479)
to false.
You can use the following sequence of commands to set chainingAllowed (page 482) to false:
1. Copy the configuration settings into the cfg object:
cfg = rs.config()
457
2. Take note of whether the current configuration settings contain the settings sub-document. If they do, skip
this step.
Warning:
document.
To avoid data loss, skip this step if the configuration settings contain the settings sub-
If the current configuration settings do not contain the settings sub-document, create the sub-document by
issuing the following command:
cfg.settings = { }
3. Issue the following sequence of commands to set chainingAllowed (page 482) to false:
cfg.settings.chainingAllowed = false
rs.reconfig(cfg)
To re-enable chained replication, set chainingAllowed (page 482) to true. You can use the following sequence
of commands:
cfg = rs.config()
cfg.settings.chainingAllowed = true
rs.reconfig(cfg)
For most replica sets, the hostnames in the host (page 480) field never change. However, if organizational needs
change, you might need to migrate some or all host names.
Note: Always use resolvable hostnames for the value of the host (page 480) field in the replica set configuration to
avoid confusion and complexity.
Overview
This document provides two separate procedures for changing the hostnames in the host (page 480) field. Use either
of the following approaches:
Change hostnames without disrupting availability (page 459). This approach ensures your applications will
always be able to read and write data to the replica set, but the approach can take a long time and may incur
downtime at the application layer.
If you use the first procedure, you must configure your applications to connect to the replica set at both the old
and new locations, which often requires a restart and reconfiguration at the application layer and which may
affect the availability of your applications. Re-configuring applications is beyond the scope of this document.
458
Chapter 8. Replication
Stop all members running on the old hostnames at once (page 461). This approach has a shorter maintenance
window, but the replica set will be unavailable during the operation.
See also:
Replica Set Reconfiguration Process (page 483), Deploy a Replica Set (page 420), and Add Members to a Replica Set
(page 433).
Assumptions
459
(d) Use rs.reconfig() (page 897) to update the replica set configuration document (page 479) with the
new hostname.
For example, the following sequence of commands updates the hostname for the secondary at the array
index 1 of the members array (i.e. members[1]) in the replica set configuration document:
cfg = rs.conf()
cfg.members[1].host = "mongodb1.example.net:27017"
rs.reconfig(cfg)
For more information on updating the configuration document, see Example Reconfiguration Operations
(page 483).
(e) Make sure your client applications are able to access the set at the new location and that the secondary has
a chance to catch up with the other members of the set.
Repeat the above steps for each non-primary member of the set.
2. Open a mongo (page 942) shell connected to the primary and step down the primary using the
rs.stepDown() (page 899) method:
rs.stepDown()
460
Chapter 8. Replication
}
]
}
3. For each member of the replica set, perform the following sequence of operations:
(a) Open a mongo (page 942) shell connected to the mongod (page 925) running on the new, temporary port.
For example, for a member running on a temporary port of 37017, you would issue this command:
mongo --port 37017
(b) Edit the replica set configuration manually. The replica set configuration is the only document in the
system.replset collection in the local database. Edit the replica set configuration with the new
hostnames and correct ports for all the members of the replica set. Consider the following sequence of
commands to change the hostnames in a three-member set:
use local
cfg = db.system.replset.findOne( { "_id": "rs" } )
cfg.members[0].host = "mongodb0.example.net:27017"
cfg.members[1].host = "mongodb1.example.net:27017"
cfg.members[2].host = "mongodb2.example.net:27017"
db.system.replset.update( { "_id": "rs" } , cfg )
5. Connect to one of the mongod (page 925) instances using the mongo (page 942) shell. For example:
mongo --port 27017
6. To confirm the new configuration, call rs.conf() (page 896) in the mongo (page 942) shell.
Your output should resemble:
{
"_id" : "rs",
"version" : 4,
461
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017"
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017"
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017"
}
]
}
462
Chapter 8. Replication
Note: The rs.status() (page 898) method is a wrapper around the replSetGetStatus (page 726)
database command.
Monitor the rate of replication by watching the oplog time in the replica graph in the MongoDB Management
Service12 . For more information see the documentation for MMS13 .
Possible causes of replication lag include:
Network Latency
Check the network routes between the members of your set to ensure that there is no packet loss or network
routing issue.
Use tools including ping to test latency between set members and traceroute to expose the routing of
packets network endpoints.
Disk Throughput
If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary,
then the secondary will have difficulty keeping state. Disk-related issues are incredibly prevalent on multitenant systems, including vitalized instances, and can be transient if the system accesses disk devices over an IP
network (as is the case with Amazons EBS system.)
Use system-level tools to assess disk status, including iostat or vmstat.
Concurrency
In some cases, long-running operations on the primary can block replication on secondaries. For best results,
configure write concern (page 55) to require confirmation of replication to secondaries, as described in replica
set write concern (page 57). This prevents write operations from returning if replication cannot keep up with
the write load.
Use the database profiler to see if there are slow queries or long-running operations that correspond to the
incidences of lag.
Appropriate Write Concern
12 http://mms.mongodb.com/
13 http://mms.mongodb.com/help/
463
If you are performing a large data ingestion or bulk load operation that requires a large number of writes to the
primary, particularly with unacknowledged write concern (page 55), the secondaries will not be able to read the
oplog fast enough to keep up with changes.
To prevent this, require write acknowledgment or journaled write concern (page 55) after every 100, 1,000, or
an another interval to provide an opportunity for secondaries to catch up with the primary.
For more information see:
Replica Acknowledge Write Concern (page 57)
Replica Set Write Concern (page 60)
Oplog Size (page 411)
Test Connections Between all Members
All members of a replica set must be able to connect to every other member of the set to support replication. Always verify connections in both directions. Networking topologies and firewall configurations prevent normal and
required connectivity, which can block replication.
Consider the following example of a bidirectional test of networking:
Example
Given a replica set with three members running on three separate hosts:
m1.example.net
m2.example.net
m3.example.net
1. Test the connection from m1.example.net to the other hosts with the following operation set
m1.example.net:
mongo --host m2.example.net --port 27017
mongo --host m3.example.net --port 27017
2. Test the connection from m2.example.net to the other two hosts with the following operation set from
m2.example.net, as in:
mongo --host m1.example.net --port 27017
mongo --host m3.example.net --port 27017
You have now tested the connection between m2.example.net and m1.example.net in both directions.
3. Test the connection from m3.example.net to the other two hosts with the following operation set from the
m3.example.net host, as in:
mongo --host m1.example.net --port 27017
mongo --host m2.example.net --port 27017
If any connection, in any direction fails, check your networking and firewall configuration and reconfigure your environment to allow these connections.
464
Chapter 8. Replication
10.10546875MB
94400 (26.22hrs)
Mon Mar 19 2012 13:50:38 GMT-0400 (EDT)
Wed Oct 03 2012 14:59:10 GMT-0400 (EDT)
Wed Oct 03 2012 15:00:21 GMT-0400 (EDT)
The oplog should be long enough to hold all transactions for the longest downtime you expect on a secondary. At a
minimum, an oplog should be able to hold minimum 24 hours of operations; however, many users prefer to have 72
hours or even a weeks work of operations.
For more information on how oplog size affects operations, see:
Oplog Size (page 411),
Delayed Replica Set Members (page 387), and
Check the Replication Lag (page 463).
Note: You normally want the oplog to be the same size on all members. If you resize the oplog, resize it on all
members.
To change oplog size, see the Change the Size of the Oplog (page 446) tutorial.
Oplog Entry Timestamp Error
Consider the following error in mongod (page 925) output and logs:
replSet error fatal couldn't query the local local.oplog.rs collection.
<timestamp> [rsStart] bad replSet oplog entry?
465
Often, an incorrectly typed value in the ts field in the last oplog entry causes this error. The correct data type is
Timestamp.
Check the type of the ts value using the following two queries against the oplog collection:
db = db.getSiblingDB("local")
db.oplog.rs.find().sort({$natural:-1}).limit(1)
db.oplog.rs.find({ts:{$type:17}}).sort({$natural:-1}).limit(1)
The first query returns the last document in the oplog, while the second returns the last document in the oplog where
the ts value is a Timestamp. The $type (page 630) operator allows you to select BSON type 17, is the Timestamp
data type.
If the queries dont return the same document, then the last document in the oplog has the wrong data type in the ts
field.
Example
If the first query returns this as the last oplog entry:
{ "ts" : {t: 1347982456000, i: 1},
"h" : NumberLong("8191276672478122996"),
"op" : "n",
"ns" : "",
"o" : { "msg" : "Reconfig set", "version" : 4 } }
And the second query returns this as the last entry where ts has the Timestamp type:
{ "ts" : Timestamp(1347982454000, 1),
"h" : NumberLong("6188469075153256465"),
"op" : "n",
"ns" : "",
"o" : { "msg" : "Reconfig set", "version" : 3 } }
Then the value for the ts field in the last oplog entry is of the wrong data type.
To set the proper type for this value and resolve this issue, use an update operation that resembles the following:
db.oplog.rs.update( { ts: { t:1347982456000, i:1 } },
{ $set: { ts: new Timestamp(1347982456000, 1)}})
Modify the timestamp values as needed based on your oplog entry. This operation may take some period to complete
because the update must scan and pull the entire oplog into memory.
Duplicate Key Error on local.slaves
The duplicate key on local.slaves error, occurs when a secondary or slave changes its hostname and the primary or
master tries to update its local.slaves collection with the new name. The update fails because it contains the
same _id value as the document containing the previous hostname. The error itself will resemble the following.
exception 11000 E11000 duplicate key error index: local.slaves.$_id_
This is a benign error and does not affect replication operations on the secondary or slave.
To prevent the error from appearing, drop the local.slaves collection from the primary or master, with the
following sequence of operations in the mongo (page 942) shell:
use local
db.slaves.drop()
466
Chapter 8. Replication
The next time a secondary or slave polls the primary or master, the primary or master recreates the local.slaves
collection.
Description
Adds a member to a replica set.
Adds an arbiter to a replica set.
Returns the replica set configuration document.
Prevents the current member from seeking election as primary for a period of time.
Returns basic help text for replica set functions.
Initializes a new replica set.
Re-configures a replica set by applying a new replica set configuration object.
Remove a member from a replica set.
Sets the slaveOk property for the current connection. Deprecated. Use readPref()
(page 871) and Mongo.setReadPref() (page 919) to set read preference.
Returns a document with information about the state of the replica set.
Causes the current primary to become a secondary which forces an election.
Sets the member that this replica set member will sync from, overriding the default sync
target selection logic.
467
Description
Prevents the current member from seeking election as primary for a period of
time.
Returns a document that reports on the status of the replica set.
Initializes a new replica set.
Enables or disables a maintenance mode, which puts a secondary node in a
RECOVERING state.
Applies a new configuration to an existing replica set.
Forces the current primary to step down and become a secondary, forcing an
election.
Explicitly override the default logic for selecting a member to replicate from.
Forces a mongod (page 925) to re-synchronize from the master. For
master-slave replication only.
Internal command that applies oplog entries to the current data set.
Displays information about this members role in the replica set, including
whether it is the master.
Internal command to support replication, returns the optime.
The following methods apply to replica sets. For a complete list of all methods, see mongo Shell Methods (page 806).
rs.status()
Returns A document with status information.
468
Chapter 8. Replication
This output reflects the current status of the replica set, using data derived from the heartbeat packets sent by the
other members of the replica set.
This method provides a wrapper around the replSetGetStatus (page 726) database command.
db.isMaster()
Returns A document that describes the role of the mongod (page 925) instance.
If the mongod (page 925) is a member of a replica set, then the ismaster (page 733) and secondary
(page 733) fields report if the instance is the primary or if it is a secondary member of the replica set.
See
isMaster (page 732) for the complete documentation of the output of db.isMaster() (page 890).
Description
rs.initiate(configuration)
Initiates a replica set. Optionally takes a configuration argument in the form of a document that holds the
configuration of a replica set.
The rs.initiate() (page 897) method has the following parameter:
param document configuration A document that specifies configuration settings (page 479) for the
new replica set. If a configuration is not specified, MongoDB uses a default configuration.
The rs.initiate() (page 897) method provides a wrapper around the replSetInitiate (page 728)
database command.
Replica Set Configuration
See Member Configuration Tutorials (page 438) and Replica Set Configuration (page 479) for examples of replica set
configuration and invitation objects.
rs.conf()
Returns a document that contains the current replica set configuration document.
See Replica Set Configuration (page 479) for more information on the replica set configuration document.
rs.config()
rs.config() (page 896) is an alias of rs.conf() (page 896).
Definition
rs.reconfig(configuration, force)
Initializes a new replica set configuration. Disconnects the shell briefly and forces a reconnection as the replica
set renegotiates which member will be primary. As a result, the shell will display an error even if this command
succeeds.
param document configuration A document that specifies the configuration of a replica set.
param document force If set as { force: true }, this forces the replica set to accept the
new configuration even if a majority of the members are not accessible. Use with caution, as
this can lead to term:rollback situations.
469
rs.reconfig() (page 897) overwrites the existing replica set configuration. Retrieve the current configuration object with rs.conf() (page 896), modify the configuration as needed and then use rs.reconfig()
(page 897) to submit the modified configuration object.
rs.reconfig() (page 897) provides a wrapper around the replSetReconfig (page 729) database
command.
Examples
If you want to force the reconfiguration if a majority of the set is not connected to the current member, or you are
issuing the command against a secondary, use the following form:
conf = rs.conf()
// modify conf to change configuration
rs.reconfig(conf, { force: true } )
Warning: Forcing a rs.reconfig() (page 897) can lead to rollback situations and other difficult to recover
from situations. Exercise caution when using this option.
See also:
Replica Set Configuration (page 479) and Replica Set Tutorials (page 419).
Definition
rs.add(host, arbiterOnly)
Adds a member to a replica set.
param string,document host The new member to add to the replica set. If a string, specifies the
hostname and optionally the port number for the new member. If a document, specifies a replica
set members document, as found in the members (page 480) array. To view a replica sets
members array, run rs.conf() (page 896).
param boolean arbiterOnly Applies only if the <host> value is a string. If true, the added host
is an arbiter.
You may specify new hosts in one of two ways:
1.as a hostname with an optional port number to use the default configuration as in the Add a Member to
an Existing Replica Set (page 435) example.
2.as a configuration document, as in the Configure and Add a Member (page 435) example.
This function will disconnect the shell briefly and forces a reconnection as the replica set renegotiates which
member will be primary. As a result, the shell will display an error even if this command succeeds.
470
Chapter 8. Replication
rs.add() (page 895) provides a wrapper around some of the functionality of the replSetReconfig
(page 729) database command and the corresponding shell helper rs.reconfig() (page 897). See the
Replica Set Configuration (page 479) document for full documentation of all replica set configuration options.
Example
To add a mongod (page 925) accessible on the default port 27017 running on the host mongodb3.example.net,
use the following rs.add() (page 895) invocation:
rs.add('mongodb3.example.net:27017')
To add mongodb3.example.net as a secondary-only (page 386) member of set, use the following form of
rs.add() (page 895):
rs.add( { "_id": 3, "host": "mongodbd3.example.net:27017", "priority": 0 } )
Replace, 3 with the next unused _id value in the replica set. See rs.conf() (page 896) to see the existing _id
values in the replica set configuration document.
See the Replica Set Configuration (page 479) and Replica Set Tutorials (page 419) documents for more information.
Description
rs.addArb(host)
Adds a new arbiter to an existing replica set.
The rs.addArb() (page 896) method takes the following parameter:
param string host Specifies the hostname and optionally the port number of the arbiter member to
add to replica set.
This function briefly disconnects the shell and forces a reconnection as the replica set renegotiates which member
will be primary. As a result, the shell displays an error even if this command succeeds.
Description
rs.stepDown(seconds)
Forces the current replica set member to step down as primary and then attempt to avoid election as primary for
the designated number of seconds. Produces an error if the current member is not the primary.
The rs.stepDown() (page 899) method has the following parameter:
param number seconds The duration of time that the stepped-down member attempts to avoid reelection as primary. If this parameter is not specified, the method uses the default value of 60
seconds.
This function disconnects the shell briefly and forces a reconnection as the replica set renegotiates which member
will be primary. As a result, the shell will display an error even if this command succeeds.
rs.stepDown() (page 899) provides a wrapper around the database command replSetStepDown
(page 730).
471
Description
rs.freeze(seconds)
Makes the current replica set member ineligible to become primary for the period specified.
The rs.freeze() (page 897) method has the following parameter:
param number seconds The duration the member is ineligible to become primary.
rs.freeze() (page 897) provides a wrapper around the database command replSetFreeze (page 726).
Definition
rs.remove(hostname)
Removes the member described by the hostname parameter from the current replica set. This function will
disconnect the shell briefly and forces a reconnection as the replica set renegotiates which member will be
primary. As a result, the shell will display an error even if this command succeeds.
The rs.remove() (page 898) method has the following parameter:
param string hostname The hostname of a system in the replica set.
Note: Before running the rs.remove() (page 898) operation, you must shut down the replica set member
that youre removing.
Changed in version 2.2: This procedure is no longer required when using rs.remove() (page 898), but it
remains good practice.
rs.slaveOk()
Provides a shorthand for the following operation:
db.getMongo().setSlaveOk()
This allows the current connection to allow read operations to run on secondary members. See the
readPref() (page 871) method for more fine-grained control over read preference (page 405) in the mongo
(page 942) shell.
db.isMaster()
Returns A document that describes the role of the mongod (page 925) instance.
If the mongod (page 925) is a member of a replica set, then the ismaster (page 733) and secondary
(page 733) fields report if the instance is the primary or if it is a secondary member of the replica set.
See
isMaster (page 732) for the complete documentation of the output of db.isMaster() (page 890).
rs.help()
Returns a basic help text for all of the replication (page 377) related shell functions.
rs.syncFrom()
New in version 2.2.
Provides a wrapper around the replSetSyncFrom (page 730), which allows administrators to configure the
member of a replica set that the current member will pull data from. Specify the name of the member you want
to replicate from in the form of [hostname]:[port].
See replSetSyncFrom (page 730) for more details.
472
Chapter 8. Replication
Database Commands
The following commands apply to replica sets. For a complete list of all commands, see Database Commands
(page 694).
Definition
isMaster
isMaster (page 732) returns a document that describes the role of the mongod (page 925) instance.
If the instance is a member of a replica set, then isMaster (page 732) returns a subset of the replica set
configuration and status including whether or not the instance is the primary of the replica set.
When sent to a mongod (page 925) instance that is not a member of a replica set, isMaster (page 732) returns
a subset of this information.
MongoDB drivers and clients use isMaster (page 732) to determine the state of the replica set members and
to discover additional members of a replica set.
The db.isMaster() (page 890) method in the mongo (page 942) shell provides a wrapper around
isMaster (page 732).
The command takes the following form:
{ isMaster: 1 }
See also:
db.isMaster() (page 890)
Output
All Instances The following isMaster (page 732) fields are common across all roles:
isMaster.ismaster
A boolean value that reports when this node is writable. If true, then this instance is a primary in a replica
set, or a master in a master-slave configuration, or a mongos (page 938) instance, or a standalone mongod
(page 925).
This field will be false if the instance is a secondary member of a replica set or if the member is an arbiter of
a replica set.
isMaster.maxBsonObjectSize
The maximum permitted size of a BSON object in bytes for this mongod (page 925) process. If not provided,
clients should assume a max size of 4 * 1024 * 1024.
isMaster.maxMessageSizeBytes
New in version 2.4.
The maximum permitted size of a BSON wire protocol message. The default value is 48000000 bytes.
isMaster.localTime
New in version 2.2.
Returns the local server time in UTC. This value is an ISO date.
473
Sharded Instances mongos (page 938) instances add the following field to the isMaster (page 732) response
document:
isMaster.msg
Contains the value isdbgrid when isMaster (page 732) returns from a mongos (page 938) instance.
Replica Sets isMaster (page 732) contains these fields when returned by a member of a replica set:
isMaster.setName
The name of the current :replica set.
isMaster.secondary
A boolean value that, when true, indicates if the mongod (page 925) is a secondary member of a replica set.
isMaster.hosts
An array of strings in the format of "[hostname]:[port]" that lists all members of the replica set that are
neither hidden, passive, nor arbiters.
Drivers use this array and the isMaster.passives (page 733) to determine which members to read from.
isMaster.passives
An array of strings in the format of "[hostname]:[port]" listing all members of the replica set which
have a priority (page 481) of 0.
This field only appears if there is at least one member with a priority (page 481) of 0.
Drivers use this array and the isMaster.hosts (page 733) to determine which members to read from.
isMaster.arbiters
An array of strings in the format of "[hostname]:[port]" listing all members of the replica set that are
arbiters.
This field only appears if there is at least one arbiter in the replica set.
isMaster.primary
A string in the format of "[hostname]:[port]" listing the current primary member of the replica set.
isMaster.arbiterOnly
A boolean value that , when true, indicates that the current instance is an arbiter. The arbiterOnly
(page 733) field is only present, if the instance is an arbiter.
isMaster.passive
A boolean value that, when true, indicates that the current instance is hidden. The passive (page 734) field
is only present for hidden members.
isMaster.hidden
A boolean value that, when true, indicates that the current instance is hidden. The hidden (page 734) field
is only present for hidden members.
isMaster.tags
A document that lists any tags assigned to this member. This field is only present if there are tags assigned to
the member. See Configure Replica Set Tag Sets (page 451) for more information.
isMaster.me
The [hostname]:[port] of the member that returned isMaster (page 732).
resync
The resync (page 731) command forces an out-of-date slave mongod (page 925) instance to re-synchronize
itself. Note that this command is relevant to master-slave replication only. It does not apply to replica sets.
Warning: This command obtains a global write lock and will block other operations until it has completed.
474
Chapter 8. Replication
replSetFreeze
The replSetFreeze (page 726) command prevents a replica set member from seeking election for the specified number of seconds. Use this command in conjunction with the replSetStepDown (page 730) command
to make a different node in the replica set a primary.
The replSetFreeze (page 726) command uses the following syntax:
{ replSetFreeze: <seconds> }
If you want to unfreeze a replica set member before the specified number of seconds has elapsed, you can issue
the command with a seconds value of 0:
{ replSetFreeze: 0 }
Restarting the mongod (page 925) process also unfreezes a replica set member.
replSetFreeze (page 726) is an administrative command, and you must issue it against the admin database.
Definition
replSetGetStatus
The replSetGetStatus command returns the status of the replica set from the point of view of the current
server. You must run the command against the admin database. The command has the following prototype
format:
{ replSetGetStatus: 1 }
The value specified does not affect the output of the command. Data provided by this command derives from
data included in heartbeats sent to the current instance by other members of the replica set. Because of the
frequency of heartbeats, these data can be several seconds out of date.
You can also access this functionality through the rs.status() (page 898) helper in the mongo (page 942)
shell.
The mongod (page 925) must have replication enabled and be a member of a replica set for the for
replSetGetStatus (page 726) to return successfully.
Output
replSetGetStatus.set
The set value is the name of the replica set, configured in the replSet (page 1000) setting. This is the same
value as _id (page 480) in rs.conf() (page 896).
replSetGetStatus.date
The value of the date field is an ISODate of the current time, according to the current server. Compare this to
the value of the lastHeartbeat (page 728) to find the operational lag between the current host and the other
hosts in the set.
replSetGetStatus.myState
The value of myState (page 727) is an integer between 0 and 10 that represents the replica state (page 487)
of the current member.
replSetGetStatus.members
The members field holds an array that contains a document for every member in the replica set.
replSetGetStatus.members.name
The name field holds the name of the server.
475
replSetGetStatus.members.self
The self field is only included in the document for the current mongod instance in the members array.
Its value is true.
replSetGetStatus.members.errmsg
This field contains the most recent error or status message received from the member. This field may be
empty (e.g. "") in some cases.
replSetGetStatus.members.health
The health value is only present for the other members of the replica set (i.e. not the member that returns
rs.status (page 898).) This field conveys if the member is up (i.e. 1) or down (i.e. 0.)
replSetGetStatus.members.state
The value of state (page 727) is an array of documents, each containing an integer between 0 and 10
that represents the replica state (page 487) of the corresponding member.
replSetGetStatus.members.stateStr
A string that describes state (page 727).
replSetGetStatus.members.uptime
The uptime (page 727) field holds a value that reflects the number of seconds that this member has been
online.
This value does not appear for the member that returns the rs.status() (page 898) data.
replSetGetStatus.members.optime
A document that contains information regarding the last operation from the operation log that this member
has applied.
replSetGetStatus.members.optime.t
A 32-bit timestamp of the last operation applied to this member of the replica set from the oplog.
replSetGetStatus.members.optime.i
An incremented field, which reflects the number of operations in since the last time stamp. This value
only increases if there is more than one operation per second.
replSetGetStatus.members.optimeDate
An ISODate formatted date string that reflects the last entry from the oplog that this member applied. If this
differs significantly from lastHeartbeat (page 728) this member is either experiencing replication
lag or there have not been any new operations since the last update. Compare members.optimeDate
between all of the members of the set.
replSetGetStatus.members.lastHeartbeat
The lastHeartbeat value provides an ISODate formatted date of the last heartbeat received from this
member. Compare this value to the value of the date (page 727) field to track latency between these
members.
This value does not appear for the member that returns the rs.status() (page 898) data.
replSetGetStatus.members.pingMS
The pingMS represents the number of milliseconds (ms) that a round-trip packet takes to travel between
the remote member and the local instance.
This value does not appear for the member that returns the rs.status() (page 898) data.
replSetGetStatus.syncingTo
The syncingTo field is only present on the output of rs.status() (page 898) on secondary and recovering
members, and holds the hostname of the member from which this instance is syncing.
replSetInitiate
The replSetInitiate (page 728) command initializes a new replica set. Use the following syntax:
476
Chapter 8. Replication
{ replSetInitiate : <config_document> }
The <config_document> is a document that specifies the replica sets configuration. For instance, heres a
config document for creating a simple 3-member replica set:
{
_id : <setname>,
members : [
{_id : 0, host : <host0>},
{_id : 1, host : <host1>},
{_id : 2, host : <host2>},
]
}
A typical way of running this command is to assign the config document to a variable and then to pass the
document to the rs.initiate() (page 897) helper:
config = {
_id : "my_replica_set",
members : [
{_id : 0, host : "rs1.example.net:27017"},
{_id : 1, host : "rs2.example.net:27017"},
{_id : 2, host : "rs3.example.net", arbiterOnly: true},
]
}
rs.initiate(config)
Notice that omitting the port cause the host to use the default port of 27017. Notice also that you can specify
other options in the config documents such as the arbiterOnly setting in this example.
See also:
Replica Set Configuration (page 479), Replica Set Tutorials (page 419), and Replica Set Reconfiguration
(page 483).
replSetMaintenance
The replSetMaintenance (page 729) admin command enables or disables the maintenance mode for a
secondary member of a replica set.
The command has the following prototype form:
{ replSetMaintenance: <boolean> }
Consider the following behavior when running the replSetMaintenance (page 729) command:
You cannot run the command on the Primary.
You must run the command against the admin database.
When enabled replSetMaintenance:
secondary is RECOVERING:
477
See Replica Set Member States (page 487) for more information about replica set member states. Refer to the
partial script for automating step down and compaction14 for an example of this procedure.
replSetReconfig
The replSetReconfig (page 729) command modifies the configuration of an existing replica set. You can
use this command to add and remove members, and to alter the options set on existing members. Use the
following syntax:
{ replSetReconfig: <new_config_document>, force: false }
You may also run the command using the shells rs.reconfig() (page 897) method.
Be aware of the following replSetReconfig (page 729) behaviors:
You must issue this command against the admin database of the current primary member of the replica
set.
You can optionally force the replica set to accept the new configuration by specifying force: true.
Use this option if the current member is not primary or if a majority of the members of the set are not
accessible.
Warning: Forcing the replSetReconfig (page 729) command can lead to a rollback situation.
Use with caution.
Use the force option to restore a replica set to new servers with different hostnames. This works even if
the set members already have a copy of the data.
A majority of the sets members must be operational for the changes to propagate properly.
This command can cause downtime as the set renegotiates primary-status. Typically this is 10-20 seconds,
but could be as long as a minute or more. Therefore, you should attempt to reconfigure only during
scheduled maintenance periods.
In some cases, replSetReconfig (page 729) forces the current primary to step down, initiating an
election for primary among the members of the replica set. When this happens, the set will drop all current
connections.
Note: replSetReconfig (page 729) obtains a special mutually exclusive lock to prevent more than one
replSetReconfig (page 729) operation from occurring at the same time.
Description
replSetSyncFrom
New in version 2.2.
Explicitly configures which host the current mongod (page 925) pulls oplog entries from. This operation is
useful for testing different patterns and in situations where a set member is not replicating from the desired host.
The replSetSyncFrom (page 730) command has the following form:
{ replSetSyncFrom: "hostname<:port>" }
478
Chapter 8. Replication
The member to replicate from must be a valid source for data in the set. The member cannot be:
The same as the mongod (page 925) on which you run replSetSyncFrom (page 730).
In other words, a member cannot replicate from itself.
An arbiter, because arbiters do not hold data.
A member that does not build indexes.
An unreachable member.
A mongod (page 925) instance that is not a member of the same replica set.
If you attempt to replicate from a member that is more than 10 seconds behind the current
member, mongod (page 925) will log a warning but will still replicate from the lagging member.
If you run replSetSyncFrom (page 730) during initial sync, MongoDB produces no error
messages, but the sync target will not change until after the initial sync operation.
Run from the mongo Shell
To run the command in the mongo (page 942) shell, use the following invocation:
db.adminCommand( { replSetSyncFrom: "hostname<:port>" } )
You may also use the rs.syncFrom() (page 899) helper in the mongo (page 942) shell in an operation
with the following form:
rs.syncFrom("hostname<:port>")
Note: replSetSyncFrom (page 730) and rs.syncFrom() (page 899) provide a temporary override of default behavior. mongod (page 925) will revert to the default sync behavior in the following
situations:
The mongod (page 925) instance restarts.
The connection between the mongod (page 925) and the sync target closes.
Changed in version 2.4: The sync target falls more than 30 seconds behind another member of the replica
set; the mongod (page 925) will revert to the default sync target.
This reference provides an overview of replica set configuration options and settings.
Use rs.conf() (page 896) in the mongo (page 942) shell to retrieve this configuration. Note that default values are
not explicitly displayed.
479
The following document provides a representation of a replica set configuration document. Angle brackets (e.g. < and
>) enclose all optional fields.
{
_id : <setname>,
version: <int>,
members: [
{
_id : <ordinal>,
host : hostname<:port>,
<arbiterOnly : <boolean>,>
<buildIndexes : <boolean>,>
<hidden : <boolean>,>
<priority: <priority>,>
<tags: { <document> },>
<slaveDelay : <number>,>
<votes : <number>>
}
, ...
],
<settings: {
<getLastErrorDefaults : <lasterrdefaults>,>
<chainingAllowed : <boolean>,>
<getLastErrorModes : <modes>>
}>
}
Configuration Variables
local.system.replset._id
Type: string
Value: <setname>
An _id field holding the name of the replica set. This reflects the set name configured with replSet
(page 1000) or mongod --replSet.
local.system.replset.members
Type: array
Contains an array holding an embedded document for each member of the replica set. The members document
contains a number of fields that describe the configuration of each member of the replica set.
The members (page 480) field in the replica set configuration document is a zero-indexed array.
local.system.replset.members[n]._id
Type: ordinal
Provides the zero-indexed identifier of every member in the replica set.
Note: When updating the replica configuration object, access the replica set members in the members
(page 480) array with the array index. The array index begins with 0. Do not confuse this index value with the
value of the _id (page 480) field in each document in the members (page 480) array.
local.system.replset.members[n].host
Type: <hostname><:port>
480
Chapter 8. Replication
Identifies the host name of the set member with a hostname and port number. This name must be resolvable for
every host in the replica set.
Warning: host (page 480) cannot hold a value that resolves to localhost or the local interface unless
all members of the set are on hosts that resolve to localhost.
local.system.replset.members[n].arbiterOnly
Optional.
Type: boolean
Default: false
Identifies an arbiter. For arbiters, this value is true, and is automatically configured by rs.addArb()
(page 896).
local.system.replset.members[n].buildIndexes
Optional.
Type: boolean
Default: true
Determines whether the mongod (page 925) builds indexes on this member. Do not set to false if a replica
set can become a primary, or if any clients ever issue queries against this instance.
Omitting index creation, and thus this setting, may be useful, if:
You are only using this instance to perform backups using mongodump (page 951),
this instance will receive no queries, and
index creation and maintenance overburdens the host system.
If set to false, secondaries configured with this option do build indexes on the _id field, to facilitate operations required for replication.
Warning: You may only set this value when adding a member to a replica set. You may not reconfigure a
replica set to change the value of the buildIndexes (page 481) field after adding the member to the set.
Other secondaries cannot replicate from a members where buildIndexes (page 481) is false.
local.system.replset.members[n].hidden
Optional.
Type: boolean
Default: false
When this value is true, the replica set hides this instance, and does not include the member in the output of
db.isMaster() (page 890) or isMaster (page 732). This prevents read operations (i.e. queries) from ever
reaching this host by way of secondary read preference.
See also:
Hidden Replica Set Members (page 387)
local.system.replset.members[n].priority
Optional.
Type: Number, between 0 and 100.0 including decimals.
Default: 1
481
Specify higher values to make a member more eligible to become primary, and lower values to make the member
less eligible to become primary. Priorities are only used in comparison to each other. Members of the set will
veto election requests from members when another eligible member has a higher priority value. Changing the
balance of priority in a replica set will trigger an election.
A priority (page 481) of 0 makes it impossible for a member to become primary.
See also:
priority (page 481) and Replica Set Elections (page 397).
local.system.replset.members[n].tags
Optional.
Type: MongoDB Document
Default: none
Used to represent arbitrary values for describing or tagging members for the purposes of extending write concern
to allow configurable data center awareness.
Use in conjunction with getLastErrorModes (page 483) and getLastErrorDefaults (page 483) and
db.getLastError() (page 886) (i.e. getLastError (page 720).)
For procedures on configuring tag sets, see Configure Replica Set Tag Sets (page 451).
Important: In tag sets, all tag values must be strings.
local.system.replset.members[n].slaveDelay
Optional.
Type: Integer. (seconds.)
Default: 0
Describes the number of seconds behind the primary that this replica set member should lag. Use this
option to create delayed members (page 387), that maintain a copy of the data that reflects the state of the data
set at some amount of time in the past, specified in seconds. Typically such delayed members help protect
against human error, and provide some measure of insurance against the unforeseen consequences of changes
and updates.
local.system.replset.members[n].votes
Optional.
Type: Integer
Default: 1
Controls the number of votes a server will cast in a replica set election (page 397). The number of votes each
member has can be any non-negative integer, but it is highly recommended each member has 1 or 0 votes.
If you need more than 7 members in one replica set, use this setting to add additional non-voting members with
a votes (page 482) value of 0.
For most deployments and most members, use the default value, 1, for votes (page 482).
local.system.replset.settings
Optional.
Type: MongoDB Document
The settings document configures options that apply to the whole replica set.
482
Chapter 8. Replication
local.system.replset.settings.chainingAllowed
Optional.
Type: boolean
Default: true
New in version 2.2.4.
When chainingAllowed (page 482) is true, the replica set allows secondary members to replicate from
other secondary members. When chainingAllowed (page 482) is false, secondaries can replicate only
from the primary.
When you run rs.config() (page 896) to view a replica sets configuration, the chainingAllowed
(page 482) field appears only when set to false. If not set, chainingAllowed (page 482) is true.
See also:
Manage Chained Replication (page 457)
local.system.replset.settings.getLastErrorDefaults
Optional.
Type: MongoDB Document
Specify arguments to the getLastError (page 720) that members of this replica set will use when no arguments to getLastError (page 720) has no arguments. If you specify any arguments, getLastError
(page 720) , ignores these defaults.
local.system.replset.settings.getLastErrorModes
Optional.
Type: MongoDB Document
Defines the names and combination of members (page 480) for use by the application layer to guarantee write
concern to database using the getLastError (page 720) command to provide data-center awareness.
Example Reconfiguration Operations
Most modifications of replica set configuration use the mongo (page 942) shell. Consider the following reconfiguration operation:
Example
Given the following replica set configuration:
{
"_id" : "rs0",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017"
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017"
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017"
}
483
]
}
The following reconfiguration operation updates the priority (page 481) of the replica set members:
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 2
cfg.members[2].priority = 2
rs.reconfig(cfg)
First, this operation sets the local variable cfg to the current replica set configuration using the rs.conf()
(page 896) method. Then it adds priority values to the cfg document for the three sub-documents in the members
(page 480) array, accessing each replica set member with the array index and not the replica set members _id
(page 480) field. Finally, it calls the rs.reconfig() (page 897) method with the argument of cfg to initialize this
new configuration. The replica set configuration after this operation will resemble the following:
{
"_id" : "rs0",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017",
"priority" : 0.5
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017",
"priority" : 2
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017",
"priority" : 1
}
]
}
Using the dot notation demonstrated in the above example, you can modify any existing setting or specify any of
optional replica set configuration variables (page 480). Until you run rs.reconfig(cfg) at the shell, no changes
will take effect. You can issue cfg = rs.conf() at any time before using rs.reconfig() (page 897) to undo
your changes and start from the current configuration. If you issue cfg as an operation at any point, the mongo
(page 942) shell at any point will output the complete document with modifications for your review.
The rs.reconfig() (page 897) operation has a force option, to make it possible to reconfigure a replica set if a
majority of the replica set is not visible, and there is no primary member of the set. use the following form:
rs.reconfig(cfg, { force: true } )
Warning: Forcing a rs.reconfig() (page 897) can lead to rollback situations and other difficult to recover
from situations. Exercise caution when using this option.
Note: The rs.reconfig() (page 897) shell method can force the current primary to step down and triggers an
election in some situations. When the primary steps down, all clients will disconnect. This is by design. Since this
typically takes 10-20 seconds, attempt to make such changes during scheduled maintenance periods.
484
Chapter 8. Replication
Every mongod (page 925) instance has its own local database, which stores data used in the replication process,
and other instance-specific data. The local database is invisible to replication: collections in the local database
are not replicated.
In replication, the local database store stores internal replication data for each member of a replica set. The local
stores the following collections:
Changed in version 2.4: When running with authentication (i.e. auth (page 993)), authenticating to the local
database is not equivalent to authenticating to the admin database. In previous versions, authenticating to the local
database provided access to all databases.
Collection on all mongod Instances
local.startup_log
On startup, each mongod (page 925) instance inserts a document into startup_log (page 485) with diagnostic information about the mongod (page 925) instance itself and host information. startup_log (page 485)
is a capped collection. This information is primarily useful for diagnostic purposes.
Example
Consider the following prototype of a document from the startup_log (page 485) collection:
{
"_id" : "<string>",
"hostname" : "<string>",
"startTime" : ISODate("<date>"),
"startTimeLocal" : "<string>",
"cmdLine" : {
"dbpath" : "<path>",
"<option>" : <value>
},
"pid" : <number>,
"buildinfo" : {
"version" : "<string>",
"gitVersion" : "<string>",
"sysInfo" : "<string>",
"loaderFlags" : "<string>",
"compilerFlags" : "<string>",
"allocator" : "<string>",
"versionArray" : [ <num>, <num>, <...> ],
"javascriptEngine" : "<string>",
"bits" : <number>,
"debug" : <boolean>,
"maxBsonObjectSize" : <number>
}
}
Documents in the startup_log (page 485) collection contain the following fields:
local.startup_log._id
Includes the system hostname and a millisecond epoch value.
local.startup_log.hostname
The systems hostname.
8.4. Replication Reference
485
local.startup_log.startTime
A UTC ISODate value that reflects when the server started.
local.startup_log.startTimeLocal
A string that reports the startTime (page 485) in the systems local time zone.
local.startup_log.cmdLine
A sub-document that reports the mongod (page 925) runtime options and their values.
local.startup_log.pid
The process identifier for this process.
local.startup_log.buildinfo
A sub-document that reports information about the build environment and settings used to compile this
mongod (page 925). This is the same output as buildInfo (page 762). See buildInfo (page 762).
local.system.replset
local.system.replset (page 486) holds the replica sets configuration object as its single document. To
view the objects configuration information, issue rs.conf() (page 896) from the mongo (page 942) shell.
You can also query this collection directly.
local.oplog.rs
local.oplog.rs (page 486) is the capped collection that holds the oplog. You set its size at creation using
the oplogSize (page 1000) setting. To resize the oplog after replica set initiation, use the Change the Size of
the Oplog (page 446) procedure. For additional information, see the Oplog Size (page 411) section.
local.replset.minvalid
This contains an object used internally by replica sets to track replication status.
local.slaves
This contains information about each member of the set and the latest point in time that this member has synced
to. If this collection becomes out of date, you can refresh it by dropping the collection and allowing MongoDB
to automatically refresh it during normal replication:
db.getSiblingDB("local").slaves.drop()
486
Chapter 8. Replication
State Description
Cannot vote. All members start up in this state. The mongod (page 925) parses the
replica set configuration document (page 438) while in STARTUP (page 487).
Can vote. The primary (page 382) is the only member to accept write operations.
Can vote. The secondary (page 382) replicates the data store.
Can vote. Members either perform startup self-checks, or transition from completing a
rollback (page 401) or resync (page 450).
Cannot vote. Has encountered an unrecoverable error.
Cannot vote. Forks replication and election threads before becoming a secondary.
Cannot vote. Has never connected to the replica set.
Can vote. Arbiters (page ??) do not replicate data and exist solely to participate in
elections.
Cannot vote. Is not accessible to the set.
Can vote. Performs a rollback (page 401).
Cannot vote. Was once in the replica set but has now been removed.
States
Core States
PRIMARY
Members in PRIMARY (page 487) state accept write operations. A replica set has only one primary at a time.
A SECONDARY (page 487) member becomes primary after an election (page 397). Members in the PRIMARY
(page 487) state are eligible to vote.
SECONDARY
Members in SECONDARY (page 487) state replicate the primarys data set and can be configured to accept read
operations. Secondaries are eligible to vote in elections, and may be elected to the PRIMARY (page 487) state if
the primary becomes unavailable.
ARBITER
Members in ARBITER (page 487) state do not replicate data or accept write operations. They are eligible to
vote, and exist solely to break a tie during elections. Replica sets should only have a member in the ARBITER
(page 487) state if the set would otherwise have an even number of members, and could suffer from tied elections. Like primaries, there should only be at most one arbiter in any replica set.
See Replica Set Members (page 382) for more information on core states.
Initialization States
STARTUP
Each member of a replica set starts up in STARTUP (page 487) state. mongod (page 925) then loads that
487
members replica set configuration, and transitions the members state to STARTUP2 (page 488). Members in
STARTUP (page 487) are not eligible to vote.
STARTUP2
Each member of a replica set enters the STARTUP2 (page 488) state as soon as mongod (page 925) finishes
loading that members configuration. While in the STARTUP2 (page 488) state, the member creates threads to
handle internal replication operations. Members are in the STARTUP2 (page 488) state for a short period of
time before entering the RECOVERING (page 488) state. Members in the STARTUP2 (page 488) state are not
eligible to vote.
RECOVERING
A member of a replica set enters RECOVERING (page 488) state when it is not ready to accept reads. The
RECOVERING (page 488) state can occur during normal operation, and doesnt necessarily reflect an error
condition. Members in the RECOVERING (page 488) state are eligible to vote in elections, but is not eligible to
enter the PRIMARY (page 487) state.
During startup, members transition through RECOVERING (page 488) after STARTUP2 (page 488) and before
becoming SECONDARY (page 487).
During normal operation, if a secondary falls behind the other members of the replica set, it may need to resync
(page 450) with the rest of the set. While resyncing, the member enters the RECOVERING (page 488) state.
Whenever the replica set replaces a primary in an election, the old primarys data collection may contain documents that did not have time to replicate to the secondary members. In this case the member rolls back those
writes. During rollback (page 401), the member will have RECOVERING (page 488) state.
On secondaries, the compact (page 752) and replSetMaintenance (page 729) commands force the secondary to enter RECOVERING (page 488) state. This prevents clients from reading during those operations.
Error States Members in any error state cant vote.
FATAL
Members that encounter an unrecoverable error enter the FATAL (page 488) state. Members in this state requires
administrator intervention.
UNKNOWN
Members that have never communicated status information to the replica set are in the UNKNOWN (page 488)
state.
DOWN
Members that lose their connection to the replica set enter the DOWN (page 488) state.
SHUNNED
Members that are removed from the replica set enter the SHUNNED (page 488) state.
ROLLBACK
When a SECONDARY (page 487) rolls back a write operation after transitioning from PRIMARY (page 487), it
enters the ROLLBACK (page 488) state. See Rollbacks During Replica Set Failover (page 401).
Read Preference Reference
Read preference describes how MongoDB clients route read operations to members of a replica set.
By default, an application directs its read operations to the primary member in a replica set. Reading from the primary
guarantees that read operations reflect the latest version of a document. However, by distributing some or all reads to
secondary members of the replica set, you can improve read throughput or reduce latency for an application that does
not require fully up-to-date data.
488
Chapter 8. Replication
Read Preference
Description
Mode
primary (page 489)
Default mode. All operations read from the current replica set primary.
primaryPreferred In most situations, operations read from the primary but if it is unavailable, operations
(page 489)
read from secondary members.
secondary
All operations read from the secondary members of the replica set.
(page 489)
secondaryPreferred In most situations, operations read from secondary members but if no secondary
(page 489)
members are available, operations read from the primary.
nearest (page 490)
Operations read from the nearest member of the replica set, irrespective of the
members type.
Read Preference Modes
primary
All read operations use only the current replica set primary. This is the default. If the primary is unavailable,
read operations produce an error or throw an exception.
The primary (page 489) read preference mode is not compatible with read preference modes that use tag sets
(page 407). If you specify a tag set with primary (page 489), the driver will produce an error.
primaryPreferred
In most situations, operations read from the primary member of the set. However, if the primary is unavailable,
as is the case during failover situations, operations read from secondary members.
When the read preference includes a tag set (page 407), the client reads first from the primary, if available, and
then from secondaries that match the specified tags. If no secondaries have matching tags, the read operation
produces an error.
Since the application may receive data from a secondary, read operations using the primaryPreferred
(page 489) mode may return stale data in some situations.
Warning: Changed in version 2.2: mongos (page 938) added full support for read preferences.
When connecting to a mongos (page 938) instance older than 2.2, using a client that supports read preference modes, primaryPreferred (page 489) will send queries to secondaries.
secondary
Operations read only from the secondary members of the set. If no secondaries are available, then this read
operation produces an error or exception.
Most sets have at least one secondary, but there are situations where there may be no available secondary. For
example, a set with a primary, a secondary, and an arbiter may not have any secondaries if a member is in
recovering state or unavailable.
When the read preference includes a tag set (page 407), the client attempts to find secondary members that
match the specified tag set and directs reads to a random secondary from among the nearest group (page 408).
If no secondaries have matching tags, the read operation produces an error. 15
Read operations using the secondary (page 489) mode may return stale data.
secondaryPreferred
In most situations, operations read from secondary members, but in situations where the set consists of a single
primary (and no other members,) the read operation will use the sets primary.
15 If your set has more than one secondary, and you use the secondary (page 489) read preference mode, consider the following effect. If
you have a three member replica set (page 391) with a primary and two secondaries, and if one secondary becomes unavailable, all secondary
(page 489) queries must target the remaining secondary. This will double the load on this secondary. Plan and provide capacity to support this as
needed.
489
When the read preference includes a tag set (page 407), the client attempts to find a secondary member that
matches the specified tag set and directs reads to a random secondary from among the nearest group (page 408).
If no secondaries have matching tags, the read operation produces an error.
Read operations using the secondaryPreferred (page 489) mode may return stale data.
Warning: In some situations using secondaryPreferred (page 489) to distribute read load to replica sets
may carry significant operational risk: if all secondaries are unavailable and your set has enough arbiters to prevent
the primary from stepping down, then the primary will receive all traffic from clients.
For this reason, use secondary (page 489) to distribute read load to replica sets, not secondaryPreferred
(page 489).
nearest
The driver reads from the nearest member of the set according to the member selection (page 408) process.
Reads in the nearest (page 490) mode do not consider the members type. Reads in nearest (page 490)
mode may read from both primaries and secondaries.
Set this mode to minimize the effect of network latency on read operations without preference for current or
stale data.
If you specify a tag set (page 407), the client attempts to find a replica set member that matches the specified
tag set and directs reads to an arbitrary member from among the nearest group (page 408).
Read operations using the nearest (page 490) mode may return stale data.
Note: All operations read from a member of the nearest group of the replica set that matches the specified
read preference mode. The nearest (page 490) mode prefers low latency reads over a members primary or
secondary status.
For nearest (page 490), the client assembles a list of acceptable hosts based on tag set and then narrows that
list to the host with the shortest ping time and all other members of the set that are within the local threshold,
or acceptable latency. See Member Selection (page 408) for more information.
Because some database commands read and return data from the database, all of the official drivers support full read
preference mode semantics (page 489) for the following commands:
group (page 697)
mapReduce (page 701) 16
aggregate (page 694)
collStats (page 763)
dbStats (page 767)
count (page 695)
distinct (page 696)
geoNear (page 708)
geoSearch (page 709)
geoWalk (page 710)
16 Only inline mapReduce (page 701) operations that do not write data support read preference, otherwise these operations must run on the
primary members.
490
Chapter 8. Replication
New in version 2.4: mongos (page 938) adds support for routing commands to shards using read preferences. Previously mongos (page 938) sent all commands to shards primaries.
491
492
Chapter 8. Replication
CHAPTER 9
Sharding
Sharding is the process of storing data records across multiple machines and is MongoDBs approach to meeting the
demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor
provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding,
you add more machines to support data growth and the demands of read and write operations.
Sharding Introduction (page 493) A high-level introduction to horizontal scaling, data partitioning, and sharded
clusters in MongoDB.
Sharding Concepts (page 498) The core documentation of sharded cluster features, configuration, architecture and
behavior.
Sharded Cluster Components (page 499) A sharded cluster consists of shards, config servers, and mongos
(page 938) instances.
Sharded Cluster Architectures (page 503) Outlines the requirements for sharded clusters, and provides examples of several possible architectures for sharded clusters.
Sharded Cluster Behavior (page 505) Discusses the operations of sharded clusters with regards to the automatic balancing of data in a cluster and other related availability and security considerations.
Sharding Mechanics (page 515) Discusses the internal operation and behavior of sharded clusters, including
chunk migration, balancing, and the cluster metadata.
Sharded Cluster Tutorials (page 521) Tutorials that describe common procedures and administrative operations relevant to the use and maintenance of sharded clusters.
Sharding Reference (page 563) Reference for sharding-related functions and operations.
493
Vertical scaling adds more CPU and storage resources to increase capacity. Scaling by adding capacity has limitations: high performance systems with large numbers of CPUs and large amount of RAM are disproportionately
more expensive than smaller systems. Additionally, cloud-based providers may only allow users to provision smaller
instances. As a result there is a practical maximum capability for vertical scaling.
Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or
shards. Each shard is an independent database, and collectively, the shards make up a single logical database.
Figure 9.1: Diagram of a large collection with data distributed across 4 shards.
Sharding addresses the challenge of scaling to support high throughput and large data sets:
Sharding reduces the number of operations each shard handles. Each shard processes fewer operations as the
cluster grows. As a result, shared clusters can increase capacity and throughput horizontally.
For example, to insert data, the application only needs to access the shard responsible for that records.
Sharding reduces the amount of data that each server needs to store. Each shard stores less data as the cluster
grows.
For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard might hold only
256GB of data. If there are 40 shards, then each shard might hold only 25GB of data.
494
Chapter 9. Sharding
Figure 9.2: Diagram of a sample sharded cluster for production purposes. Contains exactly 3 config servers, 2 or more
mongos (page 938) query routers, and at least 2 shards. The shards are replica sets.
Sharded cluster has the following components: shards, query routers and config servers.
Shards store the data. To provide high availability and data consistency, in a production sharded cluster, each shard is
a replica set 1 . For more information on replica sets, see Replica Sets (page 381).
Query Routers, or mongos (page 938) instances, interface with client applications and direct operations to the
appropriate shard or shards. The query router processes and targets operations to shards and then returns results to
the clients. A sharded cluster can contain more than one query router to divide the client request load. A client sends
requests to one query router. Most sharded cluster have many query routers.
Config servers store the clusters metadata. This data contains a mapping of the clusters data set to the shards. The
query router uses this metadata to target operations to specific shards. Production sharded clusters have exactly 3
config servers.
495
Shard Keys
To shard a collection, you need to select a shard key. A shard key is either an indexed field or an indexed compound
field that exists in every document in the collection. MongoDB divides the shard key values into chunks and distributes
the chunks evenly across the shards. To divide the shard key values into chunks, MongoDB uses either range based
partitioning and hash based partitioning. See Shard Keys (page 506) for more information.
Range Based Sharding
For range-based sharding, MongoDB divides the data set into ranges determined by the shard key values to provide
range based partitioning. Consider a numeric shard key: If you visualize a number line that goes from negative
infinity to positive infinity, each value of the shard key falls at some point on that line. MongoDB partitions this line
into smaller, non-overlapping ranges called chunks where a chunk is range of values from some minimum value to
some maximum value.
Given a range based partitioning system, documents with close shard key values are likely to be in the same chunk,
and therefore on the same shard.
Figure 9.3: Diagram of the shard key value space segmented into smaller ranges or chunks.
496
Chapter 9. Sharding
Figure 9.5: Diagram of a shard with a chunk that exceeds the default chunk size (page 519) of 64 MB and triggers a
split of the chunk into two chunks.
497
Balancing
The balancer (page 516) is a background process that manages chunk migrations. The balancer runs in all of the query
routers in a cluster.
When the distribution of a sharded collection in a cluster is uneven, the balancer process migrates chunks from the
shard that has the largest number of chunks to the shard with the least number of chunks until the collection balances.
For example: if collection users has 100 chunks on shard 1 and 50 chunks and shard 2, the balancer will migrate
chunks from shard 1 to shard 2 until the collections achieves balance.
The shards manage chunk migrations as a background operation. During migration, all requests for a chunks data
address the origin shard.
In a chunk migration, the destination shard receives all the documents in the chunk from the origin shard. Then, the
destination shard captures and applies all changes made to the data during migration process. Finally, the destination
shard updates the metadata regarding the location of the on config server.
If theres an error during the migration, the balancer aborts the process leaving the chunk on the origin shard. MongoDB removes the chunks data from the origin shard after the migration completes successfully.
Figure 9.6: Diagram of a collection distributed across three shards. For this collection, the difference in the number of
chunks between the shards reaches the migration thresholds (page 517) (in this case, 2) and triggers migration.
498
Chapter 9. Sharding
Config Servers (page 502) Config servers hold the metadata about the cluster, such as the shard location of the
data.
Sharded Cluster Architectures (page 503) Outlines the requirements for sharded clusters, and provides examples of
several possible architectures for sharded clusters.
Sharded Cluster Requirements (page 503) Discusses the requirements for sharded clusters in MongoDB.
Production Cluster Architecture (page 504) Sharded cluster for production has component requirements to
provide redundancy and high availability.
Sharded Cluster Behavior (page 505) Discusses the operations of sharded clusters with regards to the automatic balancing of data in a cluster and other related availability and security considerations.
Shard Keys (page 506) MongoDB uses the shard key to divide a collections data across the clusters shards.
Sharded Cluster High Availability (page 508) Sharded clusters provide ways to address some availability concerns.
Sharded Cluster Query Routing (page 510) The clusters routers, or mongos instances, send reads and writes
to the relevant shard or shards.
Sharding Mechanics (page 515) Discusses the internal operation and behavior of sharded clusters, including chunk
migration, balancing, and the cluster metadata.
Sharded Collection Balancing (page 516) Balancing distributes a sharded collections data cluster to all of the
shards.
Sharded Cluster Metadata (page 520) The cluster maintains internal metadata that reflects the location of data
within the cluster.
499
500
Chapter 9. Sharding
There is no particular order to the data set on a specific shard. MongoDB does not guarantee that any two contiguous
chunks will reside on a single shard.
Primary Shard
Every database has a primary 2 shard that holds all the un-sharded collections in that database.
Figure 9.8: Diagram of a primary shard. A primary shard contains non-sharded collections as well as chunks of
documents from sharded collections. Shard A is the primary shard.
To change the primary shard for a database, use the movePrimary (page 743) command.
Warning: The movePrimary (page 743) command can be expensive because it copies all non-sharded data to
the new shard. During this time, this data will be unavailable for other operations.
When you deploy a new sharded cluster, the first shard becomes the primary shard for all existing databases before
enabling sharding. Databases created subsequently may reside on any shard in the cluster.
Shard Status
Use the sh.status() (page 910) method in the mongo (page 942) shell to see an overview of the cluster.
This reports includes which shard is primary for the database and the chunk distribution across the shards. See
sh.status() (page 910) method for more details.
2
The term primary shard has nothing to do with the term primary in the context of replica sets.
501
Config Servers
Config servers are special mongod (page 925) instances that store the metadata (page 520) for a sharded cluster.
Config servers use a two-phase commit to ensure immediate consistency and reliability. Config servers do not run
as replica sets. All config servers must be available to deploy a sharded cluster or to make any changes to cluster
metadata.
A production sharded cluster has exactly three config servers. For testing purposes you may deploy a cluster with a
single config server. But to ensure redundancy and safety in production, you should always use three.
Warning: If your cluster has a single config server, then the config server is a single point of failure. If the config
server is inaccessible, the cluster is not accessible. If you cannot recover the data on a config server, the cluster
will be inoperable.
Always use three config servers for production deployments.
Config servers store metadata for a single sharded cluster. Each cluster must have its own config servers.
Tip
Use CNAMEs to identify your config servers to the cluster so that you can rename and renumber your config servers
without downtime.
Config Database
Config servers store the metadata in the config database (page 564). The mongos (page 938) instances cache this data
and use it to route reads and writes to shards.
Read and Write Operations on Config Servers
MongoDB only writes data to the config server in the following cases:
To create splits in existing chunks. For more information, see chunk splitting (page 519).
To migrate a chunk between shards. For more information, see chunk migration (page 518).
MongoDB reads data from the config server data in the following cases:
A new mongos (page 938) starts for the first time, or an existing mongos (page 938) restarts.
After a chunk migration, the mongos (page 938) instances update themselves with the new cluster metadata.
MongoDB also uses the config server to manage distributed locks.
Config Server Availability
If one or two config servers become unavailable, the clusters metadata becomes read only. You can still read and
write data from the shards, but no chunk migrations or splits will occur until all three servers are available.
If all three config servers are unavailable, you can still use the cluster if you do not restart the mongos (page 938)
instances until after the config servers are accessible again. If you restart the mongos (page 938) instances before the
config servers are available, the mongos (page 938) will be unable to route reads and writes.
Clusters become inoperable without the cluster metadata. Always, ensure that the config servers remain available and
intact. As such, backups of config servers are critical. The data on the config server is small compared to the data
502
Chapter 9. Sharding
stored in a cluster. This means the config server has a relatively low activity load, and the config server does not need
to be always available to support a sharded cluster. As a result, it is easy to back up the config servers.
If the name or address that a sharded cluster uses to connect to a config server changes, you must restart every mongod
(page 925) and mongos (page 938) instance in the sharded cluster. Avoid downtime by using CNAMEs to identify
config servers within the MongoDB deployment.
See Renaming Config Servers and Cluster Availability (page 509) for more information.
Your cluster should manage a large quantity of data if sharding is to have an effect. The default chunk size is 64
megabytes. And the balancer (page 516) will not begin moving data across shards until the imbalance of chunks among
the shards exceeds the migration threshold (page 517). In practical terms, unless your cluster has many hundreds of
megabytes of data, your data will remain on a single shard.
In some situations, you may need to shard a small collection of data. But most of the time, sharding a small collection
is not worth the added complexity and overhead unless you need additional write capacity. If you have a small data
503
set, a properly configured single MongoDB instance or a replica set will usually be enough for your persistence layer
needs.
Chunk size is user configurable. For most deployments, the default value is of 64 megabytes is ideal. See
Chunk Size (page 519) for more information.
Production Cluster Architecture
In a production cluster, you must ensure that data is redundant and that your systems are highly available. To that end,
a production cluster must have the following components:
Three config servers (page 502). Each config servers must be on separate machines. A single sharded cluster
must have exclusive use of its config servers (page 502). If you have multiple sharded clusters, you will need to
have a group of config servers for each cluster.
Two or more replica sets. These replica sets are the shards. For information on replica sets, see Replication
(page 377).
One or more mongos (page 938) instances. mongos (page 938) is the routers for the cluster. Typically,
deployments have one mongos (page 938) instance on each application server. You may also may deploy a
group of mongos (page 938) instances and use a proxy/load balancer between the application and the mongos
(page 938).
Figure 9.9: Diagram of a sample sharded cluster for production purposes. Contains exactly 3 config servers, 2 or more
mongos (page 938) query routers, and at least 2 shards. The shards are replica sets.
504
Chapter 9. Sharding
Figure 9.10: Diagram of a sample sharded cluster for testing/development purposes only. Contains only 1 config
server, 1 mongos (page 938) router, and at least 1 shard. The shard can be either a replica set or a standalone
mongod (page 925) instance.
See
Production Cluster Architecture (page 504)
505
Sharded Cluster Query Routing (page 510) The clusters routers, or mongos instances, send reads and writes to the
relevant shard or shards.
Shard Keys
The shard key determines the distribution of the collections documents among the clusters shards. The shard key is
either an indexed field or an indexed compound field that exists in every document in the collection.
MongoDB partitions data in the collection using ranges of shard key values. Each range, or chunk, defines a nonoverlapping range of shard key values. MongoDB distributes the chunks, and their documents, among the shards in
the cluster.
Figure 9.11: Diagram of the shard key value space segmented into smaller ranges or chunks.
When a chunk grows beyond the chunk size (page 519), MongoDB splits the chunk into smaller chunks, always based
on ranges in the shard key.
Important: Shard keys are immutable and cannot be changed after insertion. See the system limits for sharded
cluster (page 1017) for more information.
Chapter 9. Sharding
The shard key affects write and query performance by determining how the MongoDB partitions data in the cluster
and how effectively the mongos (page 938) instances can direct operations to the cluster. Consider the following
operational impacts of shard key selection:
Write Scaling Some possible shard keys will allow your application to take advantage of the increased write capacity
that the cluster can provide, while others do not. Consider the following example where you shard by the values of the
default _id field, which is ObjectID.
MongoDB generates ObjectID values upon document creation to produce a unique identifier for the object. However, the most significant bits of data in this value represent a time stamp, which means that they increment in a regular
and predictable pattern. Even though this value has high cardinality (page 527), when using this, any date, or other
monotonically increasing number as the shard key, all insert operations will be storing data into a single chunk, and
therefore, a single shard. As a result, the write capacity of this shard will define the effective write capacity of the
cluster.
A shard key that increases monotonically will not hinder performance if you have a very low insert rate, or if most
of your write operations are update() (page 849) operations distributed through your entire data set. Generally,
choose shard keys that have both high cardinality and will distribute write operations across the entire cluster.
Typically, a computed shard key that has some amount of randomness, such as ones that include a cryptographic
hash (i.e. MD5 or SHA1) of other content in the document, will allow the cluster to scale write operations. However,
random shard keys do not typically provide query isolation (page 507), which is another important characteristic of
shard keys.
New in version 2.4: MongoDB makes it possible to shard a collection on a hashed index. This can greatly improve
write scaling. See Shard a Collection Using a Hashed Shard Key (page 528).
Querying The mongos (page 938) provides an interface for applications to interact with sharded clusters that hides
the complexity of data partitioning. A mongos (page 938) receives queries from applications, and uses metadata
from the config server (page 502), to route queries to the mongod (page 925) instances with the appropriate data.
While the mongos (page 938) succeeds in making all querying operational in sharded environments, the shard key
you select can have a profound affect on query performance.
See also:
The Sharded Cluster Query Routing (page 510) and config server (page 502) sections for a more general overview of
querying in sharded environments.
Query Isolation The fastest queries in a sharded environment are those that mongos (page 938) will route to a
single shard, using the shard key and the cluster meta data from the config server (page 502). For queries that dont
include the shard key, mongos (page 938) must query all shards, wait for their response and then return the result to
the application. These scatter/gather queries can be long running operations.
If your query includes the first component of a compound shard key 3 , the mongos (page 938) can route the query
directly to a single shard, or a small number of shards, which provides better performance. Even if you query values
of the shard key reside in different chunks, the mongos (page 938) will route queries directly to specific shards.
To select a shard key for a collection:
3 In many ways, you can think of the shard key a cluster-wide unique index. However, be aware that sharded systems cannot enforce clusterwide unique indexes unless the unique field is in the shard key. Consider the Index Concepts (page 318) page for more information on indexes and
compound indexes.
507
determine the most commonly included fields in queries for a given application
find which of these operations are most performance dependent.
If this field has low cardinality (i.e not sufficiently selective) you should add a second field to the shard key making a
compound shard key. The data may become more splittable with a compound shard key.
See
Sharded Cluster Query Routing (page 510) for more information on query operations in the context of sharded clusters.
Sorting In sharded systems, the mongos (page 938) performs a merge-sort of all sorted query results from the
shards. See Sharded Cluster Query Routing (page 510) and Use Indexes to Sort Query Results (page 370) for more
information.
Sharded Cluster High Availability
A production (page 504) cluster has no single point of failure. This section introduces the availability concerns for
MongoDB deployments in general and highlights potential failure scenarios and available resolutions.
Application Servers or mongos Instances Become Unavailable
If each application server has its own mongos (page 938) instance, other application servers can continue access the
database. Furthermore, mongos (page 938) instances do not maintain persistent state, and they can restart and become
unavailable without loosing any state or data. When a mongos (page 938) instance starts, it retrieves a copy of the
config database and can begin routing queries.
A Single mongod Becomes Unavailable in a Shard
Replica sets (page 377) provide high availability for shards. If the unavailable mongod (page 925) is a primary, then
the replica set will elect (page 397) a new primary. If the unavailable mongod (page 925) is a secondary, and it
disconnects the primary and secondary will continue to hold all data. In a three member replica set, even if a single
member of the set experiences catastrophic failure, two other members have full copies of the data. 4
Always investigate availability interruptions and failures. If a system is unrecoverable, replace it and create a new
member of the replica set as soon as possible to replace the lost redundancy.
All Members of a Replica Set Become Unavailable
If all members of a replica set within a shard are unavailable, all data held in that shard is unavailable. However, the
data on all other shards will remain available, and its possible to read and write data to the other shards. However,
your application must be able to deal with partial results, and you should investigate the cause of the interruption and
attempt to recover the shard as soon as possible.
4 If an unavailable secondary becomes available while it still has current oplog entries, it can catch up to the latest state of the set using the
normal replication process, otherwise it must perform an initial sync.
508
Chapter 9. Sharding
Three distinct mongod (page 925) instances provide the config database using a special two-phase commits to maintain consistent state between these mongod (page 925) instances. Cluster operation will continue as normal but chunk
migration (page 516) and the cluster can create no new chunk splits (page 548). Replace the config server as soon as
possible. If all multiple config databases become unavailable, the cluster can become inoperable.
Note: All config servers must be running and available when you first initiate a sharded cluster.
If the name or address that a sharded cluster uses to connect to a config server changes, you must restart every mongod
(page 925) and mongos (page 938) instance in the sharded cluster. Avoid downtime by using CNAMEs to identify
config servers within the MongoDB deployment.
To avoid downtime when renaming config servers, use DNS names unrelated to physical or virtual hostnames to refer
to your config servers (page 502).
Generally, refer to each config server using a name in DNS (e.g. a CNAME record). When specifying the config server
connection string to mongos (page 938), DNS use this name. These records make it possible to renumber or rename
config servers without changing the connection string and without having to restart the entire cluster.
Shard Keys and Cluster Availability
509
See also:
Enable Authentication in a Sharded Cluster (page 528).
Access Control Privileges in Sharded Clusters
In sharded clusters, MongoDB provides separate administrative privileges for the sharded cluster and for each shard.
Sharded Cluster Authentication. When connected to a mongos (page 938), you can grant access to the sharded
clusters admin database. 5 These credentials reside on the config servers.
Users can access to the cluster according to their permissions (page 265). To receive privileges for the cluster,
you must authenticate while connected to a mongos (page 938) instance.
Shard Server Authentication. To allow administrators to connect and authenticate directly to specific shards, create
users in the admin database on the mongod (page 925) instance, or replica set, that provide each shard.
These users only have access to a single shard and are completely distinct from the cluster-wide credentials.
Important: Always connect and authenticate to sharded clusters via a mongos (page 938) instance.
Beyond these proprieties, privileges for sharded clusters are functionally the same as any other MongoDB deployment.
See Access Control (page 237) for more information.
Access a Sharded Cluster with Authentication
To access a sharded cluster as an authenticated user, from the command line, use the authentication options when
connecting to a mongos (page 938). Or, you can connect first and then authenticate with the authenticate
(page 725) command or the db.auth() (page 876) method.
To close an authenticated session, see the logout (page 724) command.
Restriction on localhost Interface
Sharded clusters have restrictions on the use of localhost interface. If the host identifier for a MongoDB instance
is either localhost or 127.0.0.1, then you must use localhost or 127.0.0.1 to identify all MongoDB
instances in a deployment.This applies to the host argument to the addShard (page 736) command as well as to
the --configdb (page 940) option for the mongos (page 938). If you mix localhost addresses with remote host
address, sharded clusters will not function correctly.
Sharded Cluster Query Routing
MongoDB mongos (page 938) instances route queries and write operations to shards in a sharded cluster. mongos
(page 938) provide the only interface to a sharded cluster from the perspective of applications. Applications never
connect or communicate directly with the shards.
The mongos (page 938) tracks what data is on which shard by caching the metadata from the config servers
(page 502). The mongos (page 938) uses the metadata to route operations from applications and clients to the
mongod (page 925) instances. A mongos (page 938) has no persistent state and consumes minimal system resources.
5 Credentials for databases other than the admin database reside in the mongod (page 925) instance (or sharded cluster) that is the primary
shard (page 501) for that database.
510
Chapter 9. Sharding
The most common practice is to run mongos (page 938) instances on the same systems as your application servers,
but you can maintain mongos (page 938) instances on the shards or on other dedicated resources.
Note: Changed in version 2.1.
Some aggregation operations using the aggregate (page 694) command (i.e. db.collection.aggregate()
(page 808)) will cause mongos (page 938) instances to require more CPU resources than in previous versions. This
modified performance profile may dictate alternate architecture decisions if you use the aggregation framework extensively in a sharded environment.
Routing Process
A mongos (page 938) instance uses the following processes to route queries and return results.
How mongos Determines which Shards Receive a Query A mongos (page 938) instance routes a query to a
cluster by:
1. Determining the list of shards that must receive the query.
2. Establishing a cursor on all targeted shards.
In some cases, when the shard key or a prefix of the shard key is a part of the query, the mongos (page 938) can route
the query to a subset of the shards. Otherwise, the mongos (page 938) must direct the query to all shards that hold
documents for that collection.
Example
Given the following shard key:
{ zipcode: 1, u_id: 1, c_date: 1 }
Depending on the distribution of chunks in the cluster, the mongos (page 938) may be able to target the query at a
subset of shards, if the query contains the following fields:
{ zipcode: 1 }
{ zipcode: 1, u_id: 1 }
{ zipcode: 1, u_id: 1, c_date: 1 }
How mongos Handles Query Modifiers If the result of the query is not sorted, the mongos (page 938) instance
opens a result cursor that round robins results from all cursors on the shards.
Changed in version 2.0.5: In versions prior to 2.0.5, the mongos (page 938) exhausted each cursor, one by one.
If the query specifies sorted results using the sort() (page 872) cursor method, the mongos (page 938) instance
passes the $orderby (page 691) option to the shards. When the mongos (page 938) receives results it performs an
incremental merge sort of the results while returning them to the client.
If the query limits the size of the result set using the limit() (page 867) cursor method, the mongos (page 938)
instance passes that limit to the shards and then re-applies the limit to the result before returning the result to the client.
If the query specifies a number of records to skip using the skip() (page 871) cursor method, the mongos (page 938)
cannot pass the skip to the shards, but rather retrieves unskipped results from the shards and skips the appropriate
number of documents when assembling the complete result. However, when used in conjunction with a limit()
(page 867), the mongos (page 938) will pass the limit plus the value of the skip() (page 871) to the shards to
improve the efficiency of these operations.
511
To detect if the MongoDB instance that your client is connected to is mongos (page 938), use the isMaster
(page 732) command. When a client connects to a mongos (page 938), isMaster (page 732) returns a document
with a msg field that holds the string isdbgrid. For example:
{
"ismaster" : true,
"msg" : "isdbgrid",
"maxBsonObjectSize" : 16777216,
"ok" : 1
}
If the application is instead connected to a mongod (page 925), the returned document does not include the
isdbgrid string.
Broadcast Operations and Targeted Operations
The mongos (page 938) program can route queries that include the full shard key or either of the following shard key
prefixes at a specific shard or set of shards:
{ a: 1 }
{ a: 1, b: 1 }
512
Chapter 9. Sharding
Figure 9.12: Read operations to a sharded cluster. Query criteria does not include the shard key. The query router
mongos must broadcast query to all shards for the collection.
513
Figure 9.13: Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos
can target the query to the appropriate shard or shards.
514
Chapter 9. Sharding
Depending on the distribution of data in the cluster and the selectivity of the query, mongos (page 938) may still have
to contact multiple shards 6 to fulfill these queries.
Sharded and Non-Sharded Data
Sharding operates on the collection level. You can shard multiple collections within a database or have multiple
databases with sharding enabled. 7 However, in production deployments, some databases and collections will use
sharding, while other databases and collections will only reside on a single shard.
Figure 9.14: Diagram of a primary shard. A primary shard contains non-sharded collections as well as chunks of
documents from sharded collections. Shard A is the primary shard.
Regardless of the data architecture of your sharded cluster, ensure that all queries and operations use the mongos
router to access the data cluster. Use the mongos (page 938) even for operations that do not impact the sharded data.
mongos (page 938) will route some queries, even some that include the shard key, to all shards, if needed.
As you configure sharding, you will use the enableSharding (page 736) command to enable sharding for a database. This simply makes
it possible to use the shardCollection (page 738) command on a collection within that database.
7
515
Figure 9.15: Diagram of applications/drivers issuing queries to mongos for unsharded collection as well as sharded
collection. Config servers not shown.
Chunk Splits in a Sharded Cluster (page 519) When a chunk grows beyond the configured size, MongoDB splits the
chunk in half.
Shard Key Indexes (page 520) Sharded collections must keep an index that starts with the shard key.
Sharded Cluster Metadata (page 520) The cluster maintains internal metadata that reflects the location of data within
the cluster.
Sharded Collection Balancing
Balancing is the process MongoDB uses to distribute data of a sharded collection evenly across a sharded cluster.
When a shard has too many of a sharded collections chunks compared to other shards, MongoDB automatically
balances the the chunks across the shards. The balancing procedure for sharded clusters is entirely transparent to the
user and application layer.
Cluster Balancer
The balancer process is responsible for redistributing the chunks of a sharded collection evenly among the shards for
every sharded collection. By default, the balancer process is always running.
Any mongos (page 938) instance in the cluster can start a balancing round. When a balancer process is active, the
responsible mongos (page 938) acquires a lock by modifying a document in the lock collection in the Config
Database (page 564).
Note: Changed in version 2.0: Before MongoDB version 2.0, large differences in timekeeping (i.e. clock skew)
between mongos (page 938) instances could lead to failed distributed locks. This carries the possibility of data loss,
particularly with skews larger than 5 minutes. Always use the network time protocol (NTP) by running ntpd on your
servers to minimize clock skew.
516
Chapter 9. Sharding
To address uneven chunk distribution for a sharded collection, the balancer migrates chunks (page 518) from shards
with more chunks to shards with a fewer number of chunks. The balancer migrates the chunks, one at a time, until
there is an even dispersion of chunks for the collection across the shards.
Chunk migrations carry some overhead in terms of bandwidth and workload, both of which can impact database
performance. The balancer attempts to minimize the impact by:
Moving only one chunk at a time.
Starting a balancing round only when the difference in the number of chunks between the shard with the greatest
number of chunks for a sharded collection and the shard with the lowest number of chunks for that collection
reaches the migration threshold (page 517).
You may disable the balancer temporarily for maintenance. See Disable the Balancer (page 552) for details.
You can also limit the window during which the balancer runs to prevent it from impacting production traffic. See
Schedule the Balancing Window (page 551) for details.
Note: The specification of the balancing window is relative to the local time zone of all individual mongos (page 938)
instances in the cluster.
See also:
Manage Sharded Cluster Balancer (page 551).
Migration Thresholds
To minimize the impact of balancing on the cluster, the balancer will not begin balancing until the distribution of
chunks for a sharded collection has reached certain thresholds. The thresholds apply to the difference in number
of chunks between the shard with the most chunks for the collection and the shard with the fewest chunks for that
collection. The balancer has the following thresholds:
Changed in version 2.2: The following thresholds appear first in 2.2. Prior to this release, a balancing round would
only start if the shard with the most chunks had 8 more chunks than the shard with the least number of chunks.
Number of Chunks
Less than 20
21-80
Greater than 80
Migration Threshold
2
4
8
Once a balancing round starts, the balancer will not stop until, for the collection, the difference between the number
of chunks on any two shards for that collection is less than two or a chunk migration fails.
Shard Size
By default, MongoDB will attempt to fill all available disk space with data on every shard as the data set grows. To
ensure that the cluster always has the capacity to handle data growth, monitor disk usage as well as other performance
metrics.
When adding a shard, you may set a maximum size for that shard. This prevents the balancer from migrating chunks
to the shard when the value of mapped (page 787) exceeds the maximum size. Use the maxSize parameter of the
addShard (page 736) command to set the maximum size for the shard.
See also:
Change the Maximum Storage Size for a Given Shard (page 549) and Monitoring for MongoDB (page 138).
517
Figure 9.16: Diagram of a collection distributed across three shards. For this collection, the difference in the number
of chunks between the shards reaches the migration thresholds (page 517) (in this case, 2) and triggers migration.
Chunk Migration
MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards.
Migrations may be either:
Manual. Only use manual migration in limited cases, such as to distribute data during bulk inserts. See Migrating
Chunks Manually (page 546) for more details.
Automatic. The balancer (page 516) process automatically migrates chunks when there is an uneven distribution
of a sharded collections chunks across the shards. See Migration Thresholds (page 517) for more details.
All chunk migrations use the following procedure:
1. The balancer process sends the moveChunk (page 742) command to the source shard.
2. The source starts the move with an internal moveChunk (page 742) command. During the migration process,
operations to the chunk route to the source shard. The source shard is responsible for incoming write operations
for the chunk.
3. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.
4. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure
that it has the changes to the migrated documents that occurred during the migration.
5. When fully synchronized, the destination shard connects to the config database and updates the cluster metadata
with the new location for the chunk.
6. After the destination shard completes the update of the metadata, and once there are no open cursors on the
chunk, the source shard deletes its copy of the documents.
The migration process ensures consistency and maximizes the availability of chunks during balancing.
Changed in version 2.4: While copying and deleting data during migrations, the balancer waits for replication to
secondaries. See Secondary Throttle in the v2.2 Manual8 for details.
8 http://docs.mongodb.org/v2.2/tutorial/configure-sharded-cluster-balancer/#sharded-cluster-config-secondary-throttle
518
Chapter 9. Sharding
Figure 9.17: Diagram of a shard with a chunk that exceeds the default chunk size (page 519) of 64 MB and triggers a
split of the chunk into two chunks.
Chunk Size
The default chunk size in MongoDB is 64 megabytes. You can increase or reduce the chunk size (page 547), mindful
of its effect on the clusters efficiency.
1. Small chunks lead to a more even distribution of data at the expense of more frequent migrations. This creates
expense at the query routing (mongos (page 938)) layer.
2. Large chunks lead to fewer migrations. This is more efficient both from the networking perspective and in terms
of internal overhead at the query routing layer. But, these efficiencies come at the expense of a potentially more
uneven distribution of data.
For many deployments, it makes sense to avoid frequent and potentially spurious migrations at the expense of a slightly
less evenly distributed data set.
Limitations
Changing the chunk size affects when chunks split but there are some limitations to its effects.
Automatic splitting only occurs during inserts or updates. If you lower the chunk size, it may take time for all
chunks to split to the new size.
Splits cannot be undone. If you increase the chunk size, existing chunks must grow through inserts or updates
until they reach the new size.
Note: Chunk ranges are inclusive of the lower boundary and exclusive of the upper boundary.
519
1, username:
1 }:
2. When MongoDB finishes building the index, you can safely drop the existing index on { zipcode:
1 }:
db.people.dropIndex( { zipcode: 1 } );
Since the index on the shard key cannot be a multikey index, the index { zipcode: 1, username:
can only replace the index { zipcode: 1 } if there are no array values for the username field.
1 }
If you drop the last valid index for the shard key, recover by recreating an index on just the shard key.
For restrictions on shard key indexes, see Shard Key Limitations (page 1017).
Sharded Cluster Metadata
Config servers (page 502) store the metadata for a sharded cluster. The metadata reflects state and organization of the
sharded data sets and system. The metadata includes the list of chunks on every shard and the ranges that define the
chunks. The mongos (page 938) instances cache this data and use it to route read and write operations to shards.
Config servers store the metadata in the Config Database (page 564).
Important: Always back up the config database before doing any maintenance on the config server.
To access the config database, issue the following command from the mongo (page 942) shell:
use config
In general, you should never edit the content of the config database directly. The config database contains the
following collections:
changelog (page 565)
chunks (page 566)
collections (page 567)
databases (page 567)
lockpings (page 568)
locks (page 568)
mongos (page 568)
520
Chapter 9. Sharding
521
Deploy a Sharded Cluster (page 522) Set up a sharded cluster by creating the needed data directories, starting the
required MongoDB instances, and configuring the cluster settings.
Considerations for Selecting Shard Keys (page 526) Choose the field that MongoDB uses to parse a collections documents for distribution over the clusters shards. Each shard holds documents with values within a certain range.
Shard a Collection Using a Hashed Shard Key (page 528) Shard a collection based on hashes of a fields values in
order to ensure even distribution over the collections shards.
Enable Authentication in a Sharded Cluster (page 528) Control access to a sharded cluster through a key file and
the keyFile setting on each of the clusters components.
Add Shards to a Cluster (page 529) Add a shard to add capacity to a sharded cluster.
Convert a Replica Set to a Replicated Sharded Cluster (page 530) Convert a replica set to a sharded cluster in which
each shard is its own replica set.
Deploy a Sharded Cluster
The config server processes are mongod (page 925) instances that store the clusters metadata. You designate a
mongod (page 925) as a config server using the --configsvr option. Each config server stores a complete copy
of the clusters metadata.
In production deployments, you must deploy exactly three config server instances, each running on different servers
to assure good uptime and data safety. In test environments, you can run all three instances on a single server.
Important: All members of a sharded cluster must be able to connect to all other members of a sharded cluster,
including all shards and all config servers. Ensure that the network and security systems including all interfaces and
firewalls, allow these connections.
1. Create data directories for each of the three config server instances. By default, a config server stores its data
files in the /data/configdb directory. You can choose a different location. To create a data directory, issue a
command similar to the following:
mkdir /data/configdb
522
Chapter 9. Sharding
2. Start the three config server instances. Start each by issuing a command using the following syntax:
mongod --configsvr --dbpath <path> --port <port>
The default port for config servers is 27019. You can specify a different port. The following example starts a
config server using the default port and default data directory:
mongod --configsvr --dbpath /data/configdb --port 27019
For additional command options, see mongod (page 925) or Configuration File Options (page 990).
Note: All config servers must be running and available when you first initiate a sharded cluster.
The mongos (page 938) instances are lightweight and do not require data directories. You can run a mongos
(page 938) instance on a system that runs other cluster components, such as on an application server or a server
running a mongod (page 925) process. By default, a mongos (page 938) instance runs on port 27017.
When you start the mongos (page 938) instance, specify the hostnames of the three config servers, either in the
configuration file or as command line parameters.
Tip
To avoid downtime, give each config server a logical DNS name (unrelated to the servers physical or virtual hostname). Without logical DNS names, moving or renaming a config server requires shutting down every mongod
(page 925) and mongos (page 938) instance in the sharded cluster.
To start a mongos (page 938) instance, issue a command using the following syntax:
mongos --configdb <config server hostnames>
For example, to start a mongos (page 938) that connects to config server instance running on the following hosts and
on the default ports:
cfg0.example.net
cfg1.example.net
cfg2.example.net
You would issue the following command:
mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019
Each mongos (page 938) in a sharded cluster must use the same configdb (page 1001) string, with identical host
names listed in identical order.
If you start a mongos (page 938) instance with a string that does not exactly match the string used by the other
mongos (page 938) instances in the cluster, the mongos (page 938) fails and you receive the Config Database String
Error (page 561) error.
Add Shards to the Cluster
A shard can be a standalone mongod (page 925) or a replica set. In a production environment, each shard should be
a replica set.
523
1. From a mongo (page 942) shell, connect to the mongos (page 938) instance. Issue a command using the
following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
For example, if a mongos (page 938) is accessible at mongos0.example.net on port 27017, issue the
following command:
mongo --host mongos0.example.net --port 27017
2. Add each shard to the cluster using the sh.addShard() (page 903) method, as shown in the examples below.
Issue sh.addShard() (page 903) separately for each shard. If the shard is a replica set, specify the name of
the replica set and specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard (page 736) database command, which lets you specify a name and maximum
size for the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To
use the database command, see addShard (page 736).
The following are examples of adding a shard with sh.addShard() (page 903):
To add a shard for a replica set named rs1 with a member running on port 27017 on
mongodb0.example.net, issue the following command:
sh.addShard( "rs1/mongodb0.example.net:27017" )
sh.addShard( "rs1/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net
To add a shard for a standalone mongod (page 925) on port 27017 of mongodb0.example.net,
issue the following command:
sh.addShard( "mongodb0.example.net:27017" )
Note: It might take some time for chunks to migrate to the new shard.
Before you can shard a collection, you must enable sharding for the collections database. Enabling sharding for a
database does not redistribute data but make it possible to shard the collections in that database.
Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores
all data before sharding begins.
1. From a mongo (page 942) shell, connect to the mongos (page 938) instance. Issue a command using the
following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
2. Issue the sh.enableSharding() (page 905) method, specifying the name of the database for which to
enable sharding. Use the following syntax:
524
Chapter 9. Sharding
sh.enableSharding("<database>")
Optionally, you can enable sharding for a database using the enableSharding (page 736) command, which uses
the following syntax:
db.runCommand( { enableSharding: <database> } )
Replace the <database>.<collection> string with the full namespace of your database, which consists
of the name of your database, a dot (e.g. .), and the full name of the collection. The shard-key-pattern
represents your shard key, which you specify in the same form as you would an index (page 814) key pattern.
Example
The following sequence of commands shards four collections:
sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } )
sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } )
db.alerts.ensureIndex( { _id : "hashed" } )
sh.shardCollection("events.alerts", { "_id": "hashed" } )
1, "name":
This shard key distributes documents by the value of the zipcode field. If a number of documents have
the same value for this field, then that chunk will be splittable (page 527) by the values of the name field.
(b) The addresses collection in the people database using the shard key { "state":
1 }.
1, "_id":
This shard key distributes documents by the value of the state field. If a number of documents have the
same value for this field, then that chunk will be splittable (page 527) by the values of the _id field.
(c) The chairs collection in the assets database using the shard key { "type":
}.
1, "_id":
This shard key distributes documents by the value of the type field. If a number of documents have the
same value for this field, then that chunk will be splittable (page 527) by the values of the _id field.
525
(d) The alerts collection in the events database using the shard key { "_id":
"hashed" }.
For many collections there may be no single, naturally occurring key that possesses all the qualities of a good shard
key. The following strategies may help construct a useful shard key from existing data:
1. Compute a more ideal shard key in your application layer, and store this in all of your documents, potentially in
the _id field.
2. Use a compound shard key that uses two or three values from all documents that provide the right mix of
cardinality with scalable write operations and query isolation.
3. Determine that the impact of using a less than ideal shard key is insignificant in your use case, given:
limited write volume,
expected data size, or
application query patterns.
4. New in version 2.4: Use a hashed shard key. Choose a field that has high cardinality and create a hashed index
(page 342) on that field. MongoDB uses these hashed index values as shard key values, which ensures an even
distribution of documents across the shards.
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do
not need to compute hashes.
Considerations for Selecting Shard Key Choosing the correct shard key can have a great impact on the performance, capability, and functioning of your database and cluster. Appropriate shard key choice depends on the schema
of your data and the way that your applications query and write data.
Create a Shard Key that is Easily Divisible
An easily divisible shard key makes it easy for MongoDB to distribute content among the shards. Shard keys that have
a limited number of possible values can result in chunks that are unsplittable.
See also:
Cardinality (page 527)
Create a Shard Key that has High Degree of Randomness
A shard key with high degree of randomness prevents any single shard from becoming a bottleneck and will distribute
write operations among the cluster.
526
Chapter 9. Sharding
See also:
Write Scaling (page 507)
Create a Shard Key that Targets a Single Shard
A shard key that targets a single shard makes it possible for the mongos program to return most query operations
directly from a single specific mongod instance. Your shard key should be the primary field used by your queries.
Fields with a high degree of randomness make it difficult to target operations to specific shards.
See also:
Query Isolation (page 507)
Shard Using a Compound Shard Key
The challenge when selecting a shard key is that there is not always an obvious choice. Often, an existing field in your
collection may not be the optimal key. In those situations, computing a special purpose shard key into an additional
field or using a compound shard key may help produce one that is more ideal.
Cardinality
Cardinality in the context of MongoDB, refers to the ability of the system to partition data into chunks. For example,
consider a collection of data such as an address book that stores address records:
Consider the use of a state field as a shard key:
The state keys value holds the US state for a given address document. This field has a low cardinality as all
documents that have the same value in the state field must reside on the same shard, even if a particular states
chunk exceeds the maximum chunk size.
Since there are a limited number of possible values for the state field, MongoDB may distribute data unevenly
among a small number of fixed chunks. This may have a number of effects:
If MongoDB cannot split a chunk because all of its documents have the same shard key, migrations involving these un-splittable chunks will take longer than other migrations, and it will be more difficult for your
data to stay balanced.
If you have a fixed maximum number of chunks, you will never be able to use more than that number of
shards for this collection.
Consider the use of a zipcode field as a shard key:
While this field has a large number of possible values, and thus has potentially higher cardinality, its possible
that a large number of users could have the same value for the shard key, which would make this chunk of users
un-splittable.
In these cases, cardinality depends on the data. If your address book stores records for a geographically distributed contact list (e.g. Dry cleaning businesses in America,) then a value like zipcode would be sufficient.
However, if your address book is more geographically concentrated (e.g ice cream stores in Boston Massachusetts,) then you may have a much lower cardinality.
Consider the use of a phone-number field as a shard key:
Phone number has a high cardinality, because users will generally have a unique value for this field, MongoDB
will be able to split as many chunks as needed.
527
While high cardinality, is necessary for ensuring an even distribution of data, having a high cardinality does not
guarantee sufficient query isolation (page 507) or appropriate write scaling (page 507).
Shard a Collection Using a Hashed Shard Key
New in version 2.4.
Hashed shard keys (page 506) use a hashed index (page 342) of a field as the shard key to partition data across your
sharded cluster.
For suggestions on choosing the right field as your hashed shard key, see Hashed Shard Keys (page 506). For limitations on hashed indexes, see Create a Hashed Index (page 342).
Note: If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution
may be uneven until the balancer automatically balances the collection.
To shard a collection using a hashed shard key, use an operation in the mongo (page 942) that resembles the following:
sh.shardCollection( "records.active", { a: "hashed" } )
This operation shards the active collection in the records database, using a hash of the a field as the shard key.
Specify the Initial Number of Chunks
If you shard an empty collection using a hashed shard key, MongoDB automatically creates and migrates empty chunks
so that each shard has two chunks. To control how many chunks MongoDB creates when sharding the collection, use
shardCollection (page 738) with the numInitialChunks parameter.
Important: MongoDB 2.4 adds support for hashed shard keys. After sharding a collection with a hashed shard key,
you must use the MongoDB 2.4 or higher mongos (page 938) and mongod (page 925) instances in your sharded
cluster.
Warning: MongoDB hashed indexes truncate floating point numbers to 64-bit integers before hashing. For
example, a hashed index would store the same value for a field that held a value of 2.3, 2.2, and 2.9. To
prevent collisions, do not use a hashed index for floating point numbers that cannot be consistently converted to
64-bit integers (and then back to floating point). MongoDB hashed indexes do not support floating point values
larger than 253 .
528
Chapter 9. Sharding
Procedure
When starting the component, set --keyFile option, which is an option for both mongos (page 938)
instances and mongod (page 925) instances. Set the --keyFile to the key files path.
Note: The keyFile (page 993) setting implies auth (page 993), which means in most cases you do not need
to set auth (page 993) explicitly.
3. Add the first administrative user and then add subsequent users. See Create a User Administrator (page 255).
Add Shards to a Cluster
You add shards to a sharded cluster after you create the cluster or anytime that you need to add capacity to the cluster.
If you have not created a sharded cluster, see Deploy a Sharded Cluster (page 522).
When adding a shard to a cluster, you should always ensure that the cluster has enough capacity to support the
migration without affecting legitimate production traffic.
In production environments, all shards should be replica sets.
Add a Shard to a Cluster
You interact with a sharded cluster by connecting to a mongos (page 938) instance.
1. From a mongo (page 942) shell, connect to the mongos (page 938) instance. For example, if a mongos
(page 938) is accessible at mongos0.example.net on port 27017, issue the following command:
mongo --host mongos0.example.net --port 27017
2. Add a shard to the cluster using the sh.addShard() (page 903) method, as shown in the examples below.
Issue sh.addShard() (page 903) separately for each shard. If the shard is a replica set, specify the name of
the replica set and specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard (page 736) database command, which lets you specify a name and maximum
size for the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To
use the database command, see addShard (page 736).
The following are examples of adding a shard with sh.addShard() (page 903):
529
To add a shard for a replica set named rs1 with a member running on port 27017 on
mongodb0.example.net, issue the following command:
sh.addShard( "rs1/mongodb0.example.net:27017" )
sh.addShard( "rs1/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net
To add a shard for a standalone mongod (page 925) on port 27017 of mongodb0.example.net,
issue the following command:
sh.addShard( "mongodb0.example.net:27017" )
Note: It might take some time for chunks to migrate to the new shard.
Following this tutorial, you will convert a single 3-member replica set to a cluster that consists of 2 shards. Each shard
will consist of an independent 3-member replica set.
The tutorial uses a test environment running on a local system UNIX-like system. You should feel encouraged to
follow along at home. If you need to perform this process in a production environment, notes throughout the
document indicate procedural differences.
The procedure, from a high level, is as follows:
1. Create or select a 3-member replica set and insert some data into a collection.
2. Start the config databases and create a cluster with a single shard.
3. Create a second replica set with three new mongod (page 925) instances.
4. Add the second replica set as a shard in the cluster.
5. Enable sharding on the desired collection or collections.
Process
Install MongoDB according to the instructions in the MongoDB Installation Tutorial (page 3).
Deploy a Replica Set with Test Data If have an existing MongoDB replica set deployment, you can omit the this
step and continue from Deploy Sharding Infrastructure (page 532).
Use the following sequence of steps to configure and deploy a replica set and to insert test data.
1. Create the following directories for the first replica set instance, named firstset:
/data/example/firstset1
/data/example/firstset2
/data/example/firstset3
530
Chapter 9. Sharding
2. In a separate terminal window or GNU Screen window, start three mongod (page 925) instances by running
each of the following commands:
mongod --dbpath /data/example/firstset1 --port 10001 --replSet firstset --oplogSize 700 --rest
mongod --dbpath /data/example/firstset2 --port 10002 --replSet firstset --oplogSize 700 --rest
mongod --dbpath /data/example/firstset3 --port 10003 --replSet firstset --oplogSize 700 --rest
Note: The --oplogSize 700 option restricts the size of the operation log (i.e. oplog) for each mongod
(page 925) instance to 700MB. Without the --oplogSize option, each mongod (page 925) reserves approximately 5% of the free disk space on the volume. By limiting the size of the oplog, each instance starts more
quickly. Omit this setting in production environments.
3. In a mongo (page 942) shell session in a new terminal, connect to the mongodb instance on port 10001 by
running the following command. If you are in a production environment, first read the note below.
mongo localhost:10001/admin
Note: Above and hereafter, if you are running in a production environment or are testing this process with
mongod (page 925) instances on multiple systems, replace localhost with a resolvable domain, hostname, or
the IP address of your system.
4. In the mongo (page 942) shell, initialize the first replica set by issuing the following command:
db.runCommand({"replSetInitiate" :
{"_id" : "firstset", "members" : [{"_id" : 1, "host"
{"_id" : 2, "host"
{"_id" : 3, "host"
]}})
{
"info" : "Config now saved locally. Should come online in about
"ok" : 1
}
: "localhost:10001"},
: "localhost:10002"},
: "localhost:10003"}
a minute.",
5. In the mongo (page 942) shell, create and populate a new collection by issuing the following sequence of
JavaScript operations:
use test
switched to db test
people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina
for(var i=0; i<1000000; i++){
name = people[Math.floor(Math.random()*people.length)];
user_id = i;
boolean = [true, false][Math.floor(Math.random()*2)];
added_at = new Date();
number = Math.floor(Math.random()*10001);
db.test_collection.save({"name":name, "user_id":user_id, "boolean":
}
The above operations add one million documents to the collection test_collection. This can take several
minutes, depending on your system.
The script adds the documents in the following form:
531
Deploy Sharding Infrastructure This procedure creates the three config databases that store the clusters metadata.
Note: For development and testing environments, a single config database is sufficient. In production environments,
use three config databases. Because config instances store only the metadata for the sharded cluster, they have minimal
resource requirements.
1. Create the following data directories for three config database instances:
/data/example/config1
/data/example/config2
/data/example/config3
Issue the following command at the system prompt:
mkdir -p /data/example/config1 /data/example/config2 /data/example/config3
2. In a separate terminal window or GNU Screen window, start the config databases by running the following
commands:
mongod --configsvr --dbpath /data/example/config1 --port 20001
mongod --configsvr --dbpath /data/example/config2 --port 20002
mongod --configsvr --dbpath /data/example/config3 --port 20003
3. In a separate terminal window or GNU Screen window, start mongos (page 938) instance by running the
following command:
mongos --configdb localhost:20001,localhost:20002,localhost:20003 --port 27017 --chunkSize 1
Note: If you are using the collection created earlier or are just experimenting with sharding, you can use a small
--chunkSize (1MB works well.) The default chunkSize (page 1002) of 64MB means that your cluster
must have 64MB of data before the MongoDBs automatic sharding begins working.
In production environments, do not use a small shard size.
The configdb (page 1001) options specify the configuration databases (e.g. localhost:20001,
localhost:20002, and localhost:2003). The mongos (page 938) instance runs on the default MongoDB port (i.e. 27017), while the databases themselves are running on ports in the 30001 series. In the this
example, you may omit the --port 27017 option, as 27017 is the default port.
4. Add the first shard in mongos (page 938). In a new terminal window or GNU Screen session, add the first
shard, according to the following procedure:
(a) Connect to the mongos (page 938) with the following command:
mongo localhost:27017/admin
(b) Add the first shard to the cluster by issuing the addShard (page 736) command:
db.runCommand( { addShard : "firstset/localhost:10001,localhost:10002,localhost:10003" } )
532
Chapter 9. Sharding
Deploy a Second Replica Set This procedure deploys a second replica set. This closely mirrors the process used to
establish the first replica set above, omitting the test data.
1. Create the following data directories for the members of the second replica set, named secondset:
/data/example/secondset1
/data/example/secondset2
/data/example/secondset3
2. In three new terminal windows, start three instances of mongod (page 925) with the following commands:
mongod --dbpath /data/example/secondset1 --port 10004 --replSet secondset --oplogSize 700 --rest
mongod --dbpath /data/example/secondset2 --port 10005 --replSet secondset --oplogSize 700 --rest
mongod --dbpath /data/example/secondset3 --port 10006 --replSet secondset --oplogSize 700 --rest
Note: As above, the second replica set uses the smaller oplogSize (page 1000) configuration. Omit this
setting in production environments.
3. In the mongo (page 942) shell, connect to one mongodb instance by issuing the following command:
mongo localhost:10004/admin
4. In the mongo (page 942) shell, initialize the second replica set by issuing the following command:
db.runCommand({"replSetInitiate" :
{"_id" : "secondset",
"members" : [{"_id" : 1, "host" : "localhost:10004"},
{"_id" : 2, "host" : "localhost:10005"},
{"_id" : 3, "host" : "localhost:10006"}
]}})
{
"info" : "Config now saved locally.
"ok" : 1
5. Add the second replica set to the cluster. Connect to the mongos (page 938) instance created in the previous
procedure and issue the following sequence of commands:
use admin
db.runCommand( { addShard : "secondset/localhost:10004,localhost:10005,localhost:10006" } )
6. Verify that both shards are properly configured by running the listShards (page 737) command. View this
and example output below:
db.runCommand({listShards:1})
{
"shards" : [
{
"_id" : "firstset",
"host" : "firstset/localhost:10001,localhost:10003,localhost:10002"
},
{
"_id" : "secondset",
"host" : "secondset/localhost:10004,localhost:10006,localhost:10005"
533
}
],
"ok" : 1
}
Enable Sharding MongoDB must have sharding enabled on both the database and collection levels.
Enabling Sharding on the Database Level Issue the enableSharding (page 736) command. The following
example enables sharding on the test database:
db.runCommand( { enableSharding : "test" } )
{ "ok" : 1 }
Create an Index on the Shard Key MongoDB uses the shard key to distribute documents between shards. Once
selected, you cannot change the shard key. Good shard keys:
have values that are evenly distributed among all documents,
group documents that are often accessed at the same time into contiguous chunks, and
allow for effective distribution of activity among shards.
Typically shard keys are compound, comprising of some sort of hash and some sort of other primary key. Selecting
a shard key depends on your data set, application architecture, and usage pattern, and is beyond the scope of this
document. For the purposes of this example, we will shard the number key. This typically would not be a good
shard key for production deployments.
Create the index with the following procedure:
use test
db.test_collection.ensureIndex({number:1})
See also:
The Shard Key Overview (page 506) and Shard Key (page 506) sections.
Shard the Collection Issue the following command:
use admin
db.runCommand( { shardCollection : "test.test_collection", key : {"number":1} })
{ "collectionsharded" : "test.test_collection", "ok" : 1 }
534
Chapter 9. Sharding
{
"raw" : {
"firstset/localhost:10001,localhost:10003,localhost:10002" : {
"db" : "test",
"collections" : 3,
"objects" : 973887,
"avgObjSize" : 100.33173458522396,
"dataSize" : 97711772,
"storageSize" : 141258752,
"numExtents" : 15,
"indexes" : 2,
"indexSize" : 56978544,
"fileSize" : 1006632960,
"nsSizeMB" : 16,
"ok" : 1
},
"secondset/localhost:10004,localhost:10006,localhost:10005" : {
"db" : "test",
"collections" : 3,
"objects" : 26125,
"avgObjSize" : 100.33286124401914,
"dataSize" : 2621196,
"storageSize" : 11194368,
"numExtents" : 8,
"indexes" : 2,
"indexSize" : 2093056,
"fileSize" : 201326592,
"nsSizeMB" : 16,
"ok" : 1
}
},
"objects" : 1000012,
"avgObjSize" : 100.33176401883178,
"dataSize" : 100332968,
"storageSize" : 152453120,
"numExtents" : 23,
"indexes" : 4,
"indexSize" : 59071600,
"fileSize" : 1207959552,
"ok" : 1
}
In a few moments you can run these commands for a second time to demonstrate that chunks are migrating from
9.3. Sharded Cluster Tutorials
535
firstset to secondset.
When this procedure is complete, you will have converted a replica set into a cluster where each shard is itself a replica
set.
Associate tags with a particular shard using the sh.addShardTag() (page 904) method when connected to a
mongos (page 938) instance. A single shard may have multiple tags, and multiple shards may also have the same tag.
Example
536
Chapter 9. Sharding
The following example adds the tag NYC to two shards, and the tags SFO and NRT to a third shard:
sh.addShardTag("shard0000",
sh.addShardTag("shard0001",
sh.addShardTag("shard0002",
sh.addShardTag("shard0002",
"NYC")
"NYC")
"SFO")
"NRT")
You may remove tags from a particular shard using the sh.removeShardTag() (page 908) method when connected to a mongos (page 938) instance, as in the following example, which removes the NRT tag from a shard:
sh.removeShardTag("shard0002", "NRT")
To assign a tag to a range of shard keys use the sh.addTagRange() (page 904) method when connected to a
mongos (page 938) instance. Any given shard key range may only have one assigned tag. You cannot overlap defined
ranges, or tag the same range more than once.
Example
Given a collection named users in the records database, sharded by the zipcode field. The following operations
assign:
two ranges of zip codes in Manhattan and Brooklyn the NYC tag
one range of zip codes in San Francisco the SFO tag
sh.addTagRange("records.users", { zipcode: "10001" }, { zipcode: "10281" }, "NYC")
sh.addTagRange("records.users", { zipcode: "11201" }, { zipcode: "11240" }, "NYC")
sh.addTagRange("records.users", { zipcode: "94102" }, { zipcode: "94135" }, "SFO")
Note: Shard ranges are always inclusive of the lower value and exclusive of the upper boundary.
The mongod (page 925) does not provide a helper for removing a tag range. You may delete tag assignment from
a shard key range by removing the corresponding document from the tags (page 569) collection of the config
database.
Each document in the tags (page 569) holds the namespace of the sharded collection and a minimum shard key
value.
Example
The following example removes the NYC tag assignment for the range of zip codes within Manhattan:
use config
db.tags.remove({ _id: { ns: "records.users", min: { zipcode: "10001" }}, tag: "NYC" })
The output from sh.status() (page 910) lists tags associated with a shard, if any, for each shard. A shards tags
exist in the shards document in the shards (page 569) collection of the config database. To return all shards with
9.3. Sharded Cluster Tutorials
537
a specific tag, use a sequence of operations that resemble the following, which will return only those shards tagged
with NYC:
use config
db.shards.find({ tags: "NYC" })
You can find tag ranges for all namespaces in the tags (page 569) collection of the config database. The output of
sh.status() (page 910) displays all tag ranges. To return all shard key ranges tagged with NYC, use the following
sequence of operations:
use config
db.tags.find({ tags: "NYC" })
To list the databases that have sharding enabled, query the databases collection in the Config Database (page 564).
A database has sharding enabled if the value of the partitioned field is true. Connect to a mongos (page 938)
instance with a mongo (page 942) shell, and run the following operation to get a full list of databases with sharding
enabled:
use config
db.databases.find( { "partitioned": true } )
Example
You can use the following sequence of commands when to return a list of all databases in the cluster:
use config
db.databases.find()
List Shards
To list the current set of configured shards, use the listShards (page 737) command, as follows:
use admin
db.runCommand( { listShards : 1 } )
To view cluster details, issue db.printShardingStatus() (page 892) or sh.status() (page 910). Both
methods return the same output.
Example
In the following example output from sh.status() (page 910)
538
Chapter 9. Sharding
539
3. Start all three config servers, using the same invocation that you used for the single config server.
mongod --configsvr
4. Restart all shard mongod (page 925) and mongos (page 938) processes.
Migrate Config Servers with the Same Hostname
This procedure migrates a config server (page 502) in a sharded cluster (page 498) to a new system that uses the same
hostname.
To migrate all the config servers in a cluster, perform this procedure for each config server separately and migrate the
config servers in reverse order from how they are listed in the mongos (page 938) instances configdb (page 1001)
string. Start with the last config server listed in the configdb (page 1001) string.
1. Shut down the config server.
This renders all config data for the sharded cluster read only.
2. Change the DNS entry that points to the system that provided the old config server, so that the same hostname
points to the new system. How you do this depends on how you organize your DNS and hostname resolution
services.
3. Copy the contents of dbpath (page 993) from the old config server to the new config server.
For example, to copy the contents of dbpath (page 993) to a machine
mongodb.config2.example.net, you might issue a command similar to the following:
named
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
When you start the third config server, your cluster will become writable and it will be able to create new splits and
migrate chunks as needed.
Migrate Config Servers with Different Hostnames
This procedure migrates a config server (page 502) in a sharded cluster (page 498) to a new server that uses a different
hostname. Use this procedure only if the config server will not be accessible via the same hostname.
Changing a config servers (page 502) hostname requires downtime and requires restarting every process in the
sharded cluster. If possible, avoid changing the hostname so that you can instead use the procedure to migrate a config
server and use the same hostname (page 540).
To migrate all the config servers in a cluster, perform this procedure for each config server separately and migrate the
config servers in reverse order from how they are listed in the mongos (page 938) instances configdb (page 1001)
string. Start with the last config server listed in the configdb (page 1001) string.
1. Disable the cluster balancer process temporarily. See Disable the Balancer (page 552) for more information.
2. Shut down the config server.
This renders all config data for the sharded cluster read only.
540
Chapter 9. Sharding
3. Copy the contents of dbpath (page 993) from the old config server to the new config server.
Example
To copy the contents of dbpath (page 993) to a machine named mongodb.config2.example.net, use
a command that resembles the following:
rsync -az /data/configdb mongodb.config2.example.net:/data/configdb
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
4. Restart the config server process that you used in the previous step to copy the data files to the new config server
instance.
5. Start the new config server instance. The default invocation is:
mongod --configsvr
6. Re-enable the balancer to allow the cluster to resume normal balancing operations. See the Disable the Balancer
(page 552) section for more information on managing the balancer process.
541
Note: In the course of this procedure never remove a config server from the configdb (page 1001) parameter on
any of the mongos (page 938) instances. If you need to change the name of a config server, always make sure that all
mongos (page 938) instances have three config servers specified in the configdb (page 1001) setting at all times.
Disable the balancer to stop chunk migration (page 518) and do not perform any metadata write operations until the
process finishes. If a migration is in progress, the balancer will complete the in-progress migration before stopping.
To disable the balancer, connect to one of the clusters mongos (page 938) instances and issue the following method:
sh.stopBalancer()
To check the balancer state, issue the sh.getBalancerState() (page 906) method.
For more information, see Disable the Balancer (page 552).
Migrate Each Config Server Separately
Migrate each config server (page 502) by starting with the last config server listed in the configdb (page 1001)
string. Proceed in reverse order of the configdb (page 1001) string. Migrate and restart a config server before
proceeding to the next. Do not rename a config server during this process.
Note: If the name or address that a sharded cluster uses to connect to a config server changes, you must restart every
542
Chapter 9. Sharding
mongod (page 925) and mongos (page 938) instance in the sharded cluster. Avoid downtime by using CNAMEs to
identify config servers within the MongoDB deployment.
See Migrate Config Servers with Different Hostnames (page 540) for more information.
Important: Start with the last config server listed in configdb (page 1001).
1. Shut down the config server.
This renders all config data for the sharded cluster read only.
2. Change the DNS entry that points to the system that provided the old config server, so that the same hostname
points to the new system. How you do this depends on how you organize your DNS and hostname resolution
services.
3. Copy the contents of dbpath (page 993) from the old config server to the new config server.
For example, to copy the contents of dbpath (page 993) to a machine
mongodb.config2.example.net, you might issue a command similar to the following:
named
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
If the configdb (page 1001) string will change as part of the migration, you must shut down all mongos
(page 938) instances before changing the configdb (page 1001) string. This avoids errors in the sharded cluster
over configdb (page 1001) string conflicts.
If the configdb (page 1001) string will remain the same, you can migrate the mongos (page 938) instances sequentially or all at once.
1. Shut down the mongos (page 938) instances using the shutdown (page 759) command. If the configdb
(page 1001) string is changing, shut down all mongos (page 938) instances.
2. If the hostname has changed for any of the config servers, update the configdb (page 1001) string for each
mongos (page 938) instance. The mongos (page 938) instances must all use the same configdb (page 1001)
string. The strings must list identical host names in identical order.
Tip
To avoid downtime, give each config server a logical DNS name (unrelated to the servers physical or virtual
hostname). Without logical DNS names, moving or renaming a config server requires shutting down every
mongod (page 925) and mongos (page 938) instance in the sharded cluster.
3. Restart the mongos (page 938) instances being sure to use the updated configdb (page 1001) string if hostnames have changed.
For more information, see Start the mongos Instances (page 523).
Migrate the Shards
Migrate the shards one at a time. For each shard, follow the appropriate procedure in this section.
543
Migrate a Replica Set Shard To migrate a sharded cluster, migrate each member separately. First migrate the
non-primary members, and then migrate the primary last.
If the replica set has two voting members, add an arbiter (page 389) to the replica set to ensure the set keeps a majority
of its votes available during the migration. You can remove the arbiter after completing the migration.
Migrate a Member of a Replica Set Shard
1. Shut down the mongod (page 925) process. To ensure a clean shutdown, use the shutdown (page 759)
command.
2. Move the data directory (i.e., the dbpath (page 993)) to the new machine.
3. Restart the mongod (page 925) process at the new location.
4. Connect to the replica sets current primary.
5. If the hostname of the member has changed, use rs.reconfig() (page 897) to update the replica set configuration document (page 479) with the new hostname.
For example, the following sequence of commands updates the hostname for the instance at position 2 in the
members array:
cfg = rs.conf()
cfg.members[2].host = "pocatello.example.net:27017"
rs.reconfig(cfg)
For more information on updating the configuration document, see Example Reconfiguration Operations
(page 483).
6. To confirm the new configuration, issue rs.conf() (page 896).
7. Wait for the member to recover. To check the members state, issue rs.status() (page 898).
Migrate the Primary in a Replica Set Shard While migrating the replica sets primary, the set must elect a new
primary. This failover process which renders the replica set unavailable to perform reads or accept writes for the
duration of the election, which typically completes quickly. If possible, plan the migration during a maintenance
window.
1. Step down the primary to allow the normal failover (page 396) process. To step down the primary, connect
to the primary and issue the either the replSetStepDown (page 730) command or the rs.stepDown()
(page 899) method. The following example shows the rs.stepDown() (page 899) method:
rs.stepDown()
2. Once the primary has stepped down and another member has become PRIMARY (page 487) state. To migrate
the stepped-down primary, follow the Migrate a Member of a Replica Set Shard (page 544) procedure
You can check the output of rs.status() (page 898) to confirm the change in status.
Migrate a Standalone Shard The ideal procedure for migrating a standalone shard is to convert the standalone to a
replica set (page 432) and then use the procedure for migrating a replica set shard (page 544). In production clusters,
all shards should be replica sets, which provides continued availability during maintenance windows.
Migrating a shard as standalone is a multi-step process during which part of the shard may be unavailable. If the
shard is the primary shard for a database,the process includes the movePrimary (page 743) command. While the
movePrimary (page 743) runs, you should stop modifying data in that database. To migrate the standalone shard,
use the Remove Shards from an Existing Sharded Cluster (page 553) procedure.
544
Chapter 9. Sharding
To complete the migration, re-enable the balancer to resume chunk migrations (page 518).
Connect to one of the clusters mongos (page 938) instances and pass true to the sh.setBalancerState()
(page 908) method:
sh.setBalancerState(true)
To check the balancer state, issue the sh.getBalancerState() (page 906) method.
For more information, see Enable the Balancer (page 553).
Backup Cluster Metadata
This procedure shuts down the mongod (page 925) instance of a config server (page 502) in order to create a backup
of a sharded clusters (page 493) metadata. The clusters config servers store all of the clusters metadata, most
importantly the mapping from chunks to shards.
When you perform this procedure, the cluster remains operational 9 .
1. Disable the cluster balancer process temporarily. See Disable the Balancer (page 552) for more information.
2. Shut down one of the config databases.
3. Create a full copy of the data files (i.e. the path specified by the dbpath (page 993) option for the config
instance.)
4. Restart the original configuration server.
5. Re-enable the balancer to allow the cluster to resume normal balancing operations. See the Disable the Balancer
(page 552) section for more information on managing the balancer process.
See also:
Backup Strategies for MongoDB Systems (page 136).
Create Chunks in a Sharded Cluster
Pre-splitting the chunk ranges in an empty sharded collection allows clients to insert data into an already partitioned
collection. In most situations a sharded cluster will create and distribute chunks automatically without user intervention. However, in a limited number of cases, MongoDB cannot create enough chunks or distribute data fast enough to
support required throughput. For example:
If you want to partition an existing data collection that resides on a single shard.
If you want to ingest a large volume of data into a cluster that isnt balanced, or where the ingestion of data will
lead to data imbalance. For example, monotonically increasing or decreasing shard keys insert all data into a
single chunk.
These operations are resource intensive for several reasons:
Chunk migration requires copying all the data in the chunk from one shard to another.
MongoDB can migrate only a single chunk at a time.
MongoDB creates splits only after an insert operation.
9 While one of the three config servers is unavailable, the cluster cannot split any chunks nor can it migrate chunks between shards. Your
application will be able to write data to the cluster. See Config Servers (page 502) for more information.
545
Warning: Only pre-split an empty collection. If a collection already has data, MongoDB automatically splits the
collections data when you enable sharding for the collection. Subsequent attempts to manually create splits can
lead to unpredictable chunk ranges and sizes as well as inefficient or ineffective balancing behavior.
To create chunks manually, use the following procedure:
1. Split empty chunks in your collection by manually performing the split (page 739) command on chunks.
Example
To create chunks for documents in the myapp.users collection using the email field as the shard key, use
the following operation in the mongo (page 942) shell:
for ( var x=97; x<97+26; x++ ){
for( var y=97; y<97+26; y+=6 ) {
var prefix = String.fromCharCode(x) + String.fromCharCode(y);
db.runCommand( { split : "myapp.users" , middle : { email : prefix } } );
}
}
This command moves the chunk that includes the shard key value smith to the shard named
mongodb-shard3.example.net. The command will block until the migration is complete.
Tip
To return a list of shards, use the listShards (page 737) command.
546
Chapter 9. Sharding
Example
Evenly migrate chunks
To evenly migrate chunks for the myapp.users collection, put each prefix chunk on the next shard from the other
and run the following commands in the mongo shell:
See Create Chunks in a Sharded Cluster (page 545) for an introduction to pre-splitting.
New in version 2.2: The moveChunk (page 742) command has the: _secondaryThrottle parameter. When set
to true, MongoDB ensures that changes to shards as part of chunk migrations replicate to secondaries throughout
the migration operation. For more information, see Require Replication before Chunk Migration (Secondary Throttle)
(page 550).
Changed in version 2.4: In 2.4, _secondaryThrottle is true by default.
Warning: The moveChunk (page 742) command may produce the following error message:
The collection's metadata lock is already taken.
This occurs when clients have too many open cursors that access the migrating chunk. You may either wait until
the cursors complete their operations or close the cursors manually.
3. Issue the following save() (page 846) operation to store the global chunk size configuration value:
db.settings.save( { _id:"chunksize", value: <size> } )
Note: The chunkSize (page 1002) and --chunkSize options, passed at runtime to the mongos (page 938), do
not affect the chunk size after you have initialized the cluster.
To avoid confusion, always set the chunk size using the above procedure instead of the runtime options.
Modifying the chunk size has several limitations:
547
Use splitAt() (page 909) to split a chunk in two, using the queried document as the lower bound in the new chunk:
Example
The following command splits the chunk that contains the value of 63109 for the zipcode field in the people
collection of the records database.
548
Chapter 9. Sharding
Note: splitAt() (page 909) does not nessarily split the chunk into two equally sized chunks. The split occurs at
the location of the document matching the query, regardless of where that document is in the chunk.
You can schedule a window of time during which the balancer can migrate chunks, as described in the following
procedures:
Schedule the Balancing Window (page 551)
Remove a Balancing Window Schedule (page 552).
The mongos (page 938) instances user their own local timezones to when respecting balancer window.
Configure Default Chunk Size
The default chunk size for a sharded cluster is 64 megabytes. In most situations, the default size is appropriate for
splitting and migrating chunks. For information on how chunk size affects deployments, see details, see Chunk Size
(page 519).
Changing the default chunk size affects chunks that are processes during migrations and auto-splits but does not
retroactively affect all chunks.
To configure default chunk size, see Modify Chunk Size in a Sharded Cluster (page 547).
Change the Maximum Storage Size for a Given Shard
The maxSize field in the shards (page 569) collection in the config database (page 564) sets the maximum size
for a shard, allowing you to control whether the balancer will migrate chunks to a shard. If dataSize (page 768) is
above a shards maxSize, the balancer will not move chunks to the shard. Also, the balancer will not move chunks
off an overloaded shard. This must happen manually. The maxSize value only affects the balancers selection of
destination shards.
By default, maxSize is not specified, allowing shards to consume the total amount of available space on their machines if necessary.
You can set maxSize both when adding a shard and once a shard is running.
549
To set maxSize when adding a shard, set the addShard (page 736) commands maxSize parameter to the maximum size in megabytes. For example, the following command run in the mongo (page 942) shell adds a shard with a
maximum size of 125 megabytes:
db.runCommand( { addshard : "example.net:34008", maxSize : 125 } )
To set maxSize on an existing shard, insert or update the maxSize field in the shards (page 569) collection in the
config database (page 564). Set the maxSize in megabytes.
Example
Assume you have the following shard without a maxSize field:
{ "_id" : "shard0000", "host" : "example.net:34001" }
Run the following sequence of commands in the mongo (page 942) shell to insert a maxSize of 125 megabytes:
use config
db.shards.update( { _id : "shard0000" }, { $set : { maxSize : 125 } } )
To later increase the maxSize setting to 250 megabytes, run the following:
use config
db.shards.update( { _id : "shard0000" }, { $set : { maxSize : 250 } } )
New in version 2.2.1: _secondaryThrottle became an option to the balancer and to command moveChunk
(page 742). _secondaryThrottle makes it possible to require the balancer wait for replication to secondaries
during migrations.
Changed in version 2.4: _secondaryThrottle became the default mode for all balancer and moveChunk
(page 742) operations.
Before 2.2.1, the write operations required to migrate chunks between shards do not need to replicate to secondaries in
order to succeed. However, you can configure the balancer to require migration related write operations to replicate to
secondaries. This throttles or slows the migration process and in doing so reduces the potential impact of migrations
on a sharded cluster.
You can throttle migrations by enabling the balancers _secondaryThrottle parameter. When enabled, secondary throttle requires a { w : 2 } write concern on delete and insertion operations, so that every operation
propagates to at least one secondary before the balancer issues the next operation.
Starting with version 2.4 the default secondaryThrottle value is true.
_secondaryThrottle to false.
You enable or disable _secondaryThrottle directly in the settings (page 569) collection in the config
database (page 564) by running the following commands from the mongo (page 942) shell:
use config
db.settings.update( { "_id" : "balancer" } , { $set : { "_secondaryThrottle" : true } } , { upsert :
You also can enable secondary throttle when issuing the moveChunk (page 742) command by setting
_secondaryThrottle to true. For more information, see moveChunk (page 742).
550
Chapter 9. Sharding
The following command checks if the balancer is enabled (i.e. that the balancer is allowed to run). The command does
not check if the balancer is active (i.e. if it is actively balancing chunks).
To see if the balancer is enabled in your cluster, issue the following command, which returns a boolean:
sh.getBalancerState()
When this command returns, you will see output like the following:
{
"_id"
"process"
"state"
"ts"
"when"
"who"
"why"
:
:
:
:
:
:
:
"balancer",
"mongos0.example.net:1292810611:1804289383",
2,
ObjectId("4d0f872630c42d1978be8a2e"),
"Mon Dec 20 2010 11:41:10 GMT-0500 (EST)",
"mongos0.example.net:1292810611:1804289383:Balancer:846930886",
"doing balance round" }
In some situations, particularly when your data set grows slowly and a migration can impact performance, its useful
to be able to ensure that the balancer is active only at certain times. Use the following procedure to specify a window
during which the balancer will be able to migrate chunks:
1. Connect to any mongos (page 938) in the cluster using the mongo (page 942) shell.
551
2. Issue the following command to switch to the Config Database (page 564):
use config
3. Use an operation modeled on the following example update() (page 849) operation to modify the balancers
window:
Replace <start-time> and <end-time> with time values using two digit hour and minute values (e.g
HH:MM) that describe the beginning and end boundaries of the balancing window. These times will be evaluated
relative to the time zone of each individual mongos (page 938) instance in the sharded cluster. If your mongos
(page 938) instances are physically located in different time zones, use a common time zone (e.g. GMT) to
ensure that the balancer window is interpreted correctly.
For instance, running the following will force the balancer to run between 11PM and 6AM local time only:
Note: The balancer window must be sufficient to complete the migration of all data inserted during the day.
As data insert rates can change based on activity and usage patterns, it is important to ensure that the balancing window
you select will be sufficient to support the needs of your deployment.
If you have set the balancing window (page 551) and wish to remove the schedule so that the balancer is always
running, issue the following sequence of operations:
use config
db.settings.update({ _id : "balancer" }, { $unset : { activeWindow : true } })
By default the balancer may run at any time and only moves chunks as needed. To disable the balancer for a short
period of time and prevent all migration, use the following procedure:
1. Connect to any mongos (page 938) in the cluster using the mongo (page 942) shell.
2. Issue the following operation to disable the balancer:
sh.setBalancerState(false)
If a migration is in progress, the system will complete the in-progress migration before stopping.
3. To verify that the balancer has stopped, issue the following command, which returns false if the balancer is
stopped:
sh.getBalancerState()
Optionally, to verify no migrations are in progress after disabling, issue the following operation in the mongo
(page 942) shell:
use config
while( sh.isBalancerRunning() ) {
print("waiting...");
sleep(1000);
}
552
Chapter 9. Sharding
Note: To disable the balancer from a driver that does not have the sh.startBalancer() (page 910) helper, issue
the following command from the config database:
db.settings.update( { _id: "balancer" }, { $set : { stopped: true } } , true )
Use this procedure if you have disabled the balancer and are ready to re-enable it:
1. Connect to any mongos (page 938) in the cluster using the mongo (page 942) shell.
2. Issue one of the following operations to enable the balancer:
From the mongo (page 942) shell, issue:
sh.setBalancerState(true)
From a driver that does not have the sh.startBalancer() (page 910) helper, issue the following from the
config database:
db.settings.update( { _id: "balancer" }, { $set : { stopped: false } } , true )
If MongoDB migrates a chunk during a backup (page 136), you can end with an inconsistent snapshot of your sharded
cluster. Never run a backup while the balancer is active. To ensure that the balancer is inactive during your backup
operation:
Set the balancing window (page 551) so that the balancer is inactive during the backup. Ensure that the backup
can complete while you have the balancer disabled.
manually disable the balancer (page 552) for the duration of the backup procedure.
If you turn the balancer off while it is in the middle of a balancing round, the shut down is not instantaneous. The
balancer completes the chunk move in-progress and then ceases all further balancing rounds.
Before starting a backup operation, confirm that the balancer is not active. You can use the following command to
determine if the balancer is active:
!sh.getBalancerState() && !sh.isBalancerRunning()
When the backup procedure is complete you can reactivate the balancer process.
Remove Shards from an Existing Sharded Cluster
Remove Shards:
553
To remove a shard you must ensure the shards data is migrated to the remaining shards in the cluster. This procedure
describes how to safely migrate data and how to remove a shard.
This procedure describes how to safely remove a single shard. Do not use this procedure to migrate an entire cluster
to new hardware. To migrate an entire shard to new hardware, migrate individual shards as if they were independent
replica sets.
To remove a shard, first connect to one of the clusters mongos (page 938) instances using mongo (page 942) shell.
Then use the sequence of tasks in this document to remove a shard from the cluster.
Ensure the Balancer Process is Enabled
To successfully migrate data from a shard, the balancer process must be enabled. Check the balancer state using
the sh.getBalancerState() (page 906) helper in the mongo (page 942) shell. For more information, see the
section on balancer operations (page 552).
Determine the Name of the Shard to Remove
To determine the name of the shard, connect to a mongos (page 938) instance with the mongo (page 942) shell and
either:
Use the listShards (page 737) command, as in the following:
db.adminCommand( { listShards: 1 } )
Run either the sh.status() (page 910) or the db.printShardingStatus() (page 892) method.
The shards._id field lists the name of each shard.
Remove Chunks from the Shard
Run the removeShard (page 737) command. This begins draining chunks from the shard you are removing to
other shards in the cluster. For example, for a shard named mongodb0, run:
db.runCommand( { removeShard: "mongodb0" } )
Depending on your network capacity and the amount of data, this operation can take from a few minutes to several
days to complete.
Check the Status of the Migration
To check the progress of the migration at any stage in the process, run removeShard (page 737). For example, for
a shard named mongodb0, run:
db.runCommand( { removeShard: "mongodb0" } )
{ msg: "draining ongoing" , state: "ongoing" , remaining: { chunks: NumberLong(42), dbs : NumberLong(
554
Chapter 9. Sharding
In the output, the remaining document displays the remaining number of chunks that MongoDB must migrate to
other shards and the number of MongoDB databases that have primary status on this shard.
Continue checking the status of the removeShard command until the number of chunks remaining is 0. Then proceed
to the next step.
Move Unsharded Data
If the shard is the primary shard for one or more databases in the cluster, then the shard will have unsharded data. If
the shard is not the primary shard for any databases, skip to the next task, Finalize the Migration (page 555).
In a cluster, a database with unsharded collections stores those collections only on a single shard. That shard becomes
the primary shard for that database. (Different databases in a cluster can have different primary shards.)
Warning: Do not perform this procedure until you have finished draining the shard.
1. To determine if the shard you are removing is the primary shard for any of the clusters databases, issue one of
the following methods:
sh.status() (page 910)
db.printShardingStatus() (page 892)
In the resulting document, the databases field lists each database and its primary shard. For example, the
following database field shows that the products database uses mongodb0 as the primary shard:
{
"_id" : "products",
"partitioned" : true,
"primary" : "mongodb0" }
2. To move a database to another shard, use the movePrimary (page 743) command. For example, to migrate
all remaining unsharded data from mongodb0 to mongodb1, issue the following command:
db.runCommand( { movePrimary: "products", to: "mongodb1" })
This command does not return until MongoDB completes moving all data, which may take a long time. The
response from this command will resemble the following:
{ "primary" : "mongodb1", "ok" : 1 }
To clean up all metadata information and finalize the removal, run removeShard (page 737) again. For example,
for a shard named mongodb0, run:
db.runCommand( { removeShard: "mongodb0" } )
Once the value of the stage field is completed, you may safely stop the processes comprising the mongodb0
shard.
Convert Sharded Cluster to Replica Set
555
Convert a Cluster with a Single Shard into a Replica Set (page 556)
Convert a Sharded Cluster into a Replica Set (page 556)
This tutorial describes the process for converting a sharded cluster to a non-sharded replica set. To convert a replica set
into a sharded cluster Convert a Replica Set to a Replicated Sharded Cluster (page 530). See the Sharding (page 493)
documentation for more information on sharded clusters.
Convert a Cluster with a Single Shard into a Replica Set
In the case of a sharded cluster with only one shard, that shard contains the full data set. Use the following procedure
to convert that cluster into a non-sharded replica set:
1. Reconfigure the application to connect to the primary member of the replica set hosting the single shard that
system will be the new replica set.
2. Optionally remove the --shardsrv option, if your mongod (page 925) started with this option.
Tip
Changing the --shardsrv option will change the port that mongod (page 925) listens for incoming connections on.
The single-shard cluster is now a non-sharded replica set that will accept read and write operations on the data set.
You may now decommission the remaining sharding infrastructure.
Convert a Sharded Cluster into a Replica Set
Use the following procedure to transition from a sharded cluster with more than one shard to an entirely new replica
set.
1. With the sharded cluster running, deploy a new replica set (page 420) in addition to your sharded cluster. The
replica set must have sufficient capacity to hold all of the data files from all of the current shards combined. Do
not configure the application to connect to the new replica set until the data transfer is complete.
2. Stop all writes to the sharded cluster. You may reconfigure your application or stop all mongos (page 938)
instances. If you stop all mongos (page 938) instances, the applications will not be able to read from the
database. If you stop all mongos (page 938) instances, start a temporary mongos (page 938) instance on that
applications cannot access for the data migration procedure.
3. Use mongodump and mongorestore (page 181) to migrate the data from the mongos (page 938) instance to the
new replica set.
Note: Not all collections on all databases are necessarily sharded. Do not solely migrate the sharded collections.
Ensure that all databases and all collections migrate correctly.
4. Reconfigure the application to use the non-sharded replica set instead of the mongos (page 938) instance.
The application will now use the un-sharded replica set for reads and writes. You may now decommission the remaining unused sharded cluster infrastructure.
See also:
Backup and Restore Sharded Clusters (page 188)
556
Chapter 9. Sharding
The balancer migrates chunks of documents in a sharded collections to the shards associated with a tag that has a shard
key range with an upper bound greater than the chunks lower bound.
Note: If the chunks in sharded collection are already balanced, then the balancer will not migrate any chunks. If
chunks in a sharded collection are not balanced, the balancer migrates chunks in tagged ranges to shards associated
with those tags.
After configuring tags with a shard key range, and associating it with a shard or shards, the cluster may take some time
to balance the data among the shards. This depends on the division of chunks and the current distribution of data in
the cluster.
Once configured, the balancer respects tag ranges during future balancing rounds (page 516).
See also:
Manage Shard Tags (page 536)
Chunks that Span Multiple Tag Ranges
A single chunk may contain data with a shard key values that falls into ranges associated with more than one tag. To
accommodate these situations, the balancer may migrate chunks to shards that contain shard key values that exceed
the upper bound of the selected tag range.
Example
557
The unique (page 814) constraint on indexes ensures that only one document can have a value for a field in a
collection. For sharded collections these unique indexes cannot enforce uniqueness (page 1017) because insert and
indexing operations are local to each shard. 10
If your need to ensure that a field is always unique in all collections in a sharded environment, there are two options:
1. Enforce uniqueness of the shard key (page 506).
MongoDB can enforce uniqueness for the shard key. For compound shard keys, MongoDB will enforce uniqueness on the entire key combination, and not for a specific component of the shard key.
You cannot specify a unique constraint on a hashed index (page 333).
2. Use a secondary collection to enforce uniqueness.
Create a minimal collection that only contains the unique field and a reference to a document in the main
collection. If you always insert into a secondary collection before inserting to the main collection, MongoDB
will produce an error if you attempt to use a duplicate key.
Note: If you have a small data set, you may not need to shard this collection and you can create multiple unique
indexes. Otherwise you can shard on a single unique key.
Always use the default acknowledged (page 55) write concern (page 55) in conjunction with a recent MongoDB driver
(page 1089).
Unique Constraints on the Shard Key
Process To shard a collection using the unique constraint, specify the shardCollection (page 738) command
in the following form:
db.runCommand( { shardCollection : "test.users" , key : { email : 1 } , unique : true } );
Remember that the _id field index is always unique. By default, MongoDB inserts an ObjectId into the _id field.
However, you can manually insert your own value into the _id field and use this as the shard key. To use the _id
field as the shard key, use the following operation:
10 If you specify a unique index on a sharded collection, MongoDB will be able to enforce uniqueness only among the documents located on a
single shard at the time of creation.
558
Chapter 9. Sharding
Warning: In any sharded collection where you are not sharding by the _id field, you must ensure uniqueness of
the _id field. The best way to ensure _id is always unique is to use ObjectId, or another universally unique
identifier (UUID.)
Limitations
You can only enforce uniqueness on one single field in the collection using this method.
If you use a compound shard key, you can only enforce uniqueness on the combination of component keys in
the shard key.
In most cases, the best shard keys are compound keys that include elements that permit write scaling (page 507) and
query isolation (page 507), as well as high cardinality (page 527). These ideal shard keys are not often the same keys
that require uniqueness and requires a different approach.
Unique Constraints on Arbitrary Fields
If you cannot use a unique field as the shard key or if you need to enforce uniqueness over multiple fields, you must
create another collection to act as a proxy collection. This collection must contain both a reference to the original
document (i.e. its ObjectId) and the unique key.
If you must shard this proxy collection, then shard on the unique key using the above procedure (page 558); otherwise, you can simply create multiple unique indexes on the collection.
Process Consider the following for the proxy collection:
{
"_id" : ObjectId("...")
"email" ": "..."
}
The _id field holds the ObjectId of the document it reflects, and the email field is the field on which you want to
ensure uniqueness.
To shard this collection, use the following operation using the email field as the shard key:
db.runCommand( { shardCollection : "records.proxy" , key : { email : 1 } , unique : true } );
If you do not need to shard the proxy collection, use the following command to create a unique index on the email
field:
db.proxy.ensureIndex( { "email" : 1 }, {unique : true} )
You may create multiple unique indexes on this collection if you do not plan to shard the proxy collection.
To insert documents, use the following procedure in the JavaScript shell (page 942):
use records;
var primary_id = ObjectId();
db.proxy.insert({
"_id" : primary_id
"email" : "[email protected]"
559
})
// if: the above operation returns successfully,
// then continue:
db.information.insert({
"_id" : primary_id
"email": "[email protected]"
// additional information...
})
You must insert a document into the proxy collection first. If this operation succeeds, the email field is unique, and
you may continue by inserting the actual document into the information collection.
See
The full documentation of: ensureIndex() (page 814) and shardCollection (page 738).
Considerations
Your application must catch errors when inserting documents into the proxy collection and must enforce
consistency between the two collections.
If the proxy collection requires sharding, you must shard on the single field on which you want to enforce
uniqueness.
To enforce uniqueness on more than one field using sharded proxy collections, you must have one proxy collection for every field for which to enforce uniqueness. If you create multiple unique indexes on a single proxy
collection, you will not be able to shard proxy collections.
Shard GridFS Data Store
When sharding a GridFS store, consider the following:
files Collection
Most deployments will not need to shard the files collection. The files collection is typically small, and only
contains metadata. None of the required keys for GridFS lend themselves to an even distribution in a sharded situation.
If you must shard the files collection, use the _id field possibly in combination with an application field.
Leaving files unsharded means that all the file metadata documents live on one shard. For production GridFS stores
you must store the files collection on a replica set.
chunks Collection
1 , n :
db.fs.chunks.ensureIndex( { files_id : 1 , n : 1 } )
db.runCommand( { shardCollection : "test.fs.chunks" , key : { files_id : 1 , n : 1 } } )
You may also want to shard using just the file_id field, as in the following operation:
560
Chapter 9. Sharding
Important: { files_id : 1 , n :
for the chunks collection of a GridFS store.
1 } and { files_id :
files_id : 1 } } )
And:
mongos specified a different config database string
To solve the issue, restart the mongos (page 938) with the correct string.
Cursor Fails Because of Stale Config Data
A query returns the following warning when one or more of the mongos (page 938) instances has not yet updated its
cache of the clusters metadata from the config database:
could not initialize cursor across all shards because : stale config detected
This warning should not propagate back to your application. The warning will repeat until all the mongos (page 938)
instances refresh their caches. To force an instance to refresh its cache, run the flushRouterConfig (page 735)
command.
Avoid Downtime when Moving Config Servers
Use CNAMEs to identify your config servers to the cluster so that you can rename and renumber your config servers
without downtime.
561
562
Chapter 9. Sharding
Description
Forces an update to the cluster metadata cached by a mongos (page 938).
Adds a shard to a sharded cluster.
Internal command that validates index on shard key.
Enables sharding on a specific database.
Returns a list of configured shards.
Starts the process of removing a shard from a sharded cluster.
Internal command that reports on the state of a sharded cluster.
Internal command that returns the config server version.
Internal command to sets the config server version.
Enables the sharding functionality for a collection, allowing the collection to be
sharded.
Reports whether the mongod (page 925) is a member of a sharded cluster.
Internal command that affects connections between instances in a MongoDB
deployment.
Creates a new chunk.
Internal command to split chunk. Instead use the methods sh.splitFind()
(page 909) and sh.splitAt() (page 909).
Internal command that determines split points.
Deprecated internal command. See splitVector (page 742).
Internal command that migrates chunks between shards.
Reassigns the primary shard when removing a shard from a sharded cluster.
Verifies that a process is a mongos (page 938).
Chapter 9. Sharding
The config database is not a dependable API, and users should not write data to the config database in the course
of normal operation or maintenance.
Warning: Modification of the config database on a functioning system may lead to instability or inconsistent
data sets. If you must modify the config database, use mongodump (page 951) to create a full backup of the
config database.
To access the config database, connect to a mongos (page 938) instance in a sharded cluster, and use the following
helper:
use config
You can return a list of the collections, with the following helper:
show collections
Collections
config
config.changelog
565
"max" : {
"<database>" : "<value>"
},
"lastmod" : Timestamp(1000, 1),
"lastmodEpoch" : ObjectId(<...>)
},
"right" : {
"min" : {
"<database>" : "<value>"
},
"max" : {
"<database>" : { $maxKey : 1 }
},
"lastmod" : Timestamp(1000, 2),
"lastmodEpoch" : ObjectId("<...>")
}
}
}
Each document in the changelog (page 565) collection contains the following fields:
config.changelog._id
The value of changelog._id is: <hostname>-<timestamp>-<increment>.
config.changelog.server
The hostname of the server that holds this data.
config.changelog.clientAddr
A string that holds the address of the client, a mongos (page 938) instance that initiates this change.
config.changelog.time
A ISODate timestamp that reflects when the change occurred.
config.changelog.what
Reflects the type of change recorded. Possible values are:
dropCollection
dropCollection.start
dropDatabase
dropDatabase.start
moveChunk.start
moveChunk.commit
split
multi-split
config.changelog.ns
Namespace where the change occurred.
config.changelog.details
A document that contains additional details regarding the change.
(page 566) document depends on the type of change.
config.chunks
566
Chapter 9. Sharding
These documents store the range of values for the shard key that describe the chunk in the min and max fields.
Additionally the shard field identifies the shard in the cluster that owns the chunk.
config.collections
config.databases
567
config.lockpings
config.locks
If a mongos (page 938) holds the balancer lock, the state field has a value of 2, which means that balancer
is active. The when field indicates when the balancer began the current operation.
Changed in version 2.0: The value of the state field was 1 before MongoDB 2.0.
config.mongos
568
Chapter 9. Sharding
The following document shows the status of the mongos (page 938) running on example.com:30000.
config.settings
config.shards
If the shard has tags (page 557) assigned, this document has a tags field, that holds an array of the tags, as in
the following example:
{ "_id" : "shard0001", "host" : "localhost:30001", "tags": [ "NYC" ] }
config.tags
569
config.version
Note: Like all databases in MongoDB, the config database contains a system.indexes (page 229) collection
contains metadata for all indexes in the database for information on indexes, see Indexes (page 313).
sh.addShard(host)
Adds a database instance or replica set to a sharded cluster. The optimal configuration is to deploy shards across
replica sets. This method must be run on a mongos (page 938) instance.
The sh.addShard() (page 903) method has the following parameter:
param string host The hostname of either a standalone database instance or of a replica set. Include
the port number if the instance is running on a non-standard port. Include the replica set name if
the instance is a replica set, as explained below.
The sh.addShard() (page 903) method has the following prototype form:
sh.addShard("<host>")
Warning: Do not use localhost for the hostname unless your configuration server is also running on
localhost.
The sh.addShard() (page 903) method is a helper for the addShard (page 736) command. The addShard
(page 736) command has additional options which are not available with this helper.
570
Chapter 9. Sharding
Example
To add a shard on a replica set, specify the name of the replica set and the hostname of at least one member of the
replica set, as a seed. If you specify additional hostnames, all must be members of the same replica set.
The following example adds a replica set named repl0 and specifies one member of the replica set:
sh.addShard("repl0/mongodb3.example.net:27327")
Definition
sh.enableSharding(database)
Enables sharding on the specified database. This does not automatically shard any collections but makes it
possible to begin sharding collections using sh.shardCollection() (page 908).
The sh.enableSharding() (page 905) method has the following parameter:
param string database The name of the database shard. Enclose the name in quotation marks.
See also:
sh.shardCollection() (page 908)
Definition
Warning:
MongoDB provides no method to deactivate sharding for a collection after calling
shardCollection (page 738). Additionally, after shardCollection (page 738), you cannot change
shard keys or modify the value of any field used in your shard key index.
See also:
shardCollection (page 738) for additional options, Sharding (page 493) and Sharding Introduction (page 493)
for an overview of sharding, Deploy a Sharded Cluster (page 522) for a tutorial, and Shard Keys (page 506) for
choosing a shard key.
Example
Given the people collection in the records database, the following command shards the collection by the
zipcode field:
571
sh.shardCollection("records.people", { zipcode: 1} )
Definition
sh.splitFind(namespace, query)
Splits the chunk containing the document specified by the query at its median point, creating two roughly
equal chunks. Use sh.splitAt() (page 909) to split a collection in a specific point.
In most circumstances, you should leave chunk splitting to the automated processes. However, when initially
deploying a sharded cluster it is necessary to perform some measure of pre-splitting using manual methods
including sh.splitFind() (page 909).
param string namespace The namespace (i.e. <database>.<collection>) of the sharded
collection that contains the chunk to split.
param document query A query to identify a document in a specific chunk. Typically specify the
shard key for a document as the query.
Definition
sh.splitAt(namespace, query)
Splits the chunk containing the document specified by the query as if that document were at the middle of the
collection, even if the specified document is not the actual median of the collection.
param string namespace The namespace (i.e. <database>.<collection>) of the sharded
collection that contains the chunk to split.
param document query A query to identify a document in a specific chunk. Typically specify the
shard key for a document as the query.
Use this command to manually split chunks unevenly. Use the sh.splitFind() (page 909) function to
split a chunk at the actual median.
In most circumstances, you should leave chunk splitting to the automated processes within MongoDB. However,
when initially deploying a sharded cluster it is necessary to perform some measure of pre-splitting using manual
methods including sh.splitAt() (page 909).
Definition
572
Chapter 9. Sharding
See also:
moveChunk (page 742), sh.splitAt() (page 909), sh.splitFind() (page 909), Sharding (page 493), and
chunk migration (page 518).
Example
Given the people collection in the records database, the following operation finds the chunk that contains the
documents with the zipcode field set to 53187 and then moves that chunk to the shard named shard0019:
sh.moveChunk("records.people", { zipcode: 53187 }, "shard0019")
Description
sh.setBalancerState(state)
Enables or disables the balancer. Use sh.getBalancerState() (page 906) to determine if the balancer is
currently enabled or disabled and sh.isBalancerRunning() (page 907) to check its current state.
The sh.getBalancerState() (page 906) method has the following parameter:
param Boolean state Set this to true to enable the balancer and false to disable it.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
sh.getBalancerState() (page 906)
sh.isBalancerRunning() (page 907)
sh.startBalancer() (page 910)
sh.stopBalancer() (page 912)
sh.waitForBalancer() (page 913)
sh.waitForBalancerOff() (page 913)
sh.isBalancerRunning()
Returns boolean
Returns true if the balancer process is currently running and migrating chunks and false if the balancer process is
not running. Use sh.getBalancerState() (page 906) to determine if the balancer is enabled or disabled.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
sh.getBalancerState() (page 906)
sh.setBalancerState() (page 908)
sh.startBalancer() (page 910)
sh.stopBalancer() (page 912)
573
sh.addShardTag(shard, tag)
New in version 2.2.
Associates a shard with a tag or identifier. MongoDB uses these identifiers to direct chunks that fall within a
tagged range to specific shards. sh.addTagRange() (page 904) associates chunk ranges with tag ranges.
param string shard The name of the shard to which to give a specific tag.
param string tag The name of the tag to add to the shard.
Always issue sh.addShardTag() (page 904) when connected to a mongos (page 938) instance.
Example
The following example adds three tags, NYC, LAX, and NRT, to three shards:
sh.addShardTag("shard0000", "NYC")
sh.addShardTag("shard0001", "LAX")
sh.addShardTag("shard0002", "NRT")
See also:
sh.addTagRange() (page 904) and sh.removeShardTag() (page 908).
Definition
574
Chapter 9. Sharding
param document maximum The maximum value of the shard key range to include in the tag. Specify the maximum value in the form of <fieldname>:<value>. This value must be of the
same BSON type or types as the shard key.
param string tag The name of the tag to attach the range specified by the minimum and maximum
arguments to.
Use sh.addShardTag() (page 904) to ensure that the balancer migrates documents that exist within the
specified range to a specific shard or set of shards.
Always issue sh.addTagRange() (page 904) when connected to a mongos (page 938) instance.
Note: If you add a tag range to a collection using sh.addTagRange() (page 904) and then later drop the
collection or its database, MongoDB does not remove the tag association. If you later create a new collection
with the same name, the old tag association will apply to the new collection.
Example
1, zip:
1}, the following operation creates a tag range covering zip codes in
sh.addTagRange( "exampledb.collection",
{ state: "NY", zip: MinKey },
{ state: "NY", zip: MaxKey },
"NY"
)
Definition
sh.removeShardTag(shard, tag)
New in version 2.2.
Removes the association between a tag and a shard. Always issue sh.removeShardTag() (page 908) when
connected to a mongos (page 938) instance.
param string shard The name of the shard from which to remove a tag.
param string tag The name of the tag to remove from the shard.
See also:
sh.addShardTag() (page 904), sh.addTagRange() (page 904)
sh.help()
Returns a basic help text for all sharding related shell functions.
Database Commands
addShard
Adds either a database instance or a replica set to a sharded cluster. The optimal configuration is to deploy
575
database
instance
running
on
port27027
on
the
host
db.runCommand({addShard: "mongodb0.example.net:27027"})
Warning: Do not use localhost for the hostname unless your configuration server is also running on
localhost.
The following command adds a replica set as a shard:
db.runCommand( { addShard: "repl0/mongodb3.example.net:27327"} )
You may specify all members in the replica set. All additional hostnames must be members of the same replica set.
listShards
Use the listShards (page 737) command to return a list of configured shards. The command takes the
following form:
{ listShards: 1 }
enableSharding
The enableSharding (page 736) command enables sharding on a per-database level. Use the following
command form:
{ enableSharding: "<database name>" }
Once youve enabled sharding in a database, you can use the shardCollection (page 738) command to
begin the process of distributing data among the shards.
576
Chapter 9. Sharding
Definition
shardCollection
Enables a collection for sharding and allows MongoDB to begin distributing data among shards. You must run
enableSharding (page 736) on a database before running the shardCollection (page 738) command.
shardCollection (page 738) has the following form:
{ shardCollection: "<database>.<collection>", key: <shardkey> }
of
the
collection
to
shard
in
the
form
field document key The index specification document to use as the shard key. The index must exist
prior to the shardCollection (page 738) command, unless the collection is empty. If the
collection is empty, in which case MongoDB creates the index prior to sharding the collection.
New in version 2.4: The key may be in the form { field : "hashed" }, which will use
the specified field as a hashed shard key.
field Boolean unique When true, the unique option ensures that the underlying index enforces
a unique constraint. Hashed shard keys do not support unique constraints.
field integer numInitialChunks To support hashed sharding (page 506) added in MongoDB 2.4,
numInitialChunks specifies the number of chunks to create when sharding an collection
with a hashed shard key. MongoDB will then create and balance chunks across the cluster. The
numInitialChunks must be less than 8192.
Warning: Do not run more than one shardCollection (page 738) command on the same collection at the
same time.
Shard Keys Choosing the best shard key to effectively distribute load among your shards requires some planning.
Review Shard Keys (page 506) regarding choosing a shard key.
Hashed Shard Keys New in version 2.4.
Hashed shard keys (page 506) use a hashed index of a single field as the shard key.
Warning:
MongoDB provides no method to deactivate sharding for a collection after calling
shardCollection (page 738). Additionally, after shardCollection (page 738), you cannot change shard
keys or modify the value of any field used in your shard key index.
See also:
Sharding (page 493), Sharding Concepts (page 498), and Deploy a Sharded Cluster (page 522).
Example
The following operation enables sharding for the people collection in the records database and uses the zipcode
field as the shard key (page 506):
db.runCommand( { shardCollection: "records.people", key: { zipcode: 1 } } )
577
shardingState
shardingState (page 738) is an admin command that reports if mongod (page 925) is a member of a
sharded cluster. shardingState (page 738) has the following prototype form:
{ shardingState: 1 }
For shardingState (page 738) to detect that a mongod (page 925) is a member of a sharded cluster, the
mongod (page 925) must satisfy the following conditions:
1.the mongod (page 925) is a primary member of a replica set, and
2.the mongod (page 925) instance is a member of a sharded cluster.
If shardingState (page 738) detects that a mongod (page 925) is a member of a sharded cluster,
shardingState (page 738) returns a document that resembles the following prototype:
{
"enabled" : true,
"configServer" : "<configdb-string>",
"shardName" : "<string>",
"shardHost" : "string:",
"versions" : {
"<database>.<collection>" : Timestamp(<...>),
"<database>.<collection>" : Timestamp(<...>)
},
"ok" : 1
}
The response from shardingState (page 738) when used with a config server is:
{ "enabled": false, "ok": 1 }
Note: mongos (page 938) instances do not provide the shardingState (page 738).
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed; however, the operation is typically short lived.
removeShard
Starts the process of removing a shard from a cluster. This is a multi-stage process. Begin by issuing the
following command:
{ removeShard : "[shardName]" }
The balancer will then migrate chunks from the shard specified by [shardName]. This process happens
slowly to avoid placing undue load on the overall cluster.
The command returns immediately, with the following message:
{ msg : "draining started successfully" , state: "started" , shard: "shardName" , ok : 1 }
If you run the command again, youll see the following progress output:
{ msg: "draining ongoing" , state: "ongoing" , remaining: { chunks: 23 , dbs: 1 }, ok: 1 }
The remaining document specifies how many chunks and databases remain on the shard.
db.printShardingStatus() (page 892) to list the databases that you must move from the shard.
578
Use
Chapter 9. Sharding
Each database in a sharded cluster has a primary shard. If the shard you want to remove is also the primary of
one of the clusters databases, then you must manually move the database to a new shard. This can be only after
the shard is empty. See the movePrimary (page 743) command for details.
After removing all chunks and databases from the shard, you may issue the command again, to return:
{ msg: "remove shard completed successfully", state: "completed", host: "shardName", ok : 1 }
579
580
Chapter 9. Sharding
CHAPTER 10
This document addresses basic high level questions about MongoDB and its use.
If you dont find the answer youre looking for, check the complete list of FAQs (page 581) or post your question to
the MongoDB User Mailing List1 .
581
582
Do not use MongoDB for systems that require SQL, joins, and multi-object transactions.
583
584
This document answers common questions about application development using MongoDB.
If you dont find the answer youre looking for, check the complete list of FAQs (page 581) or post your question to
the MongoDB User Mailing List4 .
Collections are containers for documents that share one or more indexes. Databases are groups of collections stored
on disk using a single set of data files. 6
For an example acme.users namespace, acme is the database name and users is the collection name. Period
characters can occur in collection names, so that acme.user.history is a valid namespace, with acme as the
database name, and user.history as the collection name.
While data models like this appear to support nested collections, the collection namespace is flat, and there is no
difference from the perspective of MongoDB between acme, acme.users, and acme.records.
10.2.2 How do you copy all objects from one collection to another?
In the mongo (page 942) shell, you can use the following operation to duplicate the entire collection:
4 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
5
6
585
db.people.find().forEach( function(x){db.user.insert(x)} );
Note: Because this process decodes BSON documents to JSON during the copy procedure, documents may incur a
loss of type-fidelity.
Consider using mongodump (page 951) and mongorestore (page 956) to maintain type fidelity.
Also consider the cloneCollection (page 748) command that may provide some of this functionality.
587
If you shorten the filed named last_name to lname and the field name best_score to score, as follows,
you could save 9 bytes per document.
{ lname : "Smith", score : 3.9 }
Shortening field names reduces expressiveness and does not provide considerable benefit on for larger documents and where document overhead is not significant concern. Shorter field names do not reduce the size of
indexes, because indexes have a predefined structure.
In general it is not necessary to use short field names.
Embed documents.
In some cases you may want to embed documents in other documents and save on the per-document overhead.
Here, my_query then will have a value such as { name : "Joe" }. If my_query contained special characters, for example ,, :, and {, the query simply wouldnt match any documents. For example, users cannot hijack a
query and convert it to a delete.
588
JavaScript
Note: You can disable all server-side execution of JavaScript, by passing the --noscripting option on the
command line or setting noscripting (page 996) in a configuration file.
All of the following MongoDB operations permit you to run arbitrary JavaScript expressions directly on the server:
$where (page 634)
db.eval() (page 884)
mapReduce (page 701)
group (page 697)
You must exercise care in these cases to prevent users from submitting malicious JavaScript.
Fortunately, you can express most queries in MongoDB without JavaScript and for queries that require JavaScript, you
can mix JavaScript and non-JavaScript in a single query. Place all the user-supplied fields directly in a BSON field and
pass JavaScript code to the $where (page 634) field.
If you need to pass user-supplied values in a $where (page 634) clause, you may escape these values with the
CodeWScope mechanism. When you set user-submitted values as variables in the scope document, you can
avoid evaluating them on the database server.
If you need to use db.eval() (page 884) with user supplied values, you can either use a CodeWScope or
you can supply extra arguments to your function. For instance:
db.eval(function(userVal){...},
user_value);
This will ensure that your application sends user_value to the database server as data rather than code.
Dollar Sign Operator Escaping
Field names in MongoDBs query language have semantic meaning. The dollar sign (i.e $) is a reserved character
used to represent operators (page 621) (i.e. $inc (page 651).) Thus, you should ensure that your applications users
cannot inject operators into their inputs.
In some cases, you may wish to build a BSON object with a user-provided key. In these situations, keys will need
to substitute the reserved $ and . characters. Any character is sufficient, but consider using the Unicode full width
equivalents: U+FF04 (i.e. $) and U+FF0E (i.e. .).
Consider the following example:
BSONObj my_object = BSON( a_key << a_name );
The user may have supplied a $ value in the a_key value. At the same time, my_object might be { $where :
"things" }. Consider the following cases:
Insert. Inserting this into the database does no harm. The insert process does not evaluate the object as a query.
Note: MongoDB client drivers, if properly implemented, check for reserved characters in keys on inserts.
Update. The update() (page 849) operation permits $ operators in the update argument but does not support
the $where (page 634) operator. Still, some users may be able to inject operators that can manipulate a single
document only. Therefore your application should escape keys, as mentioned above, if reserved characters are
possible.
589
Query Generally this is not a problem for queries that resemble { x : user_obj }: dollar signs are
not top level and have no effect. Theoretically it may be possible for the user to build a query themselves.
But checking the user-submitted content for $ characters in key names may help protect against this kind of
injection.
Driver-Specific Issues
See the PHP MongoDB Driver Security Notes8 page in the PHP driver documentation for more information
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to
highest:
1. MinKey (internal type)
2. Null
3. Numbers (ints, longs, doubles)
4. Symbol, String
5. Object
6. Array
7. BinData
8. ObjectID
9. Boolean
10. Date, Timestamp
11. Regular Expression
12. MaxKey (internal type)
Note: MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo
conversion before comparison.
Consider the following mongo (page 942) example:
8 http://us.php.net/manual/en/mongo.security.php
590
db.test.insert(
db.test.insert(
db.test.insert(
db.test.insert(
{x
{x
{x
{x
:
:
:
:
3 } );
2.9 } );
new Date() } );
true } );
db.test.find().sort({x:1});
{ "_id" : ObjectId("4b03155dce8de6586fb002c7"),
{ "_id" : ObjectId("4b03154cce8de6586fb002c6"),
{ "_id" : ObjectId("4b031566ce8de6586fb002c9"),
{ "_id" : ObjectId("4b031563ce8de6586fb002c8"),
"x"
"x"
"x"
"x"
:
:
:
:
2.9 }
3 }
true }
"Tue Nov 17 2009 16:28:03 GMT-0500 (EST)" }
The $type (page 630) operator provides access to BSON type comparison in the MongoDB query syntax. See the
documentation on BSON types and the $type (page 630) operator for additional information.
Warning: Storing values of the different types in the same field in a collection is strongly discouraged.
See also:
The Tailable Cursors (page 82) page for an example of a C++ use of MinKey.
The { cancelDate : { $type: 10 } } query matches documents that contains the cancelDate
field whose value is null only; i.e. the value of the cancelDate field is of BSON Type Null (i.e. 10) :
db.test.find( { cancelDate : { $type: 10 } } )
The query returns only the document that contains the null value:
{ "_id" : 1, "cancelDate" : null }
The { cancelDate :
cancelDate field:
{ $exists:
The query returns only the document that does not contain the cancelDate field:
{ "_id" : 2 }
See also:
The reference documentation for the $type (page 630) and $exists (page 629) operators.
10.2. FAQ: MongoDB for Application Developers
591
db.getCollection("_foo").insert( { a : 1 } )
To perform a query, use the find() (page 816) method, in as the following:
db.getCollection("_foo").find()
As a cursor returns documents other operations may interleave with the query: if some of these operations are updates (page 50) that cause the
document to move (in the case of a table scan, caused by document growth,) or that change the indexed field on the index used by the query; then
the cursor will return the same document more than once.
11 MongoDB does not permit changes to the value of the _id field; it is not possible for a cursor that transverses this index to pass the same
document more than once.
592
593
See also:
Padding Factor (page 65)
How can I enter multi-line operations in the mongo shell? (page 594)
How can I access different databases temporarily? (page 594)
Does the mongo shell support tab completion and other keyboard shortcuts? (page 594)
How can I customize the mongo shell prompt? (page 595)
Can I edit long shell operations with an external text editor? (page 595)
You can exit the line continuation mode if you enter two blank lines, as in the following example:
> if (x > 0
...
...
>
10.3.3 Does the mongo shell support tab completion and other keyboard shortcuts?
The mongo (page 942) shell supports keyboard shortcuts. For example,
Use the up/down arrow keys to scroll through command history. See .dbshell (page 946) documentation for
more information on the .dbshell file.
594
Use <Tab> to autocomplete or to list the completion possibilities, as in the following example which uses
<Tab> to complete the method name starting with the letter c:
db.myCollection.c<Tab>
Because there are many collection methods starting with the letter c, the <Tab> will list the various methods
that start with c.
For a full list of the shortcuts, see Shell Keyboard Shortcuts (page 947)
The mongo (page 942) shell prompt should now reflect the new prompt:
[email protected]>
The mongo (page 942) shell prompt should now reflect the new prompt:
Uptime:1052 Documents:25024787 >
You can add the logic for the prompt in the .mongorc.js (page 946) file to set the prompt each time you start up the
mongo (page 942) shell.
10.3.5 Can I edit long shell operations with an external text editor?
You can use your own editor in the mongo (page 942) shell by setting the EDITOR (page 946) environment variable
before starting the mongo (page 942) shell. Once in the mongo (page 942) shell, you can edit with the specified editor
by typing edit <variable> or edit <function>, as in the following example:
1. Set the EDITOR (page 946) variable from the command line prompt:
EDITOR=vim
595
edit myFunction
The command should open the vim edit session. Remember to save your changes.
5. Type myFunction to see the function definition:
myFunction
596
597
Operation
Issue a query
Get more data
from a cursor
Insert data
Remove data
Update data
Map-reduce
Create an index
db.eval()
(page 884)
eval (page 722)
aggregate()
(page 808)
Lock Type
Read lock
Read lock
Write lock
Write lock
Write lock
Read lock and write lock, unless operations are specified as non-atomic. Portions of
map-reduce jobs can run concurrently.
Building an index in the foreground, which is the default, locks the database for extended
periods of time.
Write lock. db.eval() (page 884) blocks all other JavaScript processes.
Write lock. If used with the nolock lock option, the eval (page 722) option does not take
a write lock and cannot write data to the database.
Read lock
598
10.4.7 Does a MongoDB operation ever lock more than one database?
The following MongoDB operations lock multiple databases:
db.copyDatabase() (page 878) must lock the entire mongod (page 925) instance at once.
Journaling, which is an internal operation, locks all databases for short intervals. All databases share a single
journal.
User authentication (page 237) locks the admin database as well as the database the user is accessing.
All writes to a replica sets primary lock both the database receiving the writes and the local database. The
lock for the local database allows the mongod (page 925) to write to the primarys oplog.
10.4.11 What kind of concurrency does MongoDB provide for JavaScript operations?
Changed in version 2.4: The V8 JavaScript engine added in 2.4 allows multiple JavaScript operations to run at the
same time. Prior to 2.4, a single mongod (page 925) could only run a single JavaScript operation at once.
599
This document answers common questions about horizontal scaling using MongoDBs sharding.
If you dont find the answer youre looking for, check the complete list of FAQs (page 581) or post your question to
the MongoDB User Mailing List15 .
600
601
The exact method for distributing queries to shards in a cluster depends on the nature of the query and the configuration
of the sharded cluster. Consider a sharded collection, using the shard key user_id, that has last_login and
email attributes:
For a query that selects one or more values for the user_id key:
mongos (page 938) determines which shard or shards contains the relevant data, based on the cluster metadata,
and directs a query to the required shard or shards, and returns those results to the client.
For a query that selects user_id and also performs a sort:
mongos (page 938) can make a straightforward translation of this operation into a number of queries against the
relevant shards, ordered by user_id. When the sorted queries return from all shards, the mongos (page 938)
merges the sorted results and returns the complete result to the client.
For queries that select on last_login:
These queries must run on all shards: mongos (page 938) must parallelize the query over the shards and perform
a merge-sort on the email of the documents found.
10.5.10 How does MongoDB ensure unique _id field values when using a shard
key other than _id?
If you do not use _id as the shard key, then your application/client layer must be responsible for keeping the _id
field unique. It is problematic for collections to have duplicate _id values.
If youre not sharding your collection by the _id field, then you should be sure to store a globally unique identifier in
that field. The default BSON ObjectID (page 103) works well in this case.
10.5.11 Ive enabled sharding and added a second shard, but all the data is still on
one server. Why?
First, ensure that youve declared a shard key for your collection. Until you have configured the shard key, MongoDB
will not create chunks, and sharding will not occur.
Next, keep in mind that the default chunk size is 64 MB. As a result, in most situations, the collection needs to have at
least 64 MB of data before a migration will occur.
Additionally, the system which balances chunks among the servers attempts to avoid superfluous migrations. Depending on the number of shards, your shard key, and the amount of data, systems often require at least 10 chunks of data
to trigger migrations.
You can run db.printShardingStatus() (page 892) to see all the chunks present in your cluster.
602
10.5.18 What is the process for moving, renaming, or changing the number of config servers?
See Sharded Cluster Tutorials (page 521) for information on migrating and replacing config servers.
603
10.5.20 Is it possible to quickly update mongos servers after updating a replica set
configuration?
The mongos (page 938) instances will detect these changes without intervention over time. However, if you want
to force the mongos (page 938) to reload its configuration, run the flushRouterConfig (page 735) command
against to each mongos (page 938) directly.
604
on a single shard. Although the system will eventually divide this chunk and migrate its contents to distribute data
more evenly, at any moment the cluster can only direct insert operations at a single shard. This can limit the throughput
of inserts. If most of your write operations are updates, this limitation should not impact your performance. However,
if you have a high insert volume, this may be a limitation.
To address this issue, MongoDB 2.4 provides hashed shard keys (page 506).
mongod (page 925) issues this message if, during a chunk migration (page 518), the shard could not connect to the
config database to update chunk information at the end of the migration process. If the shard cannot update the config
database after moveChunk (page 742), the cluster will have an inconsistent view of all chunks. In these situations, the
primary member of the shard will terminate itself to prevent data inconsistency. If the secondary member can access
the config database, the shards data will be accessible after an election. Administrators will need to resolve the chunk
migration failure independently.
If you encounter this issue, contact the MongoDB User Group18 or MongoDB support to address this issue.
10.5.27 How does draining a shard affect the balancing of uneven chunk distribution?
The sharded cluster balancing process controls both migrating chunks from decommissioned shards (i.e. draining,) and
normal cluster balancing activities. Consider the following behaviors for different versions of MongoDB in situations
where you remove a shard in a cluster with an uneven chunk distribution:
After MongoDB 2.2, the balancer first removes the chunks from the draining shard and then balances the remaining uneven chunk distribution.
Before MongoDB 2.2, the balancer handles the uneven chunk distribution and then removes the chunks from
the draining shard.
605
606
It may take 10-30 seconds for the members of a replica set to declare a primary inaccessible. This triggers an election.
During the election, the cluster is unavailable for writes.
The election itself may take another 10-30 seconds.
Note: Eventually consistent reads, like the ones that will return from a replica set are only possible with a write
concern that permits reads from secondary members.
10.6.5 Does replication work over the Internet and WAN connections?
Yes.
For example, a deployment may maintain a primary and secondary in an East-coast data center along with a secondary
member for disaster recovery in a West-coast data center.
See also:
Deploy a Geographically Distributed Replica Set (page 425)
10.6.8 What is the preferred replication method: replica sets or replica pairs?
Deprecated since version 1.6.
Replica sets replaced replica pairs in version 1.6. Replica sets are the preferred replication mechanism in MongoDB.
607
Note: Journaling requires some resource overhead for write operations. Journaling has no effect on read performance,
however.
Journaling is enabled by default on all 64-bit builds of MongoDB v2.0 and greater.
10.6.10 Are write operations durable if write concern does not acknowledge
writes?
Yes.
However, if you want confirmation that a given write has arrived at the server, use write concern (page 55). The
getLastError (page 720) command provides the facility for write concern. However, after the default write
concern change (page 1089), the default write concern acknowledges all write operations, and unacknowledged writes
must be explicitly configured. See the MongoDB Drivers and Client Libraries (page 95) documentation for your driver
for more information.
10.6.12 What information do arbiters exchange with the rest of the replica set?
Arbiters never receive the contents of a collection but do exchange the following data with the rest of the replica set:
Credentials used to authenticate the arbiter with the replica set. All MongoDB processes within a replica set use
keyfiles. These exchanges are encrypted.
Replica set configuration data and voting data. This information is not encrypted. Only credential exchanges
are encrypted.
If your MongoDB deployment uses SSL, then all communications between arbiters and the other members of the
replica set are secure. See the documentation for Connect to MongoDB with SSL (page 249) for more information.
Run all arbiters on secure networks, as with all MongoDB components.
See
The overview of Arbiter Members of Replica Sets (page ??).
608
10.6.15 Is it normal for replica set members to use different amounts of disk space?
Yes.
Factors including: different oplog sizes, different levels of storage fragmentation, and MongoDBs data file preallocation can lead to some variation in storage utilization between nodes. Storage use disparities will be most pronounced when you add members at different times.
609
610
10.7.5 What is the difference between soft and hard page faults?
Page faults occur when MongoDB needs access to data that isnt currently in active memory. A hard page fault
refers to situations when MongoDB must access a disk to access the data. A soft page fault, by contrast, merely
moves memory pages from one list to another, such as from an operating system file cache. In production, MongoDB
will rarely encounter soft page faults.
10.7.8 Why are the files in my data directory larger than the data in my database?
The data files in your data directory, which is the /data/db directory in default configurations, might be larger than
the data set inserted into the database. Consider the following possible causes:
Preallocated data files.
In the data directory, MongoDB preallocates data files to a particular size, in part to prevent file system fragmentation. MongoDB names the first data file <databasename>.0, the next <databasename>.1, etc. The
first file mongod (page 925) allocates is 64 megabytes, the next 128 megabytes, and so on, up to 2 gigabytes, at
which point all subsequent files are 2 gigabytes. The data files include files with allocated space but that hold no
data. mongod (page 925) may allocate a 1 gigabyte data file that may be 90% empty. For most larger databases,
unused allocated space is small compared to the database.
On Unix-like systems, mongod (page 925) preallocates an additional data file and initializes the disk space to 0.
Preallocating data files in the background prevents significant delays when a new database file is next allocated.
You can disable preallocation with the noprealloc (page 996) run time option. However noprealloc
(page 996) is not intended for use in production environments: only use noprealloc (page 996) for testing
and with small data sets where you frequently drop databases.
On Linux systems you can use hdparm to get an idea of how costly allocation might be:
time hdparm --fallocate $((1024*1024)) testfile
The oplog.
611
If this mongod (page 925) is a member of a replica set, the data directory includes the oplog.rs file, which is
a preallocated capped collection in the local database. The default allocation is approximately 5% of disk
space on 64-bit installations, see Oplog Sizing (page 411) for more information. In most cases, you should not
need to resize the oplog. However, if you do, see Change the Size of the Oplog (page 446).
The journal.
The data directory contains the journal files, which store write operations on disk prior to MongoDB applying
them to databases. See Journaling Mechanics (page 232).
Empty records.
MongoDB maintains lists of empty records in data files when deleting documents and collections. MongoDB
can reuse this space, but will never return this space to the operating system.
To de-fragment allocated storage, use compact (page 752), which de-fragments allocated space which allows.
By de-fragmenting storage, MongoDB to can effectively use the allocated space. compact (page 752) requires
up to 2 gigabytes of extra disk space to run. Do not use compact (page 752) if you are critically low on disk
space.
Important: compact (page 752) only removes fragmentation from MongoDB data files and does not return
any disk space to the operating system.
To reclaim deleted space, use repairDatabase (page 757), which rebuilds the database which de-fragments
the storage and may release space to the operating systme. repairDatabase (page 757) requires up to 2
gigabytes of extra disk space to run. Do not use repairDatabase (page 757) if you are critically low on
disk space.
Warning: repairDatabase (page 757) requires enough free disk space to hold both the old and new
database files while the repair is running. Be aware that repairDatabase (page 757) will block all other
operations and may take a long time to complete.
612
Check the value of indexSizes (page 765) in the output of the db.collection.stats() (page 848)
command.
Example
Issue the following command to retrieve index namespaces:
db.system.namespaces.find()
View the size of the data allocated for the orders.$_id_ index with the following sequence of operations:
use test
db.orders.$_id_.stats().indexSizes
10.7.11 How do I know when the server runs out of disk space?
If your server runs out of disk space for data files, you will see something like this in the log:
Thu
Thu
Thu
Thu
Thu
Thu
Aug
Aug
Aug
Aug
Aug
Aug
11
11
11
11
11
11
13:06:09
13:06:09
13:06:09
13:06:19
13:06:19
13:06:19
[FileAllocator]
[FileAllocator]
[FileAllocator]
[FileAllocator]
[FileAllocator]
[FileAllocator]
The server remains in this state forever, blocking all writes including deletes. However, reads still work. To delete
some data and compact, using the compact (page 752) command, you must restart the server first.
If your server runs out of disk space for journal files, the server process will exit. By default, mongod (page 925)
creates journal files in a sub-directory of dbpath (page 993) named journal. You may elect to put the journal files
on another storage device using a filesystem mount or a symlink.
Note: If you place the journal files on a separate storage device you will not be able to use a file system snapshot tool
to capture a consistent snapshot of your data files and journal files.
613
614
10.8.11 Can I use a multi-key index to support a query for a whole array?
Not entirely. The index can partially support these queries because it can speed the selection of the first element of
the array; however, comparing all subsequent items in the array cannot use the index and must scan the documents
individually.
615
10.8.12 How can I effectively use indexes strategy for attribute lookups?
For simple attribute lookups that dont require sorted result sets or range queries, consider creating a field that contains
an array of documents where each document has a field (e.g. attrib ) that holds a specific type of attribute. You can
index this attrib field.
For example, the attrib field in the following document allows you to add an unlimited number of attributes types:
{ _id : ObjectId(...),
attrib : [
{ k: "color",
{ k: "shape":
{ k: "color":
{ k: "avail":
]
}
v:
v:
v:
v:
"red" },
"rectangle" },
"blue" },
true }
1, "attrib.v":
1 } index:
Where can I find information about a mongod process that stopped running unexpectedly? (page 616)
Does TCP keepalive time affect sharded clusters and replica sets? (page 617)
What tools are available for monitoring MongoDB? (page 617)
Memory Diagnostics (page 617)
Do I need to configure swap space? (page 617)
What is working set and how can I estimate its size? (page 618)
Must my working set size fit RAM? (page 618)
How do I calculate how much RAM I need for my application? (page 618)
How do I read memory statistics in the UNIX top command (page 619)
Sharded Cluster Diagnostics (page 619)
In a new sharded cluster, why does all data remains on one shard? (page 619)
Why would one shard receive a disproportion amount of traffic in a sharded cluster? (page 619)
What can prevent a sharded cluster from balancing? (page 620)
Why do chunk migrations affect sharded cluster performance? (page 620)
10.9.1 Where can I find information about a mongod process that stopped running
unexpectedly?
If mongod (page 925) shuts down unexpectedly on a UNIX or UNIX-based platform, and if mongod (page 925) fails
to log a shutdown or error message, then check your system logs for messages pertaining to MongoDB. For example,
23 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
616
10.9.2 Does TCP keepalive time affect sharded clusters and replica sets?
If you experience socket errors between members of a sharded cluster or replica set, that do not have other reasonable causes, check the TCP keep alive value, which Linux systems store as the tcp_keepalive_time value. A
common keep alive period is 7200 seconds (2 hours); however, different distributions and OS X may have different
settings. For MongoDB, you will have better experiences with shorter keepalive periods, on the order of 300 seconds
(five minutes).
On Linux systems you can use the following operation to check the value of tcp_keepalive_time:
cat /proc/sys/net/ipv4/tcp_keepalive_time
You can change the tcp_keepalive_time value with the following operation:
echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time
The new tcp_keepalive_time value takes effect without requiring you to restart the mongod (page 925)
or mongos (page 938) servers. When you reboot or restart your system you will need to set the new
tcp_keepalive_time value, or see your operating systems documentation for setting the TCP keepalive value
persistently.
For OS X systems, issue the following command to view the keep alive setting:
sysctl net.inet.tcp.keepinit
If your replica set or sharded cluster experiences keepalive-related issues, you must alter the
tcp_keepalive_time value on all machines hosting MongoDB processes. This includes all machines
hosting mongos (page 938) or mongod (page 925) servers.
Windows users should consider the Windows Server Technet Article on KeepAliveTime configuration24 for more
information on setting keep alive for MongoDB deployments on Windows systems.
617
as something like a steam release valve that allows the system to release extra pressure without affecting the overall
functioning of the system.
Nevertheless, systems running MongoDB do not need swap for routine operation. Database files are memory-mapped
(page 610) and should constitute most of your MongoDB memory use. Therefore, it is unlikely that mongod
(page 925) will ever use any swap space in normal operation. The operating system will release memory from the
memory mapped files without needing swap and MongoDB can write data to the data files without needing the swap
system.
What is working set and how can I estimate its size?
The working set for a MongoDB database is the portion of your data that clients access most often. You can estimate
size of the working set, using the workingSet (page 795) document in the output of serverStatus (page 782).
To return serverStatus (page 782) with the workingSet (page 795) document, issue a command in the following form:
db.runCommand( { serverStatus: 1, workingSet: 1 } )
618
619
620
CHAPTER 11
Reference
Query Selectors
Comparison
621
Name
$gt (page 622)
$gte (page 622)
$in (page 623)
$lt (page 623)
$lte (page 624)
$ne (page 624)
$nin (page 624)
Description
Matches values that are greater than the value specified in the query.
Matches values that are equal to or greater than the value specified in the query.
Matches any of the values that exist in an array specified in the query.
Matches values that are less than the value specified in the query.
Matches values that are less than or equal to the value specified in the query.
Matches all values that are not equal to the value specified in the query.
Matches values that do not exist in an array specified to the query.
$gt
$gt
Syntax: {field:
{$gt:
value} }
$gt (page 622) selects those documents where the value of the field is greater than (i.e. >) the specified
value.
Consider the following example:
db.inventory.find( { qty: { $gt: 20 } } )
This query will select all documents in the inventory collection where the qty field value is greater than
20.
Consider the following example which uses the $gt (page 622) operator with a field from an embedded document:
db.inventory.update( { "carrier.fee": { $gt: 2 } }, { $set: { price: 9.99 } } )
This update() (page 849) operation will set the value of the price field in the documents that contain the
embedded document carrier whose fee field value is greater than 2.
See also:
find() (page 816), update() (page 849), $set (page 655).
$gte
$gte
Syntax: {field:
{$gte:
value} }
$gte (page 622) selects the documents where the value of the field is greater than or equal to (i.e. >=) a
specified value (e.g. value.)
Consider the following example:
db.inventory.find( { qty: { $gte: 20 } } )
This query would select all documents in inventory where the qty field value is greater than or equal to 20.
Consider the following example which uses the $gte (page 622) operator with a field from an embedded
document:
db.inventory.update( { "carrier.fee": { $gte: 2 } }, { $set: { price: 9.99 } } )
This update() (page 849) operation will set the value of the price field that contain the embedded document
carrier whose fee field value is greater than or equal to 2.
See also:
find() (page 816), update() (page 849), $set (page 655).
622
$in
$in
Syntax: { field:
{ $in:
<valueN> ] } }
$in (page 623) selects the documents where the field value equals any value in the specified array (e.g.
<value1>, <value2>, etc.)
Consider the following example:
db.inventory.find( { qty: { $in: [ 5, 15 ] } } )
This query selects all documents in the inventory collection where the qty field value is either 5 or 15.
Although you can express this query using the $or (page 625) operator, choose the $in (page 623) operator
rather than the $or (page 625) operator when performing equality checks on the same field.
If the field holds an array, then the $in (page 623) operator selects the documents whose field holds an
array that contains at least one element that matches a value in the specified array (e.g. <value1>, <value2>,
etc.)
Consider the following example:
db.inventory.update(
{ tags: { $in: ["appliances", "school"] } },
{ $set: { sale:true } }
)
This update() (page 849) operation will set the sale field value in the inventory collection where the
tags field holds an array with at least one element matching an element in the array ["appliances",
"school"].
Note: When using two or more $in (page 623) expressions, the product of the number of distinct elements
in the $in (page 623) arrays must be less than 4000000. Otherwise, MongoDB will throw an exception of
"combinatorial limit of $in partitioning of result set exceeded".
See also:
find() (page 816), update() (page 849), $or (page 625), $set (page 655).
$lt
$lt
Syntax: {field:
{$lt:
value} }
$lt (page 623) selects the documents where the value of the field is less than (i.e. <) the specified value.
Consider the following example:
db.inventory.find( { qty: { $lt: 20 } } )
This query will select all documents in the inventory collection where the qty field value is less than 20.
Consider the following example which uses the $lt (page 623) operator with a field from an embedded document:
db.inventory.update( { "carrier.fee": { $lt: 20 } }, { $set: { price: 9.99 } } )
This update() (page 849) operation will set the price field value in the documents that contain the embedded document carrier whose fee field value is less than 20.
See also:
find() (page 816), update() (page 849), $set (page 655).
11.1. MongoDB Interface
623
$lte
$lte
Syntax: { field:
{ $lte:
value} }
$lte (page 624) selects the documents where the value of the field is less than or equal to (i.e. <=) the
specified value.
Consider the following example:
db.inventory.find( { qty: { $lte: 20 } } )
This query will select all documents in the inventory collection where the qty field value is less than or
equal to 20.
Consider the following example which uses the $lt (page 623) operator with a field from an embedded document:
db.inventory.update( { "carrier.fee": { $lte: 5 } }, { $set: { price: 9.99 } } )
This update() (page 849) operation will set the price field value in the documents that contain the embedded document carrier whose fee field value is less than or equal to 5.
See also:
find() (page 816), update() (page 849), $set (page 655).
$ne
$ne
Syntax: {field:
{$ne:
value} }
$ne (page 624) selects the documents where the value of the field is not equal (i.e. !=) to the specified
value. This includes documents that do not contain the field.
Consider the following example:
db.inventory.find( { qty: { $ne: 20 } } )
This query will select all documents in the inventory collection where the qty field value does not equal
20, including those documents that do not contain the qty field.
Consider the following example which uses the $ne (page 624) operator with a field from an embedded document:
db.inventory.update( { "carrier.state": { $ne: "NY" } }, { $set: { qty: 20 } } )
This update() (page 849) operation will set the qty field value in the documents that contains the embedded
document carrier whose state field value does not equal NY, or where the state field or the carrier
embedded document does not exist.
See also:
find() (page 816), update() (page 849), $set (page 655).
$nin
$nin
Syntax: { field:
{ $nin:
<valueN> ]} }
624
This query will select all documents in the inventory collection where the qty field value does not equal 5
nor 15. The selected documents will include those documents that do not contain the qty field.
If the field holds an array, then the $nin (page 624) operator selects the documents whose field holds an
array with no element equal to a value in the specified array (e.g. <value1>, <value2>, etc.).
Consider the following query:
This update() (page 849) operation will set the sale field value in the inventory collection where
the tags field holds an array with no elements matching an element in the array ["appliances",
"school"] or where a document does not contain the tags field.
See also:
find() (page 816), update() (page 849), $set (page 655).
Logical
Name
$or (page 625)
$and
(page 626)
$not
(page 627)
$nor
(page 628)
Description
Joins query clauses with a logical OR returns all documents that match the conditions
clause.
Joins query clauses with a logical AND returns all documents that match the condition
clauses.
Inverts the effect of a query expression and returns documents that do not match the q
expression.
Joins query clauses with a logical NOR returns all documents that fail to match both cl
$or
$or
New in version 1.6.
Changed in version 2.0: You may nest $or (page 625) operations; however, these expressions are not as efficiently optimized as top-level.
Syntax:
{ $or:
<expressionN> } ] }
, {
The $or (page 625) operator performs a logical OR operation on an array of two or more <expressions>
and selects the documents that satisfy at least one of the <expressions>.
Consider the following query:
db.inventory.find( { price:1.99, $or: [ { qty: { $lt: 20 } }, { sale: true } ] } )
This query will select all documents in the inventory collection where:
the price field value equals 1.99 and
either the qty field value is less than 20 or the sale field value is true.
625
Consider the following example which uses the $or (page 625) operator to select fields from embedded documents:
This update() (page 849) operation will set the value of the sale field in the documents in the inventory
collection where:
the price field value equals 10.99 or
the carrier embedded document contains a field state whose value equals NY.
When using $or (page 625) with <expressions> that are equality checks for the value of the same field,
choose the $in (page 623) operator over the $or (page 625) operator.
Consider the query to select all documents in the inventory collection where:
either price field value equals 1.99 or the sale field value equals true, and
either qty field value equals 20 or qty field value equals 50,
The most effective query would be:
db.inventory.find ( { $or: [ { price: 1.99 }, { sale: true } ], qty: { $in: [20, 50] } } )
Consider the following behaviors when using the $or (page 625) operator:
When using indexes with $or (page 625) queries, remember that each clause of an $or (page 625) query
will execute in parallel. These clauses can each use their own index. Consider the following query:
db.inventory.find ( { $or: [ { price: 1.99 }, { sale: true } ] } )
For this query, you would create one index on price ( db.inventory.ensureIndex( { price:
1 } ) ) and another index on sale ( db.inventory.ensureIndex( { sale: 1 } ) )
rather than a compound index.
Also, when using the $or (page 625) operator with the sort() (page 872) method in a query, the query
will not use the indexes on the $or (page 625) fields. Consider the following query which adds a sort()
(page 872) method to the above query:
db.inventory.find ( { $or: [ { price: 1.99 }, { sale: true } ] } ).sort({item:1})
This modified query will not use the index on price nor the index on sale.
You cannot use the $or (page 625) with 2d geospatial queries (page 330).
See also:
find() (page 816), update() (page 849), $set (page 655), $and (page 626), sort() (page 872).
$and
$and
New in version 2.0.
Syntax:
{ $and:
<expressionN> } ] }
, {
$and (page 626) performs a logical AND operation on an array of two or more expressions (e.g.
<expression1>, <expression2>, etc.) and selects the documents that satisfy all the expressions
in the array. The $and (page 626) operator uses short-circuit evaluation. If the first expression (e.g.
<expression1>) evaluates to false, MongoDB will not evaluate the remaining expressions.
Consider the following example:
626
This query will select all documents in the inventory collection where:
price field value equals 1.99 and
qty field value is less than 20 and
sale field value is equal to true.
MongoDB provides an implicit AND operation when specifying a comma separated list of expressions. For
example, you may write the above query as:
db.inventory.find( { price: 1.99, qty: { $lt: 20 } , sale: true } )
If, however, a query requires an AND operation on the same field such as { price: { $ne: 1.99 }
} AND { price: { $exists: true } }, then either use the $and (page 626) operator for the
two separate expressions or combine the operator expressions for the field { price: { $ne: 1.99,
$exists: true } }.
Consider the following examples:
Both update() (page 849) operations will set the value of the qty field in documents where:
the price field value does not equal 1.99 and
the price field exists.
See also:
find() (page 816), update() (page 849), $ne (page 624), $exists (page 629), $set (page 655).
$not
$not
Syntax: { field:
{ $not:
{ <operator-expression> } } }
$not (page 627) performs a logical NOT operation on the specified <operator-expression> and selects
the documents that do not match the <operator-expression>. This includes documents that do not
contain the field.
Consider the following query:
db.inventory.find( { price: { $not: { $gt: 1.99 } } } )
This query will select all documents in the inventory collection where:
the price field value is less than or equal to 1.99 or
the price field does not exist
{ $not: { $gt: 1.99 } } is different from the $lte (page 624) operator. { $lte:
returns only the documents where price field exists and its value is less than or equal to 1.99.
1.99 }
Remember that the $not (page 627) operator only affects other operators and cannot check fields and documents independently. So, use the $not (page 627) operator for logical disjunctions and the $ne (page 624)
operator to test the contents of fields directly.
Consider the following behaviors when using the $not (page 627) operator:
627
The operation of the $not (page 627) operator is consistent with the behavior of other operators but may
yield unexpected results with some data types like arrays.
The $not (page 627) operator does not support operations with the $regex (page 633) operator. Instead
use http://docs.mongodb.org/manual/ or in your driver interfaces, use your languages regular
expression capability to create regular expression objects.
Consider
the
following
example
which
http://docs.mongodb.org/manual/:
uses
the
pattern
match
expression
The query will select all documents in the inventory collection where the item field value does not
start with the letter p.
If you are using Python, you can write the above query with the PyMongo driver and Pythons
python:re.compile() method to compile a regular expression, as follows:
import re
for noMatch in db.inventory.find( { "item": { "$not": re.compile("^p.*") } } ):
print noMatch
See also:
find() (page 816), update() (page 849), $set (page 655), $gt (page 622), $regex (page 633), PyMongo1 , driver.
$nor
$nor
Syntax:
{ $nor:
<expressionN> } ] }
$nor (page 628) performs a logical NOR operation on an array of two or more <expressions> and selects
the documents that fail all the <expressions> in the array.
Consider the following example:
db.inventory.find( { $nor: [ { price: 1.99 }, { qty: { $lt: 20 } }, { sale: true } ] } )
This query will select all documents in the inventory collection where:
the price field value does not equal 1.99 and
the qty field value is not less than 20 and
the sale field value is not equal to true
including those documents that do not contain these field(s).
The exception in returning documents that do not contain the field in the $nor (page 628) expression is when
the $nor (page 628) operator is used with the $exists (page 629) operator.
Consider the following query which uses only the $nor (page 628) operator:
db.inventory.find( { $nor: [ { price: 1.99 }, { sale: true } ]
} )
628
contain the price field whose value is not equal to 1.99 but do not contain the sale field or
do not contain the price field but contain the sale field whose value is not equal to true or
do not contain the price field and do not contain the sale field
Compare that with the following query which uses the $nor (page 628) operator with the $exists (page 629)
operator:
db.inventory.find( { $nor: [ { price: 1.99 }, { price: { $exists: false } },
{ sale: true }, { sale: { $exists: false } } ] } )
Description
Matches documents that have the specified field.
Selects documents if a field is of the specified type.
$exists
Definition
$exists
Syntax: { field:
{ $exists:
<boolean> } }
$exists (page 629) selects the documents that contain the field if <boolean> is true. If <boolean> is
false, the query only returns the documents that do not contain the field. $exists (page 629) does match
documents that contain the field that stores the null value.
When you specify true to the $exist operator, the query will select documents where the value of the
specified field is null. If you specify false to $exist, the query will not match fields that hold the null
value.
MongoDB $exists does not correspond to SQL operator exists. For SQL exists, refer to the $in
(page 623) operator.
See also:
$nin (page 624), $in (page 623), and How do I query for fields that have null values? (page 591).
Examples
Exists and Not Equal To Consider the following example:
db.inventory.find( { qty: { $exists: true, $nin: [ 5, 15 ] } } )
This query will select all documents in the inventory collection where the qty field exists and its value does not
equal 5 or 15.
629
Null Values Given a collection named records with the following documents:
{
{
{
{
{
{
{
{
{
{
a:
a:
a:
a:
a:
a:
a:
b:
b:
c:
5, b:
3, b:
null,
1, b:
2, c:
3, b:
4 }
2, c:
2 }
6 }
5, c:
null,
b: 3,
2, c:
5 }
2 }
null }
c: 8 }
c: 9 }
3 }
4 }
Result:
{
{
{
{
{
{
{
a:
a:
a:
a:
a:
a:
a:
5, b:
3, b:
null,
1, b:
2, c:
3, b:
4 }
5, c:
null,
b: 3,
2, c:
5 }
2 }
null }
c: 8 }
c: 9 }
3 }
Query:
db.records.find( { b: { $exists: false } } )
Result:
{ a: 2, c: 5 }
{ a: 4 }
{ c: 6 }
Query:
db.records.find( { c: { $exists: false } } )
Result:
{ a: 3, b: 2 }
{ a: 4 }
{ b: 2 }
$type
$type
Syntax: { field:
{ $type:
<BSON type> } }
$type (page 630) selects the documents where the value of the field is the specified BSON type.
Consider the following example:
db.inventory.find( { price: { $type : 1 } } )
This query will select all documents in the inventory collection where the price field value is a Double.
630
If the field holds an array, the $type (page 630) operator performs the type check against the array elements
and not the field.
Consider the following example where the tags field holds an array:
db.inventory.find( { tags: { $type : 4 } } )
This query will select all documents in the inventory collection where the tags array contains an element
that is itself an array.
If instead you want to determine whether the tags field is an array type, use the $where (page 634) operator:
db.inventory.find( { $where : "Array.isArray(this.tags)" } )
See the SERVER-14752 for more information about the array type.
Refer to the following table for the available BSON types and their corresponding numbers.
Type
Double
String
Object
Array
Binary data
Undefined (deprecated)
Object id
Boolean
Date
Null
Regular Expression
JavaScript
Symbol
JavaScript (with scope)
32-bit integer
Timestamp
64-bit integer
Min key
Max key
Number
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
255
127
MinKey and MaxKey compare less than and greater than all other possible BSON element values, respectively,
and exist primarily for internal use.
Note: To query if a field value is a MinKey, you must use the $type (page 630) with -1 as in the following
example:
db.collection.find( { field: { $type: -1 } } )
Example
Consider the following example operation sequence that demonstrates both type comparison and the special
MinKey and MaxKey values:
db.test.insert(
db.test.insert(
db.test.insert(
db.test.insert(
{x
{x
{x
{x
:
:
:
:
3});
2.9} );
new Date()} );
true } );
2 https://jira.mongodb.org/browse/SERVER-1475
631
db.test.insert( {x : MaxKey } )
db.test.insert( {x : MinKey } )
db.test.find().sort({x:1})
{ "_id" : ObjectId("4b04094b7c65b846e2090112"),
{ "_id" : ObjectId("4b03155dce8de6586fb002c7"),
{ "_id" : ObjectId("4b03154cce8de6586fb002c6"),
{ "_id" : ObjectId("4b031566ce8de6586fb002c9"),
{ "_id" : ObjectId("4b031563ce8de6586fb002c8"),
{ "_id" : ObjectId("4b0409487c65b846e2090111"),
"x"
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
:
{ $minKey : 1 } }
2.9 }
3 }
true }
"Tue Jul 25 2012 18:42:03 GMT-0500 (EST)"
{ $maxKey : 1 } }
To query for the minimum value of a shard key of a sharded cluster, use the following operation when connected
to the mongos (page 938):
use config
db.chunks.find( { "min.shardKey": { $type: -1 } } )
Warning: Storing values of the different types in the same field in a collection is strongly discouraged.
See also:
find() (page 816), insert() (page 832), $where (page 634), BSON, shard key, sharded cluster .
Evaluation
Name
$mod (page 632)
Evaluation Query Operators
$regex
(page 633)
$where
(page 634)
Description
Performs a modulo operation on the value of a field and selects documents with
result.
Selects documents where values match a specified regular expression.
Matches documents that satisfy a JavaScript expression.
$mod
$mod
Syntax: { field:
{ $mod:
[ divisor, remainder ]} }
$mod (page 632) selects the documents where the field value divided by the divisor has the specified
remainder.
Consider the following example:
db.inventory.find( { qty: { $mod: [ 4, 0 ] } } )
This query will select all documents in the inventory collection where the qty field value modulo 4 equals
0, such as documents with qty value equal to 0 or 12.
In some cases, you can query using the $mod (page 632) operator rather than the more expensive $where
(page 634) operator. Consider the following example using the $mod (page 632) operator:
db.inventory.find( { qty: { $mod: [ 4, 0 ] } } )
The above query is less expensive than the following query which uses the $where (page 634) operator:
632
See also:
find() (page 816), update() (page 849), $set (page 655).
$regex
$regex
The $regex (page 633) operator provides regular expression capabilities for pattern matching strings in
queries. MongoDB uses Perl compatible regular expressions (i.e. PCRE.)
You can specify regular expressions using regular expression objects or using the $regex (page 633) operator.
The following examples are equivalent:
db.collection.find( { field: /acme.*corp/i } );
db.collection.find( { field: { $regex: 'acme.*corp', $options: 'i' } } );
These expressions match all documents in collection where the value of field matches the caseinsensitive regular expression acme.*corp.
$regex (page 633) uses Perl Compatible Regular Expressions (PCRE) as the matching engine.
$options
$regex (page 633) provides four option flags:
i toggles case insensitivity, and allows all letters in the pattern to match upper and lower cases.
m toggles multiline regular expression. Without this option, all regular expression match within one
line.
If there are no newline characters (e.g. \n) or no start/end of line construct, the m option has no effect.
x toggles an extended capability. When set, $regex (page 633) ignores all white space characters
unless escaped or included in a character class.
Additionally, it ignores characters between an un-escaped # character and the next new line, so that
you may include comments in complicated patterns. This only applies to data characters; white space
characters may never appear within special character sequences in a pattern.
The x option does not affect the handling of the VT character (i.e. code 11.)
New in version 1.9.0.
s allows the dot (e.g. .) character to match all characters including newline characters.
$regex (page 633) only provides the i and m options for the native JavaScript regular expression objects
(e.g. http://docs.mongodb.org/manualacme.*corp/i). To use x and s you must use the
$regex (page 633) operator with the $options (page 633) syntax.
To combine a regular expression match with other operators, you need to use the $regex (page 633) operator.
For example:
db.collection.find( { field: { $regex: /acme.*corp/i, $nin: [ 'acmeblahcorp' ] } } );
This expression returns all instances of field in collection that match the case insensitive regular expression acme.*corp that dont match acmeblahcorp.
$regex (page 633) can only use an index efficiently when the regular expression has an anchor for the beginning (i.e.
^) of a string and is a case-sensitive match.
Additionally, while
http://docs.mongodb.org/manual^a/,
http://docs.mongodb.org/manual^a.*/,
and http://docs.mongodb.org/manual^a.*$/ match equivalent strings,
they have
633
Additionally, if the query consists only of the $where (page 634) operator, you can pass in just the JavaScript
expression or JavaScript functions, as in the following examples:
db.myCollection.find( "this.credits == this.debits || this.credits > this.debits" );
You can include both the standard MongoDB operators and the $where (page 634) operator in your query, as
in the following examples:
db.myCollection.find( { active: true, $where: "this.credits - this.debits < 0" } );
db.myCollection.find( { active: true, $where: function() { return obj.credits - obj.debits < 0;
Using normal non-$where (page 634) query statements provides the following performance advantages:
MongoDB will evaluate non-$where (page 634) components of query before $where (page 634) statements. If the non-$where (page 634) statements match no documents, MongoDB will not perform any
query evaluation using $where (page 634).
The non-$where (page 634) query statements may use an index.
Note: Changed in version 2.4.
In MongoDB 2.4, map-reduce operations (page 701), the group (page 697) command, and $where
(page 634) operator expressions cannot access certain global functions or properties, such as db, that are available in the mongo (page 942) shell.
634
When upgrading to MongoDB 2.4, you will need to refactor your code if your map-reduce operations
(page 701), group (page 697) commands, or $where (page 634) operator expressions include any global shell
functions or properties that are no longer available, such as db.
The following JavaScript functions and properties are available to map-reduce operations (page 701),
the group (page 697) command, and $where (page 634) operator expressions in MongoDB 2.4:
Available Properties
Available Functions
args
MaxKey
MinKey
assert()
BinData()
DBPointer()
DBRef()
doassert()
emit()
gc()
HexData()
hex_md5()
isNumber()
isObject()
ISODate()
isString()
Map()
MD5()
NumberInt()
NumberLong()
ObjectId()
print()
printjson()
printjsononeline()
sleep()
Timestamp()
tojson()
tojsononeline()
tojsonObject()
UUID()
version()
Geospatial
Geospatial Query Operators
Operators
Query Selectors
Name
$geoWithin (page 635)
$geoIntersects (page 637)
$near (page 637)
$nearSphere (page 638)
Description
Selects geometries within a bounding GeoJSON geometry.
Selects geometries that intersect with a GeoJSON geometry.
Returns geospatial objects in proximity to a point.
Returns geospatial objects in proximity to a point on a sphere.
$geoWithin
$geoWithin
New in version 2.4: $geoWithin (page 635) replaces $within (page 636) which is deprecated.
The $geoWithin (page 635) operator is a geospatial query operator that queries for a defined point, line or
shape that exists entirely within another defined shape. When determining inclusion, MongoDB considers the
border of a shape to be part of the shape, subject to the precision of floating point numbers.
The $geoWithin (page 635) operator queries for inclusion in a GeoJSON polygon or a shape defined by
legacy coordinate pairs.
635
The $geoWithin (page 635) operator does not return sorted results. As a result MongoDB can return $geoWithin (page 635) queries more quickly than geospatial $near (page 637) or $nearSphere
(page 638) queries, which sort results.
The 2dsphere and 2d indexes both support the $geoWithin (page 635) operator.
Changed in version 2.2.3: $geoWithin (page 635) does not require a geospatial index. However, a geospatial
index will improve query performance.
If querying for geometries that exist within a GeoJSON polygon on a sphere, pass the polygon to $geoWithin
(page 635) using the $geometry (page 639) operator.
For a polygon with only an exterior ring use following syntax:
The following example selects all indexed points and shapes that exist entirely within a GeoJSON polygon:
db.places.find( { loc :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates: [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ]
} } } } )
If querying for inclusion in a shape defined by legacy coordinate pairs on a plane, use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ <shape operator> : <coordinates>
} } } )
For the syntax of shape operators, see: $box (page 641), $polygon (page 642), $center (page 640) (defines
a circle), and $centerSphere (page 641) (defines a circle on a sphere).
Note: Any geometry specified with GeoJSON to $geoWithin (page 635) queries, must fit within a single
hemisphere. MongoDB interprets geometries larger than half of the sphere as queries for the smaller of the
complementary geometries.
$within
Deprecated since version 2.4: $geoWithin (page 635) replaces $within (page 636) in MongoDB 2.4.
636
$geoIntersects
$geoIntersects
New in version 2.4.
The $geoIntersects (page 637) operator is a geospatial query operator that selects all locations that intersect with a GeoJSON object. A location intersects a GeoJSON object if the intersection is non-empty. This
includes documents that have a shared edge. The $geoIntersects (page 637) operator uses spherical geometry.
The 2dsphere geospatial index supports $geoIntersects (page 637).
To query for intersection, pass the GeoJSON object to $geoIntersects (page 637) through the $geometry
(page 639) operator. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoIntersects :
{ $geometry :
{ type : "<GeoJSON object type>" ,
coordinates : [ <coordinates> ]
} } } } )
Note: Any geometry specified with GeoJSON to $geoIntersects (page 637) queries, must fit within a
single hemisphere. MongoDB interprets geometries larger than half of the sphere as queries for the smaller of
the complementary geometries.
$near
$near
Changed in version 2.4.
Specifies a point for which a geospatial query returns the closest documents first. The query sorts the documents
from nearest to farthest.
The $near (page 637) operator can query for a GeoJSON point or for a point defined by legacy coordinate
pairs.
The optional $maxDistance (page 640) operator limits a $near (page 637) query to return only those
documents that fall within a maximum distance of a point. If you query for a GeoJSON point, specify
$maxDistance (page 640) in meters. If you query for legacy coordinate pairs, specify $maxDistance
(page 640) in radians.
The $near (page 637) operator requires a geospatial index: a 2dsphere index for GeoJSON points; a 2d
index for legacy coordinate pairs. Queries that use a 2d index return a limit of 100 documents.
Note:
You cannot combine the $near (page 637) operator, which requires a special geospatial index
637
(page 327), with a query operator or command that uses a different type of special index. For example you
cannot combine $near (page 637) with the text (page 715) command.
For queries on GeoJSON data, use the following syntax:
db.<collection>.find( { <location field> :
{ $near :
{ $geometry :
{ type : "Point" ,
coordinates : [ <longitude> , <latitude> ] } },
$maxDistance : <distance in meters>
} } )
Note: You can further limit the number of results using cursor.limit() (page 867).
Specifying a batch size (i.e. batchSize() (page 859)) in conjunction with queries that use the $near
(page 637) is not defined. See SERVER-52363 for more information.
$nearSphere
$nearSphere
New in version 1.8.
Specifies a point for which a geospatial query returns the closest documents first. The query sorts the documents from nearest to farthest. MongoDB calculates distances for $nearSphere (page 638) using spherical
geometry.
3 https://jira.mongodb.org/browse/SERVER-5236
638
The $nearSphere (page 638) operator queries for points defined by either GeoJSON objects or legacy coordinate pairs.
The optional $maxDistance (page 640) operator limits a $nearSphere (page 638) query to return only
those documents that fall within a maximum distance of a point. If you use $maxDistance (page 640)
on GeoJSON points, the distance is measured in meters. If you use $maxDistance (page 640) on legacy
coordinate pairs, the distance is measured in radians.
The $nearSphere (page 638) operator requires a geospatial index. The 2dsphere and 2d indexes both
support $nearSphere (page 638) with both legacy coordinate pairs and GeoJSON points. Queries that use a
2d index return a at most 100 documents.
Important: If you use longitude and latitude, specify longitude first.
For queries on GeoJSON data, use the following syntax:
db.<collection>.find( { <location field> :
{ $nearSphere :
{ $geometry :
{ type : "Point" ,
coordinates : [ <longitude> , <latitude> ] } ,
$maxDistance : <distance in meters>
} } } )
The following example selects the 100 documents with legacy coordinates pairs nearest to [ 40 , 5 ], as
calculated by spherical geometry:
db.places.find( { loc :
{ $nearSphere : [ 40 , 5 ]
$maxDistance : 10
} } )
Geometry Specifiers
Name
$geometry
(page 639)
$maxDistance
(page 640)
$center
(page 640)
$centerSphere
(page 641)
$box
(page 641)
$polygon
(page 642)
$uniqueDocs
(page 643)
Description
Specifies a geometry in GeoJSON format to geospatial query operators.
Specifies a distance to limit the results of $near (page 637) and $nearSphere (page 6
queries.
Specifies a circle using legacy coordinate pairs to $geoWithin (page 635) queries when
using planar geometry.
Specifies a circle using either legacy coordinate pairs or GeoJSON format for $geoWith
(page 635) queries when using spherical geometry.
Specifies a rectangular box using legacy coordinate pairs for $geoWithin (page 635)
queries.
Specifies a polygon to using legacy coordinate pairs for $geoWithin (page 635) queries
Modifies a $geoWithin (page 635) and $near (page 637) queries to ensure that even i
document matches the query multiple times, the query returns the document once.
$geometry
11.1. MongoDB Interface
639
$geometry
New in version 2.4.
The $geometry (page 639) operator specifies a GeoJSON for a geospatial query operators. For details on
using $geometry (page 639) with an operator, see the operator:
$geoWithin (page 635)
$geoIntersects (page 637)
$near (page 637)
$maxDistance
$maxDistance
The $maxDistance (page 640) operator constrains the results of a geospatial $near (page 637) or
$nearSphere (page 638) query to the specified distance. The measuring units for the maximum distance
are determined by the coordinate system in use. For GeoJSON point object, specify the distance in meters, not
radians.
The 2d and 2dsphere geospatial indexes both support $maxDistance (page 640).
The following example query returns documents with location values that are 10 or fewer units from the point
[ 100 , 100 ].
db.places.find( { loc : { $near : [ 100 , 100 ] ,
$maxDistance: 10 }
} )
MongoDB orders the results by their distance from [ 100 , 100 ]. The operation returns the first 100
results, unless you modify the query with the cursor.limit() (page 867) method.
$center
$center
New in version 1.4.
The $center (page 640) operator specifies a circle for a geospatial $geoWithin (page 635) query. The
query returns legacy coordinate pairs that are within the bounds of the circle. The operator does not return
GeoJSON objects.
The query calculates distances using flat (planar) geometry.
The 2d geospatial index supports the $center (page 640) operator.
To use the $center (page 640) operator, specify an array that contains:
The grid coordinates of the circles center point
The circles radius, as measured in the units used by the coordinate system
Important: If you use longitude and latitude, specify longitude first.
Use the following syntax:
{ <location field> : { $geoWithin : { $center : [ [ <x>, <y> ] , <radius> ] } } }
The following example query returns all documents that have coordinates that exist within the circle centered
on [ -74 , 40.74 ] and with a radius of 10:
640
Changed in version 2.2.3: Applications can use $center (page 640) without having a geospatial index. However, geospatial indexes support much faster queries than the unindexed equivalents. Before 2.2.3, a geospatial
index must exist on a field holding coordinates before using any of the geospatial query operators.
$centerSphere
$centerSphere
New in version 1.8.
The $centerSphere (page 641) operator defines a circle for a geospatial query that uses spherical geometry.
The query returns documents that are within the bounds of the circle.
You can use the $centerSphere (page 641) operator on both GeoJSON objects and legacy coordinate pairs.
The 2d and 2dsphere geospatial indexes both support $centerSphere (page 641).
To use $centerSphere (page 641), specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical
Geometry (page 355).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere : [ [ <x>, <y> ] , <radius> ] }
} } )
Changed in version 2.2.3: Applications can use $centerSphere (page 641) without having a geospatial
index. However, geospatial indexes support much faster queries than the unindexed equivalents. Before 2.2.3, a
geospatial index must exist on a field holding coordinates before using any of the geospatial query operators.
$box
$box
New in version 1.4.
The $box (page 641) operator specifies a rectangle for a geospatial $geoWithin (page 635) query. The query
returns documents that are within the bounds of the rectangle, according to their point-based location data. The
$box (page 641) operator returns documents based on grid coordinates (page 330) and does not query for
GeoJSON shapes.
641
The query calculates distances using flat (planar) geometry. The 2d geospatial index supports the $box
(page 641) operator.
To use the $box (page 641) operator, you must specify the bottom left and top right corners of the rectangle in
an array object. Use the following syntax:
{ <location field> : { $geoWithin : { $box :
[ [ <bottom left coordinates> ] ,
[ <upper right coordinates> ] ] } } }
Changed in version 2.2.3: Applications can use $box (page 641) without having a geospatial index. However,
geospatial indexes support much faster queries than the unindexed equivalents. Before 2.2.3, a geospatial index
must exist on a field holding coordinates before using any of the geospatial query operators.
$polygon
$polygon
New in version 1.9.
The $polygon (page 642) operator specifies a polygon for a geospatial $geoWithin (page 635) query on
legacy coordinate pairs. The query returns pairs that are within the bounds of the polygon. The operator does
not query for GeoJSON objects.
The $polygon (page 642) operator calculates distances using flat (planar) geometry.
The 2d geospatial index supports the $polygon (page 642) operator.
To define the polygon, specify an array of coordinate points. Use the following syntax:
{ <location field> : { $geoWithin : { $polygon : [ [ <x1> , <y1> ] ,
[ <x2> , <y2> ] ,
[ <x3> , <y3> ] ] } } }
Changed in version 2.2.3: Applications can use $polygon (page 642) without having a geospatial index. However, geospatial indexes support much faster queries than the unindexed equivalents. Before 2.2.3, a geospatial
index must exist on a field holding coordinates before using any of the geospatial query operators.
642
$uniqueDocs
$uniqueDocs
New in version 2.0.
The $uniqueDocs (page 643) operator returns a document only once for a geospatial query if the document
matches the query multiple times. A document might match a query multiple times if the documents contains
multiple coordinate values.
You can use $uniqueDocs (page 643) only with the $geoWithin (page 635) and $near (page 637) operators. The 2d geospatial index supports $uniqueDocs (page 643).
Example
Given a collection of addresses with documents in the following form:
{ addrs : [ { name : "H" , loc : [ 55.5 , 42.3 ] } , { name : "W" , loc : [ 32.3 , 44.2 ] } ] }
The following query would return the same document multiple times:
db.list.find( { "addrs.loc" : { $geoWithin : { $box : [ [ 0 , 0 ] , [ 100 , 100 ] ] } } } )
The following query would return each matching document only once:
Note: If you specify a value of false for $uniqueDocs (page 643), MongoDB will return multiple instances of a single document.
Geospatial Query Compatibility While numerous combinations of query operators are possible, the following
table shows the recommended operators for different types of queries. The table uses the $geoWithin (page 635),
$geoIntersects (page 637) and $near (page 637) operators.
643
Query Document
Geometry of the
Query Condition
Supported by this
Index
polygon
{ $geoWithin : {
$geometry : <GeoJSON Polygon>
} }
sphere
meters
2dsphere
sphere
meters
2dsphere
point
{ $near : {
$geometry : <GeoJSON Point>,
$maxDistance : d
} }
sphere
meters
2dsphere
The index is required.
rectangle
{ $geoWithin : {
$box : [[x1, y1], [x2, y2]]
} }
flat
flat units
2d
polygon
{ $geoWithin : {
$polygon : [[x1, y1],
[x1, y2],
[x2, y2],
[x2, y1]]
} }
flat
flat units
2d
circular region
{ $geoWithin : {
$center : [[x1, y1], r],
} }
flat
flat units
2d
circular region
{ $geoWithin : {
$centerSphere :
[[x, y], radius]
} }
sphere
radians
2d
2dsphere
point
{ $near : [x1, y1],
$maxDistance : d
}
flat units
2d
The index is required.
Array
644
Name
$all (page 645)
$elemMatch
(page 645)
$size (page 646)
Description
Matches arrays that contain all elements specified in the query.
Selects documents if element in the array field matches all the specified $elemMa
(page 645) condition.
Selects documents if the array field is a specified size.
$all
$all
Syntax: { field:
{ $all:
$all (page 645) selects the documents where the field holds an array and contains all elements (e.g.
<value>, <value1>, etc.) in the array.
Consider the following example:
db.inventory.find( { tags: { $all: [ "appliances", "school", "book" ] } } )
This query selects all documents in the inventory collection where the tags field contains an array with the
elements, appliances, school, and book.
Therefore, the above query will match documents in the inventory collection that have a tags field that
hold either of the following arrays:
[ "school", "book", "bag", "headphone", "appliances" ]
[ "appliances", "school", "book" ]
The $all (page 645) operator exists to describe and specify arrays in MongoDB queries. However, you may
use the $all (page 645) operator to select against a non-array field, as in the following example:
db.inventory.find( { qty: { $all: [ 50 ] } } )
Both queries will select all documents in the inventory collection where the value of the qty field equals
50.
Note: In most cases, MongoDB does not treat arrays as sets. This operator provides a notable exception to this
approach.
In the current release queries that use the $all (page 645) operator must scan all the documents that match the
first element in the query array. As a result, even with an index to support the query, the operation may be long
running, particularly when the first element in the array is not very selective.
See also:
find() (page 816), update() (page 849), and $set (page 655).
$elemMatch (query)
See also:
645
returns all documents in collection where the array array satisfies all of the conditions in the
$elemMatch (page 645) expression, or where the value of value1 is 1 and the value of value2 is greater
than 1. Matching arrays must have at least one element that matches all specified criteria. Therefore, the following document would not match the above query:
{ array: [ { value1:1, value2:0 }, { value1:2, value2:2 } ] }
$size
$size
The $size (page 646) operator matches any array with the number of elements specified by the argument. For
example:
db.collection.find( { field: { $size: 2 } } );
returns all documents in collection where field is an array with 2 elements. For instance, the above
expression will return { field: [ red, green ] } and { field: [ apple, lime ] } but
not { field: fruit } or { field: [ orange, lemon, grapefruit ] }. To match fields
with only one element within an array use $size (page 646) with a value of 1, as follows:
db.collection.find( { field: { $size: 1 } } );
$size (page 646) does not accept ranges of values. To select documents based on fields with different numbers
of elements, create a counter field that you increment when you add elements to a field.
Queries cannot use indexes for the $size (page 646) portion of a query, although the other portions of a query
can use indexes if applicable.
Projection Operators
Projection Operators
Name
$ (page 646)
$elemMatch
(page 648)
$slice (page 650)
Description
Projects the first element in an array that matches the query condition.
Projects only the first element from an array that matches the specified $elemMatc
(page 648) condition.
Limits the number of elements projected from an array. Supports skip and limit slice
$ (projection)
$
The positional $ (page 646) operator limits the contents of the <array> field that is included in the query results
to contain the first matching element. To specify an array element to update, see the positional $ operator for
updates (page 656).
Used in the projection document of the find() (page 816) method or the findOne() (page 824) method:
The $ (page 646) projection operator limits the content of the <array> field to the first element that
matches the query document (page 68).
The <array> field must appear in the query document (page 68)
646
db.collection.find( {
{
db.collection.find( {
{
The <value> can be documents that contains query operator expressions (page 621).
Only one positional $ (page 646) operator can appear in the projection document.
Only one array field can appear in the query document (page 68); i.e. the following query is incorrect:
db.collection.find( { <array>: <value>, <someOtherArray>: <value2> },
{ "<array>.$": 1 } )
Example
A collection students contains the following documents:
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
1,
2,
3,
4,
5,
6,
"semester"
"semester"
"semester"
"semester"
"semester"
"semester"
:
:
:
:
:
:
1,
1,
1,
2,
2,
2,
"grades"
"grades"
"grades"
"grades"
"grades"
"grades"
:
:
:
:
:
:
[
[
[
[
[
[
70,
90,
85,
79,
88,
95,
87, 90 ] }
88, 92 ] }
100, 90 ] }
85, 80 ] }
88, 92 ] }
90, 96 ] }
Although the array field grades may contain multiple elements that are greater than or equal to 85, the $
(page 646) projection operator returns only the first matching element from the array.
Important: When the find() (page 816) method includes a sort() (page 872), the find() (page 816)
method applies the sort() (page 872) to order the matching documents before it applies the positional $
(page 646) projection operator.
If an array field contains multiple documents with the same field name and the find() (page 816) method
includes a sort() (page 872) on that repeating field, the returned documents may not reflect the sort order
because the sort was applied to the elements of the array before the $ (page 646) projection operator.
Example
A students collection contains the following documents where the grades field is an array of documents;
each document contain the three field names grade, mean, and std:
{ "_id" : 7, semester: 3, "grades" : [ { grade: 80, mean: 75, std: 8 },
{ grade: 85, mean: 90, std: 5 },
{ grade: 90, mean: 85, std: 3 } ] }
{ "_id" : 8, semester: 3, "grades" : [ { grade: 92, mean: 88, std: 8 },
647
In the following query, the projection { "grades.$": 1 } returns only the first element with the mean
greater than 70 for the grades field. The query also includes a sort() (page 872) to order by ascending
grades.grade field:
db.students.find( { "grades.mean": { $gt: 70 } },
{ "grades.$": 1 }
).sort( { "grades.grade": 1 } )
The find() (page 816) method sorts the matching documents before it applies the $ (page 646) projection
operator on the grades array. Thus, the results with the projected array elements do not reflect the ascending
grades.grade sort order:
{ "_id" : 8, "grades" : [ { "grade" : 92, "mean" : 88, "std" : 8 } ] }
{ "_id" : 7, "grades" : [ { "grade" : 80, "mean" : 75, "std" : 8 } ] }
Note: Since only one array field can appear in the query document, if the array contains documents, to specify
criteria on multiple fields of these documents, use the $elemMatch (query) (page 645) operator, e.g.:
db.students.find( { grades: { $elemMatch: {
mean: { $gt: 70 },
grade: { $gt:90 }
} } },
{ "grades.$": 1 } )
See also:
$elemMatch (projection) (page 648)
$elemMatch (projection)
See also:
648
Example
The following find() (page 816) operation queries for all documents where the value of the zipcode field
is 63109. The $elemMatch (page 648) projection returns only the first matching element of the students
array where the school field has a value of 102:
db.schools.find( { zipcode: 63109 },
{ students: { $elemMatch: { school: 102 } } } )
For the document with _id equal to 1, the students array contains multiple elements with the school
field equal to 102. However, the $elemMatch (page 648) projection returns only the first matching
element from the array.
The document with _id equal to 3 does not contain the students field in the result since no element in
its students array matched the $elemMatch (page 648) condition.
The $elemMatch (page 648) projection can specify criteria on multiple fields:
Example
649
The following find() (page 816) operation queries for all documents where the value of the zipcode field
is 63109. The projection includes the first matching element of the students array where the school field
has a value of 102 and the age field is greater than 10:
db.schools.find( { zipcode: 63109 },
{ students: { $elemMatch: { school: 102, age: { $gt: 10} } } } )
The operation returns the three documents that have zipcode equal to 63109:
{ "_id" : 1, "students" : [ { "name" : "jess", "school" : 102, "age" : 11 } ] }
{ "_id" : 3 }
{ "_id" : 4 }
Documents with _id equal to 3 and _id equal to 4 do not contain the students field since no element
matched the $elemMatch (page 648) criteria.
When the find() (page 816) method includes a sort() (page 872), the find() (page 816) method applies
the sort() (page 872) to order the matching documents before it applies the projection.
If an array field contains multiple documents with the same field name and the find() (page 816) method
includes a sort() (page 872) on that repeating field, the returned documents may not reflect the sort order
because the sort() (page 872) was applied to the elements of the array before the $elemMatch (page 648)
projection.
Example
The following query includes a sort() (page 872) to order by descending students.age field:
db.schools.find(
{ zipcode: 63109 },
{ students: { $elemMatch: { school: 102 } } }
).sort( { "students.age": -1 } )
The operation applies the sort() (page 872) to order the documents that have the field zipcode equal to
63109 and then applies the projection. The operation returns the three documents in the following order:
{ "_id" : 1, "students" : [ { "name" : "john", "school" : 102, "age" : 10 } ] }
{ "_id" : 3 }
{ "_id" : 4, "students" : [ { "name" : "barney", "school" : 102, "age" : 7 } ] }
See also:
$ (projection) (page 646) operator
$slice (projection)
$slice
The $slice (page 650) operator controls the number of items of an array that a query returns. For information
on limiting the size of an array during an update with $push (page 659), see the $slice (page 661) modifier
instead.
Consider the following prototype query:
db.collection.find( { field: value }, { array: {$slice: count } } );
This operation selects the document collection identified by a field named field that holds value and
returns the number of elements specified by the value of count from the array stored in the array field. If
count has a value greater than the number of elements in array the query returns all elements of the array.
650
$slice (page 650) accepts arguments in a number of formats, including negative values and arrays. Consider
the following examples:
db.posts.find( {}, { comments: { $slice: 5 } } )
Here, $slice (page 650) selects the first five items in an array in the comments field.
db.posts.find( {}, { comments: { $slice: -5 } } )
Here, the query will only return 10 items, after skipping the first 20 items of that array.
db.posts.find( {}, { comments: { $slice: [ -20, 10 ] } } )
This operation returns 10 items as well, beginning with the item that is 20th from the last item of the array.
Update Operators
Update Operators
Fields
Name
$inc (page 651)
$rename
(page 652)
$setOnInsert
(page 654)
$set (page 655)
$unset
(page 655)
Description
Increments the value of the field by the specified amount.
Renames a field.
Sets the value of a field upon documentation creation during an upsert. Has no effec
update operations that modify existing documents.
Sets the value of a field in an existing document.
Removes the specified field from an existing document.
$inc
$inc
The $inc (page 651) operator increments a value of a field by a specified amount. If the field does not exist, $inc (page 651) sets the field to the specified amount. $inc (page 651) accepts positive and negative
incremental amounts.
The following example increments the value of field1 by the value of amount for the first matching document in the collection where field equals value:
db.collection.update( { field: value },
{ $inc: { field1: amount } } );
To update all matching documents in the collection, specify multi:true in the update() (page 849)
method:
db.collection.update( { age: 20 }, { $inc: { age: 1 } }, { multi: true } );
db.collection.update( { name: "John" }, { $inc: { age: 2 } }, { multi: true } );
651
The first update() (page 849) operation increments the value of the age field by 1 for all documents in the
collection that have an age field equal to 20. The second operation increments the value of the age field by 2
for all documents in the collection with the name field equal to "John".
$rename
$rename
New in version 1.7.2.
Syntax:
{$rename:
... } }
{ <old name1>:
<new name2>,
The $rename (page 652) operator updates the name of a field. The new field name must differ from the existing
field name.
Consider the following example:
db.students.update( { _id: 1 }, { $rename: { 'nickname': 'alias', 'cell': 'mobile' } } )
This operation renames the field nickname to alias, and the field cell to mobile.
If the document already has a field with the new field name, the $rename (page 652) operator removes that
field and renames the field with the old field name to the new field name.
The $rename (page 652) operator will expand arrays and sub-documents to find a match for field names.
When renaming a field in a sub-document to another sub-document or to a regular field, the sub-document itself
remains.
Consider the following examples involving the sub-document of the following document:
{ "_id": 1,
"alias": [ "The American Cincinnatus", "The American Fabius" ],
"mobile": "555-555-5555",
"nmae": { "first" : "george", "last" : "washington" }
}
To rename a sub-document, call the $rename (page 652) operator with the name of the sub-document as
you would any other field:
db.students.update( { _id: 1 }, { $rename: { "nmae": "name" } } )
To rename a field within a sub-document, call the $rename (page 652) operator using the dot notation
(page 94) to refer to the field. Include the name of the sub-document in the new field name to ensure the
field remains in the sub-document:
db.students.update( { _id: 1 }, { $rename: { "name.first": "name.fname" } } )
652
To rename a field within a sub-document and move it to another sub-document, call the $rename
(page 652) operator using the dot notation (page 94) to refer to the field. Include the name of the new
sub-document in the new name:
db.students.update( { _id: 1 }, { $rename: { "name.last": "contact.lname" } } )
This operation renames the sub-document field last to lname and moves it to the sub-document
contact:
{ "_id" : 1,
"alias" : [ "The American Cincinnatus", "The American Fabius" ],
"contact" : { "lname" : "washington" },
"mobile" : "555-555-5555",
"name" : { "fname" : "george" }
}
If the new field name does not include a sub-document name, the field moves out of the subdocument and
becomes a regular document field.
Consider the following behavior when the specified old field name does not exist:
When renaming a single field and the existing field name refers to a non-existing field, the $rename
(page 652) operator does nothing, as in the following:
db.students.update( { _id: 1 }, { $rename: { 'wife': 'spouse' } } )
This operation does nothing because there are no fields named wife, vice, and office.
When renaming multiple fields and some but not all old field names refer to non-existing fields, the
$rename (page 652) operator performs the following operations:
Changed in version 2.2.
Renames the fields that exist to the specified new field names.
Ignores the non-existing fields.
Consider the following query that renames both an existing field mobile and a non-existing field wife.
The field named wife does not exist and $rename (page 652) sets the field to a name that already exists
alias.
db.students.update( { _id: 1 }, { $rename: { 'wife': 'alias',
'mobile': 'cell' } } )
This operation renames the mobile field to cell, and has no other impact action occurs.
{ "_id" : 1,
"alias" : [ "The American Cincinnatus", "The American Fabius" ],
"cell" : "555-555-5555",
653
Note: Before version 2.2, when renaming multiple fields and only some (but not all) old field names refer
to non-existing fields:
For the fields with the old names that do exist, the $rename (page 652) operator renames these fields
to the specified new field names.
For the fields with the old names that do not exist:
*if no field exists with the new field name, the $rename (page 652) operator does nothing.
*if fields already exist with the new field names, the $rename (page 652) operator drops these
fields.
Consider the following operation that renames both the field mobile, which exists, and the field wife,
which does not exist. The operation tries to set the field named wife to alias, which is the name of an
existing field:
db.students.update( { _id: 1 }, { $rename: { 'wife': 'alias', 'mobile': 'cell' } } )
Before 2.2, the operation renames the field mobile to cell and drops the alias field even though the
field wife does not exist:
{ "_id" : 1,
"cell" : "555-555-5555",
"name" : { "lname" : "washington" },
"places" : { "d" : "Mt Vernon", "b" : "Colonial Beach" }
}
$setOnInsert
$setOnInsert
New in version 2.4.
The $setOnInsert (page 654) operator assigns values to fields during an upsert (page 849) only when
using the upsert option to the update() (page 849) operation performs an insert.
db.collection.update( <query>,
{ $setOnInsert: { <field1>: <value1>, ... } },
{ upsert: true }
)
Example
A collection named products contains no documents.
Then, the following upsert (page 849) operation performs an insert and applies the $setOnInsert
(page 654) to set the field defaultQty to 100:
db.products.update(
{ _id: 1 },
{ $setOnInsert: { defaultQty: 100 } },
{ upsert: true }
)
654
Note: The $setOnInsert (page 654) operator only affects update() (page 849) operations with the
upsert flag that perform an insert (page 50).
If the update() (page 849) has the upsert flag and performs an update (page 50), $setOnInsert (page 654)
has no effect.
Example
A collection named products has the following document:
{ "_id" : 1, "defaultQty" : 100 }
The following update() (page 849) with the upsert flag operation performs an update:
db.products.update(
{ _id: 1 },
{ $setOnInsert: { defaultQty: 500, inStock: true },
$set: { item: "apple" } },
{ upsert: true }
)
Because the update() (page 849) with upsert operation only performs an update, MongoDB ignores the
$setOnInsert (page 654) operation and only applies the $set (page 655) operation.
The products collection now contains the following modified document:
{ "_id" : 1, "defaultQty" : 100, "item" : "apple" }
$set
$set
Use the $set (page 655) operator to set a particular value. The $set (page 655) operator requires the following
syntax:
db.collection.update( { field: value1 }, { $set: { field1: value2 } } );
This statement updates in the document in collection where field matches value1 by replacing the
value of the field field1 with value2. This operator will add the specified field or fields if they do not exist
in this document or replace the existing value of the specified field(s) if they already exist.
$unset
$unset
The $unset (page 655) operator deletes a particular field. Consider the following example:
db.collection.update( { field: value1 }, { $unset: { field1: "" } } );
The above example deletes field1 in collection from documents where field has a value of value1.
The value of the field in the $unset (page 655) statement (i.e. "" above) does not impact the operation.
If documents match the initial query (e.g. { field: value1 } above) but do not have the field specified
in the $unset (page 655) operation (e.g. field1), then the statement has no effect on the document.
Array
655
Update Operators
$addToSet
(page 657)
$pop (page 657)
$pullAll
(page 658)
$pull (page 658)
$pushAll
(page 659)
$push (page 659)
$ (query)
$
Syntax: { "<array>.$" :
Description
Acts as a placeholder to update the first element that matches the query condition in an
update.
Adds elements to an existing array only if they do not already exist in the set.
Removes the first or last item of an array.
Removes multiple values from an array.
Removes items from an array that match a query statement.
Deprecated. Adds several items to an array.
Adds an item to an array.
value }
The positional $ (page 656) operator identifies an element in an array field to update without explicitly specifying the position of the element in the array. To project, or return, an array element from a read operation, see
the $ (page 646) projection operator.
When used with the update() (page 849) method,
the positional $ (page 656) operator acts as a placeholder for the first element that matches the query
document (page 68), and
the array field must appear as part of the query document.
db.collection.update( { <array>: value ... }, { <update operator>: { "<array>.$" : value } } )
To update 80 to 82 in the grades array in the first document, use the positional $ (page 656) operator if you
do not know the position of the element in the array:
db.students.update( { _id: 1, grades: 80 }, { $set: { "grades.$" : 82 } } )
Remember that the positional $ (page 656) operator acts as a placeholder for the first match of the update query
document (page 68).
The positional $ (page 656) operator facilitates updates to arrays that contain embedded documents. Use the
positional $ (page 656) operator to access the fields in the embedded documents with the dot notation (page 94)
on the $ (page 656) operator.
Consider the following document in the students collection whose grades field value is an array of embedded documents:
{ "_id" : 4, "grades" : [ { grade: 80, mean: 75, std: 8 },
{ grade: 85, mean: 90, std: 5 },
{ grade: 90, mean: 85, std: 3 } ] }
656
Use the positional $ (page 656) operator to update the value of the std field in the embedded document with
the grade of 85:
db.students.update( { _id: 4, "grades.grade": 85 }, { $set: { "grades.$.std" : 6 } } )
Note:
Do not use the positional operator $ (page 656) with upsert operations because inserts will use the $ as a
field name in the inserted document.
When used with the $unset (page 655) operator, the positional $ (page 656) operator does not remove
the matching element from the array but rather sets it to null.
See also:
update() (page 849), $set (page 655) and $unset (page 655)
$addToSet
$addToSet
The $addToSet (page 657) operator adds a value to an array only if the value is not in the array already. If
the value is in the array, $addToSet (page 657) returns without modifying the array. Consider the following
example:
db.collection.update( { <field>: <value> }, { $addToSet: { <field>: <addition> } } );
Here, $addToSet (page 657) appends <addition> to the array stored in <field> that includes the element <value>, only if <addition> is not already a member of this array.
Note: $addToSet (page 657) only ensures that there are no duplicate items added to the set and does not
affect existing duplicate elements. $addToSet (page 657) does not guarantee a particular ordering of elements
in the modified set.
Use the $each (page 660) modifier with the $addToSet (page 657) operator to add multiple values
to an array <field> if the values do not exist in the <field>.
db.collection.update( <query>,
{
$addToSet: { <field>: { $each: [ <value1>, <value2> ... ] } }
}
)
See also:
$push (page 659)
$pop
$pop
The $pop (page 657) operator removes the first or last element of an array. Pass $pop (page 657) a value of 1
to remove the last element in an array and a value of -1 to remove the first element of an array. Consider the
following syntax:
db.collection.update( {field: value }, { $pop: { field: 1 } } );
This operation removes the last item of the array in field in the document that matches the query statement {
field: value }. The following example removes the first item of the same array:
657
Here, $pullAll (page 658) removes [ value1, value2, value3 ] from the array in field1, in the
document that matches the query statement { field: value } in collection.
$pull
$pull
The $pull (page 658) operator removes all instances of a value from an existing array, as in the following
prototype:
db.collection.update( { field: <query> }, { $pull: { field: <query> } } );
$pull (page 658) removes items from the array in the field named field that match the query in the $pull
(page 658) statement.
If a value (i.e. <value>) exists multiple times in an array, $pull (page 658) will remove all instances of the
value.
Example
Given the following document in the cpuinfo collection:
{ flags: ['vme', 'de', 'pse', 'tsc', 'msr', 'pae', 'mce' ] }
The following operation will remove the msr value from the flags array:
db.cpuinfo.update( { flags: 'msr' }, { $pull: { flags: 'msr' } } )
Example
Given the following document in the profiles collection:
{ votes: [ 3, 5, 6, 7, 7, 8 ] }
The following operation will remove all occurrences of 7 from the votes array.
db.profiles.update( { votes: 3 }, { $pull: { votes: 7 } } )
658
$pushAll
$pushAll
Deprecated since version 2.4: Use the $push (page 659) operator with $each (page 660) instead.
The $pushAll (page 659) operator is similar to the $push (page 659) but adds the ability to append several
values to an array at once.
db.collection.update( { field: value }, { $pushAll: { field1: [ value1, value2, value3 ] } } );
Here, $pushAll (page 659) appends the values in [ value1, value2, value3 ] to the array in
field1 in the document matched by the statement { field: value } in collection.
If you specify a single value, $pushAll (page 659) will behave as $push (page 659).
$push
$push
The $push (page 659) operator appends a specified value to an array.
db.collection.update( <query>,
{ $push: { <field>: <value> } }
)
The following example appends 89 to the scores array for the first document where the name field equals
joe:
db.students.update(
{ name: "joe" },
{ $push: { scores: 89 } }
)
Note:
If the field is absent in the document to update, $push (page 659) adds the array field with the value as its
element.
If the field is not an array, the operation will fail.
If the value is an array, $push (page 659) appends the whole array as a single element. To add each
element of the value separately, use $push (page 659) with the $each (page 660) modifier.
The following example appends each element of [ 90, 92, 85 ] to the scores array for the document where the name field equals joe:
db.students.update(
{ name: "joe" },
{ $push: { scores: { $each: [ 90, 92, 85 ] } } }
)
Changed in version 2.4: MongoDB adds support for the $each (page 660) modifier to the $push
(page 659) operator. Before 2.4, use $pushAll (page 659) for similar functionality.
659
Changed in version 2.4: You can use the $push (page 659) operator with the following modifiers:
$each (page 660) appends multiple values to the array field,
$slice (page 661), which is only available with $each (page 660), limits the number of array elements,
and
$sort (page 661), which is only available when used with both $each (page 660) and $slice
(page 661), orders elements of the array. $sort (page 661) can only order array elements that are documents.
The following example uses:
the $each (page 660) modifier to append documents to the quizzes array,
the $sort (page 661) modifier to sort all the elements of the modified quizzes array by the ascending
score field, and
the $slice (page 661) modifier to keep only the last five sorted elements of the quizzes array.
db.students.update( { name: "joe" },
{ $push: { quizzes: { $each: [ { id: 3,
{ id: 4,
{ id: 5,
$sort: { score: 1
$slice: -5
}
}
}
)
Name
$each
(page 660)
$slice
(page 661)
$sort
(page 661)
score: 8 },
score: 7 },
score: 6 } ],
},
Description
Modifies the $push (page 659) and $addToSet (page 657) operators to append
items for array updates.
Modifies the $push (page 659) operator to limit the size of updated arrays.
Modifies the $push (page 659) operator to reorder documents stored in an array.
$each
Note: The $each (page 660) modifier is only used with the $addToSet (page 657) and $push (page 659)
operators. See the documentation of $addToSet (page 657) and $push (page 659) for more information.
$each
The $each (page 660) modifier is available for use with the $addToSet (page 657) operator and the $push
(page 659) operator.
Use the $each (page 660) modifier with the $addToSet (page 657) operator to add multiple values to an
array <field> if the values do not exist in the <field>.
db.collection.update( <query>,
{
$addToSet: { <field>: { $each: [ <value1>, <value2> ... ] } }
}
)
Use the $each (page 660) modifier with the $push (page 659) operator to append multiple values to an array
<field>.
660
db.collection.update( <query>,
{
$push: { <field>: { $each: [ <value1>, <value2> ... ] } }
}
)
Changed in version 2.4: MongoDB adds support for the $each (page 660) modifier to the $push (page 659)
operator.
$slice
$slice
New in version 2.4.
The $slice (page 661) modifier limits the number of array elements during a $push (page 659) operation.
To project, or return, a specified number of array elements from a read operation, see the $slice (page 650)
projection operator instead.
To use the $slice (page 661) modifier, it must appear with the $each (page 660) modifier, and the $each
(page 660) modifier must be the first modifier for the $push (page 659) operation.
db.collection.update( <query>,
{ $push: {
<field>: {
$each: [ <value1>, <value2>, ... ],
$slice: <num>
}
}
}
)
$sort
$sort
New in version 2.4.
The $sort (page 661) modifier orders the elements of an array during a $push (page 659) operation. The
elements of the array must be documents.
$sort (page 661) modifies $push (page 659) updates that use both the $each (page 660) and $slice
(page 661) modifiers, where $each (page 660) is the first modifier for the $push (page 659) operation.
db.collection.update( <query>,
{ $push: {
<field>: {
661
$each: [ <document1>,
<document2>,
...
],
$slice: <num>,
$sort: <sort document>,
}
}
}
)
Important: The <sort document> only accesses the fields from the elements in the array and does not
refer to the array <field>.
Consider the following example where the collection students contain the following document:
{ "_id": 3,
"name": "joe",
"quizzes": [
{ "id" : 1, "score" : 6 },
{ "id" : 2, "score" : 9 }
]
}
The following update appends additional documents to the quizzes array, sorts all the elements of the array
by ascending score field, and slices the array to keep the last five elements:
db.students.update( { name: "joe" },
{ $push: { quizzes: { $each: [ { id: 3,
{ id: 4,
{ id: 5,
$sort: { score: 1
$slice: -5
}
}
}
)
score: 8 },
score: 7 },
score: 6 } ],
},
After the update, the array elements are in order of ascending score field.:
{
"_id" : 3,
"name" : "joe",
"quizzes" : [
{
{
{
{
{
]
"id"
"id"
"id"
"id"
"id"
:
:
:
:
:
1,
5,
4,
3,
2,
"score"
"score"
"score"
"score"
"score"
:
:
:
:
:
6
6
7
8
9
},
},
},
},
}
Bitwise
662
Name
$bit (page 663)
Description
Performs bitwise AND and OR updates of integer values.
$bit
$bit
The $bit (page 663) operator performs a bitwise update of a field. Only use this with integer fields, as in the
following examples:
db.collection.update( { field: NumberInt(1) }, { $bit: { field: { and: NumberInt(5) } } } );
db.collection.update( { field: NumberInt(1) }, { $bit: { field: { or: NumberInt(5) } } } );
Here, the $bit (page 663) operator updates the integer value of the field named field: in the first example
with a bitwise and: 5 operation; and in the second example with a bitwise or: 5 operation. $bit
(page 663) only works with integers.
$bit (page 663) only supports AND and OR bitwise operations.
Note: All numbers in the mongo (page 942) shell are doubles, not integers. Use the NumberInt() constructor to specify integers. See NumberInt (page 201) for more information.
Isolation
Name
$isolated (page 663)
Description
Modifies behavior of multi-updates to improve the isolation of the operation
$isolated
$isolated
$isolated (page 663) isolation operator isolates a write operation that affects multiple documents from other
write operations.
Note: The $isolated (page 663) isolation operator does not provide all-or-nothing atomicity for write
operations.
Consider the following example:
db.foo.update( { field1 : 1 , $isolated : 1 }, { $inc : { field2 : 1 } } , { multi: true } )
Without the $isolated (page 663) operator, multi-updates will allow other operations to interleave with these
updates. If these interleaved operations contain writes, the update operation may produce unexpected results.
By specifying $isolated (page 663) you can guarantee isolation for the entire multi-update.
Warning: $isolated (page 663) does not work with sharded clusters.
See also:
See
db.collection.update()
(page
849)
for
more
information
db.collection.update() (page 849) method.
$atomic
Deprecated since version 2.2: The $isolated (page 663) replaces $atomic (page 663).
about
the
663
Pipeline Operators
Warning: The pipeline cannot operate on values of the following types: Binary, Symbol, MinKey, MaxKey,
DBRef, Code, and CodeWScope.
Pipeline operators appear in an array. Documents pass through the operators in a sequence.
Name
$project
(page 664)
$match
(page 666)
$limit
(page 667)
$skip
(page 668)
$unwind
(page 668)
$group
(page 669)
$sort
(page 670)
$geoNear
(page 671)
Description
Reshapes a document stream. $project (page 664) can rename, add, or remove fields as well
as create computed values and sub-documents.
Filters the document stream, and only allows matching documents to pass into the next pipeline
stage. $match (page 666) uses standard MongoDB queries.
Restricts the number of documents in an aggregation pipeline.
Skips over a specified number of documents from the pipeline and returns the rest.
Takes an array of documents and returns them as a stream of documents.
Groups documents together for the purpose of calculating aggregate values based on a collection
of documents.
Takes all input documents and returns them in a stream of sorted documents.
Returns an ordered stream of documents based on proximity to a geospatial point.
Name
$project
(page 664)
$match
(page 666)
$limit
(page 667)
$skip
(page 668)
$unwind
(page 668)
$group
(page 669)
$sort
(page 670)
$geoNear
(page 671)
Description
Reshapes a document stream. $project (page 664) can rename, add, or remov
as create computed values and sub-documents.
Filters the document stream, and only allows matching documents to pass into th
stage. $match (page 666) uses standard MongoDB queries.
Restricts the number of documents in an aggregation pipeline.
Skips over a specified number of documents from the pipeline and returns the res
Takes an array of documents and returns them as a stream of documents.
Groups documents together for the purpose of calculating aggregate values based
of documents.
Takes all input documents and returns them in a stream of sorted documents.
$project (aggregation)
$project
Reshapes a document stream by renaming, adding, or removing fields. Also use $project (page 664) to
create computed values or sub-documents. Use $project (page 664) to:
Include fields from the original document.
Insert computed fields.
Rename fields.
664
This operation includes the title field and the author field in the document that returns from the aggregation
pipeline.
Note: The _id field is always included by default. You may explicitly exclude _id as follows:
db.article.aggregate(
{ $project : {
_id : 0 ,
title : 1 ,
author : 1
}}
);
Here, the projection excludes the _id field but includes the title and author fields.
Projections can also add computed fields to the document stream passing through the pipeline. A computed field
can use any of the expression operators (page 673). Consider the following example:
db.article.aggregate(
{ $project : {
title : 1,
doctoredPageViews : { $add:["$pageViews", 10] }
}}
);
Here, the field doctoredPageViews represents the value of the pageViews field after adding 10 to the
original field using the $add (page 681).
Note: You must enclose the expression that defines the computed field in braces, so that the expression is a
valid object.
You may also use $project (page 664) to rename fields. Consider the following example:
db.article.aggregate(
{ $project : {
title : 1 ,
page_views : "$pageViews" ,
bar : "$other.foo"
}}
);
This operation renames the pageViews field to page_views, and renames the foo field in the other subdocument as the top-level field bar. The field references used for renaming fields are direct expressions and do
not use an operator or surrounding braces. All aggregation field references can use dotted paths to refer to fields
in nested documents.
665
Finally, you can use the $project (page 664) to create and populate new sub-documents. Consider the
following example that creates a new object-valued field named stats that holds a number of values:
db.article.aggregate(
{ $project : {
title : 1 ,
stats : {
pv : "$pageViews",
foo : "$other.foo",
dpv : { $add:["$pageViews", 10] }
}
}}
);
This projection includes the title field and places $project (page 664) into inclusive mode. Then, it
creates the stats documents with the following fields:
pv which includes and renames the pageViews from the top level of the original documents.
foo which includes the value of other.foo from the original documents.
dpv which is a computed field that adds 10 to the value of the pageViews field in the original document
using the $add (page 681) aggregation expression.
$match (aggregation)
$match
$match (page 666) pipes the documents that match its conditions to the next operator in the pipeline.
The $match (page 666) query syntax is identical to the read operation query (page 68) syntax.
Example
The following operation uses $match (page 666) to perform a simple equality match:
db.articles.aggregate(
{ $match : { author : "dave" } }
);
The $match (page 666) selects the documents where the author field equals dave, and the aggregation
returns the following:
{ "result" : [
{
"_id" : ObjectId("512bc95fe835e68f199c8686"),
"author": "dave",
"score" : 80
},
{ "_id" : ObjectId("512bc962e835e68f199c8687"),
"author" : "dave",
"score" : 85
}
],
"ok" : 1 }
Example
The following example selects documents to process using the $match (page 666) pipeline operator and then
pipes the results to the $group (page 669) pipeline operator to compute a count of the documents:
666
db.articles.aggregate( [
{ $match : { score : { $gt : 70, $lte : 90 } } },
{ $group: { _id: null, count: { $sum: 1 } } }
] );
In the aggregation pipeline, $match (page 666) selects the documents where the score is greater than 70 and
less than or equal to 90. These documents are then piped to the $group (page 669) to perform a count. The
aggregation returns the following:
{
"result" : [
{
"_id" : null,
"count" : 3
}
],
"ok" : 1 }
Note:
Place the $match (page 666) as early in the aggregation pipeline as possible. Because $match
(page 666) limits the total number of documents in the aggregation pipeline, earlier $match (page 666)
operations minimize the amount of processing down the pipe.
If you place a $match (page 666) at the very beginning of a pipeline, the query can take advantage
of indexes like any other db.collection.find() (page 816) or db.collection.findOne()
(page 824).
New in version 2.4: $match (page 666) queries can support the geospatial $geoWithin (page 635) operations.
Warning: You cannot use $where (page 634) in $match (page 666) queries as part of the aggregation
pipeline.
$limit (aggregation)
$limit
Restricts the number of documents that pass through the $limit (page 667) in the pipeline.
$limit (page 667) takes a single numeric (positive whole number) value as a parameter. Once the specified
number of documents pass through the pipeline operator, no more will. Consider the following example:
db.article.aggregate(
{ $limit : 5 }
);
This operation returns only the first 5 documents passed to it from by the pipeline. $limit (page 667) has no
effect on the content of the documents it passes.
Note: Changed in version 2.4: $sort (page 670) and memory requirements:
When a $sort (page 670) immediately precedes a $limit (page 667) in the pipeline, the $sort
(page 670) operation only maintains the top n results as it progresses, where n is the specified limit, and
MongoDB only needs to store the number of items specified by $limit (page 667) in memory. Before
MongoDB 2.4, $sort (page 670) would sort all the results in memory, and then limit the results to n
results.
667
Unless the $sort (page 670) operator can use an index or immediately precedes a $limit (page 667),
the $sort (page 670) operation must fit within memory. Before MongoDB 2.4, unless the $sort
(page 670) operator can use an index, the $sort (page 670) operation must fit within memory.
$sort (page 670) produces an error if the operation consumes 10 percent or more of RAM.
$skip (aggregation)
$skip
Skips over the specified number of documents that pass through the $skip (page 668) in the pipeline before
passing all of the remaining input.
$skip (page 668) takes a single numeric (positive whole number) value as a parameter. Once the operation has
skipped the specified number of documents, it passes all the remaining documents along the pipeline without
alteration. Consider the following example:
db.article.aggregate(
{ $skip : 5 }
);
This operation skips the first 5 documents passed to it by the pipeline. $skip (page 668) has no effect on the
content of the documents it passes along the pipeline.
$unwind (aggregation)
$unwind
Peels off the elements of an array individually, and returns a stream of documents. $unwind (page 668) returns
one document for every member of the unwound array within every source document. Take the following
aggregation command:
db.article.aggregate(
{ $project : {
author : 1 ,
title : 1 ,
tags : 1
}},
{ $unwind : "$tags" }
);
Note: The dollar sign (i.e. $) must proceed the field specification handed to the $unwind (page 668) operator.
In the above aggregation $project (page 664) selects (inclusively) the author, title, and tags fields,
as well as the _id field implicitly. Then the pipeline passes the results of the projection to the $unwind
(page 668) operator, which will unwind the tags field. This operation may return a sequence of documents
that resemble the following for a collection that contains one document holding a tags field with an array of 3
items.
{
"result" : [
{
"_id" : ObjectId("4e6e4ef557b77501a49233f6"),
"title" : "this is my title",
"author" : "bob",
"tags" : "fun"
},
{
"_id" : ObjectId("4e6e4ef557b77501a49233f6"),
668
A single document becomes 3 documents: each document is identical except for the value of the tags field.
Each value of tags is one of the values in the original tags array.
Note: $unwind (page 668) has the following behaviors:
$unwind (page 668) is most useful in combination with $group (page 669).
You may undo the effects of unwind operation with the $group (page 669) pipeline operator.
If you specify a target field for $unwind (page 668) that does not exist in an input document, the pipeline
ignores the input document, and will generate no result documents.
If you specify a target field for $unwind (page 668)
db.collection.aggregate() (page 808) generates an error.
that
is
not
an
array,
If you specify a target field for $unwind (page 668) that holds an empty array ([]) in an input document,
the pipeline ignores the input document, and will generates no result documents.
$group (aggregation)
$group
Groups documents together for the purpose of calculating aggregate values based on a collection of documents.
In practice, $group (page 669) often supports tasks such as average page views for each page in a website on
a daily basis.
Important: The output of $group (page 669) is not ordered.
The output of $group (page 669) depends on how you define groups. Begin by specifying an identifier (i.e.
an _id field) for the group youre creating with this pipeline. For this _id field, you can specify various
expressions, including a single field from the documents in the pipeline, a computed value from a previous
stage, a document that consists of multiple fields, and other valid expressions, such as constant or subdocument
fields. You can use $project (page 664) operators in expressions for the _id field.
The following example of an _id field specifies a document that consists of multiple fields:
{ _id : { author: '$author', pageViews: '$pageViews', posted: '$posted' } }
Every $group (page 669) expression must specify an _id field. In addition to the _id field, $group
(page 669) expression can include computed fields. These other fields must use one of the following accumulators:
$addToSet (page 674)
$first (page 675)
669
This aggregation pipeline groups by the author field and computes two fields, the first docsPerAuthor is a
counter field that increments by one for each document with a given author field using the $sum (page 678) function.
The viewsPerAuthor field is the sum of all of the pageViews fields in the documents for each group.
$sort (aggregation)
$sort
The $sort (page 670) pipeline operator sorts all input documents and returns them to the pipeline in sorted
order. Consider the following prototype form:
db.<collection-name>.aggregate(
{ $sort : { <sort-key> } }
);
This sorts the documents in the collection named <collection-name>, according to the key and specification in the { <sort-key> } document.
Specify the sort in a document with a field or fields that you want to sort by and a value of 1 or -1 to specify an
ascending or descending sort respectively, as in the following example:
db.users.aggregate(
{ $sort : { age : -1, posts: 1 } }
);
This operation sorts the documents in the users collection, in descending order according by the age field
and then in ascending order according to the value in the posts field.
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest
to highest:
670
671
672
The matching documents in the result field contain two new fields:
dist.calculated field that contains the calculated distance, and
dist.location field that contains the location used in the calculation.
Expression Operators
Expression operators calculate values within the Pipeline Operators (page 664).
$group Operators
673
Name
$addToSet
(page 674)
$first (page 675)
$last (page 675)
$max (page 675)
$min (page 675)
$avg (page 676)
$push (page 677)
$sum (page 678)
Description
Returns an array of all the unique values for the selected field among for ea
in that group.
Returns the first value in a group.
Returns the last value in a group.
Returns the highest value in a group.
Returns the lowest value in a group.
Returns an average of all the values in a group.
Returns an array of all values for the selected field among for each docume
group.
Returns the sum of all the values in a group.
$addToSet (aggregation)
$addToSet
Returns an array of all the values found in the selected field among the documents in that group. Every unique
value only appears once in the result set. There is no ordering guarantee for the output documents.
Example In the mongo (page 942) shell, insert documents into a collection named products using the following
operation:
db.products.insert( [
{ "type" : "phone",
{ "type" : "phone",
{ "type" : "phone",
{ "type" : "chair",
] )
"price"
"price"
"price"
"price"
:
:
:
:
Use the following db.collection.aggregate() (page 808) operation in the mongo (page 942) shell with the
$addToSet (page 674) operator:
db.products.aggregate( {
$group : {
_id : "$type",
price: { $addToSet: "$price" },
stocked: { $addToSet: "$stocked" },
} }
)
This aggregation pipeline returns documents grouped on the value of the type field, with sets constructed from the
unique values of the price and stocked fields in the input documents. Consider the following aggregation results:
{
"_id" : "chair",
"price" : [
59.99
],
"stocked" : [
108
]
},
{
"_id" : "phone",
"price" : [
376.99,
389.99
],
674
"stocked" : [
97000,
270000,
]
}
$first (aggregation)
$first
Returns the first value it encounters for its group.
Note: Only use $first (page 675) when the $group (page 669) follows an $sort (page 670) operation.
Otherwise, the result of this operation is unpredictable.
$last (aggregation)
$last
Returns the last value it encounters for its group.
Note: Only use $last (page 675) when the $group (page 669) follows an $sort (page 670) operation.
Otherwise, the result of this operation is unpredictable.
$max (aggregation)
$max
Returns the highest value among all values of the field in all documents selected by this group.
$min (aggregation)
$min
The $min (page 675) operator returns the lowest non-null value of a field in the documents for a $group
(page 669) operation.
Changed in version 2.4: If some, but not all, documents for the $min (page 675) operation have either a null
value for the field or are missing the field, the $min (page 675) operator only considers the non-null and the
non-missing values for the field. If all documents for the $min (page 675) operation have null value for the
field or are missing the field, the $min (page 675) operator returns null for the minimum value.
Before 2.4, if any of the documents for the $min (page 675) operation were missing the field, the $min
(page 675) operator would not return any value. If any of the documents for the $min (page 675) had the
value null, the $min (page 675) operator would return a null.
Example
The users collection contains the following documents:
{
{
{
{
"_id"
"_id"
"_id"
"_id"
:
:
:
:
"abc001",
"abe001",
"efg001",
"xyz001",
"age"
"age"
"age"
"age"
:
:
:
:
25
35
20
15
}
}
}
}
To find the minimum value of the age field from all the documents, use the $min (page 675) operator:
db.users.aggregate( [ { $group: { _id:0, minAge: { $min: "$age"} } } ] )
The operation returns the value of the age field in the minAge field:
675
To find the minimum value of the age field for only those documents with _id starting with the letter a,
use the $min (page 675) operator after a $match (page 666) operation:
db.users.aggregate( [ { $match: { _id: /^a/ } },
{ $group: { _id: 0, minAge: { $min: "$age"} } }
] )
The operation returns the minimum value of the age field for the two documents with _id starting with
the letter a:
{ "result" : [ { "_id" : 0, "minAge" : 25 } ], "ok" : 1 }
Example
The users collection contains the following documents where some of the documents are either missing the
age field or the age field contains null:
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
"abc001", "age"
"abe001", "age"
"efg001", "age"
"xyz001", "age"
"xxx001" }
"zzz001", "age"
:
:
:
:
25
35
20
15
}
}
}
}
: null }
The following operation finds the minimum value of the age field in all the documents:
db.users.aggregate( [ { $group: { _id:0, minAge: { $min: "$age"} } } ] )
Because only some documents for the $min (page 675) operation are missing the age field or have age
field equal to null, $min (page 675) only considers the non-null and the non-missing values and the
operation returns the following document:
{ "result" : [ { "_id" : 0, "minAge" : 15 } ], "ok" : 1 }
The following operation finds the minimum value of the age field for only those documents where the
_id equals "xxx001" or "zzz001":
db.users.aggregate( [ { $match: { _id: {$in: [ "xxx001", "zzz001" ] } } },
{ $group: { _id: 0, minAge: { $min: "$age"} } }
] )
The $min (page 675) operation returns null for the minimum age since all documents for the $min
(page 675) operation have null value for the field age or are missing the field:
{ "result" : [ { "_id" : 0, "minAge" : null } ], "ok" : 1 }
$avg (aggregation)
$avg
Returns the average of all the values of the field in all documents selected by this group.
676
$push (aggregation)
$push
Returns an array of all the values found in the selected field among the documents in that group. A value may
appear more than once in the result set if more than one field in the grouped documents has that value.
Example The following examples use the following collection named users as the input for the aggregation
pipeline:
{
{
{
{
"_id":
"_id":
"_id":
"_id":
1,
2,
3,
4,
"user"
"user"
"user"
"user"
:
:
:
:
}
}
}
}
db.users.aggregate(
{
$group: {
_id: "$age",
users: { $push: "$user" }
}
}
)
For each age, the operation returns the field users that contains an array of all the user values associated with that
age:
{
"result" : [
{
"_id" : 20,
"users" : [
"Ty"
]
},
{
"_id" : 35,
"users" : [
"Mel"
]
},
{
"_id" : 25,
"users" : [
"Jan",
"Lee"
]
}
],
"ok" : 1
}
Push Documents Into the Returned Array Field The $push (page 677) operator can return an array of documents.
677
To group by age and return all the user and associated score values for each age, use the $push (page 677)
operator.
db.users.aggregate(
{
$group: {
_id: "$age",
users: { $push: { userid: "$user", score: "$score" } }
}
}
)
For each age, the operation returns the field users that contains an array of documents. These documents contain
the fields userid and score that hold respectively the user value and the score value associated with that age:
{
"result" : [
{
"_id" : 20,
"users" : [
{
"userid" : "Ty",
"score" : 102
}
]
},
{
"_id" : 35,
"users" : [
{
"userid" : "Mel",
"score" : 70
}
]
},
{
"_id" : 25,
"users" : [
{
"userid" : "Jan",
"score" : 80
},
{
"userid" : "Lee",
"score" : 45
}
]
}
],
"ok" : 1
}
$sum (aggregation)
$sum
Returns the sum of all the values for a specified field in the grouped documents, as in the second use above.
Alternately, if you specify a value as an argument, $sum (page 678) will increment this field by the specified
value for every document in the grouping. Typically, as in the first use above, specify a value of 1 in order to
678
Name
$and (page 679)
$or (page 679)
$not (page 679)
Description
Returns true only when all values in its input array are true.
Returns true when any value in its input array are true.
Returns the boolean value that is the opposite of the input value.
$and (aggregation)
$and
Takes an array one or more values and returns true if all of the values in the array are true. Otherwise $and
(page 679) returns false.
Note: $and (page 679) uses short-circuit logic: the operation stops evaluation after encountering the first
false expression.
$or (aggregation)
$or
Takes an array of one or more values and returns true if any of the values in the array are true. Otherwise
$or (page 679) returns false.
Note: $or (page 679) uses short-circuit logic: the operation stops evaluation after encountering the first true
expression.
$not (aggregation)
$not
Returns the boolean opposite value passed to it. When passed a true value, $not (page 679) returns false;
when passed a false value, $not (page 679) returns true.
Comparison Operators These operators perform comparisons between two values and return a Boolean, in most
cases reflecting the result of the comparison.
All comparison operators take an array with a pair of values. You may compare numbers, strings, and dates. Except
for $cmp (page 680), all comparison operators return a Boolean value. $cmp (page 680) returns an integer.
Name
$cmp (page 680)
$eq (page 680)
$gt (page 680)
$gte (page 680)
$lt (page 680)
$lte (page 680)
$ne (page 680)
Description
Compares two values and returns the result of the comparison as an integ
Takes two values and returns true if the values are equivalent.
Takes two values and returns true if the first is larger than the second.
Takes two values and returns true if the first is larger than or equal to the
Takes two values and returns true if the second value is larger than the fir
Takes two values and returns true if the second value is larger than or equ
Takes two values and returns true if the values are not equivalent.
679
$cmp (aggregation)
$cmp
Takes two values in an array and returns an integer. The returned value is:
A negative number if the first value is less than the second.
A positive number if the first value is greater than the second.
0 if the two values are equal.
$eq (aggregation)
$eq
Takes two values in an array and returns a boolean. The returned value is:
true when the values are equivalent.
false when the values are not equivalent.
$gt (aggregation)
$gt
Takes two values in an array and returns a boolean. The returned value is:
true when the first value is greater than the second value.
false when the first value is less than or equal to the second value.
$gte (aggregation)
$gte
Takes two values in an array and returns a boolean. The returned value is:
true when the first value is greater than or equal to the second value.
false when the first value is less than the second value.
$lt (aggregation)
$lt
Takes two values in an array and returns a boolean. The returned value is:
true when the first value is less than the second value.
false when the first value is greater than or equal to the second value.
$lte (aggregation)
$lte
Takes two values in an array and returns a boolean. The returned value is:
true when the first value is less than or equal to the second value.
false when the first value is greater than the second value.
$ne (aggregation)
$ne
Takes two values in an array returns a boolean. The returned value is:
true when the values are not equivalent.
false when the values are equivalent.
680
Name
$add (page 681)
$divide (page 681)
$mod (page 681)
$multiply (page 681)
$subtract (page 681)
Description
Computes the sum of an array of numbers.
Takes two numbers and divides the first number by the second.
Takes two numbers and calcualtes the modulo of the first number d
Computes the product of an array of numbers.
Takes two numbers and subtracts the second number from the first.
$add (aggregation)
$add
Takes an array of one or more numbers and adds them together, returning the sum.
$divide (aggregation)
$divide
Takes an array that contains a pair of numbers and returns the value of the first number divided by the second
number.
$mod (aggregation)
$mod
Takes an array that contains a pair of numbers and returns the remainder of the first number divided by the
second number.
See also:
$mod (page 632)
$multiply (aggregation)
$multiply
Takes an array of one or more numbers and multiples them, returning the resulting product.
$subtract (aggregation)
$subtract
Takes an array that contains a pair of numbers and subtracts the second from the first, returning their difference.
String Operators String operators manipulate strings within projection expressions.
Name
$concat (page 681)
$strcasecmp (page 684)
$substr (page 684)
$toLower (page 685)
$toUpper (page 685)
Description
Concatenates two strings.
Compares two strings and returns an integer that reflects the compari
Takes a string and returns portion of that string.
Converts a string to lowercase.
Converts a string to uppercase.
$concat (aggregation)
681
$concat
New in version 2.4.
Takes an array of strings, concatenates the strings, and returns the concatenated string. $concat (page 681)
can only accept an array of strings.
Use $concat (page 681) with the following syntax:
{ $concat: [ <string>, <string>, ... ] }
If array element has a value of null or refers to a field that is missing, $concat (page 681) will return null.
Example
Project new concatenated values.
A collection menu contains the documents that stores information on menu items separately in the section,
the category and the type fields, as in the following:
{
{
{
{
_id:
_id:
_id:
_id:
1,
2,
3,
4,
item:
item:
item:
item:
{
{
{
{
sec:
sec:
sec:
sec:
The following operation uses $concat (page 681) to concatenate the type field from the sub-document item,
a space, and the category field from the sub-document item to project a new food field:
db.menu.aggregate( { $project: { food:
{ $concat: [ "$item.type",
" ",
"$item.category"
]
}
}
}
)
The operation returns the following result set where the food field contains the concatenated strings:
{
"result" : [
{
{
{
{
"_id"
"_id"
"_id"
"_id"
:
:
:
:
1,
2,
3,
4,
"food"
"food"
"food"
"food"
:
:
:
:
"apple pie" },
"cherry pie" },
"shepherd's pie" },
"chicken pot pie" }
],
"ok" : 1
}
Example
Group by a concatenated string.
A collection menu contains the documents that stores information on menu items separately in the section,
the category and the type fields, as in the following:
{
{
{
{
682
_id:
_id:
_id:
_id:
1,
2,
3,
4,
item:
item:
item:
item:
{
{
{
{
sec:
sec:
sec:
sec:
The following aggregation uses $concat (page 681) to concatenate the sec field from the sub-document
item, the string ": ", and the category field from the sub-document item to group by the new concatenated string and perform a count:
db.menu.aggregate( { $group: { _id:
{ $concat: [ "$item.sec",
": ",
"$item.category"
]
},
count: { $sum: 1 }
}
}
)
Example
Concatenate null or missing values.
A collection menu contains the documents that stores information on menu items separately in the section,
the category and the type fields. Not all documents have the all three fields. For example, the document
with _id equal to 5 is missing the category field:
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
1,
2,
3,
4,
5,
item:
item:
item:
item:
item:
{
{
{
{
{
sec:
sec:
sec:
sec:
sec:
The following aggregation uses the $concat (page 681) to concatenate the type field from the sub-document
item, a space, and the category field from the sub-document item:
db.menu.aggregate( { $project: { food:
{ $concat: [ "$item.type",
" ",
"$item.category"
]
}
}
}
)
Because the document with _id equal to 5 is missing the type field in the item sub-document, $concat
(page 681) returns the value null as the concatenated value for the document:
{
"result" : [
{ "_id" : 1, "food" : "apple pie" },
{ "_id" : 2, "food" : "cherry pie" },
683
To handle possible missing fields, you can use $ifNull (page 687) with $concat (page 681), as in the following example which substitutes <unknown type> if the field type is null or missing, and <unknown
category> if the field category is null or is missing:
db.menu.aggregate( { $project: { food:
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
1,
2,
3,
4,
5,
"food"
"food"
"food"
"food"
"food"
:
:
:
:
:
"apple pie" },
"cherry pie" },
"shepherd's pie" },
"chicken pot pie" },
"coffee <unknown category>" }
],
"ok" : 1
}
$strcasecmp (aggregation)
$strcasecmp
Takes in two strings. Returns a number. $strcasecmp (page 684) is positive if the first string is greater
than the second and negative if the first string is less than the second. $strcasecmp (page 684) returns 0
if the strings are identical.
Note: $strcasecmp (page 684) may not make sense when applied to glyphs outside the Roman alphabet.
$strcasecmp (page 684) internally capitalizes strings before comparing them to provide a case-insensitive
comparison. Use $cmp (page 680) for a case sensitive comparison.
$substr (aggregation)
$substr
$substr (page 684) takes a string and two numbers. The first number represents the number of bytes in the
string to skip, and the second number specifies the number of bytes to return from the string.
Note: $substr (page 684) is not encoding aware and if used improperly may produce a result string containing an invalid UTF-8 character sequence.
684
$toLower (aggregation)
$toLower
Takes a single string and converts that string to lowercase, returning the result. All uppercase letters become
lowercase.
Note: $toLower (page 685) may not make sense when applied to glyphs outside the Roman alphabet.
$toUpper (aggregation)
$toUpper
Takes a single string and converts that string to uppercase, returning the result. All lowercase letters become
uppercase.
Note: $toUpper (page 685) may not make sense when applied to glyphs outside the Roman alphabet.
Date Operators Date operators take a Date typed value as a single argument and return a number.
Name
$dayOfYear (page 685)
$dayOfMonth
(page 685)
$dayOfWeek (page 685)
$year (page 685)
$month (page 686)
$week (page 686)
$hour (page 686)
$minute (page 686)
$second (page 686)
$millisecond
(page 686)
Description
Converts a date to a number between 1 and 366.
Converts a date to a number between 1 and 31.
$dayOfYear (aggregation)
$dayOfYear
Takes a date and returns the day of the year as a number between 1 and 366.
$dayOfMonth (aggregation)
$dayOfMonth
Takes a date and returns the day of the month as a number between 1 and 31.
$dayOfWeek (aggregation)
$dayOfWeek
Takes a date and returns the day of the week as a number between 1 (Sunday) and 7 (Saturday.)
$year (aggregation)
$year
Takes a date and returns the full year.
685
$month (aggregation)
$month
Takes a date and returns the month as a number between 1 and 12.
$week (aggregation)
$week
Takes a date and returns the week of the year as a number between 0 and 53.
Weeks begin on Sundays, and week 1 begins with the first Sunday of the year. Days preceding the first Sunday
of the year are in week 0. This behavior is the same as the %U operator to the strftime standard library
function.
$hour (aggregation)
$hour
Takes a date and returns the hour between 0 and 23.
$minute (aggregation)
$minute
Takes a date and returns the minute between 0 and 59.
$second (aggregation)
$second
Takes a date and returns the second between 0 and 59, but can be 60 to account for leap seconds.
$millisecond (aggregation)
$millisecond
Takes a date and returns the millisecond portion of the date as an integer between 0 and 999.
Conditional Expressions
Name
$cond
(page 686)
$ifNull
(page 687)
Description
A ternary operator that evaluates one expression, and depending on the res
of one following expressions.
Evaluates an expression and returns a value.
$cond (aggregation)
$cond
$cond (page 686) is a ternary operator that takes an array of three expressions, where the first expression
evaluates to a Boolean value. If the first expression evaluates to true, then $cond (page 686) evaluates and
returns the value of the second expression. If the first expression evaluates to false, then $cond (page 686)
evaluates and returns the third expression.
Use the $cond (page 686) operator with the following syntax:
{ $cond: [ <boolean-expression>, <true-case>, <false-case> ] }
All three values in the array specified to $cond (page 686) must be valid MongoDB aggregation expressions (page 663) or document fields. Do not use JavaScript in any aggregation statements, including $cond
(page 686).
686
Example
The following aggregation on the survey collection groups by the item_id field and returns a
weightedCount for each item_id. The $sum (page 678) operator uses the $cond (page 686) expression to add either 2 if the value stored in the level field is E and 1 otherwise.
db.survey.aggregate(
[
{
$group: {
_id: "$item_id",
weightedCount: { $sum: { $cond: [ { $eq: [ "$level", "E" ] } , 2, 1 ] } }
}
}
]
)
$ifNull (aggregation)
$ifNull
Takes an array with two expressions. $ifNull (page 687) returns the first expression if it evaluates to a
non-null value. Otherwise, $ifNull (page 687) returns the second expressions value.
Use the $ifNull (page 687) operator with the following syntax:
{ $ifNull: [ <expression>, <replacement-if-null> ] }
Both values in the array specified to $ifNull (page 687) must be valid MongoDB aggregation expressions
(page 663) or document fields. Do not use JavaScript in any aggregation statements, including $ifNull
(page 687).
Example
The following aggregation on the offSite collection groups by the location field and returns a count for
each location. If the location field contains null, the $ifNull (page 687) returns "Unspecified" as
the value. MongoDB assigns the returned value to _id in the aggregated document.
db.offSite.aggregate(
[
{
$group: {
_id: { $ifNull: [ "$location", "Unspecified" ] },
count: { $sum: 1 }
}
}
]
)
Meta-Query Operators
Query Modification Operators
Introduction In addition to the MongoDB Query Operators (page 621), there are a number of meta operators that
you can modify the output or behavior of a query. On the server, MongoDB treats the query and the options as a single
687
object. The mongo (page 942) shell and driver interfaces may provide cursor methods (page 858) that wrap these
options. When possible, use these methods; otherwise, you can add these options using either of the following syntax:
db.collection.find( { <query> } )._addSpecial( <option> )
db.collection.find( { $query: { <query> }, <option> } )
Operators
Modifiers Many of these operators have corresponding methods in the shell (page 858). These methods provide a
straightforward and user-friendly interface and are the preferred way to add these options.
Name
$comment (page 688)
$explain (page 688)
$hint (page 689)
$maxScan (page 690)
$max (page 690)
$min (page 691)
$orderby (page 691)
$returnKey
(page 692)
$showDiskLoc
(page 692)
$snapshot (page 692)
$query (page 693)
Description
Adds a comment to the query to identify queries in the database profiler output.
Forces MongoDB to report on query execution plans. See explain() (page 861).
Forces MongoDB to use a specific index. See hint() (page 866)
Limits the number of documents a cursor will return for a query. See limit()
(page 867).
Specifies a minimum exclusive upper limit for the index to use in a query. See
max() (page 867).
Specifies a minimum inclusive lower limit for the index to use in a query. See min()
(page 869).
Returns a cursor with documents sorted according to a sort specification. See
sort() (page 872).
Forces the cursor to only return fields included in the index.
Modifies the documents returned to include references to the on-disk location of each
document.
Forces the query to use the index on the _id field. See snapshot() (page 872).
Wraps a query document.
$comment
$comment
The $comment (page 688) makes it possible to attach a comment to a query. Because these comments propagate to the profile (page 770) log, adding $comment (page 688) modifiers can make your profile data much
easier to interpret and trace. Use one of the following forms:
db.collection.find( { <query> } )._addSpecial( "$comment", <comment> )
db.collection.find( { $query: { <query> }, $comment: <comment> } )
$explain
$explain
The $explain (page 688) operator provides information on the query plan. It returns a document that describes the process and indexes used to return the query. This may provide useful insight when attempting to
optimize a query. For details on the output, see cursor.explain() (page 861).
You can specify the $explain (page 688) operator in either of the following forms:
db.collection.find()._addSpecial( "$explain", 1 )
db.collection.find( { $query: {}, $explain: 1 } )
You also can specify $explain (page 688) through the explain() (page 861) method in the mongo
(page 942) shell:
688
db.collection.find().explain()
$explain (page 688) runs the actual query to determine the result. Although there are some differences
between running the query with $explain (page 688) and running without, generally, the performance will
be similar between the two. So, if the query is slow, the $explain (page 688) operation is also slow.
Additionally, the $explain (page 688) operation reevaluates a set of candidate query plans, which may cause
the $explain (page 688) operation to perform differently than a normal query. As a result, these operations
generally provide an accurate account of how MongoDB would perform the query, but do not reflect the length
of these queries.
To determine the performance of a particular index, you can use hint() (page 866) and in conjunction with
explain() (page 861), as in the following example:
db.products.find().hint( { type: 1 } ).explain()
When you run explain() (page 861) with hint() (page 866), the query optimizer does not reevaluate the
query plans.
Note: In some situations, the explain() (page 861) operation may differ from the actual query plan used by
MongoDB in a normal query.
The explain() (page 861) operation evaluates the set of query plans and reports on the winning plan for the
query. In normal operations the query optimizer caches winning query plans and uses them for similar related
queries in the future. As a result MongoDB may sometimes select query plans from the cache that are different
from the plan displayed using explain() (page 861).
See also:
explain() (page 861)
Optimization Strategies for MongoDB (page 160) page for information regarding optimization strategies.
Analyze Performance of Database Operations (page 167) tutorial for information regarding the database
profile.
Current Operation Reporting (page 879)
$hint
$hint
The $hint (page 689) operator forces the query optimizer (page 45) to use a specific index to fulfill the query.
Specify the index either by the index name or by document.
Use $hint (page 689) for testing query performance and indexing strategies. The mongo (page 942) shell
provides a helper method hint() (page 866) for the $hint (page 689) operator.
Consider the following operation:
db.users.find().hint( { age: 1 } )
This operation returns all documents in the collection named users using the index on the age field.
You can also specify a hint using either of the following forms:
db.users.find()._addSpecial( "$hint", { age : 1 } )
db.users.find( { $query: {}, $hint: { age : 1 } } )
Note: To combine $explain (page 688) and $hint (page 689) operations when $hint (page 689) is part
of the document, as in the following query statement:
689
You must add the $explain (page 688) option to the document, as in the following:
db.users.find( { $query: {}, $hint: { age : 1 }, $explain: 1 } )
$maxScan
$maxScan
Constrains the query to only scan the specified number of documents when fulfilling the query. Use one of the
following forms:
db.collection.find( { <query> } )._addSpecial( "$maxScan" , <number> )
db.collection.find( { $query: { <query> }, $maxScan: <number> } )
Use this modifier to prevent potentially long running queries from disrupting performance by scanning through
too much data.
$max
$max
Specify a $max (page 690) value to specify the exclusive upper bound for a specific index in order to constrain
the results of find() (page 816). The mongo (page 942) shell provides the max() (page 867) wrapper
method:
db.collection.find( { <query> } ).max( { field1: <max value>, ... fieldN: <max valueN> } )
You can also specify the option with either of the two forms:
db.collection.find( { <query> } )._addSpecial( "$max", { field1: <max value1>, ... fieldN: <max
db.collection.find( { $query: { <query> }, $max: { field1: <max value1>, ... fieldN: <max valueN
The $max (page 690) specifies the upper bound for all keys of a specific index in order.
Consider the following operations on a collection named collection that has an index { age:
1 }:
This operation limits the query to those documents where the field age is less than 100 using the index {
age: 1 }.
You can explicitly specify the corresponding index with hint() (page 866). Otherwise, MongoDB selects the
index using the fields in the indexBounds; however, if multiple indexes exist on same fields with different
sort orders, the selection of the index may be ambiguous.
Consider a collection named collection that has the following two indexes:
{ age: 1, type: -1 }
{ age: 1, type: 1 }
Without explicitly using hint() (page 866), MongoDB may select either index for the following operation:
db.collection.find().max( { age: 50, type: 'B' } )
Use $max (page 690) alone or in conjunction with $min (page 691) to limit results to a specific range for the
same index, as in the following example:
db.collection.find().min( { age: 20 } ).max( { age: 25 } )
690
Note: Because max() (page 867) requires an index on a field, and forces the query to use this index, you may
prefer the $lt (page 623) operator for the query if possible. Consider the following example:
db.collection.find( { _id: 7 } ).max( { age: 25 } )
The query uses the index on the age field, even if the index on _id may be better.
$min
$min
Specify a $min (page 691) value to specify the inclusive lower bound for a specific index in order to constrain
the results of find() (page 816). The mongo (page 942) shell provides the min() (page 869) wrapper
method:
db.collection.find( { <query> } ).min( { field1: <min value>, ... fieldN: <min valueN>} )
You can also specify the option with either of the two forms:
db.collection.find( { <query> } )._addSpecial( "$min", { field1: <min value1>, ... fieldN: <min
db.collection.find( { $query: { <query> }, $min: { field1: <min value1>, ... fieldN: <min valueN
The $min (page 691) specifies the lower bound for all keys of a specific index in order.
Consider the following operations on a collection named collection that has an index { age:
1 }:
db.collection.find().min( { age: 20 } )
These operations limit the query to those documents where the field age is at least 20 using the index { age:
1 }.
You can explicitly specify the corresponding index with hint() (page 866). Otherwise, MongoDB selects the
index using the fields in the indexBounds; however, if multiple indexes exist on same fields with different
sort orders, the selection of the index may be ambiguous.
Consider a collection named collection that has the following two indexes:
{ age: 1, type: -1 }
{ age: 1, type: 1 }
Without explicitly using hint() (page 866), it is unclear which index the following operation will select:
db.collection.find().min( { age: 20, type: 'C' } )
You can use $min (page 691) in conjunction with $max (page 690) to limit results to a specific range for the
same index, as in the following example:
db.collection.find().min( { age: 20 } ).max( { age: 25 } )
Note: Because min() (page 869) requires an index on a field, and forces the query to use this index, you may
prefer the $gte (page 622) operator for the query if possible. Consider the following example:
db.collection.find( { _id: 7 } ).min( { age: 25 } )
The query will use the index on the age field, even if the index on _id may be better.
$orderby
691
$orderby
The $orderby (page 691) operator sorts the results of a query in ascending or descending order.
The mongo (page 942) shell provides the cursor.sort() (page 872) method:
db.collection.find().sort( { age: -1 } )
You can also specify the option in either of the following forms:
db.collection.find()._addSpecial( "$orderby", { age : -1 } )
db.collection.find( { $query: {}, $orderby: { age : -1 } } )
These examples return all documents in the collection named collection sorted by the age field in descending order. Specify a value to $orderby (page 691) of negative one (e.g. -1, as above) to sort in descending
order or a positive value (e.g. 1) to sort in ascending order.
Unless you have an index for the specified key pattern, use $orderby (page 691) in conjunction with
$maxScan (page 690) and/or cursor.limit() (page 867) to avoid requiring MongoDB to perform a large
in-memory sort. The cursor.limit() (page 867) increases the speed and reduces the amount of memory
required to return this query by way of an optimized algorithm.
$returnKey
$returnKey
Only return the index field or fields for the results of the query. If $returnKey (page 692) is set to true
and the query does not use an index to perform the read operation, the returned documents will not contain any
fields. Use one of the following forms:
db.collection.find( { <query> } )._addSpecial( "$returnKey", true )
db.collection.find( { $query: { <query> }, $returnKey: true } )
$showDiskLoc
$showDiskLoc
$showDiskLoc (page 692) option adds a field $diskLoc to the returned documents. The $diskLoc field
contains the disk location information.
The mongo (page 942) shell provides the cursor.showDiskLoc() (page 871) method:
db.collection.find().showDiskLoc()
You can also specify the option in either of the following forms:
db.collection.find( { <query> } )._addSpecial("$showDiskLoc" , true)
db.collection.find( { $query: { <query> }, $showDiskLoc: true } )
$snapshot
$snapshot
The $snapshot (page 692) operator prevents the cursor from returning a document more than once because
an intervening write operation results in a move of the document.
Even in snapshot mode, objects inserted or deleted during the lifetime of the cursor may or may not be returned.
The mongo (page 942) shell provides the cursor.snapshot() (page 872) method:
db.collection.find().snapshot()
You can also specify the option in either of the following forms:
692
The $snapshot (page 692) operator traverses the index on the _id field 4 .
Warning:
You cannot use $snapshot (page 692) with sharded collections.
Do not use $snapshot (page 692) with $hint (page 689) or $orderby (page 691) (or the corresponding cursor.hint() (page 866) and cursor.sort() (page 872) methods.)
$query
$query
The $query (page 693) operator provides an interface to describe queries. Consider the following operation:
db.collection.find( { $query: { age : 25 } } )
These operations return only those documents in the collection named collection where the age field
equals 25.
Note: Do not mix query forms. If you use the $query (page 693) format, do not append cursor methods
(page 858) to the find() (page 816). To modify the query use the meta-query operators (page 687), such as
$explain (page 688).
Therefore, the following two operations are equivalent:
db.collection.find( { $query: { age : 25 }, $explain: true } )
db.collection.find( { age : 25 } ).explain()
See also:
For more information about queries in MongoDB see Read Operations (page 39), Read Operations (page 39),
db.collection.find() (page 816), and Getting Started with MongoDB (page 26).
Sort Order
Name
$natural (page 693)
Description
A special sort order that orders documents using the order of documents on disk.
$natural
$natural
Use the $natural (page 693) operator to use natural order for the results of a sort operation. Natural order
refers to the order of documents in the file on disk.
The $natural (page 693) operator uses the following syntax to return documents in the order they exist on
disk:
db.collection.find().sort( { $natural: 1 } )
You can achieve the $snapshot (page 692) isolation behavior using any unique index on invariable fields.
693
db.collection.find().sort( { $natural: -1 } )
See also:
cursor.sort() (page 872)
Aggregation Commands
Name
aggregate (page 694)
count (page 695)
distinct (page 696)
group (page 697)
mapReduce (page 701)
Description
Performs aggregation tasks (page 279) such as group using the aggregation fr
Counts the number of documents in a collection.
Displays the distinct values found for a specified key in a collection.
Groups documents in a collection by the specified key and performs simple ag
Performs map-reduce (page 282) aggregation for large data sets.
aggregate
aggregate
New in version 2.1.0.
aggregate (page 694) implements the aggregation framework. Consider the following prototype form:
{ aggregate: "[collection]", pipeline: [pipeline] }
Where [collection] specifies the name of the collection that contains the data that you wish to aggregate.
The pipeline argument holds an array that contains the specification for the aggregation operation. Consider
the following example.
db.runCommand(
{ aggregate : "article", pipeline : [
{ $project : {
694
}
{
{
}
] }
);
author : 1,
tags : 1,
},
$unwind : "$tags" },
$group : {
_id : "$tags",
authors : { $addToSet : "$author" }
}
More typically this operation would use the aggregate() (page 808) helper in the mongo (page 942) shell,
and would resemble the following:
db.article.aggregate(
{ $project : {
author : 1,
tags : 1,
} },
{ $unwind : "$tags" },
{ $group : {
_id : "$tags",
authors : { $addToSet : "$author" }
} }
);
Changed in version 2.4: If an error occurs, the aggregate() (page 808) helper throws an exception. In
previous versions, the helper returned a document with the error message and code, and ok status field not equal
to 1, same as the aggregate (page 694) command.
For more information about the aggregation pipeline Aggregation Pipeline (page 279) and Aggregation Reference (page 306).
count
Definition
count
Counts the number of documents in a collection. Returns a document that contains this count and as well as the
command status. count (page 695) has the following form:
{ count: <collection>, query: <query>, limit: <limit>, skip: <skip> }
Examples The following sections provide examples of the count (page 695) command.
695
Count all Documents The following operation counts the number of all documents in the orders collection:
db.runCommand( { count: 'orders' } )
In the result, the n, which represents the count, is 26, and the command status ok is 1:
{ "n" : 26, "ok" : 1 }
Count Documents with Specified Field Values The following operation returns a count of the documents in the
orders collection where the value of the ord_dt field is greater than Date(01/01/2012):
db.runCommand( { count:'orders',
query: { ord_dt: { $gt: new Date('01/01/2012') } }
} )
In the result, the n, which represents the count, is 13 and the command status ok is 1:
{ "n" : 13, "ok" : 1 }
Skip Matching Documents The following operation returns a count of the documents in the orders collection
where the value of the ord_dt field is greater than Date(01/01/2012) and skip the first 10 matching documents:
db.runCommand( { count:'orders',
query: { ord_dt: { $gt: new Date('01/01/2012') } },
skip: 10 } )
In the result, the n, which represents the count, is 3 and the command status ok is 1:
{ "n" : 3, "ok" : 1 }
distinct
Definition
distinct
Finds the distinct values for a specified field across a single collection. distinct (page 696) returns a document that contains an array of the distinct values. The return document also contains a subdocument with query
statistics and the query plan.
When possible, the distinct (page 696) command uses an index to find documents and return values.
The command takes the following form:
{ distinct: "<collection>", key: "<field>", query: <query> }
696
Return an array of the distinct values of the field sku in the subdocument item from all documents in the orders
collection:
db.runCommand ( { distinct: "orders", key: "item.sku" } )
Return an array of the distinct values of the field ord_dt from the documents in the orders collection where the
price is greater than 10:
db.runCommand ( { distinct: "orders",
key: "ord_dt",
query: { price: { $gt: 10 } }
} )
Note: MongoDB also provides the shell wrapper method db.collection.distinct() (page 812) for the
distinct (page 696) command. Additionally, many MongoDB drivers also provide a wrapper method. Refer to the
specific driver documentation.
group
Definition
group
Groups documents in a collection by the specified key and performs simple aggregation functions, such as
computing counts and sums. The command is analogous to a SELECT <...> GROUP BY statement in SQL.
The command returns a document with the grouped records as well as the command meta-data.
The group (page 697) command takes the following prototype form:
{ group: { ns: <namespace>,
key: <key>,
$reduce: <reduce function>,
$keyf: <key function>,
cond: <query>,
finalize: <finalize function> } }
The group (page 697) commands group field takes as its value a subdocument with the following fields:
field string ns The collection from which to perform the group by operation.
field document key The field or fields to group. Returns a key object for use as the grouping key.
field function reduce An aggregation function that operates on the documents during the grouping
operation. These functions may return a sum or a count. The function takes two arguments: the
current document and an aggregation result document for that group
field document initial Initializes the aggregation result document.
field function keyf Alternative to the key field. Specifies a function that creates a key object for
use as the grouping key. Use keyf instead of key to group by calculated fields rather than
existing document fields.
field document cond The selection criteria to determine which documents in the collection to process. If you omit the cond field, group (page 697) processes all the documents in the collection
for the group operation.
field function finalize A function that runs each item in the result set before group (page 697)
returns the final value. This function can either modify the result document or replace the result
11.1. MongoDB Interface
697
document as a whole. Keep in mind that, unlike the $keyf and $reduce fields that also
specify a function, this field name is finalize, not $finalize.
Warning:
The group (page 697) command does not work with sharded clusters. Use the aggregation framework or map-reduce in sharded environments.
The result set must fit within the maximum BSON document size (page 1015).
Additionally, in version 2.2, the returned array can contain at most 20,000 elements; i.e. at most
20,000 unique groupings. For group by operations that results in more than 20,000 unique groupings,
use mapReduce (page 701). Previous versions had a limit of 10,000 elements.
Prior to 2.4, the group (page 697) command took the mongod (page 925) instances JavaScript lock
which blocked all other JavaScript execution.
Available Functions
args
MaxKey
MinKey
assert()
BinData()
DBPointer()
DBRef()
doassert()
emit()
gc()
HexData()
hex_md5()
isNumber()
isObject()
ISODate()
isString()
Map()
MD5()
NumberInt()
NumberLong()
ObjectId()
print()
printjson()
printjsononeline()
sleep()
Timestamp()
tojson()
tojsononeline()
tojsonObject()
UUID()
version()
For the shell, MongoDB provides a wrapper method db.collection.group() (page 828). However, the
db.collection.group() (page 828) method takes the keyf field and the reduce field whereas the
group (page 697) command takes the $keyf field and the $reduce field.
JavaScript in MongoDB
Although group (page 697) uses JavaScript, most interactions with MongoDB do not use JavaScript but use
an idiomatic driver (page 95) in the language of the interacting application.
698
Examples The following are examples of the db.collection.group() (page 828) method. The examples
assume an orders collection with documents of the following prototype:
{
_id: ObjectId("5085a95c8fada716c89d0021"),
ord_dt: ISODate("2012-07-01T04:00:00Z"),
ship_dt: ISODate("2012-07-02T04:00:00Z"),
item: { sku: "abc123",
price: 1.99,
uom: "pcs",
qty: 25 }
}
Group by Two Fields The following example groups by the ord_dt and item.sku fields those documents that
have ord_dt greater than 01/01/2012:
db.runCommand( { group:
{
ns: 'orders',
key: { ord_dt: 1, 'item.sku': 1 },
cond: { ord_dt: { $gt: new Date( '01/01/2012' ) } },
$reduce: function ( curr, result ) { },
initial: { }
}
} )
The result is a documents that contain the retval field which contains the group by records, the count field which
contains the total number of documents grouped, the keys field which contains the number of unique groupings (i.e.
number of elements in the retval), and the ok field which contains the command status:
{ "retval" :
[ { "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
],
"count" : 13,
"keys" : 11,
"ok" : 1 }
:
:
:
:
:
:
:
:
:
:
:
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-06-08T04:00:00Z"),
ISODate("2012-06-08T04:00:00Z"),
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
:
:
:
:
:
:
:
:
:
:
:
"abc123"},
"abc456"},
"bcd123"},
"efg456"},
"abc123"},
"efg456"},
"ijk123"},
"abc123"},
"abc456"},
"abc123"},
"abc456"}
699
Calculate the Sum The following example groups by the ord_dt and item.sku fields those documents that
have ord_dt greater than 01/01/2012 and calculates the sum of the qty field for each grouping:
db.runCommand( { group:
{
ns: 'orders',
key: { ord_dt: 1, 'item.sku': 1 },
cond: { ord_dt: { $gt: new Date( '01/01/2012' ) } },
$reduce: function ( curr, result ) {
result.total += curr.item.qty;
},
initial: { total : 0 }
}
} )
The retval field of the returned document is an array of documents that contain the group by fields and the calculated
aggregation field:
{ "retval" :
[ { "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
{ "ord_dt"
],
"count" : 13,
"keys" : 11,
"ok" : 1 }
:
:
:
:
:
:
:
:
:
:
:
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-06-08T04:00:00Z"),
ISODate("2012-06-08T04:00:00Z"),
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
:
:
:
:
:
:
:
:
:
:
:
"abc123",
"abc456",
"bcd123",
"efg456",
"abc123",
"efg456",
"ijk123",
"abc123",
"abc456",
"abc123",
"abc456",
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
:
:
:
:
:
:
:
:
:
:
:
Calculate Sum, Count, and Average The following example groups by the calculated day_of_week field, those
documents that have ord_dt greater than 01/01/2012 and calculates the sum, count, and average of the qty field
for each grouping:
db.runCommand( { group:
{
ns: 'orders',
$keyf: function(doc) {
return { day_of_week: doc.ord_dt.getDay() } ; },
cond: { ord_dt: { $gt: new Date( '01/01/2012' ) } },
$reduce: function ( curr, result ) {
result.total += curr.item.qty;
result.count++;
},
initial: { total : 0, count: 0 },
finalize: function(result) {
700
25
25
10
10
25
15
20
45
25
25
25
},
},
},
},
},
},
},
},
},
},
}
The retval field of the returned document is an array of documents that contain the group by fields and the calculated
aggregation field:
{ "retval" :
[ { "day_of_week" : "Sunday", "total" : 70, "count" : 4, "avg" : 18 },
{ "day_of_week" : "Friday", "total" : 110, "count" : 6, "avg" : 18 },
{ "day_of_week" : "Tuesday", "total" : 70, "count" : 3, "avg" : 23 }
],
"count" : 13,
"keys" : 3,
"ok" : 1 }
See also:
Aggregation Concepts (page 279)
mapReduce
mapReduce
The mapReduce (page 701) command allows you to run map-reduce aggregation operations over a collection.
The mapReduce (page 701) command has the following prototype form:
db.runCommand(
{
mapReduce: <collection>,
map: <function>,
reduce: <function>,
out: <output>,
query: <document>,
sort: <document>,
limit: <number>,
finalize: <function>,
scope: <document>,
jsMode: <boolean>,
verbose: <boolean>
}
)
Pass the name of the collection to the mapReduce command (i.e. <collection>) to use as the source
documents to perform the map reduce operation. The command also accepts the following parameters:
field collection mapReduce The name of the collection on which you want to perform map-reduce.
field Javascript function map A JavaScript function that associates or maps a value with a key
and emits the key and value pair. See Requirements for the map Function (page 703) for more
information.
701
field JavaScript function reduce A JavaScript function that reduces to a single object all the
values associated with a particular key. See Requirements for the reduce Function (page 704)
for more information.
field string out Specifies the location of the result of the map-reduce operation. You can output to a
collection, output to a collection with an action, or output inline. You may output to a collection
when performing map reduce operations on the primary members of the set; on secondary members you may only use the inline output. See out Options (page 704) for more information.
field document query Specifies the selection criteria using query operators (page 621) for determining the documents input to the map function.
field document sort Sorts the input documents. This option is useful for optimization. For example,
specify the sort key to be the same as the emit key so that there are fewer reduce operations. The
sort key must be in an existing index for this collection.
field number limit Specifies a maximum number of documents to return from the collection.
field Javascript function finalize Follows the reduce method and modifies the output. See Requirements for the finalize Function (page 705) for more information.
field document scope Specifies global variables that are accessible in the map, reduce and
finalize functions.
field Boolean jsMode Optional. Specifies whether to convert intermediate data into BSON format
between the execution of the map and reduce functions. Defaults to false. If false: - Internally, MongoDB converts the JavaScript objects emitted by the map function to BSON objects.
These BSON objects are then converted back to JavaScript objects when calling the reduce
function. - The map-reduce operation places the intermediate BSON objects in temporary, ondisk storage. This allows the map-reduce operation to execute over arbitrarily large data sets.
If true: - Internally, the JavaScript objects emitted during map function remain as JavaScript
objects. There is no need to convert the objects for the reduce function, which can result in
faster execution. - You can only use jsMode for result sets with fewer than 500,000 distinct
key arguments to the mappers emit() function. The jsMode defaults to false.
field Boolean verbose Specifies whether to include the timing information in the result information. The verbose defaults to true to include the timing information.
The following is a prototype usage of the mapReduce (page 701) command:
var mapFunction = function() { ... };
var reduceFunction = function(key, values) { ... };
db.runCommand(
{
mapReduce: 'orders',
map: mapFunction,
reduce: reduceFunction,
out: { merge: 'map_reduce_results', db: 'test' },
query: { ord_date: { $gt: new Date('01/01/2012') } }
}
)
JavaScript in MongoDB
Although mapReduce (page 701) uses JavaScript, most interactions with MongoDB do not use JavaScript but
use an idiomatic driver (page 95) in the language of the interacting application.
Note: Changed in version 2.4.
702
In MongoDB 2.4, map-reduce operations (page 701), the group (page 697) command, and $where
(page 634) operator expressions cannot access certain global functions or properties, such as db, that are available in
the mongo (page 942) shell.
When upgrading to MongoDB 2.4, you will need to refactor your code if your map-reduce operations
(page 701), group (page 697) commands, or $where (page 634) operator expressions include any global shell
functions or properties that are no longer available, such as db.
The following JavaScript functions and properties are available to map-reduce operations (page 701), the
group (page 697) command, and $where (page 634) operator expressions in MongoDB 2.4:
Available Properties
Available Functions
args
MaxKey
MinKey
assert()
BinData()
DBPointer()
DBRef()
doassert()
emit()
gc()
HexData()
hex_md5()
isNumber()
isObject()
ISODate()
isString()
Map()
MD5()
NumberInt()
NumberLong()
ObjectId()
print()
printjson()
printjsononeline()
sleep()
Timestamp()
tojson()
tojsononeline()
tojsonObject()
UUID()
version()
Requirements for the map Function The map function has the following prototype:
function() {
...
emit(key, value);
}
703
function() {
if (this.status == 'A')
emit(this.cust_id, 1);
}
The following map function may call emit(key,value) multiple times depending on the number of
elements in the input documents items field:
function() {
this.items.forEach(function(item){ emit(item.sku, 1); });
}
The map function can access the variables defined in the scope parameter.
Requirements for the reduce Function The reduce function has the following prototype:
function(key, values) {
...
return result;
}
the reduce function must be idempotent. Ensure that the following statement is true:
reduce( key, [ reduce(key, valuesArray) ] ) == reduce( key, valuesArray )
the order of the elements in the valuesArray should not affect the output of the reduce function, so that
the following statement is true:
reduce( key, [ A, B ] ) == reduce( key, [ B, A ] )
out Options You can specify the following options for the out parameter:
Output to a Collection
out: <collectionName>
704
Output to a Collection with an Action This option is only available when passing out a collection that already
exists. This option is not available on secondary members of replica sets.
out: { <action>: <collectionName>
[, db: <dbName>]
[, sharded: <boolean> ]
[, nonAtomic: <boolean> ] }
When you output to a collection with an action, the out has the following parameters:
<action>: Specify one of the following actions:
replace
Replace the contents of the <collectionName> if the collection with the <collectionName> exists.
merge
Merge the new result with the existing result if the output collection already exists. If an existing document
has the same key as the new result, overwrite that existing document.
reduce
Merge the new result with the existing result if the output collection already exists. If an existing document
has the same key as the new result, apply the reduce function to both the new and the existing documents
and overwrite the existing document with the result.
db:
Optional.The name of the database that you want the map-reduce operation to write its output. By default
this will be the same database as the input collection.
sharded:
Optional. If true and you have enabled sharding on output database, the map-reduce operation will
shard the output collection using the _id field as the shard key.
nonAtomic:
New in version 2.2.
Optional. Specify output operation as non-atomic and is valid only for merge and reduce output modes
which may take minutes to execute.
If nonAtomic is true, the post-processing step will prevent MongoDB from locking the database; however,
other clients will be able to read intermediate states of the output collection. Otherwise the map reduce operation
must lock the database during post-processing.
Output Inline Perform the map-reduce operation in memory and return the result. This option is the only available
option for out on secondary members of replica sets.
out: { inline: 1 }
The result must fit within the maximum size of a BSON document (page 1015).
Requirements for the finalize Function The finalize function has the following prototype:
function(key, reducedValue) {
...
return modifiedObject;
}
705
The finalize function receives as its arguments a key value and the reducedValue from the reduce function.
Be aware that:
The finalize function should not access the database for any reason.
The finalize function should be pure, or have no impact outside of the function (i.e. side effects.)
The finalize function can access the variables defined in the scope parameter.
Examples In the mongo (page 942) shell, the db.collection.mapReduce() (page 837) method is a wrapper
around the mapReduce (page 701) command. The following examples use the db.collection.mapReduce()
(page 837) method:
Consider the following map-reduce operations on a collection orders that contains documents of the following
prototype:
{
_id: ObjectId("50a8240b927d5d8b5891743c"),
cust_id: "abc123",
ord_date: new Date("Oct 04, 2012"),
status: 'A',
price: 25,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 } ]
}
Return the Total Price Per Customer Perform the map-reduce operation on the orders collection to group by
the cust_id, and calculate the sum of the price for each cust_id:
2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices:
The valuesPrices is an array whose elements are the price values emitted by the map function and
grouped by keyCustId.
The function reduces the valuesPrice array to the sum of its elements.
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function
and the reduceFunction1 reduce function.
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
706
2. Define the corresponding reduce function with two arguments keySKU and countObjVals:
countObjVals is an array whose elements are the objects mapped to the grouped keySKU values
passed by map function to the reducer function.
The function reduces the countObjVals array to a single object reducedValue that contains the
count and the qty fields.
In reducedVal, the count field contains the sum of the count fields from the individual array elements, and the qty field contains the sum of the qty fields from the individual array elements.
var reduceFunction2 = function(keySKU, countObjVals) {
reducedVal = { count: 0, qty: 0 };
for (var idx = 0; idx < countObjVals.length; idx++) {
reducedVal.count += countObjVals[idx].count;
reducedVal.qty += countObjVals[idx].qty;
}
return reducedVal;
};
3. Define a finalize function with two arguments key and reducedVal. The function modifies the
reducedVal object to add a computed field named avg and returns the modified object:
var finalizeFunction2 = function (key, reducedVal) {
reducedVal.avg = reducedVal.qty/reducedVal.count;
return reducedVal;
};
707
using
the
mapFunction2,
db.orders.mapReduce( mapFunction2,
reduceFunction2,
{
out: { merge: "map_reduce_example" },
query: { ord_date:
{ $gt: new Date('01/01/2012') }
},
finalize: finalizeFunction2
}
)
This operation uses the query field to select only those documents with ord_date greater than new
Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the
map_reduce_example collection already exists, the operation will merge the existing contents with the
results of this map-reduce operation.
For more information and examples, see the Map-Reduce (page 282) page and Perform Incremental Map-Reduce
(page 300).
See also:
Troubleshoot the Map Function (page 302)
Troubleshoot the Reduce Function (page 303)
db.collection.mapReduce() (page 837)
Aggregation Concepts (page 279)
For a detailed comparison of the different approaches, see Aggregation Commands Comparison (page 306).
Geospatial Commands
Geospatial Commands
Name
geoNear (page 708)
geoSearch (page 709)
geoWalk (page 710)
Description
Performs a geospatial query that returns the documents closest to a given point.
Performs a geospatial query that uses MongoDBs haystack index functionality.
An internal command to support geospatial queries.
geoNear
Definition
geoNear
Specifies a point for which a geospatial query returns the closest documents first. The query returns the documents from nearest to farthest. The geoNear (page 708) command provides an alternative to the $near
(page 637) operator. In addition to the functionality of $near (page 637), geoNear (page 708) returns additional diagnostic information.
The geoNear (page 708) command can use either a GeoJSON point or legacy coordinate pairs. Queries that
use a 2d index return a limit of 100 documents.
The geoNear (page 708) command accepts a document that contains the following fields. Specify all distances
in the same units as the document coordinate system:
field string geoNear The collection to query.
708
geoSearch
geoSearch
The geoSearch (page 709) command provides an interface to MongoDBs haystack index functionality. These
indexes are useful for returning results based on location coordinates after collecting results based on some other
query (i.e. a haystack.) Consider the following example:
{ geoSearch : "places", near : [33, 33], maxDistance : 6, search : { type : "restaurant" }, limi
The above command returns all documents with a type of restaurant having a maximum distance of 6
units from the coordinates [30,33] in the collection places up to a maximum of 30 results.
Unless specified otherwise, the geoSearch (page 709) command limits results to 50 documents.
709
geoWalk
geoWalk
geoWalk (page 710) is an internal command.
Query and Write Operation Commands
Name
findAndModify
(page 710)
text (page 715)
getLastError
(page 720)
getPrevError
(page 721)
resetError (page 722)
eval (page 722)
Description
Returns and modifies a single document.
Performs a text search.
Returns the success status of the last operation.
Returns status document containing all errors since the last
(page 722) command.
Resets the last error status.
Runs a JavaScript function on the database server.
findAndModify
findAndModify
The findAndModify (page 710) command atomically modifies and returns a single document. By default,
the returned document does not include the modifications made on the update. To return the document with the
modifications made on the update, use the new option.
The command has the following syntax:
{
findAndModify: <string>,
query: <document>,
sort: <document>,
remove: <boolean>,
update: <document>,
new: <boolean>,
fields: <document>,
upsert: <boolean>
}
710
update (document) Must specify either the remove or the update field in the
findAndModify (page 710) command. The update field employs the same update
operators (page 651) or field: value specifications to modify the selected document.
new (boolean) Optional. When true, returns the modified document rather than the
original. The findAndModify (page 710) method ignores the new option for remove
operations. The default is false.
fields (document) Optional. A subset of fields to return. The fields document specifies
an inclusion of a field with 1, as in the following:
fields: { <field1>: 1, <field2>: 1, ... }
true.
for an upsert operation that performs an insert, when new is false, and includes a sort option, the
return document has lastErrorObject, value, and ok fields, but the value field holds an empty
document {}.
for an upsert that performs an insert, when new is false without a specified sort the return document has lastErrorObject, value, and ok fields, but the value field holds a null.
11.1. MongoDB Interface
711
Changed in version 2.2: Previously, the command returned an empty document (e.g. {}) in the value
field. See the 2.2 release notes (page 1053) for more information.
Consider the following examples:
The following command updates an existing document in the people collection where the document
matches the query criteria:
db.runCommand(
{
findAndModify: "people",
query: { name: "Tom", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $inc: { score: 1 } }
}
)
To return the modified document in the value field, add the new:true option to the command.
If no document match the query condition, the command returns a document that contains null in
the value field:
712
The mongo (page 942) shell and many drivers provide a findAndModify() (page 821) helper method.
Using the shell helper, this previous operation can take the following form:
db.people.findAndModify( {
query: { name: "Tom", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $inc: { score: 1 } }
} );
However, the findAndModify() (page 821) shell helper method returns just the unmodified document,
or the modified document when new is true.
{
"_id" : ObjectId("50f1d54e9beb36a0f45c6452"),
"name" : "Tom",
"state" : "active",
"rating" : 100,
"score" : 5
}
db.runCommand(
{
findAndModify: "people",
query: { name: "Gus", state: "active", rating: 100 },
sort: { rating: 1 },
update: { $inc: { score: 1 } },
upsert: true
}
)
If the command does not find a matching document, the command performs an upsert and returns a document with the following fields:
The lastErrorObject field that contains the details of the command, including the field
upserted that contains the ObjectId of the newly inserted document, and
The value field that contains an empty document {} as the original document because the command
included the sort option:
{
"lastErrorObject" : {
"updatedExisting" : false,
"upserted" : ObjectId("50f2329d0092b46dae1dc98e"),
"n" : 1,
"connectionId" : 1,
"err" : null,
"ok" : 1
},
"value" : {
},
"ok" : 1
}
If the command did not include the sort option, the value field would contain null:
11.1. MongoDB Interface
713
{
"value" : null,
"lastErrorObject" : {
"updatedExisting" : false,
"n" : 1,
"upserted" : ObjectId("5102f7540cb5c8be998c2e99")
},
"ok" : 1
}
The following findAndModify (page 710) command includes both upsert: true option and the
new:true option to return the newly inserted document in the value field if a document matching the
query is not found:
db.runCommand(
{
findAndModify: "people",
query: { name: "Pascal", state: "active", rating: 25 },
sort: { rating: 1 },
update: { $inc: { score: 1 } },
upsert: true,
new: true
}
)
The command returns the newly inserted document in the value field:
{
"lastErrorObject" : {
"updatedExisting" : false,
"upserted" : ObjectId("50f47909444c11ac2448a5ce"),
"n" : 1,
"connectionId" : 1,
"err" : null,
"ok" : 1
},
"value" : {
"_id" : ObjectId("50f47909444c11ac2448a5ce"),
"name" : "Pascal",
"rating" : 25,
"score" : 1,
"state" : "active"
},
"ok" : 1
}
When the findAndModify (page 710) command includes the upsert: true option and the query
field(s) is not uniquely indexed, the method could insert a document multiple times in certain circumstances. For
instance, if multiple clients issue the findAndModify (page 710) command and these commands complete
the find phase before any one starts the modify phase, these commands could insert the same document.
Consider an example where no document with the name Andy exists and multiple clients issue the following
command:
db.runCommand(
{
findAndModify: "people",
query: { name: "Andy" },
sort: { rating: 1 },
714
If all the commands finish the query phase before any command starts the modify phase, and there is no
unique index on the name field, the commands may all perform an upsert. To prevent this condition, create a
unique index (page 334) on the name field. With the unique index in place, then the multiple findAndModify
(page 710) commands would observe one of the following behaviors:
Exactly one findAndModify (page 710) would successfully insert a new document.
Zero or more findAndModify (page 710) commands would update the newly inserted document.
Zero or more findAndModify (page 710) commands would fail when they attempted to insert a duplicate. If the command fails due to a unique index constraint violation, you can retry the command. Absent
a delete of the document, the retry should not fail.
Warning: When using findAndModify (page 710) in a sharded environment, the query must contain
the shard key for all operations against the shard cluster. findAndModify (page 710) operations issued
against mongos (page 938) instances for non-sharded collections function normally.
Note: This command obtains a write lock on the affected database and will block other operations until it has
completed; however, typically the write lock is short lived and equivalent to other similar update() (page 849)
operations.
text
Definition
text
New in version 2.4.
Searches text content stored in the text index (page 332). The text (page 715) command is case-insensitive.
The text (page 715) command returns all documents that contain any of the terms; i.e. it performs a logical
OR search. By default, the command limits the matches to the top 100 scoring documents, in descending score
order, but you can specify a different limit.
The text (page 715) command has the following syntax:
db.collection.runCommand( "text", { search: <string>,
filter: <document>,
project: <document>,
limit: <number>,
language: <string> } )
715
descending index field as a prefix, the filter is required and the filter query must be an
equality match.
field document project Limits the fields returned by the query to only those specified. By default,
the _id field returns as part of the result set, unless you explicitly exclude the field in the project
document.
field number limit The maximum number of documents to include in the response. The text
(page 715) command sorts the results before applying the limit. The default limit is 100.
field string language The language that determines the list of stop words for the search and the
rules for the stemmer and tokenizer. If not specified, the search uses the default language of the
index. For supported languages, see Text Search Languages (page 719). Specify the language in
lowercase.
Returns The text (page 715) command returns a document that contains a field results that
contains an array of the highest scoring documents, in descending order by score. See Output
(page 718) for details.
Warning: The complete results of the text (page 715) command must fit within the BSON Document
Size (page 1015). Otherwise, the command will limit the results to fit within the BSON Document Size
(page 1015). Use the limit and the project parameters with the text (page 715) command to limit the
size of the result set.
Note:
If the search string includes phrases, the search performs an AND with any other terms in the
search string; e.g. search for "\"twinkle twinkle\" little star" searches for "twinkle
twinkle" and ("little" or "star").
text (page 715) adds all negations to the query with the logical AND operator.
The text (page 715) command ignores stop words for the search language, such as the and and in
English.
The text (page 715) command matches on the complete stemmed word. So if a document field contains the word blueberry, a search on the term blue will not match. However, blueberry or
blueberries will match.
Note: You cannot combine the text (page 715) command, which requires a special text index (page 332),
with a query operator that requires a different type of special index. For example you cannot combine text
(page 715) with the $near (page 637) operator.
The search Field The search field takes a string of terms that MongoDB parses and uses to query the text
index. Enclose the string of terms in escaped double quotes to match on the phrase. Additionally, the text (page 715)
command treats most punctuation as delimiters, except when a hyphen - negates terms.
Prefixing a word with a hyphen sign (-) negates a word:
The negated word excludes documents that contain the negated word from the result set.
A search string that only contains negated words returns no match.
A hyphenated word, such as pre-market, is not a negation. The text command treats the hyphen as a delimiter.
716
Examples The following examples assume a collection articles that has a text index on the field subject:
db.articles.ensureIndex( { subject: "text" } )
This query returns documents that contain the word coffee, case-insensitive, in the indexed subject field.
Search for Multiple Words The following command searches for bake or coffee or cake:
db.articles.runCommand( "text", { search: "bake coffee cake" } )
This query returns documents that contain either bake or coffee or cake in the indexed subject field.
Search for a Phrase
db.articles.runCommand( "text", { search: "\"bake coffee cake\"" } )
This query returns documents that contain the phrase bake coffee cake.
Exclude a Term from the Result Set Use the hyphen (-) as a prefix to exclude documents that contain a term.
Search for documents that contain the words bake or coffee but do not contain cake:
db.articles.runCommand( "text", { search: "bake coffee -cake" } )
Search with Additional Query Conditions Use the filter option to include additional query conditions.
Search for a single word coffee with an additional filter on the about field, but limit the results to 2 documents
with the highest score and return only the subject field in the matching documents:
db.articles.runCommand( "text", {
search: "coffee",
filter: { about: /desserts/ },
limit: 2,
project: { subject: 1, _id: 0 }
}
)
The filter query document may use any of the available query operators (page 621).
Because the _id field is implicitly included, in order to return only the subject field, you must explicitly
exclude (0) the _id field. Within the project document, you cannot mix inclusions (i.e. <fieldA>: 1)
and exclusions (i.e. <fieldB>: 0), except for the _id field.
Search a Different Language Use the language option to specify Spanish as the language that determines the
list of stop words and the rules for the stemmer and tokenizer:
db.articles.runCommand( "text", {
search: "leche",
language: "spanish"
}
)
717
See Text Search Languages (page 719) for the supported languages.
Important: Specify the language in lowercase.
Output The following is an example document returned by the text (page 715) command:
{
"queryDebugString" : "tomorrow||||||",
"language" : "english",
"results" : [
{
"score" : 1.3125,
"obj": {
"_id" : ObjectId("50ecef5f8abea0fda30ceab3"),
"quote" : "tomorrow, and tomorrow, and tomorrow, creeps in this petty pace",
"related_quotes" : [
"is this a dagger which I see before me",
"the handle toward my hand?"
],
"src" : {
"title" : "Macbeth",
"from" : "Act V, Scene V"
},
"speaker" : "macbeth"
}
}
],
"stats" : {
"nscanned" : 1,
"nscannedObjects" : 0,
"n" : 1,
"nfound" : 1,
"timeMicros" : 163
},
"ok" : 1
}
718
text.stats
The stats (page 718) field returns a document that contains the query execution statistics. The stats
(page 718) field contains:
text.stats.nscanned
The nscanned (page 719) field returns the total number of index entries scanned.
text.stats.nscannedObjects
The nscannedObjects (page 719) field returns the total number of documents scanned.
text.stats.n
The n (page 719) field returns the number of elements in the results (page 718) array. This number may
be less than the total number of matching documents, i.e. nfound (page 719), if the full result exceeds
the BSON Document Size (page 1015).
text.stats.nfound
The nfound (page 719) field returns the total number of documents that match. This number may be
greater than the size of the results (page 718) array, i.e. n (page 719), if the result set exceeds the
BSON Document Size (page 1015).
text.stats.timeMicros
The timeMicros (page 719) field returns the time in microseconds for the search.
text.ok
The ok (page 719) returns the status of the text (page 715) command.
Text Search Languages
guages:
The text index (page 332) and the text (page 715) command support the following lan-
danish
dutch
english
finnish
french
german
hungarian
italian
norwegian
portuguese
romanian
russian
spanish
swedish
turkish
Note: If you specify a language value of "none", then the text search has no list of stop words, and the text search
does not stem or tokenize the search terms.
getLastError
11.1. MongoDB Interface
719
Definition
getLastError
Returns the error status of the preceding operation on the current connection. Clients typically use
getLastError (page 720) in combination with write operations to ensure that the write succeeds.
field Boolean j If true, wait for the next journal commit before returning, rather than waiting for
a full disk flush. If mongod (page 925) does not have journaling enabled, this option has no
effect. If this option is enabled for a write operation, mongod (page 925) will wait no more than
1/3 of the current journalCommitInterval (page 995) before writing data to the journal.
field integer,string w When running with replication, this is the number of servers to replicate to
before returning. A w value of 1 indicates the primary only. A w value of 2 includes the primary
and at least one secondary, etc. In place of a number, you may also set w to majority to
indicate that the command should wait until the latest write propagates to a majority of replica
set members. If using w, you should also use wtimeout. Specifying a value for w without also
providing a wtimeout may cause getLastError (page 720) to block indefinitely.
field Boolean fsync If true, wait for mongod (page 925) to write this data to disk before returning.
Defaults to false. In most cases, use the j option to ensure durability and consistency of the data
set.
field integer wtimeout Milliseconds. Specify a value in milliseconds to control how long to wait
for write propagation to complete. If replication does not complete in the given timeframe, the
getLastError (page 720) command will return with an error status.
The following is the prototype form:
{ getLastError: 1 }
See also:
Write Concern (page 55), Write Concern Reference (page 96), and Replica Acknowledged (page 57).
Output Each getLastError() command returns a document containing a subset of the fields listed below.
getLastError.ok
getLastError.ok (page 720) is true when the getLastError (page 720) command completes successfully.
Note: A value of true does not indicate that the preceding operation did not produce an error.
getLastError.err
getLastError.err (page 720) is null unless an error occurs. When there was an error with the proceeding
operation, err contains a textual description of the error.
getLastError.code
getLastError.code (page 720) reports the preceding operations error code.
getLastError.connectionId
The identifier of the connection.
getLastError.lastOp
When issued against a replica set member and the preceding operation was a write or update,
getLastError.lastOp (page 720) is the optime timestamp in the oplog of the change.
getLastError.n
getLastErrr.n reports the number of documents updated or removed, if the preceding operation was an
update or remove operation.
720
getLastError.updateExisting
getLastError.updateExisting (page 720) is true when an update affects at least one document and
does not result in an upsert.
getLastError.upserted
getLastError.upserted (page 721) is an ObjectId that corresponds to the upserted document if the
update resulted in an insert.
getLastError.wnote
If set, wnote indicates that the preceding operations error relates to using the w parameter to getLastError
(page 720).
See
Write Concern Reference (page 96) for more information about w values.
getLastError.wtimeout
getLastError.wtimeout (page 721) is true if the getLastError (page 720) timed out because of
the wtimeout setting to getLastError (page 720).
getLastError.waited
If the preceding operation specified a timeout using the wtimeout setting to getLastError (page 720),
then getLastError.waited (page 721) reports the number of milliseconds getLastError (page 720)
waited before timing out.
getLastError.wtime
getLastError.wtime (page 721) is the number of milliseconds spent waiting for the preceding operation to complete. If getLastError (page 720) timed out, getLastError.wtime (page 721) and
getLastError.waited are equal.
Examples
Confirm Replication to Two Replica Set Members The following example ensures the operation has replicated to
two members (the primary and one other member):
db.runCommand( { getLastError: 1, w: 2 } )
Confirm Replication to a Majority of a Replica Set The following example ensures the write operation has replicated to a majority of the configured members of the set.
db.runCommand( { getLastError: 1, w: "majority" } )
Set a Timeout for a getLastError Response Unless you specify a timeout, a getLastError (page 720)
command may block forever if MongoDB cannot satisfy the requested write concern. To specify a timeout of 5000
milliseconds, use an invocation that resembles the following:
db.runCommand( { getLastError: 1, w: 2, wtimeout:5000 } )
When wtimeout is 0, the getLastError (page 720) operation will never time out.
getPrevError
721
getPrevError
The getPrevError (page 721) command returns the errors since the last resetError (page 722) command.
See also:
db.getPrevError() (page 887)
resetError
resetError
The resetError (page 722) command resets the last error status.
See also:
db.resetError() (page 893)
eval
eval
The eval (page 722) command evaluates JavaScript functions on the database server and has the following
form:
{
eval: <function>,
args: [ <arg1>, <arg2> ... ],
nolock: <boolean>
}
722
db.myCollection.save( doc );
return doc;
},
args: [ "eliot", 5 ]
}
);
If you want to use the servers interpreter, you must run eval (page 722). Otherwise, the mongo (page 942)
shells JavaScript interpreter evaluates functions entered directly into the shell.
If an error occurs, eval (page 722) throws an exception. The following invalid function uses the variable x
without declaring it as an argument:
db.runCommand(
{
eval: function() { return x + x; },
args: [ 3 ]
}
)
723
Warning:
By default, eval (page 722) takes a global write lock before evaluating the JavaScript function. As
a result, eval (page 722) blocks all other read and write operations to the database while the eval
(page 722) operation runs. Set nolock to true on the eval (page 722) command to prevent
the eval (page 722) command from taking the global write lock before evaluating the JavaScript.
nolock does not impact whether operations within the JavaScript code itself takes a write lock.
Do not use eval (page 722) for long running operations as eval (page 722) blocks all other operations. Consider using other server side code execution options (page 198).
You can not use eval (page 722) with sharded data. In general, you should avoid using eval
(page 722) in sharded cluster; nevertheless, it is possible to use eval (page 722) with non-sharded
collections and databases stored in a sharded cluster.
With authentication (page 993) enabled, eval (page 722) will fail during the operation if you
do not have the permission to perform a specified task.
Changed in version 2.4: You must have full admin access to run.
Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript
operations to execute at the same time. Prior to 2.4, eval (page 722) executed in a single thread.
See also:
Server-side JavaScript (page 198)
Database Operations
Authentication Commands
Authentication Commands
Name
logout (page 724)
authenticate
(page 725)
copydbgetnonce
(page 725)
getnonce (page 725)
Description
Terminates the current authenticated session.
Starts an authenticated session using a username and password.
logout
logout
The logout (page 724) command terminates the current authenticated session:
{ logout: 1 }
Note: If youre not logged in and using authentication, logout (page 724) has no effect.
Changed in version 2.4: Because MongoDB now allows users defined in one database to have privileges on another database, you must call logout (page 724) while using the same database context that you authenticated
to.
If you authenticated to a database such as users or $external, you must issue logout (page 724) against
this database in order to successfully log out.
Example
Use the use <database-name> helper in the interactive mongo (page 942) shell, or the following
db.getSiblingDB() (page 888) in the interactive shell or in mongo (page 942) shell scripts to change
the db object:
724
db = db.getSiblingDB('<database-name>')
When you have set the database context and db object, you can use the logout (page 724) to log out of
database as in the following operation:
db.runCommand( { logout: 1 } )
authenticate
authenticate
Clients use authenticate (page 725) to authenticate a connection.
db.auth() (page 876) helper as follows:
See
db.auth() (page 876) and Security Concepts (page 237) for more information.
copydbgetnonce
copydbgetnonce
Client libraries use copydbgetnonce (page 725) to get a one-time password for use with the copydb
(page 745) command.
Note: This command obtains a write lock on the affected database and will block other operations until it has
completed; however, the write lock for this operation is short lived.
getnonce
getnonce
Client libraries use getnonce (page 725) to generate a one-time password for authentication.
Replication Commands
725
Name
replSetFreeze (page 726)
Replication Commands
replSetGetStatus
(page 726)
replSetInitiate
(page 728)
replSetMaintenance
(page 729)
replSetReconfig
(page 729)
replSetStepDown
(page 730)
replSetSyncFrom
(page 730)
resync (page 731)
applyOps (page 732)
isMaster (page 732)
getoptime (page 734)
Description
Prevents the current member from seeking election as primary for a perio
time.
Returns a document that reports on the status of the replica set.
Initializes a new replica set.
Forces the current primary to step down and become a secondary, forcin
election.
Explicitly override the default logic for selecting a member to replicate f
replSetFreeze
replSetFreeze
The replSetFreeze (page 726) command prevents a replica set member from seeking election for the specified number of seconds. Use this command in conjunction with the replSetStepDown (page 730) command
to make a different node in the replica set a primary.
The replSetFreeze (page 726) command uses the following syntax:
{ replSetFreeze: <seconds> }
If you want to unfreeze a replica set member before the specified number of seconds has elapsed, you can issue
the command with a seconds value of 0:
{ replSetFreeze: 0 }
Restarting the mongod (page 925) process also unfreezes a replica set member.
replSetFreeze (page 726) is an administrative command, and you must issue it against the admin database.
replSetGetStatus
Definition
replSetGetStatus
The replSetGetStatus command returns the status of the replica set from the point of view of the current
server. You must run the command against the admin database. The command has the following prototype
format:
{ replSetGetStatus: 1 }
The value specified does not affect the output of the command. Data provided by this command derives from
data included in heartbeats sent to the current instance by other members of the replica set. Because of the
frequency of heartbeats, these data can be several seconds out of date.
726
You can also access this functionality through the rs.status() (page 898) helper in the mongo (page 942)
shell.
The mongod (page 925) must have replication enabled and be a member of a replica set for the for
replSetGetStatus (page 726) to return successfully.
Output
replSetGetStatus.set
The set value is the name of the replica set, configured in the replSet (page 1000) setting. This is the same
value as _id (page 480) in rs.conf() (page 896).
replSetGetStatus.date
The value of the date field is an ISODate of the current time, according to the current server. Compare this to
the value of the lastHeartbeat (page 728) to find the operational lag between the current host and the other
hosts in the set.
replSetGetStatus.myState
The value of myState (page 727) is an integer between 0 and 10 that represents the replica state (page 487)
of the current member.
replSetGetStatus.members
The members field holds an array that contains a document for every member in the replica set.
replSetGetStatus.members.name
The name field holds the name of the server.
replSetGetStatus.members.self
The self field is only included in the document for the current mongod instance in the members array.
Its value is true.
replSetGetStatus.members.errmsg
This field contains the most recent error or status message received from the member. This field may be
empty (e.g. "") in some cases.
replSetGetStatus.members.health
The health value is only present for the other members of the replica set (i.e. not the member that returns
rs.status (page 898).) This field conveys if the member is up (i.e. 1) or down (i.e. 0.)
replSetGetStatus.members.state
The value of state (page 727) is an array of documents, each containing an integer between 0 and 10
that represents the replica state (page 487) of the corresponding member.
replSetGetStatus.members.stateStr
A string that describes state (page 727).
replSetGetStatus.members.uptime
The uptime (page 727) field holds a value that reflects the number of seconds that this member has been
online.
This value does not appear for the member that returns the rs.status() (page 898) data.
replSetGetStatus.members.optime
A document that contains information regarding the last operation from the operation log that this member
has applied.
replSetGetStatus.members.optime.t
A 32-bit timestamp of the last operation applied to this member of the replica set from the oplog.
replSetGetStatus.members.optime.i
An incremented field, which reflects the number of operations in since the last time stamp. This value
only increases if there is more than one operation per second.
727
replSetGetStatus.members.optimeDate
An ISODate formatted date string that reflects the last entry from the oplog that this member applied. If this
differs significantly from lastHeartbeat (page 728) this member is either experiencing replication
lag or there have not been any new operations since the last update. Compare members.optimeDate
between all of the members of the set.
replSetGetStatus.members.lastHeartbeat
The lastHeartbeat value provides an ISODate formatted date of the last heartbeat received from this
member. Compare this value to the value of the date (page 727) field to track latency between these
members.
This value does not appear for the member that returns the rs.status() (page 898) data.
replSetGetStatus.members.pingMS
The pingMS represents the number of milliseconds (ms) that a round-trip packet takes to travel between
the remote member and the local instance.
This value does not appear for the member that returns the rs.status() (page 898) data.
replSetGetStatus.syncingTo
The syncingTo field is only present on the output of rs.status() (page 898) on secondary and recovering
members, and holds the hostname of the member from which this instance is syncing.
replSetInitiate
replSetInitiate
The replSetInitiate (page 728) command initializes a new replica set. Use the following syntax:
{ replSetInitiate : <config_document> }
The <config_document> is a document that specifies the replica sets configuration. For instance, heres a
config document for creating a simple 3-member replica set:
{
_id : <setname>,
members : [
{_id : 0, host : <host0>},
{_id : 1, host : <host1>},
{_id : 2, host : <host2>},
]
}
A typical way of running this command is to assign the config document to a variable and then to pass the
document to the rs.initiate() (page 897) helper:
config = {
_id : "my_replica_set",
members : [
{_id : 0, host : "rs1.example.net:27017"},
{_id : 1, host : "rs2.example.net:27017"},
{_id : 2, host : "rs3.example.net", arbiterOnly: true},
]
}
rs.initiate(config)
Notice that omitting the port cause the host to use the default port of 27017. Notice also that you can specify
other options in the config documents such as the arbiterOnly setting in this example.
See also:
728
Replica Set Configuration (page 479), Replica Set Tutorials (page 419), and Replica Set Reconfiguration
(page 483).
replSetMaintenance
replSetMaintenance
The replSetMaintenance (page 729) admin command enables or disables the maintenance mode for a
secondary member of a replica set.
The command has the following prototype form:
{ replSetMaintenance: <boolean> }
Consider the following behavior when running the replSetMaintenance (page 729) command:
You cannot run the command on the Primary.
You must run the command against the admin database.
When enabled replSetMaintenance:
secondary is RECOVERING:
replSetReconfig
replSetReconfig
The replSetReconfig (page 729) command modifies the configuration of an existing replica set. You can
use this command to add and remove members, and to alter the options set on existing members. Use the
following syntax:
{ replSetReconfig: <new_config_document>, force: false }
You may also run the command using the shells rs.reconfig() (page 897) method.
Be aware of the following replSetReconfig (page 729) behaviors:
You must issue this command against the admin database of the current primary member of the replica
set.
You can optionally force the replica set to accept the new configuration by specifying force: true.
Use this option if the current member is not primary or if a majority of the members of the set are not
accessible.
Warning: Forcing the replSetReconfig (page 729) command can lead to a rollback situation.
Use with caution.
Use the force option to restore a replica set to new servers with different hostnames. This works even if
the set members already have a copy of the data.
6 https://github.com/mongodb/mongo-snippets/blob/master/js/compact-example.js
729
A majority of the sets members must be operational for the changes to propagate properly.
This command can cause downtime as the set renegotiates primary-status. Typically this is 10-20 seconds,
but could be as long as a minute or more. Therefore, you should attempt to reconfigure only during
scheduled maintenance periods.
In some cases, replSetReconfig (page 729) forces the current primary to step down, initiating an
election for primary among the members of the replica set. When this happens, the set will drop all current
connections.
Note: replSetReconfig (page 729) obtains a special mutually exclusive lock to prevent more than one
replSetReconfig (page 729) operation from occurring at the same time.
replSetStepDown
Description
replSetStepDown
Forces the primary of the replica set to become a secondary. This initiates an election for primary (page 397).
field number replSetStepDown A number of seconds for the member to avoid election to primary.
If you do not specify a value for <seconds>, replSetStepDown (page 730) will attempt
to avoid reelection to primary for 60 seconds.
field Boolean force New in version 2.0: Forces the primary to step down even if there are no secondary members within 10 seconds of the primarys latest optime.
Warning: replSetStepDown (page 730) forces all clients currently connected to the database to disconnect. This helps ensure that clients maintain an accurate view of the replica set.
New in version 2.0: If there is no secondary within 10 seconds of the primary, replSetStepDown (page 730)
will not succeed to prevent long running elections.
Example The following example specifies that the former primary avoids reelection to primary for 120 seconds:
db.runCommand( { replSetStepDown: 120 } )
replSetSyncFrom
Description
replSetSyncFrom
New in version 2.2.
Explicitly configures which host the current mongod (page 925) pulls oplog entries from. This operation is
useful for testing different patterns and in situations where a set member is not replicating from the desired host.
The replSetSyncFrom (page 730) command has the following form:
{ replSetSyncFrom: "hostname<:port>" }
730
You may also use the rs.syncFrom() (page 899) helper in the mongo (page 942) shell in an operation
with the following form:
rs.syncFrom("hostname<:port>")
Note: replSetSyncFrom (page 730) and rs.syncFrom() (page 899) provide a temporary override of default behavior. mongod (page 925) will revert to the default sync behavior in the following
situations:
The mongod (page 925) instance restarts.
The connection between the mongod (page 925) and the sync target closes.
Changed in version 2.4: The sync target falls more than 30 seconds behind another member of the replica
set; the mongod (page 925) will revert to the default sync target.
resync
resync
The resync (page 731) command forces an out-of-date slave mongod (page 925) instance to re-synchronize
itself. Note that this command is relevant to master-slave replication only. It does not apply to replica sets.
Warning: This command obtains a global write lock and will block other operations until it has completed.
applyOps
731
Definition
applyOps
Applies specified oplog entries to a mongod (page 925) instance. The applyOps (page 732) command is
primarily an internal command to support sharded clusters.
The applyOps (page 732) command takes a document with the following fields:
field array applyOps The oplog entries to apply.
field array preCondition An array of documents that contain the conditions that must be true in
order to apply the oplog entry. Each document contains a set of conditions, as described in the
next table.
The preCondition array takes one or more documents with the following fields:
field string ns A namespace. If you use this field, applyOps (page 732) applies oplog entries only
for the collection described by this namespace.
param string q Specifies the query that produces the results specified in the res field.
param string res The results of the query in the q field that must match to apply the oplog entry.
Example
The applyOps (page 732) command has the following prototype form:
Warning: This command obtains a global write lock and will block other operations until it has
completed.
isMaster
Definition
isMaster
isMaster (page 732) returns a document that describes the role of the mongod (page 925) instance.
If the instance is a member of a replica set, then isMaster (page 732) returns a subset of the replica set
configuration and status including whether or not the instance is the primary of the replica set.
When sent to a mongod (page 925) instance that is not a member of a replica set, isMaster (page 732) returns
a subset of this information.
MongoDB drivers and clients use isMaster (page 732) to determine the state of the replica set members and
to discover additional members of a replica set.
The db.isMaster() (page 890) method in the mongo (page 942) shell provides a wrapper around
isMaster (page 732).
The command takes the following form:
{ isMaster: 1 }
See also:
db.isMaster() (page 890)
Output
732
All Instances The following isMaster (page 732) fields are common across all roles:
isMaster.ismaster
A boolean value that reports when this node is writable. If true, then this instance is a primary in a replica
set, or a master in a master-slave configuration, or a mongos (page 938) instance, or a standalone mongod
(page 925).
This field will be false if the instance is a secondary member of a replica set or if the member is an arbiter of
a replica set.
isMaster.maxBsonObjectSize
The maximum permitted size of a BSON object in bytes for this mongod (page 925) process. If not provided,
clients should assume a max size of 4 * 1024 * 1024.
isMaster.maxMessageSizeBytes
New in version 2.4.
The maximum permitted size of a BSON wire protocol message. The default value is 48000000 bytes.
isMaster.localTime
New in version 2.2.
Returns the local server time in UTC. This value is an ISO date.
Sharded Instances mongos (page 938) instances add the following field to the isMaster (page 732) response
document:
isMaster.msg
Contains the value isdbgrid when isMaster (page 732) returns from a mongos (page 938) instance.
Replica Sets isMaster (page 732) contains these fields when returned by a member of a replica set:
isMaster.setName
The name of the current :replica set.
isMaster.secondary
A boolean value that, when true, indicates if the mongod (page 925) is a secondary member of a replica set.
isMaster.hosts
An array of strings in the format of "[hostname]:[port]" that lists all members of the replica set that are
neither hidden, passive, nor arbiters.
Drivers use this array and the isMaster.passives (page 733) to determine which members to read from.
isMaster.passives
An array of strings in the format of "[hostname]:[port]" listing all members of the replica set which
have a priority (page 481) of 0.
This field only appears if there is at least one member with a priority (page 481) of 0.
Drivers use this array and the isMaster.hosts (page 733) to determine which members to read from.
isMaster.arbiters
An array of strings in the format of "[hostname]:[port]" listing all members of the replica set that are
arbiters.
This field only appears if there is at least one arbiter in the replica set.
isMaster.primary
A string in the format of "[hostname]:[port]" listing the current primary member of the replica set.
733
isMaster.arbiterOnly
A boolean value that , when true, indicates that the current instance is an arbiter. The arbiterOnly
(page 733) field is only present, if the instance is an arbiter.
isMaster.passive
A boolean value that, when true, indicates that the current instance is hidden. The passive (page 734) field
is only present for hidden members.
isMaster.hidden
A boolean value that, when true, indicates that the current instance is hidden. The hidden (page 734) field
is only present for hidden members.
isMaster.tags
A document that lists any tags assigned to this member. This field is only present if there are tags assigned to
the member. See Configure Replica Set Tag Sets (page 451) for more information.
isMaster.me
The [hostname]:[port] of the member that returned isMaster (page 732).
getoptime
getoptime
getoptime (page 734) is an internal command.
See also:
Replication (page 377) for more information regarding replication.
Sharding Commands
734
Sharding Commands
Name
flushRouterConfig
(page 735)
addShard (page 736)
checkShardingIndex
(page 736)
enableSharding
(page 736)
listShards
(page 737)
removeShard
(page 737)
getShardMap
(page 737)
getShardVersion
(page 737)
setShardVersion
(page 737)
shardCollection
(page 738)
shardingState
(page 738)
unsetSharding
(page 739)
split (page 739)
splitChunk
(page 741)
splitVector
(page 742)
medianKey (page 742)
moveChunk (page 742)
movePrimary
(page 743)
isdbgrid (page 743)
Description
Forces an update to the cluster metadata cached by a mongos (page 938).
Adds a shard to a sharded cluster.
Internal command that validates index on shard key.
Enables sharding on a specific database.
Returns a list of configured shards.
Starts the process of removing a shard from a sharded cluster.
Internal command that reports on the state of a sharded cluster.
Internal command that returns the config server version.
Internal command to sets the config server version.
Enables the sharding functionality for a collection, allowing the collection to be
sharded.
Reports whether the mongod (page 925) is a member of a sharded cluster.
Internal command that affects connections between instances in a MongoDB
deployment.
Creates a new chunk.
Internal command to split chunk. Instead use the methods sh.splitFind()
(page 909) and sh.splitAt() (page 909).
Internal command that determines split points.
Deprecated internal command. See splitVector (page 742).
Internal command that migrates chunks between shards.
Reassigns the primary shard when removing a shard from a sharded cluster.
Verifies that a process is a mongos (page 938).
flushRouterConfig
flushRouterConfig
flushRouterConfig (page 735) clears the current cluster information cached by a mongos (page 938)
instance and reloads all sharded cluster metadata from the config database.
This forces an update when the configuration database holds data that is newer than the data cached in the
mongos (page 938) process.
Warning: Do not modify the config data, except as explicitly documented. A config database cannot
typically tolerate manual manipulation.
flushRouterConfig (page 735) is an administrative command that is only available for mongos
(page 938) instances.
New in version 1.8.2.
addShard
735
Definition
addShard
Adds either a database instance or a replica set to a sharded cluster. The optimal configuration is to deploy
shards across replica sets.
Run addShard (page 736) when connected to a mongos (page 938) instance. The command takes the following form when adding a single database instance as a shard:
{ addShard: "<hostname><:port>", maxSize: <size>, name: "<shard_name>" }
Warning: Do not use localhost for the hostname unless your configuration server is also running on
localhost.
The following command adds a replica set as a shard:
db.runCommand( { addShard: "repl0/mongodb3.example.net:27327"} )
You may specify all members in the replica set. All additional hostnames must be members of the same replica set.
checkShardingIndex
checkShardingIndex
checkShardingIndex (page 736) is an internal command that supports the sharding functionality.
enableSharding
enableSharding
The enableSharding (page 736) command enables sharding on a per-database level. Use the following
command form:
{ enableSharding: "<database name>" }
736
Once youve enabled sharding in a database, you can use the shardCollection (page 738) command to
begin the process of distributing data among the shards.
listShards
listShards
Use the listShards (page 737) command to return a list of configured shards. The command takes the
following form:
{ listShards: 1 }
removeShard
removeShard
Starts the process of removing a shard from a cluster. This is a multi-stage process. Begin by issuing the
following command:
{ removeShard : "[shardName]" }
The balancer will then migrate chunks from the shard specified by [shardName]. This process happens
slowly to avoid placing undue load on the overall cluster.
The command returns immediately, with the following message:
{ msg : "draining started successfully" , state: "started" , shard: "shardName" , ok : 1 }
If you run the command again, youll see the following progress output:
{ msg: "draining ongoing" , state: "ongoing" , remaining: { chunks: 23 , dbs: 1 }, ok: 1 }
The remaining document specifies how many chunks and databases remain on the shard.
db.printShardingStatus() (page 892) to list the databases that you must move from the shard.
Use
Each database in a sharded cluster has a primary shard. If the shard you want to remove is also the primary of
one of the clusters databases, then you must manually move the database to a new shard. This can be only after
the shard is empty. See the movePrimary (page 743) command for details.
After removing all chunks and databases from the shard, you may issue the command again, to return:
{ msg: "remove shard completed successfully", state: "completed", host: "shardName", ok : 1 }
getShardMap
getShardMap
getShardMap (page 737) is an internal command that supports the sharding functionality.
getShardVersion
getShardVersion
getShardVersion (page 737) is an internal command that supports sharding functionality.
setShardVersion
setShardVersion
setShardVersion (page 737) is an internal command that supports sharding functionality.
shardCollection
737
Definition
shardCollection
Enables a collection for sharding and allows MongoDB to begin distributing data among shards. You must run
enableSharding (page 736) on a database before running the shardCollection (page 738) command.
shardCollection (page 738) has the following form:
{ shardCollection: "<database>.<collection>", key: <shardkey> }
of
the
collection
to
shard
in
the
form
field document key The index specification document to use as the shard key. The index must exist
prior to the shardCollection (page 738) command, unless the collection is empty. If the
collection is empty, in which case MongoDB creates the index prior to sharding the collection.
New in version 2.4: The key may be in the form { field : "hashed" }, which will use
the specified field as a hashed shard key.
field Boolean unique When true, the unique option ensures that the underlying index enforces
a unique constraint. Hashed shard keys do not support unique constraints.
field integer numInitialChunks To support hashed sharding (page 506) added in MongoDB 2.4,
numInitialChunks specifies the number of chunks to create when sharding an collection
with a hashed shard key. MongoDB will then create and balance chunks across the cluster. The
numInitialChunks must be less than 8192.
Warning: Do not run more than one shardCollection (page 738) command on the same collection at the
same time.
Shard Keys Choosing the best shard key to effectively distribute load among your shards requires some planning.
Review Shard Keys (page 506) regarding choosing a shard key.
Hashed Shard Keys New in version 2.4.
Hashed shard keys (page 506) use a hashed index of a single field as the shard key.
Warning:
MongoDB provides no method to deactivate sharding for a collection after calling
shardCollection (page 738). Additionally, after shardCollection (page 738), you cannot change shard
keys or modify the value of any field used in your shard key index.
See also:
Sharding (page 493), Sharding Concepts (page 498), and Deploy a Sharded Cluster (page 522).
Example The following operation enables sharding for the people collection in the records database and uses
the zipcode field as the shard key (page 506):
db.runCommand( { shardCollection: "records.people", key: { zipcode: 1 } } )
shardingState
shardingState
shardingState (page 738) is an admin command that reports if mongod (page 925) is a member of a
sharded cluster. shardingState (page 738) has the following prototype form:
738
{ shardingState: 1 }
For shardingState (page 738) to detect that a mongod (page 925) is a member of a sharded cluster, the
mongod (page 925) must satisfy the following conditions:
1.the mongod (page 925) is a primary member of a replica set, and
2.the mongod (page 925) instance is a member of a sharded cluster.
If shardingState (page 738) detects that a mongod (page 925) is a member of a sharded cluster,
shardingState (page 738) returns a document that resembles the following prototype:
{
"enabled" : true,
"configServer" : "<configdb-string>",
"shardName" : "<string>",
"shardHost" : "string:",
"versions" : {
"<database>.<collection>" : Timestamp(<...>),
"<database>.<collection>" : Timestamp(<...>)
},
"ok" : 1
}
The response from shardingState (page 738) when used with a config server is:
{ "enabled": false, "ok": 1 }
Note: mongos (page 938) instances do not provide the shardingState (page 738).
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed; however, the operation is typically short lived.
unsetSharding
unsetSharding
unsetSharding (page 739) is an internal command that supports sharding functionality.
split
Definition
split
Splits a chunk in a sharded cluster into two chunks. The mongos (page 938) instance splits and manages chunks
automatically, but for exceptional circumstances the split (page 739) command does allow administrators to
manually create splits. See Split Chunks in a Sharded Cluster (page 548) for information on these circumstances,
and on the MongoDB shell commands that wrap split (page 739).
The split (page 739) command takes a document with the following fields:
field string split The name of the collection where the chunk exists. Specify the collections full
namespace, including the database name.
739
field document find An query statement that specifies an equality match on the shard key. The
match selects the chunk that contains the specified document. You must specify only one of the
following: find, bounds, or middle.
field array bounds New in version 2.4: The bounds of a chunk to split. bounds applies to chunks
in collections partitioned using a hashed shard key. The parameters array must consist of two
documents specifying the lower and upper shard-key values of the chunk. The values must match
the minimum and maximum values of an existing chunk. Specify only one of the following:
find, bounds, or middle.
field document middle The document to use as the split point to create two chunks. split
(page 739) requires one of the following options: find, bounds, or middle.
Command Formats To create a chunk split, connect to a mongos (page 938) instance, and issue the following
command to the admin database:
db.adminCommand( { split: <database>.<collection>,
find: <document> } )
Or:
db.adminCommand( { split: <database>.<collection>,
middle: <document> } )
Or:
db.adminCommand( { split: <database>.<collection>,
bounds: [ <lower>, <upper> ] } )
To create a split for a collection that uses a hashed shard key, use the bounds parameter. Do not use the middle
parameter for this purpose.
Warning: Be careful when splitting data in a sharded collection to create new chunks. When you shard a
collection that has existing data, MongoDB automatically creates chunks to evenly distribute the collection. To
split data effectively in a sharded cluster you must consider the number of documents in a chunk and the average
document size to create a uniform chunk size. When chunks have irregular sizes, shards may have an equal number
of chunks but have very different data sizes. Avoid creating splits that lead to a collection with differently sized
chunks.
See also:
moveChunk (page 742), sh.moveChunk() (page 907), sh.splitAt() (page 909), and sh.splitFind()
(page 909), which wrap the functionality of split (page 739).
Examples The following sections provide examples of the split (page 739) command.
Split a Chunk in Half
db.runCommand( { split : "test.people", find : { _id : 99 } } )
The split (page 739) command identifies the chunk in the people collection of the test database, that holds
documents that match { _id : 99 }. split (page 739) does not require that a match exist, in order to identify
the appropriate chunk. Then the command splits it into two chunks of equal size.
Note: split (page 739) creates two equal chunks by range as opposed to size, and does not use the selected point
as a boundary for the new chunks
740
Define an Arbitrary Split Point To define an arbitrary split point, use the following form:
db.runCommand( { split : "test.people", middle : { _id : 99 } } )
The split (page 739) command identifies the chunk in the people collection of the test database, that would
hold documents matching the query { _id : 99 }. split (page 739) does not require that a match exist, in
order to identify the appropriate chunk. Then the command splits it into two chunks, with the matching document as
the lower bound of one of the split chunks.
This form is typically used when pre-splitting data in a collection.
Split a Chunk Using Values of a Hashed Shard Key This example uses the hashed shard key userid in a
people collection of a test database. The following command uses an array holding two single-field documents to
represent the minimum and maximum values of the hashed shard key to split the chunk:
db.runCommand( { split: "test.people",
bounds : [ { userid: NumberLong("-5838464104018346494") },
{ userid: NumberLong("-5557153028469814163") }
] } )
Note: MongoDB uses the 64-bit NumberLong (page 201) type to represent the hashed value.
Use sh.status() (page 910) to see the existing bounds of the shard keys.
Metadata Lock Error If another process in the mongos (page 938), such as a balancer process, changes metadata
while split (page 739) is running, you may see a metadata lock error.
errmsg: "The collection's metadata lock is already taken."
This message indicates that the split has failed with no side effects. Retry the split (page 739) command.
splitChunk
Definition
splitChunk
An internal administrative command. To split chunks, use the sh.splitFind() (page 909) and
sh.splitAt() (page 909) functions in the mongo (page 942) shell.
Warning: Be careful when splitting data in a sharded collection to create new chunks. When you shard a
collection that has existing data, MongoDB automatically creates chunks to evenly distribute the collection.
To split data effectively in a sharded cluster you must consider the number of documents in a chunk and the
average document size to create a uniform chunk size. When chunks have irregular sizes, shards may have
an equal number of chunks but have very different data sizes. Avoid creating splits that lead to a collection
with differently sized chunks.
See also:
moveChunk (page 742) and sh.moveChunk() (page 907).
The splitChunk (page 741) command takes a document with the following fields:
field string ns The complete namespace of the chunk to split.
field document keyPattern The shard key.
field document min The lower bound of the shard key for the chunk to split.
11.1. MongoDB Interface
741
field document max The upper bound of the shard key for the chunk to split.
field string from The shard that owns the chunk to split.
field document splitKeys The split point for the chunk.
field document shardId The shard.
splitVector
splitVector
Is an internal command that supports meta-data operations in sharded clusters.
medianKey
medianKey
medianKey (page 742) is an internal command.
moveChunk
Definition
moveChunk
Internal administrative command. Moves chunks between shards. Issue the moveChunk (page 742) command
via a mongos (page 938) instance while using the admin database. Use the following forms:
db.runCommand( { moveChunk : <namespace> ,
find : <query> ,
to : <string>,
_secondaryThrottle : <boolean> } )
Alternately:
db.runCommand( { moveChunk : <namespace> ,
bounds : <array> ,
to : <string>,
_secondaryThrottle : <boolean> } )
742
The chunk migration (page 518) section describes how chunks move between shards on MongoDB.
See also:
split (page 739), sh.moveChunk() (page 907), sh.splitAt() (page 909), and sh.splitFind()
(page 909).
Return Messages moveChunk (page 742) returns the following error message if another metadata operation is in
progress on the chunks (page 566) collection:
errmsg: "The collection's metadata lock is already taken."
If another process, such as a balancer process, changes meta data while moveChunk (page 742) is running, you may
see this error. You may retry the moveChunk (page 742) operation without side effects.
Note: Only use the moveChunk (page 742) in special circumstances such as preparing your sharded cluster for
an initial ingestion of data, or a large bulk import operation. In most cases allow the balancer to create and balance
chunks in sharded clusters. See Create Chunks in a Sharded Cluster (page 545) for more information.
movePrimary
movePrimary
In a sharded cluster, this command reassigns the databases primary shard, which holds all un-sharded collections in the database. movePrimary (page 743) is an administrative command that is only available for
mongos (page 938) instances. Only use movePrimary (page 743) when removing a shard from a sharded
cluster.
Important: Only use movePrimary (page 743) when:
the database does not contain any collections with data, or
you have drained all sharded collections using the removeShard (page 737) command.
See Remove Shards from an Existing Sharded Cluster (page 553) for a complete procedure.
movePrimary (page 743) changes the primary shard for this database in the cluster metadata, and migrates
all un-sharded collections to the specified shard. Use the command with the following form:
{ movePrimary : "test", to : "shard0001" }
When the command returns, the databases primary location will shift to the designated shard. To fully decommission a shard, use the removeShard (page 737) command.
isdbgrid
isdbgrid
This command verifies that a process is a mongos (page 938).
If you issue the isdbgrid (page 743) command when connected to a mongos (page 938), the response
document includes the isdbgrid field set to 1. The returned document is similar to the following:
{ "isdbgrid" : 1, "hostname" : "app.example.net", "ok" : 1 }
If you issue the isdbgrid (page 743) command when connected to a mongod (page 925), MongoDB returns
an error document. The isdbgrid (page 743) command is not available to mongod (page 925). The error
document, however, also includes a line that reads "isdbgrid" : 1, just as in the document returned for a
mongos (page 938). The error document is similar to the following:
743
{
"errmsg" : "no such cmd: isdbgrid",
"bad cmd" : {
"isdbgrid" : 1
},
"ok" : 0
}
You can instead use the isMaster (page 732) command to determine connection to a mongos (page 938).
When connected to a mongos (page 938), the isMaster (page 732) command returns a document that contains the string isdbgrid in the msg field.
See also:
Sharding (page 493) for more information about MongoDBs sharding functionality.
Instance Administration Commands
Name
renameCollection (page 744)
copydb (page 745)
dropDatabase (page 746)
drop (page 747)
create (page 747)
clone (page 748)
cloneCollection (page 748)
cloneCollectionAsCapped
(page 749)
closeAllDatabases (page 749)
Administration Commands
Description
Changes the name of an existing collection.
Copies a database from a remote host to the current host.
Removes the current database.
Removes the specified collection from the database.
Creates a collection and sets collection parameters.
Copies a database from a remote host to the current host.
Copies a collection from a remote host to the current host.
Copies a non-capped collection as a new capped collection.
renameCollection
Definition
renameCollection
Changes the name of an existing collection. Specify collections to renameCollection (page 744) in the
744
form of a complete namespace, which includes the database name. Issue the renameCollection (page 744)
command against the admin database. The command takes the following form:
{ renameCollection: "<source_namespace>", to: "<target_namespace>", dropTarget: <true|false> }
Exceptions
exception 10026 Raised if the source namespace does not exist.
exception 10027 Raised if the target namespace exists and dropTarget is either false or unspecified.
exception 15967 Raised if the target namespace is an invalid collection name.
Shell Helper The shell helper db.collection.renameCollection() (page 846) provides a simpler interface to using this command within a database. The following is equivalent to the previous example:
db.source-namespace.renameCollection( "target" )
Warning: You cannot use renameCollection (page 744) with sharded collections.
Warning: This command obtains a global write lock and will block other operations until it has completed.
copydb
copydb
The copydb (page 745) command copies a database from a remote host to the current host. copydb (page 745)
accepts the following options:
field string fromhost Hostname of the source mongod (page 925) instance. If omitted, copydb
(page 745) copies one database to another within a single MongoDB instance.
field string fromdb Name of the source database.
745
If you need to copy a database and authenticate, its easiest to use the shell helper:
db.copyDatabase(<remote_db_name>, <local_db_name>, <from_host_name>, <username>, <password>)
dropDatabase
dropDatabase
The dropDatabase (page 746) command drops a database, deleting the associated data files.
dropDatabase (page 746) operates on the current database.
In the shell issue the use <database> command, replacing <database> with the name of the database
you wish to delete. Then use the following command form:
746
{ dropDatabase: 1 }
The mongo (page 942) shell also provides the following equivalent helper method:
db.dropDatabase();
Warning: This command obtains a global write lock and will block other operations until it has completed.
drop
drop
The drop (page 747) command removes an entire collection from a database. The command has following
syntax:
{ drop: <collection_name> }
The mongo (page 942) shell provides the equivalent helper method:
db.collection.drop();
Note that this command also removes any indexes associated with the dropped collection.
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed.
create
Definition
create
Explicitly creates a collection. create (page 747) has the following form:
{ create: <collection_name>,
capped: <true|false>,
autoIndexId: <true|false>,
size: <max_size>,
max: <max_documents>
}
747
For more information on the autoIndexId field in versions before 2.2, see _id Fields and Indexes on Capped
Collections (page 1054).
The db.createCollection() (page 878) method wraps the create (page 747) command.
Note: The create (page 747) command obtains a write lock on the affected database and will block other operations
until it has completed. The write lock for this operation is typically short lived. However, allocations for large capped
collections may take longer.
Example To create a capped collection limited to 64 kilobytes, issue the command in the following form:
db.runCommand( { create: "collection", capped: true, size: 64 * 1024 } )
clone
clone
The clone (page 748) command clone a database from a remote MongoDB instance to the current host. clone
(page 748) copies the database on the remote instance with the same name as the current database. The command
takes the following form:
{ clone: "db1.example.net:27017" }
Replace db1.example.net:27017 above with the resolvable hostname for the MongoDB instance you
wish to copy from. Note the following behaviors:
clone (page 748) can run against a slave or a non-primary member of a replica set.
clone (page 748) does not snapshot the database. If any clients update the database youre copying at
any point during the clone operation, the resulting database may be inconsistent.
You must run clone (page 748) on the destination server.
The destination server is not locked for the duration of the clone (page 748) operation. This means that
clone (page 748) will occasionally yield to allow other operations to complete.
See copydb (page 745) for similar functionality.
Warning: This command obtains an intermittent write lock on the destination server, that can block other
operations until it completes.
cloneCollection
Definition
cloneCollection
Copies a collection from a remote mongod (page 925) instance to the current mongod (page 925) instance.
cloneCollection (page 748) creates a collection in a database with the same name as the remote collections database. cloneCollection (page 748) takes the following form:
Important: You cannot clone a collection through a mongos (page 938) but must connect directly to the
mongod (page 925) instance.
cloneCollection (page 748) has the following fields:
field string cloneCollection The name of the collection to clone.
748
field string from Specify a resolvable hostname and optional port number of the remote server
where the specified collection resides.
field document query A query that filters the documents in the remote collection that
cloneCollection (page 748) will copy to the current database.
field boolean copyIndexes If set to false the indexes on the originating server are not copied with
the documents in the collection. This is set to true by default.
Example
This operation copies the profiles collection from the users database on the server at
mongodb.example.net. The operation only copies documents that satisfy the query { active: true }
and does not copy indexes. cloneCollection (page 748) copies indexes by default, but you can disable this
behavior by setting { copyIndexes: false }. The query and copyIndexes arguments are optional.
If, in the above example, the profiles collection exists in the users database, then MongoDB appends documents
from the remote collection to the destination collection.
cloneCollectionAsCapped
cloneCollectionAsCapped
The cloneCollectionAsCapped (page 749) command creates a new capped collection from an existing, non-capped collection within the same database. The operation does not affect the original non-capped
collection.
The command has the following syntax:
The command copies an existing collection and creates a new capped collection with a maximum size specified by the capped size in bytes. The name of the new capped collection must be distinct
and cannot be the same as that of the original existing collection. To replace the original non-capped collection
with a capped collection, use the convertToCapped (page 749) command.
During the cloning, the cloneCollectionAsCapped (page 749) command exhibit the following behavior:
MongoDB will transverse the documents in the original collection in natural order as theyre loaded.
If the capped size specified for the new collection is smaller than the size of the original uncapped
collection, then MongoDB will begin overwriting earlier documents in insertion order, which is first in,
first out (e.g FIFO).
closeAllDatabases
closeAllDatabases
closeAllDatabases (page 749) is an internal command that invalidates all cursors and closes the open
database files. The next operation that uses the database will reopen the file.
Warning: This command obtains a global write lock and will block other operations until it has completed.
convertToCapped
convertToCapped
The convertToCapped (page 749) command converts an existing, non-capped collection to a capped collection within the same database.
749
convertToCapped (page 749) takes an existing collection (<collection>) and transforms it into a
capped collection with a maximum size in bytes, specified to the size argument (<capped size>).
During the conversion process, the convertToCapped (page 749) command exhibit the following behavior:
MongoDB transverses the documents in the original collection in natural order and loads the documents
into a new capped collection.
If the capped size specified for the capped collection is smaller than the size of the original uncapped
collection, then MongoDB will overwrite documents in the capped collection based on insertion order, or
first in, first out order.
Internally, to convert the collection, MongoDB uses the following procedure
cloneCollectionAsCapped (page 749) command creates the capped collection and imports the
data.
MongoDB drops the original collection.
renameCollection (page 744) renames the new capped collection to the name of the original
collection.
Note: MongoDB does not support the convertToCapped (page 749) command in a sharded cluster.
Warning: The convertToCapped (page 749) will not recreate indexes from the original collection on
the new collection, other than the index on the _id field. If you need indexes on this collection you will
need to create these indexes after the conversion is complete.
See also:
create (page 747)
Warning: This command obtains a global write lock and will block other operations until it has completed.
filemd5
filemd5
The filemd5 (page 750) command returns the md5 hashes for a single file stored using the GridFS specification. Client libraries use this command to verify that files are correctly written to MongoDB. The command
takes the files_id of the file in question and the name of the GridFS root collection as arguments. For
example:
{ filemd5: ObjectId("4f1f10e37671b50e4ecd2776"), root: "fs" }
dropIndexes
dropIndexes
The dropIndexes (page 750) command drops one or all indexes from the current collection. To drop all
indexes, issue the command like so:
{ dropIndexes: "collection", index: "*" }
To drop a single, issue the command by specifying the name of the index you want to drop. For example, to
drop the index named age_1, use the following command:
750
The shell provides a useful command helper. Heres the equivalent command:
db.collection.dropIndex("age_1");
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed.
fsync
fsync
The fsync (page 751) command forces the mongod (page 925) process to flush all pending writes to the
storage layer. mongod (page 925) is always writing data to the storage layer as applications write more data to
the database. MongoDB guarantees that it will write all data to disk within the syncdelay (page 998) interval,
which is 60 seconds by default.
{ fsync: 1 }
The fsync (page 751) operation is synchronous by default, to run fsync (page 751) asynchronously, use the
following form:
{ fsync: 1, async: true }
The connection will return immediately. You can check the output of db.currentOp() (page 879) for the
status of the fsync (page 751) operation.
The primary use of fsync (page 751) is to lock the database during backup operations. This will flush all data
to the data storage layer and block all write operations until you unlock the database. Consider the following
command form:
{ fsync: 1, lock: true }
Note: You may continue to perform read operations on a database that has a fsync (page 751) lock. However,
following the first write operation all subsequent read operations wait until you unlock the database.
To check on the current state of the fsync lock, use db.currentOp() (page 879). Use the following
JavaScript function in the shell to test if the database is currently locked:
serverIsLocked = function () {
var co = db.currentOp();
if (co && co.fsyncLock) {
return true;
}
return false;
}
After loading this function into your mongo (page 942) shell session you can call it as follows:
serverIsLocked()
This function will return true if the database is currently locked and false if the database is not locked. To
unlock the database, make a request for an unlock using the following command:
db.getSiblingDB("admin").$cmd.sys.unlock.findOne();
New in version 1.9.0: The db.fsyncLock() (page 885) and db.fsyncUnlock() (page 886) helpers in
the shell.
11.1. MongoDB Interface
751
In the mongo (page 942) shell, you may use the db.fsyncLock() (page 885) and db.fsyncUnlock()
(page 886) wrappers for the fsync (page 751) lock and unlock process:
db.fsyncLock();
db.fsyncUnlock();
Note: fsync (page 751) lock is only possible on individual shards of a sharded cluster, not on the entire sharded cluster. To backup an entire sharded cluster, please read Sharded Cluster Backup Considerations
(page 137).
If your mongod (page 925) has journaling enabled, consider using another method (page 185) to back up your
database.
Note: The database cannot be locked with db.fsyncLock() (page 885) while profiling is enabled. You
must disable profiling before locking the database with db.fsyncLock() (page 885). Disable profiling using
db.setProfilingLevel() (page 894) as follows in the mongo (page 942) shell:
db.setProfilingLevel(0)
clean
clean
clean (page 752) is an internal command.
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed.
connPoolSync
connPoolSync
connPoolSync (page 752) is an internal command.
compact
Definition
compact
New in version 2.0.
Rewrites and defragments all data in a collection, as well as all of the indexes on that collection. compact
(page 752) has the following form:
{ compact: <collection name> }
752
paddingFactor does not affect the padding of subsequent record allocations after compact
(page 752) completes. For more information, see paddingFactor (page 753).
field integer paddingBytes Sets the padding as an absolute number of bytes for all records compacted during the compact (page 752) operation. After compact (page 752) completes,
paddingBytes does not affect the padding of subsequent record allocations. For more information, see paddingBytes (page 753).
compact (page 752) is similar to repairDatabase (page 757); however, repairDatabase (page 757)
operates on an entire database.
paddingFactor New in version 2.2.
The paddingFactor field takes the following range of values:
Default: 1.0
Minimum: 1.0 (no padding)
Maximum: 4.0
If your updates increase the size of the documents, padding will increase the amount of space allocated to each
document and avoid expensive document relocation operations within the data files.
You can calculate the padding size by subtracting the document size from the record size or, in terms of the
paddingFactor, by subtracting 1 from the paddingFactor:
padding size = (paddingFactor - 1) * <document size>.
For example, a paddingFactor of 1.0 specifies a padding size of 0 whereas a paddingFactor of 1.2 specifies
a padding size of 0.2 or 20 percent (20%) of the document size.
With the following command, you can use the paddingFactor option of the compact (page 752) command to
set the record size to 1.1 of the document size, or a padding factor of 10 percent (10%):
db.runCommand ( { compact: '<collection>', paddingFactor: 1.1 } )
compact (page 752) compacts existing documents but does not reset paddingFactor statistics for the collection.
After the compact (page 752) MongoDB will use the existing paddingFactor when allocating new records for
documents in this collection.
paddingBytes New in version 2.2.
Specifying paddingBytes can be useful if your documents start small but then increase in size significantly. For
example, if your documents are initially 40 bytes long and you grow them by 1KB, using paddingBytes: 1024
might be reasonable since using paddingFactor: 4.0 would specify a record size of 160 bytes (4.0 times
the initial document size), which would only provide a padding of 120 bytes (i.e. record size of 160 bytes minus the
document size).
With the following command, you can use the paddingBytes option of the compact (page 752) command to set
the padding size to 100 bytes on the collection named by <collection>:
db.runCommand ( { compact: '<collection>', paddingBytes: 100 } )
Warning: Always have an up-to-date backup before performing server maintenance such as the compact
(page 752) operation.
Behaviors The compact (page 752) has the behaviors described here.
753
Blocking In MongoDB 2.2, compact (page 752) blocks activities only for its database. Prior to 2.2, the command
blocked all activities.
You may view the intermediate progress either by viewing the mongod (page 925) log file or by running the
db.currentOp() (page 879) in another shell instance.
Operation Termination If you terminate the operation with the db.killOp() (page 890) method or restart the
server before the compact (page 752) operation has finished:
If you have journaling enabled, the data remains consistent and usable, regardless of the state of the compact
(page 752) operation. You may have to manually rebuild the indexes.
If you do not have journaling enabled and the mongod (page 925) or compact (page 752) terminates during
the operation, it is impossible to guarantee that the data is in a consistent state.
In either case, much of the existing free space in the collection may become un-reusable. In this scenario, you
should rerun the compaction to completion to restore the use of this free space.
Disk Space compact (page 752) generally uses less disk space than repairDatabase (page 757) and is faster.
However, the compact (page 752) command is still slow and blocks other database use. Only use compact
(page 752) during scheduled maintenance periods.
compact (page 752) requires up to 2 gigabytes of additional disk space while running. Unlike repairDatabase
(page 757), compact (page 752) does not free space on the file system.
To see how the storage space changes for the collection, run the collStats (page 763) command before and after
compaction.
Size and Number of Data Files compact (page 752) may increase the total size and number of your data files,
especially when run for the first time. However, this will not increase the total collection storage space since storage
size is the amount of data allocated within the database files, and not the size/number of the files on the file system.
Replica Sets compact (page 752) commands do not replicate to secondaries in a replica set:
Compact each member separately.
Ideally run compact (page 752) on a secondary. See option force:true above for information regarding
compacting the primary.
Important: On secondaries, the compact (page 752) command forces the secondary to enter RECOVERING
(page 488) state. This prevents clients from reading during compaction. Once the operation finishes, the secondary returns to SECONDARY (page 487) state.
See Replica Set Member States (page 487) for more information about replica set member states. Refer to the
partial script for automating step down and compaction7 for an example of this procedure.
Sharded Clusters compact (page 752) is a command issued to a mongod (page 925). In a sharded environment,
run compact (page 752) on each shard separately as a maintenance operation.
Important: You cannot issue compact (page 752) against a mongos (page 938) instance.
7 https://github.com/mongodb/mongo-snippets/blob/master/js/compact-example.js
754
Capped Collections It is not possible to compact capped collections because they dont have padding, and documents cannot grow in these collections. However, the documents of a capped collection are not subject to fragmentation.
collMod
collMod
New in version 2.2.
collMod (page 755) makes it possible to add flags to a collection to modify the behavior of MongoDB. Flags
include usePowerOf2Sizes (page 755) and index (page 755). The command takes the following prototype
form:
db.runCommand( {"collMod" : <collection> , "<flag>" : <value> } )
In this command substitute <collection> with the name of a collection in the current database, and <flag>
and <value> with the flag and value you want to set.
Use the userFlags (page 764) field in the in db.collection.stats() (page 848) output to check
enabled collection flags.
usePowerOf2Sizes
The usePowerOf2Sizes (page 755) flag changes the method that MongoDB uses to allocate space
on disk for documents in this collection. By setting usePowerOf2Sizes (page 755), you ensure that
MongoDB will allocate space for documents in sizes that are powers of 2 (e.g. 4, 8, 16, 32, 64, 128, 256,
512...8388608). With usePowerOf2Sizes (page 755) MongoDB will be able to more effectively reuse
space.
Note: With usePowerOf2Sizes (page 755) MongoDB, allocates records that have power of 2 sizes,
until record sizes equal 4 megabytes. For records larger than 4 megabytes with usePowerOf2Sizes
(page 755) set, mongod (page 925) will allocate records in full megabytes by rounding up to the nearest
megabyte.
usePowerOf2Sizes (page 755) is useful for collections where you will be inserting and deleting large
numbers of documents to ensure that MongoDB will effectively use space on disk.
Example
To enable usePowerOf2Sizes (page 755) on the collection named products, use the following
operation:
db.runCommand( {collMod: "products", usePowerOf2Sizes : true })
To disable usePowerOf2Sizes (page 755) on the collection products, use the following operation:
db.runCommand( { collMod: "products", usePowerOf2Sizes: false })
Warning: Changed in version 2.2.1: usePowerOf2Sizes (page 755) now supports documents
larger than 8 megabytes. If you enable usePowerOf2Sizes (page 755) you must use at least
version 2.2.1.
usePowerOf2Sizes (page 755) only affects subsequent allocations caused by document insertion
or record relocation as a result of document growth, and does not affect existing allocations.
index
The index (page 755) flag changes the expiration time of a TTL Collection (page 158).
Specify the key and new expiration time with a document of the form:
755
where <index_spec> is an existing index in the collection and seconds is the number of seconds to
subtract from the current time.
Example
To update the expiration value for a collection named sessions indexed on a lastAccess field from
30 minutes to 60 minutes, use the following operation:
db.runCommand({collMod: "sessions",
index: {keyPattern: {lastAccess:1},
expireAfterSeconds: 3600}})
On success collMod (page 755) returns a document with fields expireAfterSeconds_old and
expireAfterSeconds_new set to their respective values.
On failure, collMod (page 755) returns a document with no expireAfterSeconds field
to update if there is no existing expireAfterSeconds field or cannot find index {
**key**: 1.0 } for ns **namespace** if the specified keyPattern does not exist.
reIndex
reIndex
The reIndex (page 756) command rebuilds all indexes for a specified collection. Use the following syntax:
{ reIndex: "collection" }
Normally, MongoDB compacts indexes during routine updates. For most users, the reIndex (page 756)
command is unnecessary. However, it may be worth running if the collection size has changed significantly or
if the indexes are consuming a disproportionate amount of disk space.
Call reIndex (page 756) using the following form:
db.collection.reIndex();
Note: For replica sets, reIndex (page 756) will not propagate from the primary to secondaries. reIndex
(page 756) will only affect a single mongod (page 925) instance.
See
Index Creation (page 335) for more information on the behavior of indexing operations in MongoDB.
setParameter
setParameter
setParameter (page 756) is an administrative command for modifying options normally set on the command
line. You must issue the setParameter (page 756) command against the admin database in the form:
{ setParameter: 1, <option>: <value> }
Replace the <option> with one of the supported setParameter (page 756) options:
756
The values specified for getParameter and <option> do not affect the output. The command works with
the following options:
quiet
notablescan
logLevel
syncdelay
See also:
setParameter (page 756) for more about these parameters.
repairDatabase
Definition
repairDatabase
Checks and repairs errors and inconsistencies in data storage. repairDatabase (page 757) is analogous to
a fsck command for file systems. Run the repairDatabase (page 757) command to ensure data integrity
after the system experiences an unexpected system restart or crash, if:
1.The mongod (page 925) instance is not running with journaling enabled.
Note: When using journaling, there is almost never any need to run repairDatabase (page 757). In
the event of an unclean shutdown, the server will be able restore the data files to a pristine state automatically.
757
2.There are no other intact replica set members with a complete data set.
Warning:
During normal operations, only use the repairDatabase (page 757) command
and wrappers including db.repairDatabase() (page 892) in the mongo (page 942) shell and
mongod --repair, to compact database files and/or reclaim disk space. Be aware that these operations remove and do not save any corrupt data during the repair process.
If you are trying to repair a replica set member, and you have access to an intact copy of your data (e.g.
a recent backup or an intact member of the replica set), you should restore from that intact copy, and
not use repairDatabase (page 757).
repairDatabase (page 757) has the following fields:
field boolean preserveClonedFilesOnFailure When true, repairDatabase will not delete
temporary files in the backup directory on error, and all new files are created with the backup
instead of _tmp directory prefix. By default repairDatabase does not delete temporary
files, and uses the _tmp naming prefix for new files.
field boolean backupOriginalFiles When true, repairDatabase moves old database files to
the backup directory instead of deleting them before moving new files into place. New files are
created with the backup instead of _tmp directory prefix. By default, repairDatabase
leaves temporary files unchanged, and uses the _tmp naming prefix for new files.
repairDatabase (page 757) takes the following form:
{ repairDatabase: 1 }
Warning: This command obtains a global write lock and will block other operations until it has completed.
Note: repairDatabase (page 757) requires free disk space equal to the size of your current data set
plus 2 gigabytes. If the volume that holds dbpath lacks sufficent space, you can mount a separate volume and
use that for the repair. When mounting a separate volume for repairDatabase (page 757) you must run
repairDatabase (page 757) from the command line and use the --repairpath switch to specify the
folder in which to store temporary repair files.
See mongod --repair and mongodump --repair for information on these related options.
Behavior The repairDatabase (page 757) command compacts all collections in the database. It is identical to
running the compact (page 752) command on each collection individually.
repairDatabase (page 757) reduces the total size of the data files on disk. It also recreates all indexes in the
database.
The time requirement for repairDatabase (page 757) depends on the size of the data set.
You may invoke repairDatabase (page 757) from multiple contexts:
Use the mongo (page 942) shell to run the command, as above.
Use the db.repairDatabase() (page 892) in the mongo (page 942) shell.
Run mongod (page 925) directly from your systems shell. Make sure that mongod (page 925) isnt already
running, and that you invoke mongod (page 925) as a user that has access to MongoDBs data files. Run as:
758
mongod --repair
Note: mongod --repair (page 925) will fail if your database is not a master or primary. In most cases,
you should recover a corrupt secondary using the data from an existing intact node. To run repair on a secondary/slave restart the instance in standalone mode without the --replSet or --slave options.
Example
{ repairDatabase: 1 }
Using repairDatabase to Reclaim Disk Space You should not use repairDatabase (page 757) for data
recovery unless you have no other option.
However, if you trust that there is no corruption and you have enough free space, then repairDatabase (page 757)
is the appropriate and the only way to reclaim disk space.
touch
touch
New in version 2.2.
The touch (page 759) command loads data from the data storage layer into memory. touch (page 759) can
load the data (i.e. documents,) indexes or both documents and indexes. Use this command to ensure that a
collection, and/or its indexes, are in memory before another operation. By loading the collection or indexes
into memory, mongod (page 925) will ideally be able to perform subsequent operations more efficiently. The
touch (page 759) command has the following prototypical form:
{ touch: [collection], data: [boolean], index: [boolean] }
By default, data and index are false, and touch (page 759) will perform no operation. For example, to load
both the data and the index for a collection named records, you would use the following command in the
mongo (page 942) shell:
db.runCommand({ touch: "records", data: true, index: true })
touch (page 759) will not block read and write operations on a mongod (page 925), and can run on secondary
members of replica sets.
Note: Using touch (page 759) to control or tweak what a mongod (page 925) stores in memory may displace
other records data in memory and hinder performance. Use with caution in production systems.
Warning: If you run touch (page 759) on a secondary, the secondary will enter a RECOVERING state
to prevent clients from sending read operations during the touch (page 759) operation. When touch
(page 759) finishes the secondary will automatically return to SECONDARY state. See state (page 727)
for more information on replica set member states.
shutdown
759
shutdown
The shutdown (page 759) command cleans up all database resources and then terminates the process. You
must issue the shutdown (page 759) command against the admin database in the form:
{ shutdown: 1 }
Note: Run the shutdown (page 759) against the admin database. When using shutdown (page 759), the
connection must originate from localhost or use an authenticated connection.
If the node youre trying to shut down is a replica set (page 377) primary, then the command will succeed only
if there exists a secondary node whose oplog data is within 10 seconds of the primary. You can override this
protection using the force option:
{ shutdown: 1, force: true }
Alternatively, the shutdown (page 759) command also supports a timeoutSecs argument which allows
you to specify a number of seconds to wait for other members of the replica set to catch up:
{ shutdown: 1, timeoutSecs: 60 }
The equivalent mongo (page 942) shell helper syntax looks like this:
db.shutdownServer({timeoutSecs: 60});
logRotate
logRotate
The logRotate (page 760) command is an administrative command that allows you to rotate the MongoDB logs to prevent a single logfile from consuming too much disk space. You must issue the logRotate
(page 760) command against the admin database in the form:
{ logRotate: 1 }
Note: Your mongod (page 925) instance needs to be running with the --logpath [file] option.
You may also rotate the logs by sending a SIGUSR1 signal to the mongod (page 925) process. If your mongod
(page 925) has a process ID of 2200, heres how to send the signal on Linux:
kill -SIGUSR1 2200
logRotate (page 760) renames the existing log file by appending the current timestamp to the filename. The
appended timestamp has the following form:
<YYYY>-<mm>-<DD>T<HH>-<MM>-<SS>
Then logRotate (page 760) creates a new log file with the same name as originally specified by the logpath
(page 992) setting to mongod (page 925) or mongos (page 938).
Note: New in version 2.0.3: The logRotate (page 760) command is available to mongod (page 925)
instances running on Windows systems with MongoDB release 2.0.3 and higher.
Diagnostic Commands
760
Diagnostic Commands
Name
Description
listDatabases
Returns a document that lists all databases and returns basic database statisti
(page 761)
dbHash (page 761)
Internal command to support sharding.
driverOIDTest
Internal command that converts an ObjectID to a string to support tests.
(page 761)
listCommands
Lists all database commands provided by the current mongod (page 925) ins
(page 762)
availableQueryOptions Internal command that reports on the capabilities of the current MongoDB
(page 762)
instance.
buildInfo (page 762)
Displays statistics about the MongoDB build.
collStats (page 763)
Reports storage utilization statics for a specified collection.
connPoolStats
Reports statistics on the outgoing connections from this MongoDB instance
(page 765)
other MongoDB instances in the deployment.
dbStats (page 767)
Reports storage utilization statistics for the specified database.
cursorInfo (page 769)
Reports statistics on active cursors.
dataSize (page 769)
Returns the data size for a range of data. For internal use.
diagLogging (page 769) Provides a diagnostic logging. For internal use.
getCmdLineOpts
Returns a document with the run-time arguments to the MongoDB instance
(page 769)
their parsed options.
netstat (page 770)
Internal command that reports on intra-deployment connectivity. Only availa
for mongos (page 938) instances.
ping (page 770)
Internal command that tests intra-deployment connectivity.
profile (page 770)
Interface for the database profiler (page 229).
validate (page 771)
Internal command that scans for a collections data and indexes for correctne
top (page 774)
Returns raw usage statistics for each database in the mongod (page 925) ins
indexStats (page 774)
Experimental command that collects and aggregates statistics on all indexes.
whatsmyuri (page 779)
Internal command that returns information on the current client.
getLog (page 779)
Returns recent log messages.
hostInfo (page 780)
Returns data that reflects the underlying host system.
serverStatus
Returns a collection metrics on instance-wide resource utilization and status
(page 782)
features (page 799)
Reports on features available in the current MongoDB instance.
isSelf
Internal command to support testing.
listDatabases
listDatabases
The listDatabases (page 761) command provides a list of existing databases along with basic statistics
about them:
{ listDatabases: 1 }
The value (e.g. 1) does not affect the output of the command. listDatabases (page 761) returns a document
for each database. Each document contains a name field with the database name, a sizeOnDisk field with
the total size of the database file on disk in bytes, and an empty field specifying whether the database has any
data.
dbHash
dbHash
dbHash (page 761) is an internal command.
driverOIDTest
11.1. MongoDB Interface
761
driverOIDTest
driverOIDTest (page 761) is an internal command.
listCommands
listCommands
The listCommands (page 762) command generates a list of all database commands implemented for the
current mongod (page 925) instance.
availableQueryOptions
availableQueryOptions
availableQueryOptions (page 762) is an internal command that is only available on mongos (page 938)
instances.
buildInfo
buildInfo
The buildInfo (page 762) command is an administrative command which returns a build summary for the
current mongod (page 925). buildInfo (page 762) has the following prototype form:
{ buildInfo: 1 }
In the mongo (page 942) shell, call buildInfo (page 762) in the following form:
db.runCommand( { buildInfo: 1 } )
Example
The output document of buildInfo (page 762) has the following form:
{
"version" : "<string>",
"gitVersion" : "<string>",
"sysInfo" : "<string>",
"loaderFlags" : "<string>",
"compilerFlags" : "<string>",
"allocator" : "<string>",
"versionArray" : [ <num>, <num>, <...> ],
"javascriptEngine" : "<string>",
"bits" : <num>,
"debug" : <boolean>,
"maxBsonObjectSize" : <num>,
"ok" : <num>
}
buildInfo.compilerFlags
The flags passed to the compiler that builds the mongod (page 925) binary.
buildInfo.allocator
Changed in version 2.2.
The memory allocator that mongod (page 925) uses. By default this is tcmalloc after version 2.2, and
system before 2.2.
buildInfo.versionArray
An array that conveys version information about the mongod (page 925) instance. See version for a
more readable version of this string.
buildInfo.javascriptEngine
Changed in version 2.4.
A string that reports the JavaScript engine used in the mongod (page 925) instance. By default, this is V8
after version 2.4, and SpiderMonkey before 2.4.
buildInfo.bits
A number that reflects the target processor architecture of the mongod (page 925) binary.
buildInfo.debug
A boolean. true when built with debugging options.
buildInfo.maxBsonObjectSize
A number that reports the Maximum BSON Document Size (page 1015).
collStats
Definition
collStats
The collStats (page 763) command returns a variety of storage statistics for a given collection. Use the
following syntax:
{ collStats: "collection" , scale : 1024 }
Specify the collection you want statistics for, and use the scale argument to scale the output. The above
example will display values in kilobytes.
Examine the following example output, which uses the db.collection.stats() (page 848) helper in the
mongo (page 942) shell.
> db.users.stats()
{
"ns" : "app.users",
"count" : 9,
"size" : 432,
"avgObjSize" : 48,
"storageSize" : 3840,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 3840,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 16384,
"indexSizes" : {
"_id_" : 8192,
"username" : 8192
//
//
//
//
//
//
//
//
//
namespace
number of documents
collection size in bytes
average object size in bytes
(pre)allocated space for the collection in bytes
number of extents (contiguously allocated chunks of d
number of indexes
size of the most recently created extent in bytes
padding can speed up updates if documents grow
763
},
"ok" : 1
}
Note: The scale factor rounds values to whole numbers. This can produce unpredictable and unexpected results
in some situations.
Output
collStats.ns
The namespace of the current collection, which follows the format [database].[collection].
collStats.count
The number of objects or documents in this collection.
collStats.size
The size of the data stored in this collection. This value does not include the size of any indexes associated with
the collection, which the totalIndexSize (page 765) field reports.
The scale argument affects this value.
collStats.avgObjSize
The average size of an object in the collection. The scale argument affects this value.
collStats.storageSize
The total amount of storage allocated to this collection for document storage. The scale argument affects this
value. The storageSize (page 764) does not decrease as you remove or shrink documents.
collStats.numExtents
The total number of contiguously allocated data file regions.
collStats.nindexes
The number of indexes on the collection. All collections have at least one index on the _id field.
Changed in version 2.2: Before 2.2, capped collections did not necessarily have an index on the _id field, and
some capped collections created with pre-2.2 versions of mongod (page 925) may not have an _id index.
collStats.lastExtentSize
The size of the last extent allocated. The scale argument affects this value.
collStats.paddingFactor
The amount of space added to the end of each document at insert time. The document padding provides a small
amount of extra space on disk to allow a document to grow slightly without needing to move the document.
mongod (page 925) automatically calculates this padding factor
collStats.flags
Changed in version 2.2: Removed in version 2.2 and replaced with the userFlags (page 764) and
systemFlags (page 764) fields.
Indicates the number of flags on the current collection. In version 2.0, the only flag notes the existence of an
index on the _id field.
collStats.systemFlags
New in version 2.2.
Reports the flags on this collection that reflect internal server options. Typically this value is 1 and reflects the
existence of an index on the _id field.
collStats.userFlags
New in version 2.2.
764
Reports the flags on this collection set by the user. In version 2.2 the only user flag is usePowerOf2Sizes
(page 755). If usePowerOf2Sizes (page 755) is enabled, userFlags (page 764) will be set to 1, otherwise
userFlags (page 764) will be 0.
See the collMod (page 755) command for more information on setting user flags and usePowerOf2Sizes
(page 755).
collStats.totalIndexSize
The total size of all indexes. The scale argument affects this value.
collStats.indexSizes
This field specifies the key and size of every existing index on the collection. The scale argument affects this
value.
Example The following is an example of db.collection.stats() (page 848) and collStats (page 763)
output:
{
"ns" : "<database>.<collection>",
"count" : <number>,
"size" : <number>,
"avgObjSize" : <number>,
"storageSize" : <number>,
"numExtents" : <number>,
"nindexes" : <number>,
"lastExtentSize" : <number>,
"paddingFactor" : <number>,
"systemFlags" : <bit>,
"userFlags" : <bit>,
"totalIndexSize" : <number>,
"indexSizes" : {
"_id_" : <number>,
"a_1" : <number>
},
"ok" : 1
}
connPoolStats
Definition
connPoolStats
Note: connPoolStats (page 765) only returns meaningful results for mongos (page 938) instances and
for mongod (page 925) instances in sharded clusters.
The command connPoolStats (page 765) returns information regarding the number of open connections to
the current database instance, including client connections and server-to-server connections for replication and
clustering. The command takes the following form:
{ connPoolStats: 1 }
The value of the argument (i.e. 1 ) does not affect the output of the command.
Note: connPoolStats (page 765) only returns meaningful results for mongos (page 938) instances and
for mongod (page 925) instances in sharded clusters.
11.1. MongoDB Interface
765
Output
connPoolStats.hosts
The sub-documents of the hosts (page 766) document report connections between the mongos (page 938) or
mongod (page 925) instance and each component mongod (page 925) of the sharded cluster.
connPoolStats.hosts.[host].available
available (page 766) reports the total number of connections that the mongos (page 938) or mongod
(page 925) could use to connect to this mongod (page 925).
connPoolStats.hosts.[host].created
created (page 766) reports the number of connections that this mongos (page 938) or mongod
(page 925) has ever created for this host.
connPoolStats.replicaSets
replicaSets (page 766) is a document that contains replica set information for the sharded cluster.
connPoolStats.replicaSets.shard
The shard (page 766) document reports on each shard within the sharded cluster
connPoolStats.replicaSets.[shard].host
The host (page 766) field holds an array of document that reports on each host within the shard in the
replica set.
These values derive from the replica set status (page 726) values.
connPoolStats.replicaSets.[shard].host[n].addr
addr (page 766) reports the address for the host in the sharded cluster in the format of
[hostname]:[port].
connPoolStats.replicaSets.[shard].host[n].ok
ok (page 766) reports false when:
the mongos (page 938) or mongod (page 925) cannot connect to instance.
the mongos (page 938) or mongod (page 925) received a connection exception or error.
This field is for internal use.
connPoolStats.replicaSets.[shard].host[n].ismaster
ismaster (page 766) reports true if this host (page 766) is the primary member of the replica
set.
connPoolStats.replicaSets.[shard].host[n].hidden
hidden (page 766) reports true if this host (page 766) is a hidden member of the replica set.
connPoolStats.replicaSets.[shard].host[n].secondary
secondary (page 766) reports true if this host (page 766) is a secondary member of the replica
set.
connPoolStats.replicaSets.[shard].host[n].pingTimeMillis
pingTimeMillis (page 766) reports the ping time in milliseconds from the mongos (page 938)
or mongod (page 925) to this host (page 766).
connPoolStats.replicaSets.[shard].host[n].tags
New in version 2.2.
tags (page 766) reports the tags (page 482), if this member of the set has tags configured.
connPoolStats.replicaSets.[shard].master
master (page 766) reports the ordinal identifier of the host in the host (page 766) array that is the
primary of the replica set.
766
connPoolStats.replicaSets.[shard].nextSlave
Deprecated since version 2.2.
nextSlave (page 766) reports the secondary member that the mongos (page 938) will use to service
the next request for this replica set.
connPoolStats.createdByType
createdByType (page 767) document reports the number of each type of connection that mongos (page 938)
or mongod (page 925) has created in all connection pools.
mongos (page 938) connect to mongod (page 925) instances using one of three types of connections. The
following sub-document reports the total number of connections by type.
connPoolStats.createdByType.master
master (page 767) reports the total number of connections to the primary member in each cluster.
connPoolStats.createdByType.set
set (page 767) reports the total number of connections to a replica set member.
connPoolStats.createdByType.sync
sync (page 767) reports the total number of config database connections.
connPoolStats.totalAvailable
totalAvailable (page 767) reports the running total of connections from the mongos (page 938) or
mongod (page 925) to all mongod (page 925) instances in the sharded cluster available for use.
connPoolStats.totalCreated
totalCreated (page 767) reports the total number of connections ever created from the mongos (page 938)
or mongod (page 925) to all mongod (page 925) instances in the sharded cluster.
connPoolStats.numDBClientConnection
numDBClientConnection (page 767) reports the total number of connections from the mongos (page 938)
or mongod (page 925) to all of the mongod (page 925) instances in the sharded cluster.
connPoolStats.numAScopedConnection
numAScopedConnection (page 767) reports the number of exception safe connections created from
mongos (page 938) or mongod (page 925) to all mongod (page 925) in the sharded cluster. The mongos
(page 938) or mongod (page 925) releases these connections after receiving a socket exception from the
mongod (page 925).
dbStats
Definition
dbStats
The dbStats (page 767) command returns storage statistics for a given database. The command takes the
following syntax:
{ dbStats: 1, scale: 1 }
The values of the options above do not affect the output of the command. The scale option allows you to
specify how to scale byte values. For example, a scale value of 1024 will display the results in kilobytes
rather than in bytes:
{ dbStats: 1, scale: 1024 }
Note: Because scaling rounds values to whole numbers, scaling may return unlikely or unexpected results.
767
The time required to run the command depends on the total size of the database. Because the command must
touch all data files, the command may take several seconds to run.
In the mongo (page 942) shell, the db.stats() (page 894) function provides a wrapper around dbStats
(page 767).
Output
dbStats.db
Contains the name of the database.
dbStats.collections
Contains a count of the number of collections in that database.
dbStats.objects
Contains a count of the number of objects (i.e. documents) in the database across all collections.
dbStats.avgObjSize
The average size of each document in bytes. This is the dataSize (page 768) divided by the number of
documents.
dbStats.dataSize
The total size in bytes of the data held in this database including the padding factor. The scale argument
affects this value. The dataSize (page 768) will not decrease when documents shrink, but will decrease when
you remove documents.
dbStats.storageSize
The total amount of space in bytes allocated to collections in this database for document storage. The scale
argument affects this value. The storageSize (page 768) does not decrease as you remove or shrink documents.
dbStats.numExtents
Contains a count of the number of extents in the database across all collections.
dbStats.indexes
Contains a count of the total number of indexes across all collections in the database.
dbStats.indexSize
The total size in bytes of all indexes created on this database. The scale arguments affects this value.
dbStats.fileSize
The total size in bytes of the data files that hold the database. This value includes preallocated space and the
padding factor. The value of fileSize (page 768) only reflects the size of the data files for the database and
not the namespace file.
The scale argument affects this value.
dbStats.nsSizeMB
The total size of the namespace files (i.e. that end with .ns) for this database. You cannot change the size of
the namespace file after creating a database, but you can change the default size for all new namespace files with
the nssize (page 996) runtime option.
See also:
The nssize (page 996) option, and Maximum Namespace File Size (page 1016)
dbStats.dataFileVersion
New in version 2.4.
Document that contains information about the on-disk format of the data files for the database.
dbStats.dataFileVersion.major
New in version 2.4.
768
The major version number for the on-disk format of the data files for the database.
dbStats.dataFileVersion.minor
New in version 2.4.
The minor version number for the on-disk format of the data files for the database.
cursorInfo
cursorInfo
The cursorInfo (page 769) command returns information about current cursor allotment and use. Use the
following form:
{ cursorInfo: 1 }
The value (e.g. 1 above,) does not affect the output of the command.
cursorInfo (page 769) returns the total number of open cursors (totalOpen,) the size of client cursors
in current use (clientCursors_size,) and the number of timed out cursors since the last server restart
(timedOut.)
dataSize
dataSize
For internal use.
The dataSize (page 769) command returns the data size for a set of data within a certain range:
This will return a document that contains the size of all matching documents.
Replace
database.collection value with database and collection from your deployment. The keyPattern,
min, and max parameters are options.
The amount of time required to return dataSize (page 769) depends on the amount of data in the collection.
diagLogging
diagLogging
diagLogging (page 769) is an internal command.
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed.
getCmdLineOpts
getCmdLineOpts
The getCmdLineOpts (page 769) command returns a document containing command line options used to
start the given mongod (page 925):
{ getCmdLineOpts: 1 }
This command returns a document with two fields, argv and parsed. The argv field contains an array
with each item from the command string used to invoke mongod (page 925). The document in the parsed
field includes all runtime options, including those parsed from the command line and those specified in the
configuration file, if specified.
Consider the following example output of getCmdLineOpts (page 769):
769
{
"argv" : [
"/usr/bin/mongod",
"--config",
"/etc/mongodb.conf",
"--fork"
],
"parsed" : {
"bind_ip" : "127.0.0.1",
"config" : "/etc/mongodb/mongodb.conf",
"dbpath" : "/srv/mongodb",
"fork" : true,
"logappend" : "true",
"logpath" : "/var/log/mongodb/mongod.log",
"quiet" : "true"
},
"ok" : 1
}
http://docs.mongodb.org/manual/administration/import-export/
netstat
netstat
netstat (page 770) is an internal command that is only available on mongos (page 938) instances.
ping
ping
The ping (page 770) command is a no-op used to test whether a server is responding to commands. This
command will return immediately even if the server is write-locked:
{ ping: 1 }
The value (e.g. 1 above,) does not impact the behavior of the command.
profile
profile
Use the profile (page 770) command to enable, disable, or change the query profiling level. This allows
administrators to capture data regarding performance. The database profiling system can impact performance
and can allow the server to write the contents of queries to the log. Your deployment should carefully consider
the security implications of this. Consider the following prototype syntax:
{ profile: <level> }
Setting
No change. Returns the current profile level.
Off. No profiling.
On. Only includes slow operations.
On. Includes all operations.
You may optionally set a threshold in milliseconds for profiling using the slowms option, as follows:
{ profile: 1, slowms: 200 }
770
mongod (page 925) writes the output of the database profiler to the system.profile collection.
mongod (page 925) records queries that take longer than the slowms (page 997) to the server log even when
the database profiler is not active.
See also:
Additional documentation regarding database profiling Database Profiling (page 143).
See also:
db.getProfilingStatus() (page 887) and db.setProfilingLevel() (page 894) provide
wrappers around this functionality in the mongo (page 942) shell.
Note: The database cannot be locked with db.fsyncLock() (page 885) while profiling is enabled. You
must disable profiling before locking the database with db.fsyncLock() (page 885). Disable profiling using
db.setProfilingLevel() (page 894) as follows in the mongo (page 942) shell:
db.setProfilingLevel(0)
Note: This command obtains a write lock on the affected database and will block other operations until it has
completed. However, the write lock is only held while enabling or disabling the profiler. This is typically a short
operation.
validate
Definition
validate
The validate (page 771) command checks the structures within a namespace for correctness by scanning
the collections data and indexes. The command returns information regarding the on-disk representation of the
collection.
The validate command can be slow, particularly on larger data sets.
The following example validates the contents of the collection named users.
{ validate: "users" }
scandata:
false skips the scan of the base collection without skipping the scan of the index.
Use one of the following forms to perform the full collection validation:
db.collection.validate(true)
db.runCommand( { validate: "collection", full: true } )
Warning: This command is resource intensive and may have an impact on the performance of your MongoDB instance.
771
Output
validate.ns
The full namespace name of the collection. Namespaces include the database name and the collection name in
the form database.collection.
validate.firstExtent
The disk location of the first extent in the collection. The value of this field also includes the namespace.
validate.lastExtent
The disk location of the last extent in the collection. The value of this field also includes the namespace.
validate.extentCount
The number of extents in the collection.
validate.extents
validate (page 771) returns one instance of this document for every extent in the collection. This subdocument is only returned when you specify the full option to the command.
validate.extents.loc
The disk location for the beginning of this extent.
validate.extents.xnext
The disk location for the extent following this one. null if this is the end of the linked list of extents.
validate.extents.xprev
The disk location for the extent preceding this one. null if this is the head of the linked list of extents.
validate.extents.nsdiag
The namespace this extent belongs to (should be the same as the namespace shown at the beginning of the
validate listing).
validate.extents.size
The number of bytes in this extent.
validate.extents.firstRecord
The disk location of the first record in this extent.
validate.extents.lastRecord
The disk location of the last record in this extent.
validate.datasize
The number of bytes in all data records. This value does not include deleted records, nor does it include extent
headers, nor record headers, nor space in a file unallocated to any extent. datasize (page 772) includes record
padding.
validate.nrecords
The number of documents in the collection.
validate.lastExtentSize
The size of the last new extent created in this collection. This value determines the size of the next extent created.
validate.padding
A floating point value between 1 and 2.
When MongoDB creates a new record it uses the padding factor to determine how much additional space to add
to the record. The padding factor is automatically adjusted by mongo when it notices that update operations are
triggering record moves.
validate.firstExtentDetails
The size of the first extent created in this collection. This data is similar to the data provided by the extents
(page 772) sub-document; however, the data reflects only the first extent in the collection and is always returned.
772
validate.firstExtentDetails.loc
The disk location for the beginning of this extent.
validate.firstExtentDetails.xnext
The disk location for the extent following this one. null if this is the end of the linked list of extents,
which should only be the case if there is only one extent.
validate.firstExtentDetails.xprev
The disk location for the extent preceding this one. This should always be null.
validate.firstExtentDetails.nsdiag
The namespace this extent belongs to (should be the same as the namespace shown at the beginning of the
validate listing).
validate.firstExtentDetails.size
The number of bytes in this extent.
validate.firstExtentDetails.firstRecord
The disk location of the first record in this extent.
validate.firstExtentDetails.lastRecord
The disk location of the last record in this extent.
validate.objectsFound
The number of records actually encountered in a scan of the collection. This field should have the same value
as the nrecords (page 772) field.
validate.invalidObjects
The number of records containing BSON documents that do not pass a validation check.
Note: This field is only included in the validation output when you specify the full option.
validate.bytesWithHeaders
This is similar to datasize, except that bytesWithHeaders (page 773) includes the record headers. In version
2.0, record headers are 16 bytes per document.
Note: This field is only included in the validation output when you specify the full option.
validate.bytesWithoutHeaders
bytesWithoutHeaders (page 773) returns data collected from a scan of all records. The value should be
the same as datasize (page 772).
Note: This field is only included in the validation output when you specify the full option.
validate.deletedCount
The number of deleted or free records in the collection.
validate.deletedSize
The size of all deleted or free records in the collection.
validate.nIndexes
The number of indexes on the data in the collection.
validate.keysPerIndex
A document containing a field for each index, named after the indexs name, that contains the number of keys,
or documents referenced, included in the index.
validate.valid
Boolean. true, unless validate (page 771) determines that an aspect of the collection is not valid. When
773
false, see the errors (page 774) field for more information.
validate.errors
Typically empty; however, if the collection is not valid (i.e valid (page 773) is false,) this field will contain a
message describing the validation error.
validate.ok
Set to 1 when the command succeeds. If the command fails the ok (page 774) field has a value of 0.
top
top
The top (page 774) command is an administrative command which returns raw usage of each database, and
provides amount of time, in microseconds, used and a count of operations for the following event types:
total
readLock
writeLock
queries
getmore
insert
update
remove
commands
You must issue the top (page 774) command against the admin database in the form:
{ top: 1 }
indexStats
Definition
indexStats
The indexStats (page 774) command aggregates statistics for the B-tree data structure that stores data for a
MongoDB index.
Warning: This command is not intended for production deployments.
The command can be run only on a mongod
--enableExperimentalIndexStatsCmd option.
(page
925)
instance
that
uses
the
774
Output The db.collection.indexStats() (page 811) method and equivalent indexStats (page 774)
command aggregate statistics for the B-tree data structure that stores data for a MongoDB index. The commands
aggregate statistics firstly for the entire B-tree and secondly for each individual level of the B-tree. The output displays
the following values.
indexStats.index
The index name (page 338).
indexStats.version
The index version.
For more information on index version numbers, see the v option in
db.collection.ensureIndex() (page 814).
indexStats.isIdIndex
If true, the index is the default _id index for the collection.
indexStats.keyPattern
The indexed keys.
indexStats.storageNs
The namespace of the indexs underlying storage.
indexStats.bucketBodyBytes
The fixed size, in bytes, of a B-tree bucket in the index, not including the record header. All indexes for a given
version have the same value for this field. MongoDB allocates fixed size buckets on disk.
indexStats.depth
The number of levels in the B-tree, not including the root level.
indexStats.overall
This section of the output displays statistics for the entire B-tree.
indexStats.overall.numBuckets
The number of buckets in the entire B-tree, including all levels.
indexStats.overall.keyCount
Statistics about the number of keys in a bucket, evaluated on a per-bucket level.
indexStats.overall.usedKeyCount
Statistics about the number of used keys in a bucket, evaluated on a per-bucket level. Used keys are keys
not marked as deleted.
indexStats.overall.bsonRatio
Statistics about the percentage of the bucket body that is occupied by the key objects themselves, excluding
associated metadata.
For example, if you have the document { name:
}, the key object is the string Bob Smith.
indexStats.overall.keyNodeRatio
Statistics about the percentage of the bucket body that is occupied by the key node objects (the metadata
and links pertaining to the keys). This does not include the key itself. In the current implementation, a key
nodes objects consist of: the pointer to the key data (in the same bucket), the pointer to the record the key
is for, and the pointer to a child bucket.
indexStats.overall.fillRatio
The sum of the bsonRatio (page 775) and the keyNodeRatio (page 775). This shows how full the
buckets are. This will be much higher for indexes with sequential inserts.
indexStats.perLevel
This section of the output displays statistics for each level of the B-tree separately, starting with the root level.
This section displays a different document for each B-tree level.
775
indexStats.perLevel.numBuckets
The number of buckets at this level of the B-tree.
indexStats.perLevel.keyCount
Statistics about the number of keys in a bucket, evaluated on a per-bucket level.
indexStats.perLevel.usedKeyCount
Statistics about the number of used keys in a bucket, evaluated on a per-bucket level. Used keys are keys
not marked as deleted.
indexStats.perLevel.bsonRatio
Statistics about the percentage of the bucket body that is occupied by the key objects themselves, excluding
associated metadata.
indexStats.perLevel.keyNodeRatio
Statistics about the percentage of the bucket body that is occupied by the key node objects (the metadata
and links pertaining to the keys).
indexStats.perLevel.fillRatio
The sum of the bsonRatio (page 776) and the keyNodeRatio (page 776). This shows how full the
buckets are. This will be much higher in the following cases:
For indexes with sequential inserts, such as the _id index when using ObjectId keys.
For indexes that were recently built in the foreground with existing data.
If you recently ran compact (page 752) or --repair.
Example The following is an example of db.collection.indexStats() (page 811) and indexStats
(page 774) output.
{
"index" : "type_1_traits_1",
"version" : 1,
"isIdIndex" : false,
"keyPattern" : {
"type" : 1,
"traits" : 1
},
"storageNs" : "test.animals.$type_1_traits_1",
"bucketBodyBytes" : 8154,
"depth" : 2,
"overall" : {
"numBuckets" : 45513,
"keyCount" : {
"count" : NumberLong(45513),
"mean" : 253.89602970579836,
"stddev" : 21.784799875240708,
"min" : 52,
"max" : 290,
"quantiles" : {
"0.01" : 201.99785091648775,
// ...
"0.99" : 289.9999655156967
}
},
"usedKeyCount" : {
"count" : NumberLong(45513),
// ...
"quantiles" : {
776
"0.01" : 201.99785091648775,
// ...
"0.99" : 289.9999655156967
}
},
"bsonRatio" : {
"count" : NumberLong(45513),
// ...
"quantiles" : {
"0.01" : 0.4267797891997124,
// ...
"0.99" : 0.5945548174629648
}
},
"keyNodeRatio" : {
"count" : NumberLong(45513),
// ...
"quantiles" : {
"0.01" : 0.3963656628236211,
// ...
"0.99" : 0.5690457993930765
}
},
"fillRatio" : {
"count" : NumberLong(45513),
// ...
"quantiles" : {
"0.01" : 0.9909134214926929,
// ...
"0.99" : 0.9960755457453732
}
}
},
"perLevel" : [
{
"numBuckets" : 1,
"keyCount" : {
"count" : NumberLong(1),
"mean" : 180,
"stddev" : 0,
"min" : 180,
"max" : 180
},
"usedKeyCount" : {
"count" : NumberLong(1),
// ...
"max" : 180
},
"bsonRatio" : {
"count" : NumberLong(1),
// ...
"max" : 0.3619082658817758
},
"keyNodeRatio" : {
"count" : NumberLong(1),
// ...
"max" : 0.35320088300220753
},
777
"fillRatio" : {
"count" : NumberLong(1),
// ...
"max" : 0.7151091488839834
}
},
{
"numBuckets" : 180,
"keyCount" : {
"count" : NumberLong(180),
"mean" : 250.84444444444443,
"stddev" : 26.30057503009355,
"min" : 52,
"max" : 290
},
"usedKeyCount" : {
"count" : NumberLong(180),
// ...
"max" : 290
},
"bsonRatio" : {
"count" : NumberLong(180),
// ...
"max" : 0.5945548197203826
},
"keyNodeRatio" : {
"count" : NumberLong(180),
// ...
"max" : 0.5690458670591121
},
"fillRatio" : {
"count" : NumberLong(180),
// ...
"max" : 0.9963208241353937
}
},
{
"numBuckets" : 45332,
"keyCount" : {
"count" : NumberLong(45332),
"mean" : 253.90977675813994,
"stddev" : 21.761620836279018,
"min" : 167,
"max" : 290,
"quantiles" : {
"0.01" : 202.0000012563603,
// ...
"0.99" : 289.99996486571894
}
},
"usedKeyCount" : {
"count" : NumberLong(45332),
// ...
"quantiles" : {
"0.01" : 202.0000012563603,
// ...
"0.99" : 289.99996486571894
}
778
},
"bsonRatio" : {
"count" : NumberLong(45332),
// ...
"quantiles" : {
"0.01" : 0.42678446958950583,
// ...
"0.99" : 0.5945548175411283
}
},
"keyNodeRatio" : {
"count" : NumberLong(45332),
// ...
"quantiles" : {
"0.01" : 0.39636988227885306,
// ...
"0.99" : 0.5690457981176729
}
},
"fillRatio" : {
"count" : NumberLong(45332),
// ...
"quantiles" : {
"0.01" : 0.9909246995605362,
// ...
"0.99" : 0.996075546919481
}
}
}
],
"ok" : 1
}
Additional Resources For more information on the commands limits and output, see the following:
The equivalent db.collection.indexStats() (page 811) method,
indexStats (page 774), and
https://github.com/mongodb-labs/storage-viz#readme.
whatsmyuri
whatsmyuri
whatsmyuri (page 779) is an internal command.
getLog
getLog
The getLog (page 779) command returns a document with a log array that contains recent messages from the
mongod (page 925) process log. The getLog (page 779) command has the following syntax:
{ getLog: <log> }
779
rs - if the mongod (page 925) is part of a replica set, getLog (page 779) will return recent notices
related to replica set activity.
startupWarnings - will return logs that may contain errors or warnings from MongoDBs log from
when the current process started. If mongod (page 925) started without warnings, this filter may return an
empty array.
You may also specify an asterisk (e.g. *) as the <log> value to return a list of available log filters. The
following interaction from the mongo (page 942) shell connected to a replica set:
db.adminCommand({getLog: "*" })
{ "names" : [ "global", "rs", "startupWarnings" ], "ok" : 1 }
getLog (page 779) returns events from a RAM cache of the mongod (page 925) events and does not read log
data from the log file.
hostInfo
hostInfo
New in version 2.2.
Returns A document with information about the underlying system that the mongod (page 925) or
mongos (page 938) runs on. Some of the returned fields are only included on some platforms.
You must run the hostInfo (page 780) command, which takes no arguments, against the admin database.
Consider the following invocations of hostInfo (page 780):
db.hostInfo()
db.adminCommand( { "hostInfo" : 1 } )
In the mongo (page 942) shell you can use db.hostInfo() (page 889) as a helper to access hostInfo
(page 780). The output of hostInfo (page 780) on a Linux system will resemble the following:
{
"system" : {
"currentTime" : ISODate("<timestamp>"),
"hostname" : "<hostname>",
"cpuAddrSize" : <number>,
"memSizeMB" : <number>,
"numCores" : <number>,
"cpuArch" : "<identifier>",
"numaEnabled" : <boolean>
},
"os" : {
"type" : "<string>",
"name" : "<string>",
"version" : "<string>"
},
"extra" : {
"versionString" : "<string>",
"libcVersion" : "<string>",
"kernelVersion" : "<string>",
"cpuFrequencyMHz" : "<string>",
"cpuFeatures" : "<string>",
"pageSize" : <number>,
"numPages" : <number>,
"maxOpenFiles" : <number>
},
"ok" : <return>
}
780
781
hostInfo.extra.nfsAsync
nfsAsync (page 781) only appears on OS X systems.
hostInfo.extra.cpuFrequencyMHz
Reports the clock speed of the systems processor in megahertz.
hostInfo.extra.cpuFeatures
Reports the processor feature flags.
On Linux systems this the same information that
http://docs.mongodb.org/manualproc/cpuinfo includes in the flags fields.
hostInfo.extra.pageSize
Reports the default system page size in bytes.
hostInfo.extra.numPages
numPages (page 782) only appears on Linux systems.
hostInfo.extra.maxOpenFiles
Reports the current system limits on open file handles. See UNIX ulimit Settings (page 225) for more
information.
maxOpenFiles (page 782) only appears on Linux systems.
hostInfo.extra.scheduler
Reports the active I/O scheduler. scheduler (page 782) only appears on OS X systems.
serverStatus
Definition
serverStatus
The serverStatus (page 782) command returns a document that provides an overview of the database
processs state. Most monitoring applications run this command at a regular interval to collection statistics
about the instance:
{ serverStatus: 1 }
The value (i.e. 1 above), does not affect the operation of the command.
Changed in version 2.4: In 2.4 you can dynamically suppress portions of the serverStatus (page 782)
output, or include suppressed sections by adding fields to the command document as in the following examples:
db.runCommand( { serverStatus: 1, repl: 0, indexCounters: 0 } )
db.runCommand( { serverStatus: 1, workingSet: 1, metrics: 0, locks: 0 } )
serverStatus (page 782) includes all fields by default, except workingSet (page 795), by default.
Note: You may only dynamically include top-level fields from the serverStatus (page 782) document that are
not included by default. You can exclude any field that serverStatus (page 782) includes by default.
See also:
db.serverStatus() (page 893) and http://docs.mongodb.org/manualreference/server-status
Output The serverStatus (page 782) command returns a collection of information that reflects the databases
status. These data are useful for diagnosing and assessing the performance of your MongoDB instance. This reference
catalogs each datum included in the output of this command and provides context for using this data to more effectively
administer your database.
See also:
782
Much of the output of serverStatus (page 782) is also displayed dynamically by mongostat (page 974). See
the mongostat (page 974) command for more information.
section
of
the
serverStatus.locks
The locks (page 783) document contains sub-documents that provides a granular report on MongoDB
database-level lock use. All values are of the NumberLong() type.
Generally, fields named:
R refer to the global read lock,
W refer to the global write lock,
r refer to the database specific read lock, and
w refer to the database specific write lock.
If a document does not have any fields, it means that no locks have existed with this context since the last time
the mongod (page 925) started.
serverStatus.locks..
A field named . holds the first document in locks (page 783) that contains information about the global lock.
783
serverStatus.locks...timeLockedMicros
The timeLockedMicros (page 783) document reports the amount of time in microseconds that a lock has
existed in all databases in this mongod (page 925) instance.
serverStatus.locks...timeLockedMicros.R
The R field reports the amount of time in microseconds that any database has held the global read lock.
serverStatus.locks...timeLockedMicros.W
The W field reports the amount of time in microseconds that any database has held the global write lock.
serverStatus.locks...timeLockedMicros.r
The r field reports the amount of time in microseconds that any database has held the local read lock.
serverStatus.locks...timeLockedMicros.w
The w field reports the amount of time in microseconds that any database has held the local write lock.
serverStatus.locks...timeAcquiringMicros
The timeAcquiringMicros (page 784) document reports the amount of time in microseconds that operations have spent waiting to acquire a lock in all databases in this mongod (page 925) instance.
serverStatus.locks...timeAcquiringMicros.R
The R field reports the amount of time in microseconds that any database has spent waiting for the global read
lock.
serverStatus.locks...timeAcquiringMicros.W
The W field reports the amount of time in microseconds that any database has spent waiting for the global write
lock.
serverStatus.locks.admin
The admin (page 784) document contains two sub-documents that report data regarding lock use in the admin
database.
serverStatus.locks.admin.timeLockedMicros
The timeLockedMicros (page 784) document reports the amount of time in microseconds that locks have
existed in the context of the admin database.
serverStatus.locks.admin.timeLockedMicros.r
The r field reports the amount of time in microseconds that the admin database has held the read lock.
serverStatus.locks.admin.timeLockedMicros.w
The w field reports the amount of time in microseconds that the admin database has held the write lock.
serverStatus.locks.admin.timeAcquiringMicros
The timeAcquiringMicros (page 784) document reports on the amount of field time in microseconds that
operations have spent waiting to acquire a lock for the admin database.
serverStatus.locks.admin.timeAcquiringMicros.r
The r field reports the amount of time in microseconds that operations have spent waiting to acquire a read lock
on the admin database.
serverStatus.locks.admin.timeAcquiringMicros.w
The w field reports the amount of time in microseconds that operations have spent waiting to acquire a write
lock on the admin database.
serverStatus.locks.local
The local (page 784) document contains two sub-documents that report data regarding lock use in the local
database. The local database contains a number of instance specific data, including the oplog for replication.
serverStatus.locks.local.timeLockedMicros
The timeLockedMicros (page 784) document reports on the amount of time in microseconds that locks
have existed in the context of the local database.
784
serverStatus.locks.local.timeLockedMicros.r
The r field reports the amount of time in microseconds that the local database has held the read lock.
serverStatus.locks.local.timeLockedMicros.w
The w field reports the amount of time in microseconds that the local database has held the write lock.
serverStatus.locks.local.timeAcquiringMicros
The timeAcquiringMicros (page 785) document reports on the amount of time in microseconds that
operations have spent waiting to acquire a lock for the local database.
serverStatus.locks.local.timeAcquiringMicros.r
The r field reports the amount of time in microseconds that operations have spent waiting to acquire a read lock
on the local database.
serverStatus.locks.local.timeAcquiringMicros.w
The w field reports the amount of time in microseconds that operations have spent waiting to acquire a write
lock on the local database.
serverStatus.locks.<database>
For each additional database locks (page 783) includes a document that reports on the lock use for this
database. The names of these documents reflect the database name itself.
serverStatus.locks.<database>.timeLockedMicros
The timeLockedMicros (page 785) document reports on the amount of time in microseconds that locks
have existed in the context of the <database> database.
serverStatus.locks.<database>.timeLockedMicros.r
The r field reports the amount of time in microseconds that the <database> database has held the read lock.
serverStatus.locks.<database>.timeLockedMicros.w
The w field reports the amount of time in microseconds that the <database> database has held the write lock.
serverStatus.locks.<database>.timeAcquiringMicros
The timeAcquiringMicros (page 785) document reports on the amount of time in microseconds that
operations have spent waiting to acquire a lock for the <database> database.
serverStatus.locks.<database>.timeAcquiringMicros.r
The r field reports the amount of time in microseconds that operations have spent waiting to acquire a read lock
on the <database> database.
serverStatus.locks.<database>.timeAcquiringMicros.w
The w field reports the amount of time in microseconds that operations have spent waiting to acquire a write
lock on the <database> database.
globalLock For an example of the globalLock output, see the globalLock
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.globalLock
The globalLock (page 785) data structure contains information regarding the databases current lock state,
historical lock status, current operation queue, and the number of active clients.
serverStatus.globalLock.totalTime
The value of totalTime (page 785) represents the time, in microseconds, since the database last started and
creation of the globalLock (page 785). This is roughly equivalent to total server uptime.
serverStatus.globalLock.lockTime
The value of lockTime (page 785) represents the time, in microseconds, since the database last started, that
the globalLock (page 785) has been held.
Consider this value in combination with the value of totalTime (page 785). MongoDB aggregates these
values in the ratio (page 786) value. If the ratio (page 786) value is small but totalTime (page 785) is
11.1. MongoDB Interface
785
high the globalLock (page 785) has typically been held frequently for shorter periods of time, which may be
indicative of a more normal use pattern. If the lockTime (page 785) is higher and the totalTime (page 785)
is smaller (relatively,) then fewer operations are responsible for a greater portion of servers use (relatively.)
serverStatus.globalLock.ratio
Changed in version 2.2: ratio (page 786) was removed. See locks (page 783).
The value of ratio (page 786) displays the relationship between lockTime (page 785) and totalTime
(page 785).
Low values indicate that operations have held the globalLock (page 785) frequently for shorter periods of
time. High values indicate that operations have held globalLock (page 785) infrequently for longer periods
of time.
serverStatus.globalLock.currentQueue
The currentQueue (page 786) data structure value provides more granular information concerning the number of operations queued because of a lock.
serverStatus.globalLock.currentQueue.total
The value of total (page 786) provides a combined total of operations queued waiting for the lock.
A consistently small queue, particularly of shorter operations should cause no concern. Also, consider this value
in light of the size of queue waiting for the read lock (e.g. readers (page 786)) and write lock (e.g. writers
(page 786)) individually.
serverStatus.globalLock.currentQueue.readers
The value of readers (page 786) is the number of operations that are currently queued and waiting for the
read lock. A consistently small read-queue, particularly of shorter operations should cause no concern.
serverStatus.globalLock.currentQueue.writers
The value of writers (page 786) is the number of operations that are currently queued and waiting for the
write lock. A consistently small write-queue, particularly of shorter operations is no cause for concern.
globalLock.activeClients
serverStatus.globalLock.activeClients
The activeClients (page 786) data structure provides more granular information about the number of
connected clients and the operation types (e.g. read or write) performed by these clients.
Use this data to provide context for the currentQueue (page 786) data.
serverStatus.globalLock.activeClients.total
The value of total (page 786) is the total number of active client connections to the database. This combines
clients that are performing read operations (e.g. readers (page 786)) and clients that are performing write
operations (e.g. writers (page 786)).
serverStatus.globalLock.activeClients.readers
The value of readers (page 786) contains a count of the active client connections performing read operations.
serverStatus.globalLock.activeClients.writers
The value of writers (page 786) contains a count of active client connections performing write operations.
mem For
an
example
of
the
mem
output,
see
the
mem
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.mem
The mem (page 786) data structure holds information regarding the target system architecture of mongod
(page 925) and current memory use.
786
serverStatus.mem.bits
The value of bits (page 786) is either 64 or 32, depending on which target architecture specified during the
mongod (page 925) compilation process. In most instances this is 64, and this value does not change over time.
serverStatus.mem.resident
The value of resident (page 787) is roughly equivalent to the amount of RAM, in megabytes (MB), currently
used by the database process. In normal use this value tends to grow. In dedicated database servers this number
tends to approach the total amount of system memory.
serverStatus.mem.virtual
virtual (page 787) displays the quantity, in megabytes (MB), of virtual memory used by the mongod
(page 925) process. With journaling enabled, the value of virtual (page 787) is at least twice the value
of mapped (page 787).
If virtual (page 787) value is significantly larger than mapped (page 787) (e.g. 3 or more times), this may
indicate a memory leak.
serverStatus.mem.supported
supported (page 787) is true when the underlying system supports extended memory information. If this
value is false and the system does not support extended memory information, then other mem (page 786) values
may not be accessible to the database server.
serverStatus.mem.mapped
The value of mapped (page 787) provides the amount of mapped memory, in megabytes (MB), by the database.
Because MongoDB uses memory-mapped files, this value is likely to be to be roughly equivalent to the total
size of your database or databases.
serverStatus.mem.mappedWithJournal
mappedWithJournal (page 787) provides the amount of mapped memory, in megabytes (MB), including
the memory used for journaling. This value will always be twice the value of mapped (page 787). This field is
only included if journaling is enabled.
connections For an example of the connections output, see the connections section of the
http://docs.mongodb.org/manualreference/server-status page.
serverStatus.connections
The connections (page 787) sub document data regarding the current connection status and availability of
the database server. Use these values to asses the current load and capacity requirements of the server.
serverStatus.connections.current
The value of current (page 787) corresponds to the number of connections to the database server from
clients. This number includes the current shell session. Consider the value of available (page 787) to add
more context to this datum.
This figure will include the current shell connection as well as any inter-node connections to support a replica
set or sharded cluster.
serverStatus.connections.available
available (page 787) provides a count of the number of unused available connections that the database
can provide. Consider this value in combination with the value of current (page 787) to understand the
connection load on the database, and the UNIX ulimit Settings (page 225) document for more information about
system thresholds on available connections.
serverStatus.connections.totalCreated
totalCreated (page 787) provides a count of all connections created to the server. This number includes
connections that have since closed.
787
section
of
the
serverStatus.extra_info
The extra_info (page 788) data structure holds data collected by the mongod (page 925) instance about the
underlying system. Your system may only report a subset of these fields.
serverStatus.extra_info.note
The field note (page 788) reports that the data in this structure depend on the underlying platform, and has the
text: fields vary by platform.
serverStatus.extra_info.heap_usage_bytes
The heap_usage_bytes (page 788) field is only available on Unix/Linux systems, and reports the total size
in bytes of heap space used by the database process.
serverStatus.extra_info.page_faults
The page_faults (page 788) Reports the total number of page faults that require disk operations. Page
faults refer to operations that require the database server to access data which isnt available in active memory.
The page_faults (page 788) counter may increase dramatically during moments of poor performance and
may correlate with limited memory environments and larger data sets. Limited and sporadic page faults do not
necessarily indicate an issue.
indexCounters For an example of the indexCounters output, see the indexCounters section of the
http://docs.mongodb.org/manualreference/server-status page.
serverStatus.indexCounters
Changed in version 2.2: Previously, data in the indexCounters (page 788) document reported sampled data,
and were only useful in relative comparison to each other, because they could not reflect absolute index use. In
2.2 and later, these data reflect actual index use.
Changed in version 2.4: Fields previously in the btree sub-document of indexCounters (page 788) are
now fields in the indexCounters (page 788) document.
The indexCounters (page 788) data structure reports information regarding the state and use of indexes in
MongoDB.
serverStatus.indexCounters.accesses
accesses (page 788) reports the number of times that operations have accessed indexes. This value is the
combination of the hits (page 788) and misses (page 788). Higher values indicate that your database has
indexes and that queries are taking advantage of these indexes. If this number does not grow over time, this
might indicate that your indexes do not effectively support your use.
serverStatus.indexCounters.hits
The hits (page 788) value reflects the number of times that an index has been accessed and mongod (page 925)
is able to return the index from memory.
A higher value indicates effective index use. hits (page 788) values that represent a greater proportion of the
accesses (page 788) value, tend to indicate more effective index configuration.
serverStatus.indexCounters.misses
The misses (page 788) value represents the number of times that an operation attempted to access an index
that was not in memory. These misses, do not indicate a failed query or operation, but rather an inefficient use
of the index. Lower values in this field indicate better index use and likely overall performance as well.
serverStatus.indexCounters.resets
The resets (page 788) value reflects the number of times that the index counters have been reset since the
database last restarted. Typically this value is 0, but use this value to provide context for the data specified by
other indexCounters (page 788) values.
788
serverStatus.indexCounters.missRatio
The missRatio (page 788) value is the ratio of hits (page 788) to misses (page 788). This value is
typically 0 or approaching 0.
backgroundFlushing For an example of the backgroundFlushing output, see the backgroundFlushing section
of the http://docs.mongodb.org/manualreference/server-status page.
serverStatus.backgroundFlushing
mongod (page 925) periodically flushes writes to disk. In the default configuration, this happens every 60
seconds. The backgroundFlushing (page 789) data structure contains data regarding these operations.
Consider these values if you have concerns about write performance and journaling (page 793).
serverStatus.backgroundFlushing.flushes
flushes (page 789) is a counter that collects the number of times the database has flushed all writes to disk.
This value will grow as database runs for longer periods of time.
serverStatus.backgroundFlushing.total_ms
The total_ms (page 789) value provides the total number of milliseconds (ms) that the mongod (page 925)
processes have spent writing (i.e. flushing) data to disk. Because this is an absolute value, consider the value of
flushes (page 789) and average_ms (page 789) to provide better context for this datum.
serverStatus.backgroundFlushing.average_ms
The average_ms (page 789) value describes the relationship between the number of flushes and the total
amount of time that the database has spent writing data to disk. The larger flushes (page 789) is, the more
likely this value is likely to represent a normal, time; however, abnormal data can skew this value.
Use the last_ms (page 789) to ensure that a high average is not skewed by transient historical issue or a
random write distribution.
serverStatus.backgroundFlushing.last_ms
The value of the last_ms (page 789) field is the amount of time, in milliseconds, that the last flush operation
took to complete. Use this value to verify that the current performance of the server and is in line with the
historical data provided by average_ms (page 789) and total_ms (page 789).
serverStatus.backgroundFlushing.last_finished
The last_finished (page 789) field provides a timestamp of the last completed flush operation in the
ISODate format. If this value is more than a few minutes old relative to your servers current time and accounting
for differences in time zone, restarting the database may result in some data loss.
Also consider ongoing operations that might skew this value by routinely block write operations.
cursors For an example of the cursors output,
see the cursors
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.cursors
The cursors (page 789) data structure contains data regarding cursor state and use.
serverStatus.cursors.totalOpen
totalOpen (page 789) provides the number of cursors that MongoDB is maintaining for clients. Because
MongoDB exhausts unused cursors, typically this value small or zero. However, if there is a queue, stale
tailable cursors, or a large number of operations this value may rise.
serverStatus.cursors.clientCursors_size
Deprecated since version 1.x: See totalOpen (page 789) for this datum.
serverStatus.cursors.timedOut
timedOut (page 789) provides a counter of the total number of cursors that have timed out since the server
process started. If this number is large or growing at a regular rate, this may indicate an application error.
789
section
of
the
serverStatus.network
The network (page 790) data structure contains data regarding MongoDBs network use.
serverStatus.network.bytesIn
The value of the bytesIn (page 790) field reflects the amount of network traffic, in bytes, received by this
database. Use this value to ensure that network traffic sent to the mongod (page 925) process is consistent with
expectations and overall inter-application traffic.
serverStatus.network.bytesOut
The value of the bytesOut (page 790) field reflects the amount of network traffic, in bytes, sent from this
database. Use this value to ensure that network traffic sent by the mongod (page 925) process is consistent with
expectations and overall inter-application traffic.
serverStatus.network.numRequests
The numRequests (page 790) field is a counter of the total number of distinct requests that the server has
received. Use this value to provide context for the bytesIn (page 790) and bytesOut (page 790) values to
ensure that MongoDBs network utilization is consistent with expectations and application use.
repl For
an
example
of
the
repl
output,
see
the
repl
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.repl
The repl (page 790) data structure contains status information for MongoDBs replication (i.e. replica set)
configuration. These values only appear when the current host has replication enabled.
See Replication (page 377) for more information on replication.
serverStatus.repl.setName
The setName (page 790) field contains a string with the name of the current replica set. This value reflects the
--replSet command line argument, or replSet (page 1000) value in the configuration file.
See Replication (page 377) for more information on replication.
serverStatus.repl.ismaster
The value of the ismaster (page 790) field is either true or false and reflects whether the current node is
the master or primary node in the replica set.
See Replication (page 377) for more information on replication.
serverStatus.repl.secondary
The value of the secondary (page 790) field is either true or false and reflects whether the current node
is a secondary node in the replica set.
See Replication (page 377) for more information on replication.
serverStatus.repl.hosts
hosts (page 790) is an array that lists the other nodes in the current replica set. Each member of the replica set
appears in the form of hostname:port.
See Replication (page 377) for more information on replication.
opcountersRepl For an example of the opcountersRepl output, see the opcountersRepl section of the
http://docs.mongodb.org/manualreference/server-status page.
serverStatus.opcountersRepl
The opcountersRepl (page 790) data structure, similar to the opcounters (page 791) data structure,
790
provides an overview of database replication operations by type and makes it possible to analyze the load on the
replica in more granular manner. These values only appear when the current host has replication enabled.
These values will differ from the opcounters (page 791) values because of how MongoDB serializes operations during replication. See Replication (page 377) for more information on replication.
These numbers will grow over time in response to database use. Analyze these values over time to track database
utilization.
serverStatus.opcountersRepl.insert
insert (page 791) provides a counter of the total number of replicated insert operations since the mongod
(page 925) instance last started.
serverStatus.opcountersRepl.query
query (page 791) provides a counter of the total number of replicated queries since the mongod (page 925)
instance last started.
serverStatus.opcountersRepl.update
update (page 791) provides a counter of the total number of replicated update operations since the mongod
(page 925) instance last started.
serverStatus.opcountersRepl.delete
delete (page 791) provides a counter of the total number of replicated delete operations since the mongod
(page 925) instance last started.
serverStatus.opcountersRepl.getmore
getmore (page 791) provides a counter of the total number of getmore operations since the mongod
(page 925) instance last started. This counter can be high even if the query count is low. Secondary nodes
send getMore operations as part of the replication process.
serverStatus.opcountersRepl.command
command (page 791) provides a counter of the total number of replicated commands issued to the database
since the mongod (page 925) instance last started.
opcounters For an example of the opcounters output, see the opcounters
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.opcounters
The opcounters (page 791) data structure provides an overview of database operations by type and makes it
possible to analyze the load on the database in more granular manner.
These numbers will grow over time and in response to database use. Analyze these values over time to track
database utilization.
Note: The data in opcounters (page 791) treats operations that affect multiple documents, such as bulk insert
or multi-update operations, as a single operation. See document (page 795) for more granular document-level
operation tracking.
serverStatus.opcounters.insert
insert (page 791) provides a counter of the total number of insert operations since the mongod (page 925)
instance last started.
serverStatus.opcounters.query
query (page 791) provides a counter of the total number of queries since the mongod (page 925) instance last
started.
serverStatus.opcounters.update
update (page 791) provides a counter of the total number of update operations since the mongod (page 925)
instance last started.
791
serverStatus.opcounters.delete
delete (page 791) provides a counter of the total number of delete operations since the mongod (page 925)
instance last started.
serverStatus.opcounters.getmore
getmore (page 792) provides a counter of the total number of getmore operations since the mongod
(page 925) instance last started. This counter can be high even if the query count is low. Secondary nodes
send getMore operations as part of the replication process.
serverStatus.opcounters.command
command (page 792) provides a counter of the total number of commands issued to the database since the
mongod (page 925) instance last started.
asserts For an example of the asserts output,
see the asserts
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.asserts
The asserts (page 792) document reports the number of asserts on the database. While assert errors are
typically uncommon, if there are non-zero values for the asserts (page 792), you should check the log file
for the mongod (page 925) process for more information. In many cases these errors are trivial, but are worth
investigating.
serverStatus.asserts.regular
The regular (page 792) counter tracks the number of regular assertions raised since the server process started.
Check the log file for more information about these messages.
serverStatus.asserts.warning
The warning (page 792) counter tracks the number of warnings raised since the server process started. Check
the log file for more information about these warnings.
serverStatus.asserts.msg
The msg (page 792) counter tracks the number of message assertions raised since the server process started.
Check the log file for more information about these messages.
serverStatus.asserts.user
The user (page 792) counter reports the number of user asserts that have occurred since the last time the
server process started. These are errors that user may generate, such as out of disk space or duplicate key. You
can prevent these assertions by fixing a problem with your application or deployment. Check the MongoDB log
for more information.
serverStatus.asserts.rollovers
The rollovers (page 792) counter displays the number of times that the rollover counters have rolled over
since the last time the server process started. The counters will rollover to zero after 230 assertions. Use this
value to provide context to the other values in the asserts (page 792) data structure.
writeBacksQueued For an example of the writeBacksQueued output, see the writeBacksQueued section of the
http://docs.mongodb.org/manualreference/server-status page.
serverStatus.writeBacksQueued
The value of writeBacksQueued (page 792) is true when there are operations from a mongos (page 938)
instance queued for retrying. Typically this option is false.
See also:
writeBacks
792
section
of
the
serverStatus.dur
The dur (page 793) (for durability) document contains data regarding the mongod (page 925)s journalingrelated operations and performance. mongod (page 925) must be running with journaling for these data to
appear in the output of serverStatus (page 782).
Note: The data values are not cumulative but are reset on a regular basis as determined by the journal group
commit interval (page 176). This interval is ~100 milliseconds (ms) by default (or 30ms if the journal file is on
the same file system as your data files) and is cut by 2/3 when there is a getLastError (page 720) command
pending. The interval is configurable using the --journalCommitInterval option.
See also:
Journaling Mechanics (page 232) for more information about journaling operations.
serverStatus.dur.commits
The commits (page 793) provides the number of transactions written to the journal during the last journal
group commit interval (page 176).
serverStatus.dur.journaledMB
The journaledMB (page 793) provides the amount of data in megabytes (MB) written to journal during the
last journal group commit interval (page 233).
serverStatus.dur.writeToDataFilesMB
The writeToDataFilesMB (page 793) provides the amount of data in megabytes (MB) written from journal
to the data files during the last journal group commit interval (page 233).
serverStatus.dur.compression
New in version 2.0.
The compression (page 793) represents the compression ratio of the data written to the journal:
( journaled_size_of_data / uncompressed_size_of_data )
serverStatus.dur.commitsInWriteLock
The commitsInWriteLock (page 793) provides a count of the commits that occurred while a write lock was
held. Commits in a write lock indicate a MongoDB node under a heavy write load and call for further diagnosis.
serverStatus.dur.earlyCommits
The earlyCommits (page 793) value reflects the number of times MongoDB requested a commit before the
scheduled journal group commit interval (page 233). Use this value to ensure that your journal group commit
interval (page 176) is not too long for your deployment.
serverStatus.dur.timeMS
The timeMS (page 793) document provides information about the performance of the mongod (page 925)
instance during the various phases of journaling in the last journal group commit interval (page 176).
serverStatus.dur.timeMS.dt
The dt (page 793) value provides, in milliseconds, the amount of time over which MongoDB collected the
timeMS (page 793) data. Use this field to provide context to the other timeMS (page 793) field values.
serverStatus.dur.timeMS.prepLogBuffer
The prepLogBuffer (page 793) value provides, in milliseconds, the amount of time spent preparing to write
to the journal. Smaller values indicate better journal performance.
793
serverStatus.dur.timeMS.writeToJournal
The writeToJournal (page 793) value provides, in milliseconds, the amount of time spent actually writing
to the journal. File system speeds and device interfaces can affect performance.
serverStatus.dur.timeMS.writeToDataFiles
The writeToDataFiles (page 794) value provides, in milliseconds, the amount of time spent writing to
data files after journaling. File system speeds and device interfaces can affect performance.
serverStatus.dur.timeMS.remapPrivateView
The remapPrivateView (page 794) value provides, in milliseconds, the amount of time spent remapping
copy-on-write memory mapped views. Smaller values indicate better journal performance.
recordStats For an example of the recordStats output, see the recordStats section of the
http://docs.mongodb.org/manualreference/server-status page.
serverStatus.recordStats
The recordStats (page 794) document provides fine grained reporting on page faults on a per database level.
MongoDB uses a read lock on each database to return recordStats (page 794). To minimize this overhead,
you can disable this section, as in the following operation:
db.serverStatus( { recordStats: 0 } )
serverStatus.recordStats.accessesNotInMemory
accessesNotInMemory (page 794) reflects the number of times mongod (page 925) needed to access a
memory page that was not resident in memory for all databases managed by this mongod (page 925) instance.
serverStatus.recordStats.pageFaultExceptionsThrown
pageFaultExceptionsThrown (page 794) reflects the number of page fault exceptions thrown by
mongod (page 925) when accessing data for all databases managed by this mongod (page 925) instance.
serverStatus.recordStats.local.accessesNotInMemory
accessesNotInMemory (page 794) reflects the number of times mongod (page 925) needed to access a
memory page that was not resident in memory for the local database.
serverStatus.recordStats.local.pageFaultExceptionsThrown
pageFaultExceptionsThrown (page 794) reflects the number of page fault exceptions thrown by
mongod (page 925) when accessing data for the local database.
serverStatus.recordStats.admin.accessesNotInMemory
accessesNotInMemory (page 794) reflects the number of times mongod (page 925) needed to access a
memory page that was not resident in memory for the admin database.
serverStatus.recordStats.admin.pageFaultExceptionsThrown
pageFaultExceptionsThrown (page 794) reflects the number of page fault exceptions thrown by
mongod (page 925) when accessing data for the admin database.
serverStatus.recordStats.<database>.accessesNotInMemory
accessesNotInMemory (page 794) reflects the number of times mongod (page 925) needed to access a
memory page that was not resident in memory for the <database> database.
serverStatus.recordStats.<database>.pageFaultExceptionsThrown
pageFaultExceptionsThrown (page 794) reflects the number of page fault exceptions thrown by
mongod (page 925) when accessing data for the <database> database.
workingSet New in version 2.4.
Note: The workingSet (page 795) data is only included in the output of serverStatus (page 782) if explicitly
enabled. To return the workingSet (page 795), use one of the following commands:
794
db.serverStatus( { workingSet: 1 } )
db.runCommand( { serverStatus: 1, workingSet: 1 } )
For
an
example
of
the
workingSet
output,
see
the
workingSet
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
serverStatus.workingSet
workingSet (page 795) is a document that contains values useful for estimating the size of the working set,
which is the amount of data that MongoDB uses actively. workingSet (page 795) uses an internal data
structure that tracks pages accessed by mongod (page 925).
serverStatus.workingSet.note
note (page 795) is a field that holds a string warning that the workingSet (page 795) document is an
estimate.
serverStatus.workingSet.pagesInMemory
pagesInMemory (page 795) contains a count of the total number of pages accessed by mongod (page 925)
over the period displayed in overSeconds (page 795). The default page size is 4 kilobytes: to convert this
value to the amount of data in memory multiply this value by 4 kilobytes.
If your total working set is less than the size of physical memory, over time the value of pagesInMemory
(page 795) will reflect your data size.
Use pagesInMemory (page 795) in conjunction with overSeconds (page 795) to help estimate the actual
size of the working set.
serverStatus.workingSet.computationTimeMicros
computationTimeMicros (page 795) reports the amount of time the mongod (page 925) instance used to
compute the other fields in the workingSet (page 795) section.
Reporting on workingSet (page 795) may impact the performance of other operations on the mongod
(page 925) instance because MongoDB must collect some data within the context of a lock. Ensure that automated monitoring tools consider this metric when determining the frequency of collection for workingSet
(page 795).
serverStatus.workingSet.overSeconds
overSeconds (page 795) returns the amount of time elapsed between the newest and oldest pages tracked in
the pagesInMemory (page 795) data point.
If overSeconds (page 795) is decreasing, or if pagesInMemory (page 795) equals physical RAM and
overSeconds (page 795) is very small, the working set may be much larger than physical RAM.
When overSeconds (page 795) is large, MongoDBs data set is equal to or smaller than physical RAM.
metrics For
an
example
of
the
metrics
output,
see
the
metrics
http://docs.mongodb.org/manualreference/server-status page.
section
of
the
795
serverStatus.metrics.document.deleted
deleted (page 795) reports the total number of documents deleted.
serverStatus.metrics.document.inserted
inserted (page 796) reports the total number of documents inserted.
serverStatus.metrics.document.returned
returned (page 796) reports the total number of documents returned by queries.
serverStatus.metrics.document.updated
updated (page 796) reports the total number of documents updated.
serverStatus.metrics.getLastError
getLastError (page 796) is a document that reports on getLastError (page 720) use.
serverStatus.metrics.getLastError.wtime
wtime (page 796) is a sub-document that reports getLastError (page 720) operation counts with a w
argument greater than 1.
serverStatus.metrics.getLastError.wtime.num
num (page 796) reports the total number of getLastError (page 720) operations with a specified write
concern (i.e. w) that wait for one or more members of a replica set to acknowledge the write operation (i.e. a w
value greater than 1.)
serverStatus.metrics.getLastError.wtime.totalMillis
totalMillis (page 796) reports the total amount of time in milliseconds that the mongod (page 925) has
spent performing getLastError (page 720) operations with write concern (i.e. w) that wait for one or more
members of a replica set to acknowledge the write operation (i.e. a w value greater than 1.)
serverStatus.metrics.getLastError.wtimeouts
wtimeouts (page 796) reports the number of times that write concern operations have timed out as a result of
the wtimeout threshold to getLastError (page 720).
serverStatus.metrics.operation
operation (page 796) is a sub-document that holds counters for several types of update and query operations
that MongoDB handles using special operation types.
serverStatus.metrics.operation.fastmod
fastmod (page 796) reports the number of update (page 50) operations that neither cause documents to grow
nor require updates to the index. For example, this counter would record an update operation that use the $inc
(page 651) operator to increment the value of a field that is not indexed.
serverStatus.metrics.operation.idhack
idhack (page 796) reports the number of queries that contain the _id field. For these queries, MongoDB will
use default index on the _id field and skip all query plan analysis.
serverStatus.metrics.operation.scanAndOrder
scanAndOrder (page 796) reports the total number of queries that return sorted numbers that cannot perform
the sort operation using an index.
serverStatus.metrics.queryExecutor
queryExecutor (page 796) is a document that reports data from the query execution system.
serverStatus.metrics.queryExecutor.scanned
scanned (page 796) reports the total number of index items scanned during queries and query-plan evaluation.
This counter is the same as nscanned (page 864) in the output of explain() (page 861).
serverStatus.metrics.record
record (page 796) is a document that reports data related to record allocation in the on-disk memory files.
serverStatus.metrics.record.moves
moves (page 796) reports the total number of times documents move within the on-disk representation of the
796
MongoDB data set. Documents move as a result of operations that increase the size of the document beyond
their allocated record size.
serverStatus.metrics.repl
repl (page 797) holds a sub-document that reports metrics related to the replication process. repl (page 797)
document appears on all mongod (page 925) instances, even those that arent members of replica sets.
serverStatus.metrics.repl.apply
apply (page 797) holds a sub-document that reports on the application of operations from the replication oplog.
serverStatus.metrics.repl.apply.batches
batches (page 797) reports on the oplog application process on secondaries members of replica sets. See
Multithreaded Replication (page 412) for more information on the oplog application processes
serverStatus.metrics.repl.apply.batches.num
num (page 797) reports the total number of batches applied across all databases.
serverStatus.metrics.repl.apply.batches.totalMillis
totalMillis (page 797) reports the total amount of time the mongod (page 925) has spent applying operations from the oplog.
serverStatus.metrics.repl.apply.ops
ops (page 797) reports the total number of oplog operations applied.
serverStatus.metrics.repl.buffer
MongoDB buffers oplog operations from the replication sync source buffer before applying oplog entries in a
batch. buffer (page 797) provides a way to track the oplog buffer. See Multithreaded Replication (page 412)
for more information on the oplog application process.
serverStatus.metrics.repl.buffer.count
count (page 797) reports the current number of operations in the oplog buffer.
serverStatus.metrics.repl.buffer.maxSizeBytes
maxSizeBytes (page 797) reports the maximum size of the buffer. This value is a constant setting in the
mongod (page 925), and is not configurable.
serverStatus.metrics.repl.buffer.sizeBytes
sizeBytes (page 797) reports the current size of the contents of the oplog buffer.
serverStatus.metrics.repl.network
network (page 797) reports network use by the replication process.
serverStatus.metrics.repl.network.bytes
bytes (page 797) reports the total amount of data read from the replication sync source.
serverStatus.metrics.repl.network.getmores
getmores (page 797) reports on the getmore operations, which are requests for additional results from the
oplog cursor as part of the oplog replication process.
serverStatus.metrics.repl.network.getmores.num
num (page 797) reports the total number of getmore operations, which are operations that request an additional
set of operations from the replication sync source.
serverStatus.metrics.repl.network.getmores.totalMillis
totalMillis (page 797) reports the total amount of time required to collect data from getmore operations.
Note: This number can be quite large, as MongoDB will wait for more data even if the getmore operation
does not initial return data.
serverStatus.metrics.repl.network.ops
ops (page 797) reports the total number of operations read from the replication source.
797
serverStatus.metrics.repl.network.readersCreated
readersCreated (page 797) reports the total number of oplog query processes created. MongoDB will
create a new oplog query any time an error occurs in the connection, including a timeout, or a network operation.
Furthermore, readersCreated (page 797) will increment every time MongoDB selects a new source fore
replication.
serverStatus.metrics.repl.oplog
oplog (page 798) is a document that reports on the size and use of the oplog by this mongod (page 925)
instance.
serverStatus.metrics.repl.oplog.insert
insert (page 798) is a document that reports insert operations into the oplog.
serverStatus.metrics.repl.oplog.insert.num
num (page 798) reports the total number of items inserted into the oplog.
serverStatus.metrics.repl.oplog.insert.totalMillis
totalMillis (page 798) reports the total amount of time spent for the mongod (page 925) to insert data into
the oplog.
serverStatus.metrics.repl.oplog.insertBytes
insertBytes (page 798) the total size of documents inserted into the oplog.
serverStatus.metrics.repl.preload
preload (page 798) reports on the pre-fetch stage, where MongoDB loads documents and indexes into
RAM to improve replication throughput.
See Multithreaded Replication (page 412) for more information about the pre-fetch stage of the replication
process.
serverStatus.metrics.repl.preload.docs
docs (page 798) is a sub-document that reports on the documents loaded into memory during the pre-fetch
stage.
serverStatus.metrics.repl.preload.docs.num
num (page 798) reports the total number of documents loaded during the pre-fetch stage of replication.
serverStatus.metrics.repl.preload.docs.totalMillis
totalMillis (page 798) reports the total amount of time spent loading documents as part of the pre-fetch
stage of replication.
serverStatus.metrics.repl.preload.indexes
indexes (page 798) is a sub-document that reports on the index items loaded into memory during the pre-fetch
stage of replication.
See Multithreaded Replication (page 412) for more information about the pre-fetch stage of replication.
serverStatus.metrics.repl.preload.indexes.num
num (page 798) reports the total number of index entries loaded by members before updating documents as part
of the pre-fetch stage of replication.
serverStatus.metrics.repl.preload.indexes.totalMillis
totalMillis (page 798) reports the total amount of time spent loading index entries as part of the pre-fetch
stage of replication.
serverStatus.metrics.ttl
ttl (page 798) is a sub-document that reports on the operation of the resource use of the ttl index (page 158)
process.
serverStatus.metrics.ttl.deletedDocuments
deletedDocuments (page 798) reports the total number of documents deleted from collections with a ttl
index (page 158).
798
serverStatus.metrics.ttl.passes
passes (page 798) reports the number of times the background process removes documents from collections
with a ttl index (page 158).
features
features
features (page 799) is an internal command that returns the build-level feature settings.
isSelf
_isSelf
_isSelf (page 799) is an internal command.
Internal Commands
Internal Commands
Name
handshake (page 799)
_recvChunkAbort (page 799)
_recvChunkCommit
(page 800)
_recvChunkStart (page 800)
_recvChunkStatus
(page 800)
_replSetFresh
mapreduce.shardedfinish
(page 800)
_transferMods (page 800)
replSetHeartbeat
(page 800)
replSetGetRBID (page 800)
_migrateClone (page 800)
replSetElect (page 800)
writeBacksQueued
(page 801)
writebacklisten (page 801)
Description
Internal command.
Internal command that supports chunk migrations in sharded clusters. Do not
call directly.
Internal command that supports chunk migrations in sharded clusters. Do not
call directly.
Internal command that facilitates chunk migrations in sharded clusters.. Do
not call directly.
Internal command that returns data to support chunk migrations in sharded
clusters. Do not call directly.
Internal command that supports replica set election operations.
Internal command that supports map-reduce in sharded cluster
environments.
Internal command that supports chunk migrations. Do not call directly.
Internal command that supports replica set operations.
Internal command that supports replica set operations.
Internal command that supports chunk migration. Do not call directly.
Internal command that supports replica set functionality.
Internal command that supports chunk migrations in sharded clusters.
Internal command that supports chunk migrations in sharded clusters.
handshake
handshake
handshake (page 799) is an internal command.
recvChunkAbort
_recvChunkAbort
_recvChunkAbort (page 799) is an internal command. Do not call directly.
799
recvChunkCommit
_recvChunkCommit
_recvChunkCommit (page 800) is an internal command. Do not call directly.
recvChunkStart
_recvChunkStart
_recvChunkStart (page 800) is an internal command. Do not call directly.
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed.
recvChunkStatus
_recvChunkStatus
_recvChunkStatus (page 800) is an internal command. Do not call directly.
replSetFresh
replSetFresh
replSetFresh (page 800) is an internal command that supports replica set functionality.
mapreduce.shardedfinish
mapreduce.shardedfinish
Provides internal functionality to support map-reduce in sharded environments.
See also:
mapReduce (page 701)
transferMods
_transferMods
_transferMods (page 800) is an internal command. Do not call directly.
replSetHeartbeat
replSetHeartbeat
replSetHeartbeat (page 800) is an internal command that supports replica set functionality.
replSetGetRBID
replSetGetRBID
replSetGetRBID (page 800) is an internal command that supports replica set functionality.
migrateClone
_migrateClone
_migrateClone (page 800) is an internal command. Do not call directly.
replSetElect
replSetElect
replSetElect (page 800) is an internal command that support replica set functionality.
800
writeBacksQueued
writeBacksQueued
writeBacksQueued (page 801) is an internal command that returns a document reporting there are operations in the write back queue for the given mongos (page 938) and information about the queues.
writeBacksQueued.hasOpsQueued
Boolean.
hasOpsQueued (page 801) is true if there are write Back operations queued.
writeBacksQueued.totalOpsQueued
Integer.
totalOpsQueued (page 801) reflects the number of operations queued.
writeBacksQueued.queues
Document.
queues (page 801) holds a sub-document where the fields are all write back queues. These field hold a
document with two fields that reports on the state of the queue. The fields in these documents are:
writeBacksQueued.queues.n
n (page 801) reflects the size, by number of items, in the queues.
writeBacksQueued.queues.minutesSinceLastCall
The number of minutes since the last time the mongos (page 938) touched this queue.
The command document has the following prototype form:
{writeBacksQueued: 1}
To call writeBacksQueued (page 801) from the mongo (page 942) shell, use the following
db.runCommand() (page 893) form:
db.runCommand({writeBacksQueued: 1})
:
:
:
:
:
:
:
{
{
{
{
{
{
{
"n"
"n"
"n"
"n"
"n"
"n"
"n"
:
:
:
:
:
:
:
0,
0,
0,
0,
0,
0,
0,
"minutesSinceLastCall"
"minutesSinceLastCall"
"minutesSinceLastCall"
"minutesSinceLastCall"
"minutesSinceLastCall"
"minutesSinceLastCall"
"minutesSinceLastCall"
:
:
:
:
:
:
:
1
0
0
0
4
0
1
},
},
},
},
},
},
}
writebacklisten
writebacklisten
writebacklisten (page 801) is an internal command.
801
Testing Commands
Testing Commands
Name
Description
testDistLockWithSkew
Internal command. Do not call this directly.
testDistLockWithSyncCluster
Internal command. Do not call this directly.
captrunc (page 802)
Internal command. Truncates capped collections.
emptycapped (page 803)
Internal command. Removes all documents from a capped collection.
godinsert (page 803)
Internal command for testing.
_hashBSONElement
Internal command. Computes the MD5 hash of a BSON element.
(page 803)
_journalLatencyTest
Tests the time required to write and perform a file system sync for a file in the
journal directory.
sleep (page 805)
Internal command for testing. Forces MongoDB to block all operations.
replSetTest (page 805)
Internal command for testing replica set functionality.
forceerror (page 806)
Internal command for testing. Forces a user assertion exception.
skewClockCommand
Internal command. Do not call this command directly.
configureFailPoint
Internal command for testing. Configures failure points.
(page 806)
testDistLockWithSkew
_testDistLockWithSkew
_testDistLockWithSkew (page 802) is an internal command. Do not call directly.
Note:
_testDistLockWithSkew (page 802) is an internal command that is not enabled by
default.
_testDistLockWithSkew (page 802) must be enabled by using --setParameter
enableTestCommands=1 on the mongod (page 925) command line. _testDistLockWithSkew
(page 802) cannot be enabled during run-time.
testDistLockWithSyncCluster
_testDistLockWithSyncCluster
_testDistLockWithSyncCluster (page 802) is an internal command. Do not call directly.
Note:
_testDistLockWithSyncCluster (page 802) is an internal command that is not
enabled by default.
_testDistLockWithSyncCluster (page 802) must be enabled by using --setParameter enableTestCommands=1 on the mongod (page 925) command line.
_testDistLockWithSyncCluster (page 802) cannot be enabled during run-time.
captrunc
Definition
captrunc
Truncates capped collections. captrunc (page 802) is an internal command to support testing that takes the
following form:
{ captrunc: "<collection>", n: <integer>, inc: <true|false> }.
802
Examples The following command truncates 10 older documents from the collection records:
db.runCommand({captrunc: "records" , n: 10})
The following command truncates 100 documents and the 101st document:
db.runCommand({captrunc: "records", n: 100, inc: true})
emptycapped
emptycapped
The emptycapped command removes all documents from a capped collection. Use the following syntax:
{ emptycapped: "events" }
This command removes all records from the capped collection named events.
Warning: This command obtains a write lock on the affected database and will block other operations until
it has completed.
Note: emptycapped (page 803) is an internal command that is not enabled by default. emptycapped
(page 803) must be enabled by using --setParameter enableTestCommands=1 on the mongod
(page 925) command line. emptycapped (page 803) cannot be enabled during run-time.
godinsert
godinsert
godinsert (page 803) is an internal command for testing purposes only.
Note: This command obtains a write lock on the affected database and will block other operations until it has
completed.
Note: godinsert (page 803) is an internal command that is not enabled by default. godinsert (page 803)
must be enabled by using --setParameter enableTestCommands=1 on the mongod (page 925) command line. godinsert (page 803) cannot be enabled during run-time.
_hashBSONElement
Description
_hashBSONElement
New in version 2.4.
803
An internal command that computes the MD5 hash of a BSON element. The _hashBSONElement (page 803)
command returns 8 bytes from the 16 byte MD5 hash.
The _hashBSONElement (page 803) command has the following fields:
field BSONElement key The BSON element to hash.
field integer seed A seed used when computing the hash.
Note:
_hashBSONElement (page 803) is an internal command that is not enabled by
default.
_hashBSONElement (page 803) must be enabled by using --setParameter
enableTestCommands=1 on the mongod (page 925) command line. _hashBSONElement (page 803)
cannot be enabled during run-time.
Output The _hashBSONElement (page 803) command returns a document that holds the following fields:
_hashBSONElement.key
The original BSON element.
_hashBSONElement.seed
The seed used for the hash, defaults to 0.
_hashBSONElement.out
The decimal result of the hash.
_hashBSONElement.ok
Holds the 1 if the function returns successfully, and 0 if the operation encountered an error.
Example Invoke a mongod (page 925) instance with test commands enabled:
mongod --setParameter enableTestCommands=1
Run the following to hash the same ISODate string but this time to specify a seed value:
db.runCommand({_hashBSONElement: ISODate("2013-02-12T22:12:57.211Z"), seed:2013})
804
journalLatencyTest
journalLatencyTest
journalLatencyTest (page 805) is an administrative command that tests the length of time required
to write and perform a file system sync (e.g. fsync) for a file in the journal directory. You must issue the
journalLatencyTest (page 805) command against the admin database in the form:
{ journalLatencyTest: 1 }
The value (i.e. 1 above), does not affect the operation of the command.
Note:
journalLatencyTest (page 805) is an internal command that is not enabled by
default.
journalLatencyTest (page 805) must be enabled by using --setParameter
enableTestCommands=1 on the mongod (page 925) command line. journalLatencyTest (page 805)
cannot be enabled during run-time.
sleep
Definition
sleep
Forces the database to block all operations. This is an internal command for testing purposes.
The sleep (page 805) command has the following options:
field boolean w If true, obtains a global write lock. Otherwise obtains a read lock.
field integer secs The number of seconds to sleep.
Example The sleep (page 805) command takes the following prototype form:
{ sleep: 1, w: <true|false>, secs: <seconds> }
The command places the mongod (page 925) instance in a write lock state for a specified number of seconds. Without
arguments, sleep (page 805) causes a read lock for 100 seconds.
Warning: sleep (page 805) claims the lock specified in the w argument and blocks all operations on the mongod
(page 925) instance for the specified amount of time.
Note: sleep (page 805) is an internal command that is not enabled by default. sleep (page 805) must be enabled
by using --setParameter enableTestCommands=1 on the mongod (page 925) command line. sleep
(page 805) cannot be enabled during run-time.
replSetTest
replSetTest
replSetTest (page 805) is internal diagnostic command used for regression tests that supports replica set
functionality.
Note: replSetTest (page 805) is an internal command that is not enabled by default. replSetTest
(page 805) must be enabled by using --setParameter enableTestCommands=1 on the mongod
(page 925) command line. replSetTest (page 805) cannot be enabled during run-time.
805
forceerror
forceerror
The forceerror (page 806) command is for testing purposes only. Use forceerror (page 806) to force a
user assertion exception. This command always returns an ok value of 0.
skewClockCommand
_skewClockCommand
_skewClockCommand (page 806) is an internal command. Do not call directly.
Note:
_skewClockCommand (page 806) is an internal command that is not enabled by
default.
_skewClockCommand (page 806) must be enabled by using --setParameter
enableTestCommands=1 on the mongod (page 925) command line. _skewClockCommand (page 806)
cannot be enabled during run-time.
configureFailPoint
Definition
configureFailPoint
Configures a failure point that you can turn on and off while MongoDB runs. configureFailPoint
(page 806) is an internal command for testing purposes that takes the following form:
{configureFailPoint: "<failure_point>", mode: <behavior> }
You must issue configureFailPoint (page 806) against the admin database. configureFailPoint
(page 806) has the following fields:
field string configureFailPoint The name of the failure point.
field document,string mode Controls the behavior of a failure point. The possible values are
alwaysOn, off, or a document in the form of {times: n} that specifies the number of
times the failure point remains on before it deactivates. The maximum value for the number is a
32-bit signed integer.
field document data Passes in additional data for use in configuring the fail point. For example, to
imitate a slow connection pass in a document that contains a delay time.
Note:
configureFailPoint (page 806) is an internal command that is not enabled by default.
configureFailPoint (page 806) must be enabled by using --setParameter enableTestCommands=1
on the mongod (page 925) command line. configureFailPoint (page 806) cannot be enabled during run-time.
Example
db.adminCommand( { configureFailPoint: "blocking_thread", mode: {times: 21} } )
806
JavaScript in MongoDB
Although these methods use JavaScript, most interactions with MongoDB do not use JavaScript but use an idiomatic
driver (page 95) in the language of the interacting application.
Collection
Collection Methods
Name
db.collection.aggregate() (page 808)
db.collection.count() (page 809)
db.collection.createIndex() (page 810)
db.collection.getIndexStats() (page 810)
db.collection.indexStats() (page 811)
db.collection.dataSize() (page 812)
db.collection.distinct() (page 812)
db.collection.drop() (page 812)
db.collection.dropIndex() (page 812)
db.collection.dropIndexes() (page 813)
db.collection.ensureIndex() (page 814)
db.collection.find() (page 816)
db.collection.findAndModify() (page 821)
db.collection.findOne() (page 824)
db.collection.getIndexes() (page 826)
db.collection.getShardDistribution() (page 827)
db.collection.getShardVersion() (page 828)
db.collection.group() (page 828)
db.collection.insert() (page 832)
db.collection.isCapped() (page 837)
db.collection.mapReduce() (page 837)
db.collection.reIndex() (page 844)
db.collection.remove() (page 844)
db.collection.renameCollection() (page 846)
db.collection.save() (page 846)
db.collection.stats() (page 848)
db.collection.storageSize() (page 849)
db.collection.totalSize() (page 849)
db.collection.totalIndexSize() (page 849)
Description
Provides access to the aggregation pipeline (page 279).
Wraps count (page 695) to return a count of the number of docu
Builds an index on a collection. Use db.collection.ensure
Renders a human-readable view of the data collected by indexS
Renders a human-readable view of the data collected by indexS
Returns the size of the collection. Wraps the size (page 764) fie
Returns an array of documents that have distinct values for the spe
Removes the specified collection from the database.
Removes a specified index on a collection.
Removes all indexes on a collection.
Creates an index if it does not currently exist. If the index exists e
Performs a query on a collection and returns a cursor object.
Atomically modifies and returns a single document.
Performs a query and returns a single document.
Returns an array of documents that describe the existing indexes o
For collections in sharded clusters, db.collection.getShar
Internal diagnostic method for shard cluster.
Provides simple data aggregation function. Groups documents in
Creates a new document in a collection.
Reports if a collection is a capped collection.
Performs map-reduce style data aggregation.
Rebuilds all existing indexes on a collection.
Deletes documents from a collection.
Changes the name of a collection.
Provides a wrapper around an insert() (page 832) and updat
Reports on the state of a collection. Provides a wrapper around th
Reports the total size used by the collection in bytes. Provides a w
Reports the total size of a collection, including the size of all docu
Reports the total size used by the indexes on a collection. Provide
807
Name
db.collection.update() (page 849)
db.collection.validate() (page 856)
db.collection.aggregate()
Definition
db.collection.aggregate(pipeline)
New in version 2.2.
Calculates aggregate values for the data in a collection. Always call the aggregate() (page 808) method on
a collection object.
param document pipeline A sequence of data aggregation processes. See the aggregation reference (page 306) for documentation of these operators.
Returns
A document with two fields:
result which holds an array of documents returned by the pipeline
ok which holds the value 1, indicating success.
Throws exception
Changed in version 2.4: If an error occurs, the aggregate() (page 808) helper throws an
exception. In previous versions, the helper returned a document with the error message and
code, and ok status field not equal to 1, same as the aggregate (page 694) command.
Example Consider a collection named articles that contains documents of the following format:
{
title : "this is my title" ,
author : "bob" ,
posted : new Date () ,
pageViews : 5 ,
tags : [ "fun" , "good" , "sport" ] ,
comments : [
{ author :"joe" , text : "this is cool" } ,
{ author :"sam" , text : "this is bad" }
],
other : { foo : 5 }
}
The following aggregation pivots the data to group authors by individual tags:
db.articles.aggregate(
{ $project : {
author : 1,
tags : 1,
}
},
{ $unwind : "$tags" },
{ $group : {
_id : { tags : "$tags" },
authors : { $addToSet : "$author" }
808
}
}
)
The aggregation pipeline begins with the collection articles and selects the author and tags fields using the
$project (page 664) pipeline operator. The $unwind (page 668) operator produces one output document per tag.
Finally, the $group (page 669) pipeline operator groups authors by tags.
The operation returns the following document:
{
"result" : [
{
"_id" : { "tags" : "good" },
"authors" : [ "bob", "mike", ... ]
},
{
"_id" : { "tags" : "fun" },
"authors" : [ "bob", "al" ]
},
...
],
"ok" : 1
}
809
db.orders.count()
Count all Documents that Match a Query Count the number of the documents in the orders collection with the
field ord_dt greater than new Date(01/01/2012):
db.orders.count( { ord_dt: { $gt: new Date('01/01/2012') } } )
db.collection.createIndex()
Definition
db.collection.createIndex(keys, options)
Deprecated since version 1.8.
Creates indexes on collections.
param document keys For each field to index, a key-value pair with the field and the index order:
1 for ascending or -1 for descending.
param document options One or more key-value pairs that specify index options. For a list of
options, see db.collection.ensureIndex() (page 814).
See also:
Indexes (page 313), db.collection.createIndex() (page 810), db.collection.dropIndex()
(page 812), db.collection.dropIndexes() (page 813), db.collection.getIndexes()
(page 826), db.collection.reIndex() (page 844), and db.collection.totalIndexSize()
(page 849)
db.collection.getIndexStats()
Definition
db.collection.getIndexStats(index)
Displays a human-readable summary of aggregated statistics about an indexs B-tree data structure. The information summarizes the output returned by the indexStats (page 774) command and indexStats()
(page 811) method. The getIndexStats() (page 810) method displays the information on the screen and
does not return an object.
The getIndexStats() (page 810) method has the following form:
db.<collection>.getIndexStats( { index : "<index name>" } )
Warning: Do not use getIndexStats() (page 810) or indexStats (page 774) with production
deployments.
Example The following command returns information for an index named type_1_traits_1:
db.animals.getIndexStats({index:"type_1_traits_1"})
The command returns the following summary. For more information on the B-tree statistics, see indexStats
(page 774).
-- index "undefined" -version 1 | key pattern { "type" : 1,
2 deep, bucket body is 8154 bytes
bucket count
45513
180
45332
db.collection.indexStats()
Definition
db.collection.indexStats(index)
Aggregates statistics for the B-tree data structure that stores data for a MongoDB index.
The
indexStats() (page 811) method is a thin wrapper around the indexStats (page 774) command. The
indexStats() (page 811) method is available only on mongod (page 925) instances running with the
--enableExperimentalIndexStatsCmd option.
Important: The indexStats() (page 811) method is not intended for production deployments.
The indexStats() (page 811) method has the following form:
db.<collection>.indexStats( { index: "<index name>" } )
811
db.collection.dataSize()
db.collection.dataSize()
Returns The size of the collection. This method provides a wrapper around the size (page 764)
output of the collStats (page 763) (i.e. db.collection.stats() (page 848)) command.
db.collection.distinct()
Definition
db.collection.distinct(field, query)
Finds the distinct values for a specified field across a single collection and returns the results in an array.
param string field The field for which to return distinct values.
param document query A query that specifies the documents from which to retrieve the distinct
values.
The db.collection.distinct() (page 812) method provides a wrapper around the distinct
(page 696) command. Results must not be larger than the maximum BSON size (page 1015).
When possible to use covered indexes, the db.collection.distinct() (page 812) method will use an
index to find the documents in the query as well as to return the data.
Examples The following are examples of the db.collection.distinct() (page 812) method:
Return an array of the distinct values of the field ord_dt from all documents in the orders collection:
db.orders.distinct( 'ord_dt' )
Return an array of the distinct values of the field sku in the subdocument item from all documents in the
orders collection:
db.orders.distinct( 'item.sku' )
Return an array of the distinct values of the field ord_dt from the documents in the orders collection where
the price is greater than 10:
db.orders.distinct( 'ord_dt', { price: { $gt: 10 } } )
db.collection.drop()
db.collection.drop()
Call the db.collection.drop() (page 812) method on a collection to drop it from the database.
db.collection.drop() (page 812) takes no arguments and will produce an error if called with any arguments.
db.collection.dropIndex()
Definition
db.collection.dropIndex(index)
Drops or removes the specified index from a collection. The db.collection.dropIndex() (page 812)
method provides a wrapper around the dropIndexes (page 750) command.
Note: You cannot drop the default index on the _id field.
812
The index on the field cat has the user-specified name of catIdx 9 . To drop the index catIdx, you
can use either the index name:
db.pets.dropIndex( "catIdx" )
1 }:
db.pets.dropIndex( { "cat" : 1 } )
db.collection.dropIndexes()
db.collection.dropIndexes()
Drops all indexes other than the required index on the _id field. Only call dropIndexes() (page 813) as a
method on a collection object.
db.collection.ensureIndex()
8
When using a mongo (page 942) shell version earlier than 2.2.2, if you specified a name during the index creation, you must use the name to
drop the index.
9 During index creation, if the user does not specify an index name, the system generates the name by concatenating the index key field and
value with an underscore, e.g. cat_1.
813
Definition
db.collection.ensureIndex(keys, options)
Creates an index on the specified field if the index does not already exist.
The ensureIndex() (page 814) method has the following fields:
param document keys For ascending/descending indexes, a document that contains pairs with the
name of the field or fields to index and order of the index. A 1 specifies ascending and a -1
specifies descending. MongoDB supports several different index types including text (page 332),
geospatial (page 326), and hashed (page 333) indexes. See Indexing Tutorials (page 338).
param document options A document that controls the creation of the index. The document contains a set of options, as described in the next table.
Warning: Index names, including their full namespace (i.e. database.collection) cannot be longer
than 128 characters. See the getIndexes() (page 826) field name (page 826) for the names of existing
indexes.
The options document has one or more of the following fields:
param Boolean background Builds the index in the background so that building an index does not
block other database activities. Specify true to build in the background. The default value is
false.
param Boolean unique Creates a unique index so that the collection will not accept insertion of
documents where the index key or keys match an existing value in the index. Specify true
to create a unique index. The default value is false. This option applies only to ascending/descending indexes.
param string name The name of the index. If unspecified, MongoDB generates an index name by
concatenating the names of the indexed fields and the sort order.
param Boolean dropDups Creates a unique index on a field that may have duplicates. MongoDB
indexes only the first occurrence of a key and removes all documents from the collection that
contain subsequent occurrences of that key. Specify true to create unique index. The default
value is false. This option applies only to scalar indexes.
param Boolean sparse If true, the index only references documents with the specified field.
These indexes use less space but behave differently in some situations (particularly sorts). The
default value is false. This applies only to ascending/descending indexes.
param integer expireAfterSeconds Specifies a value, in seconds, as a TTL to control how long
MongoDB retains documents in this collection. See Expire Data from Collections by Setting
TTL (page 158) for more information on this functionality. This applies only to TTL indexes.
param index version v The index version number. The default index version depends on the version
of mongod (page 925) running when creating the index. Before version 2.0, the this value was
0; versions 2.0 and later use version 1, which provides a smaller and faster index format. Specify
a different index version only in unusual situations.
param document weights For text indexes, the significance of the field relative to the other indexed fields. The document contains field and weight pairs. The weight is a number ranging
from 1 to 99,999 and denotes the significance of the field relative to the other indexed fields
in terms of the score. You can specify weights for some or all the indexed fields. See Control
Search Results with Weights (page 364) to adjust the scores. The default value is 1. This applies
to text indexes only.
param string default_language For a text index, the language that determines the list of stop
words and the rules for the stemmer and tokenizer. See Text Search Languages (page 719) for
814
the available languages and Specify a Language for Text Index (page 362) for more information
and examples. The default value is english. This applies to text indexes only.
param string language_override For a text index, specify the name of the field in the document
that contains, for that document, the language to override the default language. The default value
is language.
Examples
Create an Ascending Index on a Single Field The following example creates an ascending index on the field
orderDate.
db.collection.ensureIndex( { orderDate: 1 } )
If the keys document specifies more than one field, then ensureIndex() (page 814) creates a compound index.
Create an Index on a Multiple Fields The following example creates a compound index on the orderDate field
(in ascending order) and the zipcode field (in descending order.)
db.collection.ensureIndex( { orderDate: 1, zipcode: -1 } )
815
Definition
db.collection.find(<criteria>, <projection>)
Selects documents in a collection and returns a cursor to the selected documents.
10
param document criteria Specifies selection criteria using query operators (page 621). To return
all documents in a collection, omit this parameter or pass an empty document ({}).
param document projection Specifies the fields to return using projection operators (page 646).
To return all fields in the matching document, omit this parameter.
Returns
A cursor to the documents that match the query criteria. When the find() (page 816) method
returns documents, the method is actually returning a cursor to the documents.
If the projection argument is specified, the matching documents contain only the
projection fields and the _id field. You can optionally exclude the _id field.
Executing find() (page 816) directly in the mongo (page 942) shell automatically iterates the
cursor to display up to the first 20 documents. Type it to continue iteration.
To access the returned documents with a driver, use the appropriate cursor handling mechanism
for the driver language (page 95).
The projection parameter takes a document of the following form:
{ field1: <boolean>, field2: <boolean> ... }
Find Documents that Match Query Criteria To find documents that match a set of selection criteria, call find()
with the <criteria> parameter. The following operation returns all the documents from the collection products
where qty is greater than 25:
db.products.find( { qty: { $gt: 25 } } )
10
db.collection.find() (page 816) is a wrapper for the more formal query structure that uses the $query (page 693) operator.
816
The following operation returns documents in the bios collection (page 111) where _id equals
db.bios.find( { _id: 5 } )
Query Using Operators The following operation returns documents in the bios collection (page 111) where _id
equals either 5 or ObjectId("507c35dd8fada716c89d0013"):
db.bios.find(
{
_id: { $in: [ 5,
}
)
ObjectId("507c35dd8fada716c89d0013") ] }
Query for Ranges Combine comparison operators to specify ranges. The following operation returns documents
with field between value1 and value2:
db.collection.find( { field: { $gt: value1, $lt: value2 } } );
Query a Field that Contains an Array If a field contains an array and your query has multiple conditional operators,
the field as a whole will match if either a single array element meets the conditions or a combination of array elements
meet the conditions.
Given a collection students that contains the following documents:
{ "_id" : 1, "score" : [ -1, 3 ] }
{ "_id" : 2, "score" : [ 1, 5 ] }
{ "_id" : 3, "score" : [ 5, 5 ] }
In the document with _id equal to 1, the score: [ -1, 3 ] meets the conditions because the element -1
meets the $lt: 2 condition and the element 3 meets the $gt: 0 condition.
In the document with _id equal to 2, the score: [ 1, 5 ] meets the conditions because the element 1 meets
both the $lt: 2 condition and the $gt: 0 condition.
Query Arrays
Query for an Array Element The following operation returns documents in the bios collection (page 111) where
the array field contribs contains the element "UNIX":
db.bios.find( { contribs: "UNIX" } )
817
Query an Array of Documents The following operation returns documents in the bios collection (page 111) where
awards array contains a subdocument element that contains the award field equal to "Turing Award" and the
year field greater than 1980:
db.bios.find(
{
awards: {
$elemMatch: {
award: "Turing Award",
year: { $gt: 1980 }
}
}
}
)
Query Subdocuments
Query Exact Matches on Subdocuments The following operation returns documents in the bios collection
(page 111) where the subdocument name is exactly { first: "Yukihiro", last: "Matsumoto" },
including the order:
db.bios.find(
{
name: {
first: "Yukihiro",
last: "Matsumoto"
}
}
)
The name field must match the sub-document exactly. The query does not match documents with the following name
fields:
{
first: "Yukihiro",
aka: "Matz",
last: "Matsumoto"
}
{
last: "Matsumoto",
first: "Yukihiro"
}
Query Fields of a Subdocument The following operation returns documents in the bios collection (page 111)
where the subdocument name contains a field first with the value "Yukihiro" and a field last with the value
"Matsumoto". The query uses dot notation to access fields in a subdocument:
db.bios.find(
{
"name.first": "Yukihiro",
"name.last": "Matsumoto"
}
)
818
The query matches the document where the name field contains a subdocument with the field first with the value
"Yukihiro" and a field last with the value "Matsumoto". For instance, the query would match documents
with name fields that held either of the following values:
{
first: "Yukihiro",
aka: "Matz",
last: "Matsumoto"
}
{
last: "Matsumoto",
first: "Yukihiro"
}
Projections The projection parameter specifies which fields to return. The parameter contains either include or
exclude specifications, not both, unless the exclude is for the _id field.
Specify the Fields to Return The following operation returns all the documents from the products collection
where qty is greater than 25 and returns only the _id, item and qty fields:
db.products.find( { qty: { $gt: 25 } }, { item: 1, qty: 1 } )
The following operation finds all documents in the bios collection (page 111) and returns only the name field,
contribs field and _id field:
db.bios.find( { }, { name: 1, contribs: 1 } )
Explicitly Excluded Fields The following operation queries the bios collection (page 111) and returns all fields
except the the first field in the name subdocument and the birth field:
db.bios.find(
{ contribs: 'OOP' },
{ 'name.first': 0, birth: 0 }
)
Explicitly Exclude the _id Field The following operation excludes the _id and qty fields from the result set:
db.products.find( { qty: { $gt: 25 } }, { _id: 0, qty: 0 } )
The documents in the result set contain all fields except the _id and qty fields:
{ "item" : "pencil", "type" : "no.2" }
{ "item" : "bottle", "type" : "blue" }
{ "item" : "paper" }
The following operation finds documents in the bios collection (page 111) and returns only the name field and the
contribs field:
819
db.bios.find(
{ },
{ name: 1, contribs: 1, _id: 0 }
)
On Arrays and Subdocuments The following operation queries the bios collection (page 111) and returns the
last field in the name subdocument and the first two elements in the contribs array:
db.bios.find(
{ },
{
_id: 0,
'name.last': 1,
contribs: { $slice: 2 }
}
)
Iterate the Returned Cursor The find() (page 816) method returns a cursor to the results. In the mongo
(page 942) shell, if the returned cursor is not assigned to a variable using the var keyword, the cursor is automatically iterated up to 20 times to access up to the first 20 documents that match the query. You can use the
DBQuery.shellBatchSize to change the number of iterations. See Flags (page 859) and Cursor Behaviors
(page 43). To iterate manually, assign the returned cursor to a variable using the var keyword.
With Variable Name The following example uses the variable myCursor to iterate over the cursor and print the
matching documents:
var myCursor = db.bios.find( );
myCursor
With next() Method The following example uses the cursor method next() (page 870) to access the documents:
var myCursor = db.bios.find( );
var myDocument = myCursor.hasNext() ? myCursor.next() : null;
if (myDocument) {
var myName = myDocument.name;
print (tojson(myName));
}
To print, you can also use the printjson() method instead of print(tojson()):
if (myDocument) {
var myName = myDocument.name;
printjson(myName);
}
With forEach() Method The following example uses the cursor method forEach() (page 866) to iterate the
cursor and access the documents:
820
Modify the Cursor Behavior The mongo (page 942) shell and the drivers (page 95) provide several cursor methods
that call on the cursor returned by the find() (page 816) method to modify its behavior.
Order Documents in the Result Set The sort() (page 872) method orders the documents in the result set. The
following operation returns documents in the bios collection (page 111) sorted in ascending order by the name field:
db.bios.find().sort( { name: 1 } )
Regardless of the order you chain the limit() (page 867) and the sort() (page 872), the request to the server has
the structure that treats the query and the sort() (page 872) modifier as a single object. Therefore, the limit()
(page 867) operation method is always applied after the sort() (page 872) regardless of the specified order of the
operations in the chain. See the meta query operators (page 687).
db.collection.findAndModify()
Definition
db.collection.findAndModify(<document>)
Atomically modifies and returns a single document. By default, the returned document does not include the
modifications made on the update. To return the document with the modifications made on the update, use
the new option. The findAndModify() (page 821) method is a shell helper around the findAndModify
(page 710) command.
The findAndModify() (page 821) method has the following form:
db.collection.findAndModify( {
query: <document>,
sort: <document>,
remove: <boolean>,
821
update: <document>,
new: <boolean>,
fields: <document>,
upsert: <boolean>
} );
The db.collection.findAndModify() (page 821) method takes a document parameter with the following subdocument fields:
param document query The selection criteria for the modification. The query field employs the
same query selectors (page 621) as used in the db.collection.find() (page 816) method.
Although the query may match multiple documents, findAndModify() (page 821) will select only one document to modify.
param document sort Determines which document the operation modifies if the query selects multiple documents. findAndModify() (page 821) modifies the first document in the sort order
specified by this argument.
param Boolean remove Must specify either the remove or the update field in the
findAndModify() (page 821) method. Removes the document specified in the update
field. Set this to true to remove the selected document . The default is false.
param document update Must specify either the remove or the update field in the
findAndModify() (page 821) method. Performs an update of the selected document. The
update field employs the same update operators (page 651) or field: value specifications to modify the selected document.
param Boolean new When true, returns the modified document rather than the original. The
findAndModify() (page 821) method ignores the new option for remove operations. The
default is false.
param document fields A subset of fields to return. The fields document specifies an inclusion
of a field with 1, as in: fields: { <field1>: 1, <field2>: 1, ... }. See
projection (page 72).
param Boolean upsert Used in conjunction with the update field.
When true,
findAndModify() (page 821) creates a new document if the query returns no documents. The default is false.
Return Data The findAndModify() (page 821) method returns either: the pre-modification document or, if
new: true is set, the modified document.
Note:
If the query finds no document for update or remove operations, findAndModify() (page 821) returns
null.
If the query finds no document for an upsert, operation, findAndModify() (page 821) performs an insert.
If new is false, and the sort option is NOT specified, the method returns null.
Changed in version 2.2: Previously returned an empty document {}. See the 2.2 release notes (page 1053) for
more information.
If the query finds no document for an upsert, findAndModify() (page 821) performs an insert. If new is
false, and a sort option, the method returns an empty document {}.
Behaviors
822
Upsert and Unique Index When findAndModify() (page 821) includes the upsert: true option and the
query field(s) is not uniquely indexed, the method could insert a document multiple times in certain circumstances.
For instance, if multiple clients each invoke the method with the same query condition and these methods complete
the find phase before any of methods perform the modify phase, these methods could insert the same document.
In the following example, no document with the name Andy exists, and multiple clients issue the following command:
db.people.findAndModify( {
query: { name: "Andy" },
sort: { rating: 1 },
update: { $inc: { score: 1 } },
upsert: true
} )
Then, if these clients findAndModify() (page 821) methods finish the query phase before any command starts
the modify phase, and there is no unique index on the name field, the commands may all perform an upsert. To
prevent this condition, create a unique index (page 334) on the name field. With the unique index in place, the multiple
methods would observe one of the following behaviors:
Exactly one findAndModify() (page 821) would successfully insert a new document.
Zero or more findAndModify() (page 821) methods would update the newly inserted document.
Zero or more findAndModify() (page 821) methods would fail when they attempted to insert a duplicate.
If the method fails due to a unique index constraint violation, you can retry the method. Absent a delete of the
document, the retry should not fail.
Sharded Collections When using findAndModify (page 710) in a sharded environment, the query must contain the shard key for all operations against the shard cluster for the sharded collections.
findAndModify (page 710) operations issued against mongos (page 938) instances for non-sharded collections
function normally.
Examples
Update The following method updates an existing document in the people collection where the document matches
the query criteria:
db.people.findAndModify( {
query: { name: "Tom", state: "active", rating: { $gt: 10 } },
sort: { rating: 1 },
update: { $inc: { score: 1 } }
} )
823
"state" : "active",
"rating" : 100,
"score" : 5
}
To return the modified document, add the new:true option to the method.
If no document matched the query condition, the method returns null:
null
db.people.findAndModify( {
query: { name: "Gus", state: "active", rating: 100 },
sort: { rating: 1 },
update: { $inc: { score: 1 } },
upsert: true
} )
If the method does not find a matching document, the method performs an upsert. Because the method included the
sort option, it returns an empty document { } as the original (pre-modification) document:
{ }
If the method did not include a sort option, the method returns null.
null
Update, Insert and Return New Document The following method includes both the upsert: true option
and the new:true option to return the newly inserted document if a document matching the query is not found:
db.people.findAndModify( {
query: { name: "Pascal", state: "active", rating: 25 },
sort: { rating: 1 },
update: { $inc: { score: 1 } },
upsert: true,
new: true
} )
db.collection.findOne()
Definition
824
db.collection.findOne(<criteria>, <projection>)
Returns one document that satisfies the specified query criteria. If multiple documents satisfy the query, this
method returns the first document according to the natural order which reflects the order of documents on the
disk. In capped collections, natural order is the same as insertion order.
param document criteria Specifies query selection criteria using query operators (page 621).
param document projection Specifies the fields to return using projection operators (page 646).
To return all fields in the matching document, omit this parameter.
Returns
One document that satisfies the criteria specified as the first argument to this method. If you
specify the projection argument, findOne() (page 824) returns a document that only
contains the projection fields, and the _id field if you do not explicitly exclude the _id
field.
Although similar to the find() (page 816) method, the findOne() (page 824) method returns a document rather than a cursor.
The projection parameter takes a document of the following form:
{ field1: <boolean>, field2: <boolean> ... }
With a Query Specification The following operation returns the first matching document from the bios collection
(page 111) where either the field first in the subdocument name starts with the letter G or where the field birth
is less than new Date(01/01/1945):
db.bios.findOne(
{
$or: [
{ 'name.first' : /^G/ },
{ birth: { $lt: new Date('01/01/1945') } }
]
}
)
With a Projection The projection parameter specifies which fields to return. The parameter contains either
include or exclude specifications, not both, unless the exclude is for the _id field.
825
Specify the Fields to Return The following operation finds a document in the bios collection (page 111) and returns
only the name, contribs and _id fields:
db.bios.findOne(
{ },
{ name: 1, contribs: 1 }
)
Return All but the Excluded Fields The following operation returns a document in the bios collection (page 111)
where the contribs field contains the element OOP and returns all fields except the _id field, the first field in
the name subdocument, and the birth field:
db.bios.findOne(
{ contribs: 'OOP' },
{ _id: 0, 'name.first': 0, birth: 0 }
)
Access the findOne Result You cannot apply the cursor methods to the result of the findOne() (page 824)
method. However, you can access the document directly, as in the example:
var myDocument = db.bios.findOne();
if (myDocument) {
var myName = myDocument.name;
print (tojson(myName));
}
db.collection.getIndexes()
db.collection.getIndexes()
Returns an array that holds a list of documents that identify and describe the existing indexes on the collection.
You must call the db.collection.getIndexes() (page 826) on a collection. For example:
db.collection.getIndexes()
Change collection to the name of the collection whose indexes you want to learn.
The db.collection.getIndexes() (page 826) items consist of the following fields:
system.indexes.v
Holds the version of the index.
The index version depends on the version of mongod (page 925) that created the index. Before version
2.0 of MongoDB, the this value was 0; versions 2.0 and later use version 1.
system.indexes.key
Contains a document holding the keys held in the index, and the order of the index. Indexes may be either
descending or ascending order. A value of negative one (e.g. -1) indicates an index sorted in descending
order while a positive value (e.g. 1) indicates an index sorted in an ascending order.
system.indexes.ns
The namespace context for the index.
system.indexes.name
A unique name for the index comprised of the field names and orders of all keys.
826
db.collection.getShardDistribution()
db.collection.getShardDistribution()
Returns
Prints the data distribution statistics for a sharded collection.
You must call the
getShardDistribution() (page 827) method on a sharded collection, as in the following
example:
db.myShardedCollection.getShardDistribution()
In the following example, the collection has two shards. The output displays both the individual shard distribution information as well the total shard distribution:
Shard <shard-a> at <host-a>
data : <size-a> docs : <count-a> chunks : <number of chunks-a>
estimated data per chunk : <size-a>/<number of chunks-a>
estimated docs per chunk : <count-a>/<number of chunks-a>
Shard <shard-b> at <host-b>
data : <size-b> docs : <count-b> chunks : <number of chunks-b>
estimated data per chunk : <size-b>/<number of chunks-b>
estimated docs per chunk : <count-b>/<number of chunks-b>
Totals
data : <stats.size> docs : <stats.count> chunks : <calc total chunks>
Shard <shard-a> contains <estDataPercent-a>% data, <estDocPercent-a>% docs in cluster, avg obj
Shard <shard-b> contains <estDataPercent-b>% data, <estDocPercent-b>% docs in cluster, avg obj
<estDataPercent-x> is a calculated value that reflects, for each shard, the data size as the percentage
of the collections total data size, for example:
<estDataPercent-x> = <size-x>/<stats.size>
827
<estDocPercent-x> is a calculated value that reflects, for each shard, the number of documents as
the percentage of the total number of documents for the collection, for example:
<estDocPercent-x> = <count-x>/<stats.count>
stats.shards[ <shard-x> ].avgObjSize is a number that reflects the average object size,
including the unit of measure, for the shard.
For example, the following is a sample output for the distribution of a sharded collection:
Shard shard-a at shard-a/MyMachine.local:30000,MyMachine.local:30001,MyMachine.local:30002
data : 38.14Mb docs : 1000003 chunks : 2
estimated data per chunk : 19.07Mb
estimated docs per chunk : 500001
Shard shard-b at shard-b/MyMachine.local:30100,MyMachine.local:30101,MyMachine.local:30102
data : 38.14Mb docs : 999999 chunks : 3
estimated data per chunk : 12.71Mb
estimated docs per chunk : 333333
Totals
data : 76.29Mb docs : 2000002 chunks : 5
Shard shard-a contains 50% data, 50% docs in cluster, avg obj size on shard : 40b
Shard shard-b contains 49.99% data, 49.99% docs in cluster, avg obj size on shard : 40b
See also:
Sharding (page 493)
db.collection.getShardVersion()
db.collection.getShardVersion()
This method returns information regarding the state of data in a sharded cluster that is useful when diagnosing
underlying issues with a sharded cluster.
For internal and diagnostic use only.
db.collection.group()
Definition
db.collection.group({ key, reduce, initial, [keyf,] [cond,] finalize })
Groups documents in a collection by the specified keys and performs simple aggregation functions such as
computing counts and sums. The method is analogous to a SELECT <...> GROUP BY statement in SQL.
The group() (page 828) method returns an array.
The db.collection.group() (page 828) accepts a single document that contains the following:
field document key The field or fields to group. Returns a key object for use as the grouping key.
field function reduce An aggregation function that operates on the documents during the grouping
operation. These functions may return a sum or a count. The function takes two arguments: the
current document and an aggregation result document for that group
field document initial Initializes the aggregation result document.
field function keyf Alternative to the key field. Specifies a function that creates a key object for
use as the grouping key. Use keyf instead of key to group by calculated fields rather than
existing document fields.
828
field document cond The selection criteria to determine which documents in the collection to process. If you omit the cond field, db.collection.group() (page 828) processes all the
documents in the collection for the group operation.
field function finalize A function that runs each item in the result set before
db.collection.group() (page 828) returns the final value. This function can either modify the result document or replace the result document as a whole. Keep in mind
that, unlike the $keyf and $reduce fields that also specify a function, this field name is
finalize, not $finalize.
The db.collection.group() (page 828) method is a shell wrapper for the group (page 697) command.
However, the db.collection.group() (page 828) method takes the keyf field and the reduce field
whereas the group (page 697) command takes the $keyf field and the $reduce field.
Warning:
The db.collection.group() (page 828) method does not work with sharded clusters. Use the
aggregation framework or map-reduce in sharded environments.
The result set must fit within the maximum BSON document size (page 1015).
In version 2.2, the returned array can contain at most 20,000 elements; i.e. at most 20,000
unique groupings. For group by operations that results in more than 20,000 unique groupings, use
mapReduce (page 701). Previous versions had a limit of 10,000 elements.
Prior to 2.4, the db.collection.group() (page 828) method took the mongod (page 925)
instances JavaScript lock, which blocked all other JavaScript execution.
829
Available Properties
Available Functions
args
MaxKey
MinKey
assert()
BinData()
DBPointer()
DBRef()
doassert()
emit()
gc()
HexData()
hex_md5()
isNumber()
isObject()
ISODate()
isString()
Map()
MD5()
NumberInt()
NumberLong()
ObjectId()
print()
printjson()
printjsononeline()
sleep()
Timestamp()
tojson()
tojsononeline()
tojsonObject()
UUID()
version()
Examples The following examples assume an orders collection with documents of the following prototype:
{
_id: ObjectId("5085a95c8fada716c89d0021"),
ord_dt: ISODate("2012-07-01T04:00:00Z"),
ship_dt: ISODate("2012-07-02T04:00:00Z"),
item: { sku: "abc123",
price: 1.99,
uom: "pcs",
qty: 25 }
}
Group by Two Fields The following example groups by the ord_dt and item.sku fields those documents that
have ord_dt greater than 01/01/2011:
db.orders.group( {
key: { ord_dt: 1, 'item.sku': 1 },
cond: { ord_dt: { $gt: new Date( '01/01/2012' ) } },
reduce: function ( curr, result ) { },
initial: { }
} )
830
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
:
:
:
:
:
:
:
:
:
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
:
:
:
:
:
:
:
:
:
"abc123"},
"abc456"},
"bcd123"},
"efg456"},
"abc123"},
"efg456"},
"ijk123"},
"abc123"},
"abc456"},
Calculate the Sum The following example groups by the ord_dt and item.sku fields, those documents that
have ord_dt greater than 01/01/2011 and calculates the sum of the qty field for each grouping:
db.orders.group( {
key: { ord_dt: 1, 'item.sku': 1 },
cond: { ord_dt: { $gt: new Date( '01/01/2012' ) } },
reduce: function ( curr, result ) {
result.total += curr.item.qty;
},
initial: { total : 0 }
} )
The result is an array of documents that contain the group by fields and the calculated aggregation field:
[ {
{
{
{
{
{
{
{
{
{
{
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
"ord_dt"
:
:
:
:
:
:
:
:
:
:
:
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-07-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-06-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-05-01T04:00:00Z"),
ISODate("2012-06-08T04:00:00Z"),
ISODate("2012-06-08T04:00:00Z"),
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
"item.sku"
:
:
:
:
:
:
:
:
:
:
:
"abc123",
"abc456",
"bcd123",
"efg456",
"abc123",
"efg456",
"ijk123",
"abc123",
"abc456",
"abc123",
"abc456",
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
"total"
:
:
:
:
:
:
:
:
:
:
:
25
25
10
10
25
15
20
45
25
25
25
},
},
},
},
},
},
},
},
},
},
} ]
Calculate Sum, Count, and Average The following example groups by the calculated day_of_week field, those
documents that have ord_dt greater than 01/01/2011 and calculates the sum, count, and average of the qty field
for each grouping:
db.orders.group( {
keyf: function(doc) {
return { day_of_week: doc.ord_dt.getDay() } ; },
cond: { ord_dt: { $gt: new Date( '01/01/2012' ) } },
reduce: function ( curr, result ) {
result.total += curr.item.qty;
result.count++;
},
initial: { total : 0, count: 0 },
finalize: function(result) {
831
The result is an array of documents that contain the group by fields and the calculated aggregation field:
[ { "day_of_week" : "Sunday", "total" : 70, "count" : 4, "avg" : 18 },
{ "day_of_week" : "Friday", "total" : 110, "count" : 6, "avg" : 18 },
{ "day_of_week" : "Tuesday", "total" : 70, "count" : 3, "avg" : 23 } ]
See also:
Aggregation Concepts (page 279)
db.collection.insert()
Definition
db.collection.insert(document)
Inserts a document or an array of documents into a collection.
Changed in version 2.2: The insert() (page 832) method can accept an array of documents to perform a
bulk insert of the documents into the collection.
param document,array document A document or array of documents to insert into the collection.
The insert() (page 832) method has the following behaviors:
If the collection does not exist, then the insert() (page 832) method will create the collection.
If the document does not specify an _id field, then MongoDB will add the _id field and assign a unique
ObjectId (page 103) for the document before inserting. Most drivers create an ObjectId and insert the _id
field, but the mongod (page 925) will create and populate the _id if the driver or application does not.
If the document specifies a new field, then the insert() (page 832) method inserts the document with
the new field. This requires no changes to the data model for the collection or the existing documents.
Examples The following examples show how to use the insert() (page 832) method to insert a document or an
array of documents into either the products collection or the bios collection. If the collections do not exist, the
insert() (page 832) method creates the collections. 11
Insert a Document without Specifying an _id Field In the following examples, the document passed to the
insert() (page 832) method does not contain the _id field. During the insert, mongod (page 925) will create the
_id field and assign it a unique ObjectId (page 103) value.
The ObjectId values are specific to the machine and time when the operation is run. As such, your values may
differ from those in the example.
11
You can also view a list of the existing collections in the database using the show collections operation in the mongo (page 942) shell.
832
products Collection The following example inserts a document into the products collection:
db.products.insert( { item: "card", qty: 15 } )
The inserted document includes an _id field with the generated ObjectId value:
{ "_id" : ObjectId("5063114bd386d8fadbd6b004"), "item" : "card", "qty" : 15 }
bios Collection The following example inserts a document into the The bios Example Collection (page 111):
db.bios.insert(
{
name: { first: 'John', last: 'McCarthy' },
birth: new Date('Sep 04, 1927'),
death: new Date('Dec 24, 2011'),
contribs: [ 'Lisp', 'Artificial Intelligence', 'ALGOL' ],
awards: [
{
award: 'Turing Award',
year: 1971,
by: 'ACM'
},
{
award: 'Kyoto Prize',
year: 1988,
by: 'Inamori Foundation'
},
{
award: 'National Medal of Science',
year: 1990,
by: 'National Science Foundation'
}
]
}
)
The returned document includes an _id field with the generated ObjectId value:
{
"_id" : ObjectId("50a1880488d113a4ae94a94a"),
"name" : { "first" : "John", "last" : "McCarthy" },
"birth" : ISODate("1927-09-04T04:00:00Z"),
"death" : ISODate("2011-12-24T05:00:00Z"),
"contribs" : [ "Lisp", "Artificial Intelligence", "ALGOL" ],
"awards" : [
{
"award" : "Turing Award",
"year" : 1971,
"by" : "ACM"
},
{
"award" : "Kyoto Prize",
"year" :1988,s
"by" : "Inamori Foundation"
},
833
{
"award" : "National Medal of Science",
"year" : 1990,
"by" : "National Science Foundation"
}
]
}
Note: Most MongoDB driver clients will include the _id field and generate an ObjectId before sending the insert
operation to MongoDB; however, if the client sends a document without an _id field, the mongod (page 925) will
add the _id field and generate the ObjectId.
Insert a Document Specifying an _id Field In the following examples, the documents passed to the insert()
(page 832) method includes the _id field. The value of _id must be unique within the collection to avoid duplicate
key error.
products Collection This example inserts into the products collection a document that includes an _id field.
The _id value of 10 must be a unique value for the _id field in the products collection. If the value were not
unique, the insert would fail:
db.products.insert( { _id: 10, item: "box", qty: 20 } )
The insert operation creates the following document in the products collection:
{ "_id" : 10, "item" : "box", "qty" : 20 }
bios Collection This example inserts into the bios collection a document that includes an _id field. The _id
value of 1 must be a unique value for the _id field in the bios collection. Otherwise, if the value were not unique,
the insert would fail:
db.bios.insert(
{
_id: 1,
name: { first: 'John', last: 'Backus' },
birth: new Date('Dec 03, 1924'),
death: new Date('Mar 17, 2007'),
contribs: [ 'Fortran', 'ALGOL', 'Backus-Naur Form', 'FP' ],
awards: [
{
award: 'W.W. McDowell Award',
year: 1967,
by: 'IEEE Computer Society'
},
{
award: 'National Medal of Science',
year: 1975,
by: 'National Science Foundation'
},
{
award: 'Turing Award',
year: 1977,
by: 'ACM'
},
{
834
The insert operation created the following document in the bios collection:
{
"_id" : 1,
"name" : { "first" : "John", "last" : "Backus" },
"birth" : ISODate("1924-12-03T05:00:00Z"),
"death" : ISODate("2007-03-17T04:00:00Z"),
"contribs" : [ "Fortran", "ALGOL", "Backus-Naur Form", "FP" ],
"awards" : [
{
"award" : "W.W. McDowell Award",
"year" : 1967,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Science",
"year" : 1975,
"by" : "National Science Foundation"
},
{
"award" : "Turing Award",
"year" : 1977,
"by" : "ACM"
},
{ "award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering"
}
]
}
Insert Multiple Documents The following examples perform a bulk insert of multiple documents by passing an
array of documents to the insert() (page 832) method.
products Collection This example inserts three documents into the products collection. The documents in the
array do not need to have the same fields. For instance, the first document in the array has an _id field and a type
field. Because the second and third documents do not contain an _id field, mongod (page 925) will create the _id
field for the second and third documents during the insert:
db.products.insert( [ { _id: 11, item: "pencil", qty: 50, type: "no.2" },
{
item: "pen", qty: 20 },
{
item: "eraser", qty: 25 }
] )
835
bios Collection This example inserts three documents in the bios collection. The documents in the array do not
need to have the same fields. The document with _id: 3 contains a field named title that does not appear in the
other documents. MongoDB does not require the other documents to contain this field:
db.bios.insert(
[
{
_id: 3,
name: { first: 'Grace', last: 'Hopper' },
title: 'Rear Admiral',
birth: new Date('Dec 09, 1906'),
death: new Date('Jan 01, 1992'),
contribs: [ 'UNIVAC', 'compiler', 'FLOW-MATIC', 'COBOL' ],
awards: [
{
award: 'Computer Sciences Man of the Year',
year: 1969,
by: 'Data Processing Management Association'
},
{
award: 'Distinguished Fellow',
year: 1973,
by: ' British Computer Society'
},
{
award: 'W. W. McDowell Award',
year: 1976,
by: 'IEEE Computer Society'
},
{
award: 'National Medal of Technology',
year: 1991,
by: 'United States'
}
]
},
{
_id: 4,
name: { first: 'Kristen', last: 'Nygaard' },
birth: new Date('Aug 27, 1926'),
death: new Date('Aug 10, 2002'),
contribs: [ 'OOP', 'Simula' ],
awards: [
{
award: 'Rosing Prize',
year: 1999,
by: 'Norwegian Data Association'
},
{
award: 'Turing Award',
year: 2001,
by: 'ACM'
},
836
{
award: 'IEEE John von Neumann Medal',
year: 2001,
by: 'IEEE'
}
]
},
{
_id: 5,
name: { first: 'Ole-Johan', last: 'Dahl' },
birth: new Date('Oct 12, 1931'),
death: new Date('Jun 29, 2002'),
contribs: [ 'OOP', 'Simula' ],
awards: [
{
award: 'Rosing Prize',
year: 1999,
by: 'Norwegian Data Association'
},
{
award: 'Turing Award',
year: 2001,
by: 'ACM'
},
{
award: 'IEEE John von Neumann Medal',
year: 2001,
by: 'IEEE'
}
]
}
]
)
db.collection.isCapped()
db.collection.isCapped()
Returns Returns true if the collection is a capped collection, otherwise returns false.
See also:
Capped Collections (page 156)
db.collection.mapReduce()
db.collection.mapReduce(map, reduce, {<out>, <query>, <sort>, <limit>, <finalize>, <scope>,
<jsMode>, <verbose>})
The db.collection.mapReduce() (page 837) method provides a wrapper around the mapReduce
(page 701) command.
db.collection.mapReduce(
<map>,
<reduce>,
{
out: <collection>,
query: <document>,
sort: <document>,
limit: <number>,
837
finalize: <function>,
scope: <document>,
jsMode: <boolean>,
verbose: <boolean>
}
)
838
(page 634) operator expressions cannot access certain global functions or properties, such as db, that are available in the mongo (page 942) shell.
When upgrading to MongoDB 2.4, you will need to refactor your code if your map-reduce operations
(page 701), group (page 697) commands, or $where (page 634) operator expressions include any global shell
functions or properties that are no longer available, such as db.
The following JavaScript functions and properties are available to map-reduce operations (page 701),
the group (page 697) command, and $where (page 634) operator expressions in MongoDB 2.4:
Available Properties
Available Functions
args
MaxKey
MinKey
assert()
BinData()
DBPointer()
DBRef()
doassert()
emit()
gc()
HexData()
hex_md5()
isNumber()
isObject()
ISODate()
isString()
Map()
MD5()
NumberInt()
NumberLong()
ObjectId()
print()
printjson()
printjsononeline()
sleep()
Timestamp()
tojson()
tojsononeline()
tojsonObject()
UUID()
version()
Requirements for the map Function The map function has the following prototype:
function() {
...
emit(key, value);
}
839
emit(this.cust_id, 1);
}
The following map function may call emit(key,value) multiple times depending on the number of
elements in the input documents items field:
function() {
this.items.forEach(function(item){ emit(item.sku, 1); });
}
The map function can access the variables defined in the scope parameter.
Requirements for the reduce Function The reduce function has the following prototype:
function(key, values) {
...
return result;
}
the reduce function must be idempotent. Ensure that the following statement is true:
reduce( key, [ reduce(key, valuesArray) ] ) == reduce( key, valuesArray )
the order of the elements in the valuesArray should not affect the output of the reduce function, so that
the following statement is true:
reduce( key, [ A, B ] ) == reduce( key, [ B, A ] )
out Options You can specify the following options for the out parameter:
Output to a Collection
out: <collectionName>
840
Output to a Collection with an Action This option is only available when passing out a collection that already
exists. This option is not available on secondary members of replica sets.
out: { <action>: <collectionName>
[, db: <dbName>]
[, sharded: <boolean> ]
[, nonAtomic: <boolean> ] }
When you output to a collection with an action, the out has the following parameters:
<action>: Specify one of the following actions:
replace
Replace the contents of the <collectionName> if the collection with the <collectionName> exists.
merge
Merge the new result with the existing result if the output collection already exists. If an existing document
has the same key as the new result, overwrite that existing document.
reduce
Merge the new result with the existing result if the output collection already exists. If an existing document
has the same key as the new result, apply the reduce function to both the new and the existing documents
and overwrite the existing document with the result.
db:
Optional.The name of the database that you want the map-reduce operation to write its output. By default
this will be the same database as the input collection.
sharded:
Optional. If true and you have enabled sharding on output database, the map-reduce operation will
shard the output collection using the _id field as the shard key.
nonAtomic:
New in version 2.2.
Optional. Specify output operation as non-atomic and is valid only for merge and reduce output modes
which may take minutes to execute.
If nonAtomic is true, the post-processing step will prevent MongoDB from locking the database; however,
other clients will be able to read intermediate states of the output collection. Otherwise the map reduce operation
must lock the database during post-processing.
Output Inline Perform the map-reduce operation in memory and return the result. This option is the only available
option for out on secondary members of replica sets.
out: { inline: 1 }
The result must fit within the maximum size of a BSON document (page 1015).
Requirements for the finalize Function The finalize function has the following prototype:
function(key, reducedValue) {
...
return modifiedObject;
}
841
The finalize function receives as its arguments a key value and the reducedValue from the reduce function.
Be aware that:
The finalize function should not access the database for any reason.
The finalize function should be pure, or have no impact outside of the function (i.e. side effects.)
The finalize function can access the variables defined in the scope parameter.
Map-Reduce Examples Consider the following map-reduce operations on a collection orders that contains documents of the following prototype:
{
_id: ObjectId("50a8240b927d5d8b5891743c"),
cust_id: "abc123",
ord_date: new Date("Oct 04, 2012"),
status: 'A',
price: 25,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 } ]
}
Return the Total Price Per Customer Perform the map-reduce operation on the orders collection to group by
the cust_id, and calculate the sum of the price for each cust_id:
2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices:
The valuesPrices is an array whose elements are the price values emitted by the map function and
grouped by keyCustId.
The function reduces the valuesPrice array to the sum of its elements.
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function
and the reduceFunction1 reduce function.
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
Calculate Order and Total Quantity with Average Quantity Per Item In this example, you will perform a
map-reduce operation on the orders collection for all documents that have an ord_date value greater than
01/01/2012. The operation groups by the item.sku field, and calculates the number of orders and the total
quantity ordered for each sku. The operation concludes by calculating the average quantity per order for each sku
value:
1. Define the map function to process each input document:
In the function, this refers to the document that the map-reduce operation is processing.
For each item, the function associates the sku with a new object value that contains the count of 1
and the item qty for the order and emits the sku and value pair.
var mapFunction2 = function() {
for (var idx = 0; idx < this.items.length; idx++) {
var key = this.items[idx].sku;
var value = {
count: 1,
qty: this.items[idx].qty
};
emit(key, value);
}
};
2. Define the corresponding reduce function with two arguments keySKU and countObjVals:
countObjVals is an array whose elements are the objects mapped to the grouped keySKU values
passed by map function to the reducer function.
The function reduces the countObjVals array to a single object reducedValue that contains the
count and the qty fields.
In reducedVal, the count field contains the sum of the count fields from the individual array elements, and the qty field contains the sum of the qty fields from the individual array elements.
var reduceFunction2 = function(keySKU, countObjVals) {
reducedVal = { count: 0, qty: 0 };
for (var idx = 0; idx < countObjVals.length; idx++) {
reducedVal.count += countObjVals[idx].count;
reducedVal.qty += countObjVals[idx].qty;
}
return reducedVal;
};
3. Define a finalize function with two arguments key and reducedVal. The function modifies the
reducedVal object to add a computed field named avg and returns the modified object:
var finalizeFunction2 = function (key, reducedVal) {
reducedVal.avg = reducedVal.qty/reducedVal.count;
return reducedVal;
};
using
the
mapFunction2,
843
db.orders.mapReduce( mapFunction2,
reduceFunction2,
{
out: { merge: "map_reduce_example" },
query: { ord_date:
{ $gt: new Date('01/01/2012') }
},
finalize: finalizeFunction2
}
)
This operation uses the query field to select only those documents with ord_date greater than new
Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the
map_reduce_example collection already exists, the operation will merge the existing contents with the
results of this map-reduce operation.
For more information and examples, see the Map-Reduce (page 282) page and Perform Incremental Map-Reduce
(page 300).
See also:
Troubleshoot the Map Function (page 302)
Troubleshoot the Reduce Function (page 303)
mapReduce (page 701) command
Aggregation Concepts (page 279)
db.collection.reIndex()
db.collection.reIndex()
This method drops all indexes and recreates them. This operation may be expensive for collections that have a
large amount of data and/or a large number of indexes.
Call this method, which takes no arguments, on a collection object. For example:
db.collection.reIndex()
Change collection to the name of the collection that you want to rebuild the index.
See
Index Creation (page 335) for more information on the behavior of indexing operations in MongoDB.
db.collection.remove()
Definition
db.collection.remove(query, justOne)
Removes documents from a collection.
The remove() (page 844) method has the following parameters:
param document query Specifies deletion criteria using query operators (page 621). To delete all
documents in a collection, omit this parameter or pass an empty document ({}).
param boolean justOne To limit the deletion to just one document, set to true. The default value
is false.
844
Note: You cannot use the remove() (page 844) method with a capped collection.
Examples The following are examples of the remove() (page 844) method.
Remove All Documents from a Collection To remove all documents in a collection, call the remove (page 844)
method with no parameters: The following operation deletes all documents from the bios collection (page 111):
db.bios.remove()
Note: This operation is not equivalent to the drop() (page 812) method.
To remove all documents from a collection, it may be more efficient to use the drop() (page 812) method to drop
the entire collection, including the indexes, and then recreate the collection and rebuild the indexes.
Remove All Documents that Match a Condition To remove the documents that match a deletion criteria, call the
remove() (page 844) method with the <query> parameter:
The following operation deletes all documents from the bios collection (page 111) where the subdocument name
contains a field first whose value starts with G:
db.bios.remove( { 'name.first' : /^G/ } )
The following operation removes all the documents from the collection products where qty is greater than 20:
db.products.remove( { qty: { $gt: 20 } } )
Remove a Single Document that Matches a Condition To remove the first document that match a deletion criteria,
call the remove (page 844) method with the query criteria and the justOne parameter set to true or 1.
The following operation deletes a single document from the bios collection (page 111) where the turing field equals
true:
db.bios.remove( { turing: true }, 1 )
The following operation removes the first document from the collection products where qty is greater than 20:
db.products.remove( { qty: { $gt: 20 } }, true )
Capped Collection You cannot use the remove() (page 844) method with a capped collection.
Isolation If the <query> argument to the remove() (page 844) method matches multiple documents in the collection, the delete operation may interleave with other write operations to that collection. For an unsharded collection,
you have the option to override this behavior with the $isolated (page 663) isolation operator, effectively isolating
the delete operation and blocking other write operations during the delete. To isolate the query, include $isolated:
1 in the <query> parameter as in the following examples:
db.bios.remove( { turing: true, $isolated: 1 } )
db.products.remove( { qty: { $gt: 20 }, $isolated: 1 } )
db.collection.renameCollection()
11.1. MongoDB Interface
845
Definition
db.collection.renameCollection(target, string)
Renames a collection. Provides a wrapper for the renameCollection (page 744) database command.
param string target The new name of the collection. Enclose the string in quotes.
param boolean dropTarget If true,
mongod (page 925) drops
renameCollection (page 744) prior to renaming the collection.
the
target
of
Example Call the db.collection.renameCollection() (page 846) method on a collection object. For
example:
db.rrecord.renameCollection("record")
This operation will rename the rrecord collection to record. If the target name (i.e. record) is the name of an
existing collection, then the operation will fail.
Limitations The method has the following limitations:
db.collection.renameCollection() (page 846) cannot move a collection between databases. Use
renameCollection (page 744) for these rename operations.
db.collection.renameCollection() (page 846) cannot operation on sharded collections.
The db.collection.renameCollection() (page 846) method operates within a collection by changing the
metadata associated with a given collection.
Refer to the documentation renameCollection (page 744) for additional warnings and messages.
Warning: The db.collection.renameCollection() (page 846) method and renameCollection
(page 744) command will invalidate open cursors which interrupts queries that are currently returning data.
db.collection.save()
Definition
db.collection.save(document)
Updates an existing document or inserts a new document, depending on its document parameter.
The save() (page 846) method takes the following parameter:
param document document A document to save to the collection.
If the document does not contain an _id field, then the save() (page 846) method performs an insert. During
the operation, mongod (page 925) will add to the document the _id field and assign it a unique ObjectId
(page 103).
If the document contains an _id field, then the save() (page 846) method performs an upsert, querying the
collection on the _id field. If a document does not exist with the specified _id value, the save() (page 846)
method performs an insert. If a document exists with the specified _id value, the save() (page 846) method
performs an update that replaces all fields in the existing document with the fields from the document.
Examples The following are examples of the save() (page 846) method. The following examples show how to
use the save() (page 846) method to insert or update a document into either the products collection or the bios
collection.
846
Save a New Document without Specifying an _id Field In the following examples, the parameter to the save()
(page 846) method is a document without an _id field. This means the save() (page 846) method performs an
insert. During the insert, mongod (page 925) will create the _id field with a unique ObjectId (page 103) value.
The ObjectId values are specific to the machine and time when the operation is run. As such, your values may
differ from those in the example.
products Collection This example inserts into the products collection a document with the item field set to
book and qty field set to 40:
db.products.save( { item: "book", qty: 40 } )
The inserted document includes an _id field with the generated ObjectId value:
{ "_id" : ObjectId("50691737d386d8fadbd6b01d"), "item" : "book", "qty" : 40 }
bios Collection This example inserts a new document into the The bios Example Collection (page 111):
db.bios.save(
{
name: { first: 'Guido', last: 'van Rossum'},
birth: new Date('Jan 31, 1956'),
contribs: [ 'Python' ],
awards: [
{
award: 'Award for the Advancement of Free Software',
year: 2001,
by: 'Free Software Foundation'
},
{
award: 'NLUUG Award',
year: 2003,
by: 'NLUUG'
}
]
}
)
The inserted document includes an _id field with the generated ObjectId value.
Note: Most MongoDB driver clients will include the _id field and generate an ObjectId before sending the insert
operation to MongoDB; however, if the client sends a document without an _id field, the mongod (page 925) will
add the _id field and generate the ObjectId.
Save a New Document Specifying an _id Field The following examples pass the save() (page 846) method a
document with an _id field. Because the _id field holds a value that does not exist in the collection, the operations
insert new documents. The results of these operations are identical to an update operation with the upsert flag (page 50)
set to true or 1.
products Collection This example creates a new document with the _id field set to 100, the item field set to
water, and the qty field set to 30:
db.products.save( { _id: 100, item: "water", qty: 30 } )
847
The operation results in the following new document in the products collection:
{ "_id" : 100, "item" : "water", "qty" : 30 }
bios Collection This example creates a new document in the The bios Example Collection (page 111). The
save() (page 846) method inserts the document because the bios collection has no document with the _id field
equal to 10:
db.bios.save(
{
_id: 10,
name: { first: 'Yukihiro', aka: 'Matz', last: 'Matsumoto'},
birth: new Date('Apr 14, 1965'),
contribs: [ 'Ruby' ],
awards: [
{
award: 'Award for the Advancement of Free Software',
year: '2011',
by: 'Free Software Foundation'
}
]
}
)
Replace an Existing Document The following example passes the save() (page 846) method a document with
an _id field. Because the _id field holds a value that does exist in the collection, the operation performs an update
to replace the existing document.
products Collection This example replaces the fields and values of the following document that currently exists
in the products collection:
{ "_id" : 100, "item" : "water", "qty" : 30 }
To replace the documents data, pass the save() (page 846) method a document that contains the _id field set to
100 and the new fields and values:
db.products.save( { _id : 100, item : "juice" } )
db.collection.stats()
Definition
db.collection.stats(scale)
Returns statistics about the collection. The method includes the following parameter:
param number scale The scale used in the output to display the sizes of items. By default, output
displays sizes in bytes. To display kilobytes rather than bytes, specify a scale value of 1024.
Returns A document containing statistics that reflecting the state of the specified collection.
The stats() (page 848) method provides a wrapper around the database command collStats (page 763).
848
See also:
collStats (page 763) for an overview of the output of this command.
db.collection.storageSize()
db.collection.storageSize()
Returns The total amount of storage allocated to this collection for document storage. Provides
a wrapper around the storageSize (page 764) field of the collStats (page 763) (i.e.
db.collection.stats() (page 848)) output.
db.collection.totalSize()
db.collection.totalSize()
Returns The total size of the data in the collection plus the size of every indexes on the collection.
db.collection.totalIndexSize()
db.collection.totalIndexSize()
Returns The total size of all indexes for the collection. This method provides a wrapper
around the totalIndexSize (page 765) output of the collStats (page 763) (i.e.
db.collection.stats() (page 848)) operation.
db.collection.update()
Definition
db.collection.update(query, update, options)
Modifies an existing document or documents in a collection. By default, the update() (page 849) method
updates a single document. If the multi option is set to true, the method updates all documents that match
the query criteria.
Changed in version 2.2: The update() (page 849) method has the following form:
db.collection.update(
<query>,
<update>,
{
upsert: <Boolean>,
multi: <Boolean>,
}
)
Prior to version 2.2, the update() (page 849) method has the following form:
db.collection.update( <query>, <update>, <upsert>, <multi> )
param document update The modifications to apply. For details see Update Parameter (page 850)
after this table.
11.1. MongoDB Interface
849
param document,Boolean upsert If set to true, creates a new document when no document
matches the query criteria. The default value is false, which does not insert a new document when no match is found. The syntax for this parameter depends on the MongoDB version.
See Upsert Parameter (page 850).
param document,Boolean multi If set to true, updates multiple documents that meet the query
criteria. If set to false, updates one document. The default value is false. For additional
information, see Multi Parameter (page 850).
The update() (page 849) method either updates specific fields in the existing document or replaces the document. The method updates specific fields if the <update> parameter contains only update operator (page 651)
expressions, such as a $set (page 655) operator expression. Otherwise the method replaces the existing document.
To update fields in subdocuments, use dot notation.
The update() (page 849) method can modify the name of a field using the $rename (page 652) operator.
Update Parameter If the <update> document contains update operator (page 651) expressions, such those using
the $set (page 655) operator, then:
The <update> document must contain only update operator (page 651) expressions.
The update() (page 849) method updates only the corresponding fields in the document. For an example, see
Update Specific Fields (page 851).
If the <update> document contains only field:value expressions, then:
The update() (page 849) method replaces the matching document with the <update> document. The
update() (page 849) method does not replace the _id value. For an example, see Replace All Fields
(page 852).
update() (page 849) cannot update multiple documents.
Upsert Parameter In MongoDB versions 2.2 and later, the upsert parameter has the following form:
upsert : true|false
Prior to version 2.2, the upsert parameter is a positional Boolean. To enable, specify true as the third parameter
to update() (page 849).
If upsert is set to true and if no document matches the query criteria, update() (page 849) inserts a single
document. The upsert creates the new document with either:
The fields and values of the <update> parameter, or
The fields and values of the both the <query> and <update> parameters. The upsert creates a document
with data from both <query> and <update> if the <update> parameter contains only update operator
(page 651) expressions.
For examples, see Insert a New Document if No Match Exists (Upsert) (page 852).
Multi Parameter In MongoDB versions 2.2 and later, the multi parameter has the following form:
multi : true|false
Prior to version 2.2, the multi parameter is a positional Boolean. To enable the multiple updates, specify true as
the fourth parameter to update() (page 849).
850
If multi is set to true, the update() (page 849) method updates all documents that meet the <query> criteria.
The multi update operation may interleave with other write operations. For unsharded collections, you can override
this behavior with the $isolated (page 663) isolation operator, which isolates the update operation and blocks other
write operations during the update.
For an example, see Update Multiple Documents (page 854).
Examples The following examples use the MongoDB version 2.2 interface to specify options in the document
form.
Update Specific Fields To update specific fields in a document, use update operators (page 651) in the <update>
parameter. For example, given a books collection with the following document:
{ "_id" : 11, "item" : "Divine Comedy", "stock" : 2 }
The following operation uses the $set (page 655) and $inc (page 651) operators to add a price field and to
increment stock by 5.
db.books.update( { item: "Divine Comedy" },
{
$set: { price: 18 },
$inc: { stock: 5 }
} )
Update Specific Fields in Subdocuments Use dot notation to update values in subdocuments. The following example, which uses the bios collection (page 111), queries for the document with _id equal to 1 and updates the value
of the field middle in the subdocument name:
db.bios.update( { _id: 1 }, { $set: { "name.middle": "Warner" } } )
Add New Fields If the <update> parameter contains fields not currently in the document, the update()
(page 849) method adds the new fields to the document. The following operation adds two new fields: mbranch
and aka. The operation adds aka in the subdocument name:
db.bios.update(
{ _id: 3 },
{ $set: {
mbranch: "Navy",
"name.aka": "Amazing Grace"
}
}
)
Remove Fields The following operation uses the $unset (page 655) operator to remove the birth field from the
document that has _id equal to 3:
db.bios.update( { _id: 3 }, { $unset: { birth: 1 } } )
851
Replace All Fields Given the following document in the books collection:
{
"_id" : 22,
"item" : "The Banquet",
"author" : "Dante",
"price" : 20,
"stock" : 4
}
The following operation passes an <update> document that contains only field and value pairs, which means the
document replaces all the fields in the original document. The operation does not replace the _id value. The operation
contains the same value for the item field in both the <query> and <update> documents, which means the field
does not change:
db.books.update( { item: "The Banquet" },
{ item: "The Banquet", price: 19 , stock: 3 }
)
The operation creates the following new document. The operation removed the author field and changed the values
of the price and stock fields:
{
"_id" : 22,
"item" : "The Banquet",
"price" : 19,
"stock" : 3
}
In the next example, which uses the bios collection (page 111), the operation changes all values for the document
including the value used to locate the document. The operation locates a document by querying for name set to {
first: "John", last: "McCarthy" } and then issues a replacement document that includes the name
field set to { first: "Ken", last: "Iverson" }.
db.bios.update(
{ name: { first: "John", last: "McCarthy" } },
{
name: { first: "Ken", last: "Iverson" },
birth: new Date("Dec 17, 1941"),
died: new Date("Oct 19, 2004"),
contribs: [ "APL", "J" ],
awards: [
{ award: "Turing Award",
year: 1979,
by: "ACM" },
{ award: "Harry H. Goode Memorial Award",
year: 1975,
by: "IEEE Computer Society" },
{ award: "IBM Fellow",
year: 1970,
by: "IBM" }
]
}
)
852
Insert a New Document if No Match Exists (Upsert) The following command sets the upsert option to true 12
so that update() (page 849) creates a new document in the books collection if no document matches the <query>
parameter:
db.books.update( { item: "The New Life" },
{ item: "The New Life",
author: "Dante",
price: 15 },
{ upsert: true }
)
If no document matches the <query> parameter, the upsert inserts a document with the fields and values of the
<update> parameter and a new unique ObjectId for the _id field:
{
"_id" : ObjectId("51e5990c95098ed69d4a89f2"),
"author" : "Dante",
"item" : "The New Life",
"price" : 15
}
In the next example, the <update> parameter includes only update operators (page 651). If no document matches
the <query> parameter, the upsert inserts a document with the fields and values of the both the <query> and
<update> parameters:
db.bios.update(
{
_id: 7,
name: { first: "Ken", last: "Thompson" }
},
{
$set: {
birth: new Date("Feb 04, 1943"),
contribs: [ "UNIX", "C", "B", "UTF-8" ],
awards: [
{
award: "Turing Award",
year: 1983,
by: "ACM"
},
{
award: "IEEE Richard W. Hamming Medal",
year: 1990,
by: "IEEE"
},
{
award: "National Medal of Technology",
year: 1998,
by: "United States"
},
{
award: "Tsutomu Kanai Award",
year: 1999,
by: "IEEE"
},
{
award: "Japan Prize",
12 Prior to version 2.2, in the mongo (page 942) shell, you would specify the upsert and the multi options in the update() (page 849)
method as positional boolean options. See update() (page 849) for details.
853
year: 2011,
by: "The Japan Prize Foundation"
}
]
}
},
{ upsert: true }
)
Update Multiple Documents To update multiple documents, set the multi option to true 13 . The following
example queries the bios collection (page 111) for all documents where awards.award is set to Turing. The
update sets the turing field to true:
db.bios.update(
{ "awards.award": "Turing" },
{ $set: { turing: true } },
{ multi: true }
)
Combine the Upsert and Multi Parameters Given a books collection that includes the following documents:
{
"_id" : 11,
"author" : "Dante",
"item" : "Divine Comedy",
"price" : 18 }
{
"_id" : 22,
"author" : "Dante",
"item" : "The Banquet",
"price" : 19 }
{
"_id" : 33,
"author" : "Dante",
"item" : "Monarchia",
"price" : 14
}
The following command uses the multi parameter to update all documents where author is "Dante" and uses
the upsert parameter to create a new document if no such documents are found 14 :
db.books.update( { author: "Dante" },
{ $set: { born: "Florence", died: "Ravenna" } },
{ upsert: true, multi: true }
)
854
"died" : "Ravenna",
"item" : "Divine Comedy",
"price" : 18
}
{
"_id" : 22,
"author" : "Dante",
"born" : "Florence",
"died" : "Ravenna",
"item" : "The Banquet",
"price" : 19
}
{
"_id" : 33,
"author" : "Dante",
"born" : "Florence",
"died" : "Ravenna",
"item" : "Monarchia",
"price" : 14
}
Update Arrays
Update an Element by Position If the update operation requires an update of an element in an array field, the
update() (page 849) method can perform the update using the position of the element and dot notation. Arrays in
MongoDB are zero-based.
The following operation queries the bios collection (page 111) for the first document with _id field equal to 1 and
updates the second element in the contribs array:
db.bios.update(
{ _id: 1 },
{ $set: { "contribs.1": "ALGOL 58" } }
)
Update an Element if Position is Unknown If the position in the array is not known, the update() (page 849)
method can perform the update using the $ (page 656) positional operator. The array field must appear in the <query>
parameter in order to determine which array element to update.
The following operation queries the bios collection (page 111) for the first document where the _id field equals 3 and
the contribs array contains an element equal to compiler. If found, the update() (page 849) method updates
the first matching element in the array to A compiler in the document:
db.bios.update(
{ _id: 3, "contribs": "compiler" },
{ $set: { "contribs.$": "A compiler" } }
)
Update a Document Element The update() (page 849) method can perform the update of an array that contains
subdocuments by using the positional operator (i.e. $ (page 656)) and the dot notation.
The following operation queries the bios collection (page 111) for the first document where the _id field equals 6
and the awards array contains a subdocument element with the by field equal to ACM. If found, the update()
(page 849) method updates the by field in the first matching subdocument:
855
db.bios.update(
{ _id: 6, "awards.by": "ACM" } ,
{ $set: { "awards.$.by": "Association for Computing Machinery" } }
)
Add an Element The following operation queries the bios collection (page 111) for the first document that has an
_id field equal to 1 and adds a new element to the awards field:
db.bios.update(
{ _id: 1 },
{
$push: { awards: { award: "IBM Fellow", year: 1963, by: "IBM" } }
}
)
In the next example, the $set (page 655) operator uses dot notation (page 94) to access the middle field in the
name subdocument. The $push (page 659) operator adds another document as element to the field awards.
Consider the following operation:
db.bios.update(
{ _id: 1 },
{
$set: { "name.middle": "Warner" },
$push: { awards: {
award: "IBM Fellow",
year: "1963",
by: "IBM"
}
}
}
)
856
The validate() (page 856) method output provides an in-depth view of how the collection uses storage. Be
aware that this command is potentially resource intensive and may impact the performance of your MongoDB
instance.
The validate() (page 856) method is a wrapper around the validate (page 771) database command.
See also:
validate (page 771)
857
Cursor
Cursor Methods
Name
Description
cursor.addOption() Adds special wire protocol flags that modify the behavior of the query.
(page 858)
cursor.batchSize() Controls the number of documents MongoDB will return to the client in a single
(page 859)
network message.
cursor.count()
Returns a count of the documents in a cursor.
(page 860)
cursor.explain()
Reports on the query execution plan, including index use, for a cursor.
(page 861)
cursor.forEach()
Applies a JavaScript function for every document in a cursor.
(page 866)
cursor.hasNext()
Returns true if the cursor has documents and can be iterated.
(page 866)
cursor.hint()
Forces MongoDB to use a specific index for a query.
(page 866)
cursor.limit()
Constrains the size of a cursors result set.
(page 867)
cursor.map()
Applies a function to each document in a cursor and collects the return values in an
(page 867)
array.
cursor.max()
Specifies an exclusive upper index bound for a cursor. For use with
(page 867)
cursor.hint() (page 866)
cursor.min()
Specifies an inclusive lower index bound for a cursor. For use with
(page 869)
cursor.hint() (page 866)
cursor.next()
Returns the next document in a cursor.
(page 870)
cursor.objsLeftInBatch()
Returns the number of documents left in the current cursor batch.
(page 871)
cursor.readPref()
Specifies a read preference to a cursor to control how the client directs queries to a
(page 871)
replica set.
cursor.showDiskLoc()Returns a cursor with modified documents that include the on-disk location of the
(page 871)
document.
cursor.size()
Returns a count of the documents in the cursor after applying skip() (page 871)
(page 871)
and limit() (page 867) methods.
cursor.skip()
Returns a cursor that begins returning results only after passing or skipping a
(page 871)
number of documents.
cursor.snapshot()
Forces the cursor to use the index on the _id field. Ensures that the cursor returns
(page 872)
each document, with regards to the value of the _id field, only once.
cursor.sort()
Returns results ordered according to a sort specification.
(page 872)
cursor.toArray()
Returns an array that contains all documents returned by the cursor.
(page 874)
cursor.addOption()
Definition
cursor.addOption(flag)
Adds OP_QUERY wire protocol flags, such as the tailable flag, to change the behavior of queries.
858
Warning: Adding incorrect wire protocol flags can cause problems and/or extra server load.
cursor.batchSize()
Definition
cursor.batchSize(size)
Specifies the number of documents to return in each batch of the response from the MongoDB instance. In most
cases, modifying the batch size will not affect the user or the application, as the mongo (page 942) shell and
most drivers (page 95) return results as if MongoDB returned a single batch.
The batchSize() (page 859) method takes the following parameter:
param integer size The number of documents to return per batch. Do not use a batch size of 1.
Note: Specifying 1 or a negative number is analogous to using the limit() (page 867) method.
15 http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol/?pageVersion=106#op-query
16 http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol/?pageVersion=106#op-query
859
Example The following example sets the batch size for the results of a query (i.e. find() (page 816)) to 10. The
batchSize() (page 859) method does not change the output in the mongo (page 942) shell, which, by default,
iterates over the first 20 documents.
db.inventory.find().batchSize(10)
cursor.count()
Definition
cursor.count()
Counts the number of documents referenced by a cursor. Append the count() (page 860) method to a find()
(page 816) query to return the number of matching documents. The operation does not perform the query but
instead counts the results that would be returned by the query.
The count() (page 860) method has the following prototype form:
db.collection.find().count()
Example
Count the number of the documents in the orders collection with the field ord_dt greater than new
Date(01/01/2012):
db.orders.find( { ord_dt: { $gt: new Date('01/01/2012') } } ).count()
Example
Count the number of the documents in the orders collection with the field ord_dt greater than new
Date(01/01/2012) taking into account the effect of the limit(5):
db.orders.find( { ord_dt: { $gt: new Date('01/01/2012') } } ).limit(5).count(true)
860
cursor.explain()
Definition
cursor.explain(verbose)
Provides information on the query plan. The query plan is the plan the server uses to find the matches for a
query. This information may be useful when optimizing a query. The explain() (page 861) method returns
a document that describes the process used to return the query results.
The explain() (page 861) method has the following form:
db.collection.find().explain()
When you run explain() (page 861) with hint() (page 866), the query optimizer does not reevaluate the
query plans.
Note: In some situations, the explain() (page 861) operation may differ from the actual query plan used
by MongoDB in a normal query. The explain() (page 861) operation evaluates the set of query plans and
reports on the winning plan for the query. In normal operations the query optimizer caches winning query plans
and uses them for similar related queries in the future. As a result MongoDB may sometimes select query plans
from the cache that are different from the plan displayed using explain() (page 861).
See also:
$explain (page 688)
Optimization Strategies for MongoDB (page 160) page for information regarding optimization strategies.
Analyze Performance of Database Operations (page 167) tutorial for information regarding the database
profile.
Current Operation Reporting (page 879)
Explain Results
861
Explain on Queries on Unsharded Collections For queries on unsharded collections, explain() (page 861)
returns the following core information.
{
"cursor" : "<Cursor Type and Index>",
"isMultiKey" : <boolean>,
"n" : <num>,
"nscannedObjects" : <num>,
"nscanned" : <num>,
"nscannedObjectsAllPlans" : <num>,
"nscannedAllPlans" : <num>,
"scanAndOrder" : <boolean>,
"indexOnly" : <boolean>,
"nYields" : <num>,
"nChunkSkips" : <num>,
"millis" : <num>,
"indexBounds" : { <index bounds> },
"allPlans" : [
{ "cursor" : "<Cursor Type and Index>",
"n" : <num>,
"nscannedObjects" : <num>,
"nscanned" : <num>,
"indexBounds" : { <index bounds> }
},
...
],
"oldPlan" : {
"cursor" : "<Cursor Type and Index>",
"indexBounds" : { <index bounds> }
}
"server" : "<host:port>",
}
For details on the fields, see Core Explain Output Fields (page 863).
Explain on $or Queries Queries with $or (page 625) operator execute each clause of the $or (page 625) expression in parallel and can use separate indexes on the individual clauses. If the query uses indexes on any or all of the
querys clause, explain() (page 861) contains output (page 863) for each clause as well as the cumulative data for
the entire query:
{
"clauses" : [
{
<core explain output>
},
{
<core explain output>
},
...
],
"n" : <num>,
"nscannedObjects" : <num>,
"nscanned" : <num>,
"nscannedObjectsAllPlans" : <num>,
"nscannedAllPlans" : <num>,
"millis" : <num>,
"server" : "<host:port>"
}
862
For details on the fields, see $or Query Output Fields (page 865) and Core Explain Output Fields (page 863).
Explain on Queries on Sharded Collections For queries on sharded collections, explain() (page 861) returns
information for each shard the query accesses. For queries on unsharded collections, see Core Explain Output Fields
(page 863).
For queries on a sharded collection, the output contains the Core Explain Output Fields (page 863) for each accessed
shard and cumulative shard information (page 865):
{
"clusteredType" : "<Shard Access Type>",
"shards" : {
"<shard1>" : [
{
<core explain output>
}
],
"<shard2>" : [
{
<core explain output>
}
],
...
},
"millisShardTotal" : <num>,
"millisShardAvg" : <num>,
"numQueries" : <num>,
"numShards" : <num>,
"cursor" : "<Cursor Type and Index>",
"n" : <num>,
"nChunkSkips" : <num>,
"nYields" : <num>,
"nscanned" : <num>,
"nscannedAllPlans" : <num>,
"nscannedObjects" : <num>,
"nscannedObjectsAllPlans" : <num>,
"millis" : <num>
}
For details on these fields, see Core Explain Output Fields (page 863) for each accessed shard and Sharded Collections
Output Fields (page 865).
Explain Output Fields
Core Explain Output Fields This section explains output for queries on collections that are not sharded. For queries
on sharded collections, see Explain on Queries on Sharded Collections (page 863).
explain.cursor
cursor (page 863) is a string that reports the type of cursor used by the query operation:
BasicCursor indicates a full collection scan.
BtreeCursor indicates that the query used an index. The cursor includes name of the index. When a
query uses an index, the output of explain() (page 861) includes indexBounds (page 865) details.
GeoSearchCursor indicates that the query used a geospatial index.
863
For BtreeCursor cursors, MongoDB will append the name of the index to the cursor string. Additionally,
depending on how the query uses an index, MongoDB may append one or both of the following strings to the
cursor string:
reverse indicates that query transverses the index from the highest values to the lowest values (e.g.
right to left.)
multi indicates that the query performed multiple look-ups. Otherwise, the query uses the index to
determine a range of possible matches.
explain.isMultiKey
isMultiKey (page 864) is a boolean. When true, the query uses a multikey index (page 324), where one of
the fields in the index holds an array.
explain.n
n (page 864) is a number that reflects the number of documents that match the query selection criteria.
explain.nscannedObjects
Specifies the total number of documents scanned during the query. The nscannedObjects (page 864)
may be lower than nscanned (page 864), such as if the index covers (page 368) a query. See indexOnly
(page 864). Additionally, the nscannedObjects (page 864) may be lower than nscanned (page 864) in
the case of multikey index on an array field with duplicate documents.
explain.nscanned
Specifies the total number of documents or index entries scanned during the database operation. You want n
(page 864) and nscanned (page 864) to be close in value as possible. The nscanned (page 864) value may
be higher than the nscannedObjects (page 864) value, such as if the index covers (page 368) a query. See
indexOnly (page 864).
explain.nscannedObjectsAllPlans
New in version 2.2.
nscannedObjectsAllPlans (page 864) is a number that reflects the total number of documents scanned
for all query plans during the database operation.
explain.nscannedAllPlans
New in version 2.2.
nscannedAllPlans (page 864) is a number that reflects the total number of documents or index entries
scanned for all query plans during the database operation.
explain.scanAndOrder
scanAndOrder (page 864) is a boolean that is true when the query cannot use the order of documents in
the index for returning sorted results: MongoDB must sort the documents after it receives the documents from
a cursor.
If scanAndOrder (page 864) is false, MongoDB can use the order of the documents in an index to return
sorted results.
explain.indexOnly
indexOnly (page 864) is a boolean value that returns true when the query is covered (page 368) by the
index indicated in the cursor (page 863) field. When an index covers a query, MongoDB can both match the
query conditions (page 68) and return the results using only the index because:
all the fields in the query (page 68) are part of that index, and
all the fields returned in the results set are in the same index.
explain.nYields
nYields (page 864) is a number that reflects the number of times this query yielded the read lock to allow
waiting writes execute.
864
explain.nChunkSkips
nChunkSkips (page 864) is a number that reflects the number of documents skipped because of active chunk
migrations in a sharded system. Typically this will be zero. A number greater than zero is ok, but indicates a
little bit of inefficiency.
explain.millis
millis (page 865) is a number that reflects the time in milliseconds to complete the query.
explain.indexBounds
indexBounds (page 865) is a document that contains the lower and upper index key bounds. This field
resembles one of the following:
"indexBounds" : {
"start" : { <index key1> : <value>, ...
"end" : { <index key1> : <value>, ... }
},
},
"indexBounds" : { "<field>" : [ [ <lower bound>, <upper bound> ] ],
...
}
explain.allPlans
allPlans (page 865) is an array that holds the list of plans the query optimizer runs in order to select the
index for the query. Displays only when the <verbose> parameter to explain() (page 861) is true or 1.
explain.oldPlan
New in version 2.2.
oldPlan (page 865) is a document value that contains the previous plan selected by the query optimizer for
the query. Displays only when the <verbose> parameter to explain() (page 861) is true or 1.
explain.server
New in version 2.2.
server (page 865) is a string that reports the MongoDB server.
$or Query Output Fields
explain.clauses
clauses (page 865) is an array that holds the Core Explain Output Fields (page 863) information for each
clause of the $or (page 625) expression. clauses (page 865) is only included when the clauses in the $or
(page 625) expression use indexes.
Sharded Collections Output Fields
explain.clusteredType
clusteredType (page 865) is a string that reports the access pattern for shards. The value is:
ParallelSort, if the mongos (page 938) queries shards in parallel.
SerialServer, if the mongos (page 938) queries shards sequentially.
explain.shards
shards (page 865) contains fields for each shard in the cluster accessed during the query. Each field holds the
Core Explain Output Fields (page 863) for that shard.
explain.millisShardTotal
millisShardTotal (page 865) is a number that reports the total time in milliseconds for the query to run
on the shards.
865
explain.millisShardAvg
millisShardAvg (page 865) is a number that reports the average time in millisecond for the query to run on
each shard.
explain.numQueries
numQueries (page 866) is a number that reports the total number of queries executed.
explain.numShards
numShards (page 866) is a number that reports the total number of shards queried.
cursor.forEach()
Description
cursor.forEach(function)
Iterates the cursor to apply a JavaScript function to each document from the cursor.
The forEach() (page 866) method has the following prototype form:
db.collection.find().forEach(<function>)
See also:
cursor.map() (page 867) for similar functionality.
cursor.hasNext()
cursor.hasNext()
Returns Boolean.
cursor.hasNext() (page 866) returns true if the cursor returned by the db.collection.find()
(page 816) query can iterate further to return more documents.
cursor.hint()
Definition
cursor.hint(index)
Call this method on a query to override MongoDBs default index selection and query optimization process.
Use db.collection.getIndexes() (page 826) to return the list of current indexes on a collection.
The cursor.hint() (page 866) method has the following parameter:
param string,document index The index to hint or force MongoDB to use when performing the
query. Specify the index either by the index name or by the index specification document.
866
See
Indexing Tutorials (page 338) for information.
Example The following example returns all documents in the collection named users using the index on the age
field.
db.users.find().hint( { age: 1 } )
You can also specify the index using the index name:
db.users.find().hint( "age_1" )
See also:
$hint (page 689)
cursor.limit()
cursor.limit()
Use the limit() (page 867) method on a cursor to specify the maximum number of documents the cursor will
return. limit() (page 867) is analogous to the LIMIT statement in a SQL database.
Note: You must apply limit() (page 867) to the cursor before retrieving any documents from the database.
Use limit() (page 867) to maximize performance and prevent MongoDB from returning more results than
required for processing.
A limit() (page 867) value of 0 (e.g. .limit(0) (page 867)) is equivalent to setting no limit.
cursor.map()
cursor.map(function)
Applies function to each document visited by the cursor and collects the return values from successive
application into an array.
The cursor.map() (page 867) method has the following parameter:
param function function A function to apply to each document visited by the cursor.
Example
db.users.find().map( function(u) { return u.name; } );
See also:
cursor.forEach() (page 866) for similar functionality.
cursor.max()
Definition
cursor.max()
Specifies the exclusive upper bound for a specific index in order to constrain the results of find() (page 816).
max() (page 867) provides a way to specify an upper bound on compound key indexes.
The max() (page 867) method has the following parameter:
867
param document indexBounds The exclusive upper bound for the index keys.
The indexBounds parameter has the following prototype form:
{ field1: <max value>, field2: <max value2> ... fieldN:<max valueN>}
The fields correspond to all the keys of a particular index in order. You can explicitly specify the particular
index with the hint() (page 866) method. Otherwise, mongod (page 925) selects the index using the fields
in the indexBounds; however, if multiple indexes exist on same fields with different sort orders, the selection
of the index may be ambiguous.
See also:
min() (page 869).
Note: max() (page 867) is a shell wrapper around the query modifier $max (page 690).
Behavior
Because max() (page 867) requires an index on a field, and forces the query to use this index, you may prefer
the $lt (page 623) operator for the query if possible. Consider the following example:
db.products.find( { _id: 7 } ).max( { price: 1.39 } )
The query will use the index on the price field, even if the index on _id may be better.
max() (page 867) exists primarily to support the mongos (page 938) (sharding) process.
If you use max() (page 867) with min() (page 869) to specify a range, the index bounds specified in min()
(page 869) and max() (page 867) must both refer to the keys of the same index.
Example This example assumes a collection named products that holds the following documents:
{
{
{
{
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
:
:
:
"_id" : 1 }
"item" : 1, "type" : 1 }
"item" : 1, "type" : -1 }
"price" : 1 }
Using the ordering of { item: 1, type: 1 } index, max() (page 867) limits the query to the documents that are below the bound of item equal to apple and type equal to jonagold:
db.products.find().max( { item: 'apple', type: 'jonagold' } ).hint( { item: 1, type: 1 } )
868
If the query did not explicitly specify the index with the hint() (page 866) method, it is ambiguous as to
whether mongod (page 925) would select the { item: 1, type: 1 } index ordering or the { item:
1, type: -1 } index ordering.
Using the ordering of the index { price: 1 }, max() (page 867) limits the query to the documents that are
below the index key bound of price equal to 1.99 and min() (page 869) limits the query to the documents
that are at or above the index key bound of price equal to 1.39:
db.products.find().min( { price: 1.39 } ).max( { price: 1.99 } ).hint( { price: 1 } )
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
cursor.min()
Definition
cursor.min()
Specifies the inclusive lower bound for a specific index in order to constrain the results of find() (page 816).
min() (page 869) provides a way to specify lower bounds on compound key indexes.
The min() (page 869) has the following parameter:
param document indexBounds The inclusive lower bound for the index keys.
The indexBounds parameter has the following prototype form:
{ field1: <min value>, field2: <min value2>, fieldN:<min valueN> }
The fields correspond to all the keys of a particular index in order. You can explicitly specify the particular
index with the hint() (page 866) method. Otherwise, MongoDB selects the index using the fields in the
indexBounds; however, if multiple indexes exist on same fields with different sort orders, the selection of the
index may be ambiguous.
See also:
max() (page 867).
Note: min() (page 869) is a shell wrapper around the query modifier $min (page 691).
Behaviors
Because min() (page 869) requires an index on a field, and forces the query to use this index, you may prefer
the $gte (page 622) operator for the query if possible. Consider the following example:
db.products.find( { _id: 7 } ).min( { price: 1.39 } )
The query will use the index on the price field, even if the index on _id may be better.
min() (page 869) exists primarily to support the mongos (page 938) process.
869
If you use min() (page 869) with max() (page 867) to specify a range, the index bounds specified in min()
(page 869) and max() (page 867) must both refer to the keys of the same index.
Example This example assumes a collection named products that holds the following documents:
{
{
{
{
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
:
:
:
"_id" : 1 }
"item" : 1, "type" : 1 }
"item" : 1, "type" : -1 }
"price" : 1 }
Using the ordering of the { item: 1, type: 1 } index, min() (page 869) limits the query to the
documents that are at or above the index key bound of item equal to apple and type equal to jonagold,
as in the following:
db.products.find().min( { item: 'apple', type: 'jonagold' } ).hint( { item: 1, type: 1 } )
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
If the query did not explicitly specify the index with the hint() (page 866) method, it is ambiguous as to
whether mongod (page 925) would select the { item: 1, type: 1 } index ordering or the { item:
1, type: -1 } index ordering.
Using the ordering of the index { price: 1 }, min() (page 869) limits the query to the documents that
are at or above the index key bound of price equal to 1.39 and max() (page 867) limits the query to the
documents that are below the index key bound of price equal to 1.99:
db.products.find().min( { price: 1.39 } ).max( { price: 1.99 } ).hint( { price: 1 } )
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
cursor.next()
870
cursor.next()
Returns The next document in the cursor returned by the db.collection.find() (page 816)
method. See cursor.hasNext() (page 866) related functionality.
cursor.objsLeftInBatch()
cursor.objsLeftInBatch()
cursor.objsLeftInBatch() (page 871) returns the number of documents remaining in the current batch.
The MongoDB instance returns response in batches. To retrieve all the documents from a cursor may require
multiple batch responses from the MongoDB instance. When there are no more documents remaining in the
current batch, the cursor will retrieve another batch to get more documents until the cursor exhausts.
cursor.readPref()
Definition
cursor.readPref(mode, tagSet)
Append readPref() (page 871) to a cursor to control how the client routes the query to members of the
replica set.
param string mode One of the following read preference modes: primary (page 489),
primaryPreferred (page 489), secondary (page 489), secondaryPreferred
(page 489), or nearest (page 490)
param array tagSet A tag set used to specify custom read preference modes. For details, see Tag
Sets (page 407).
Note: You must apply readPref() (page 871) to the cursor before retrieving any documents from the
database.
cursor.showDiskLoc()
cursor.showDiskLoc()
Returns A modified cursor object that contains documents with appended information that describes
the on-disk location of the document.
See also:
$showDiskLoc (page 692) for related functionality.
cursor.size()
cursor.size()
Returns A count of the number of documents that match the db.collection.find()
(page 816) query after applying any cursor.skip() (page 871) and cursor.limit()
(page 867) methods.
cursor.skip()
cursor.skip()
Call the cursor.skip() (page 871) method on a cursor to control where MongoDB begins returning results.
This approach may be useful in implementing paged results.
Note: You must apply cursor.skip() (page 871) to the cursor before retrieving any documents from the
database.
11.1. MongoDB Interface
871
The cursor.skip() (page 871) method is often expensive because it requires the server to walk from the
beginning of the collection or index to get the offset or skip position before beginning to return result. As
offset (e.g. pageNumber above) increases, cursor.skip() (page 871) will become slower and more CPU
intensive. With larger collections, cursor.skip() (page 871) may become IO bound.
Consider using range-based pagination for these kinds of tasks. That is, query for a range of objects, using logic
within the application to determine the pagination rather than the database itself. This approach features better
index utilization, if you do not need to easily jump to a specific page.
cursor.snapshot()
cursor.snapshot()
Append the snapshot() (page 872) method to a cursor to toggle the snapshot mode. This ensures that the
query will not return a document multiple times, even if intervening write operations result in a move of the
document due to the growth in document size.
Warning:
You must apply snapshot() (page 872) to the cursor before retrieving any documents from the
database.
You can only use snapshot() (page 872) with unsharded collections.
The snapshot() (page 872) does not guarantee isolation from insertion or deletions.
The snapshot() (page 872) traverses the index on the _id field. As such, snapshot() (page 872) cannot
be used with sort() (page 872) or hint() (page 866).
Queries with results of less than 1 megabyte are effectively implicitly snapshotted.
cursor.sort()
Definition
cursor.sort(sort)
Controls the order that the query returns matching documents. For each field in the sort document, if the
fields corresponding value is positive, then sort() (page 872) returns query results in ascending order for
that attribute. If the fields corresponding value is negative, then sort() (page 872) returns query results in
descending order.
The sort() (page 872) method has the following parameter:
param document sort A document that defines the sort order of the result set.
The sort parameter contains field and value pairs, in the following form:
{ field: value }
872
Note: You must apply limit() (page 867) to the cursor before retrieving any documents from the database.
Examples The following query returns all documents in collection sorted by the age field in descending order.
db.collection.find().sort( { age: -1 } );
The following query specifies the sort order using the fields from a sub-document name. The query sorts first by the
last field and then by the first field. The query sorts both fields in ascending order:
db.bios.find().sort( { 'name.last': 1, 'name.first': 1 } )
Limit Results Unless you have an index for the specified key pattern, use sort() (page 872) in conjunction
with limit() (page 867) to avoid requiring MongoDB to perform a large, in-memory sort. limit() (page 867)
increases the speed and reduces the amount of memory required to return this query by way of an optimized algorithm.
Warning: The sort function requires that the entire sort be able to complete within 32 megabytes. When the sort
option consumes more than 32 megabytes, MongoDB will return an error. Use limit() (page 867), or create an
index on the field that youre sorting to avoid this error.
Return Natural Order The $natural (page 693) parameter returns items according to their order on disk. Consider the following query:
db.collection.find().sort( { $natural: -1 } )
This will return documents in the reverse of the order on disk. Typically, the order of documents on disks reflects
insertion order, except when documents move internal because of document growth due to update operations.
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to
highest:
1. MinKey (internal type)
2. Null
3. Numbers (ints, longs, doubles)
4. Symbol, String
5. Object
6. Array
7. BinData
8. ObjectID
9. Boolean
10. Date, Timestamp
11. Regular Expression
12. MaxKey (internal type)
Note: MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo
conversion before comparison.
873
cursor.toArray()
cursor.toArray()
The toArray() (page 874) method returns an array that contains all the documents from a cursor. The method
iterates completely the cursor, loading all the documents into RAM and exhausting the cursor.
Returns An array of documents.
Consider the following example that applies toArray() (page 874) to the cursor returned from the find()
(page 816) method:
var allProductsArray = db.products.find().toArray();
if (allProductsArray.length > 0) { printjson (allProductsArray[0]); }
The variable allProductsArray holds the array of documents returned by toArray() (page 874).
Database
Database Methods
Name
db.addUser() (page 875)
db.auth() (page 876)
db.changeUserPassword() (page 877)
db.cloneCollection() (page 877)
db.cloneDatabase() (page 877)
db.commandHelp() (page 878)
db.copyDatabase() (page 878)
db.createCollection() (page 878)
db.currentOp() (page 879)
db.dropDatabase() (page 884)
db.eval() (page 884)
db.fsyncLock() (page 885)
db.fsyncUnlock() (page 886)
db.getCollection() (page 886)
db.getCollectionNames() (page 886)
db.getLastError() (page 886)
db.getLastErrorObj() (page 886)
db.getMongo() (page 887)
db.getName() (page 887)
db.getPrevError() (page 887)
db.getProfilingLevel() (page 887)
db.getProfilingStatus() (page 887)
db.getReplicationInfo() (page 887)
db.getSiblingDB() (page 888)
db.help() (page 889)
db.hostInfo() (page 889)
db.isMaster() (page 890)
db.killOp() (page 890)
db.listCommands() (page 890)
db.loadServerScripts() (page 890)
db.logout() (page 891)
874
Description
Adds a user to a database, and allows administrators to configure the users
Authenticates a user to a database.
Changes an existing users password.
Copies data directly between MongoDB instances. Wraps cloneCollec
Copies a database from a remote host to the current host. Wraps clone (p
Returns help information for a database command.
Copies a database to another database on the current host. Wraps copydb
Creates a new collection. Commonly used to create a capped collection.
Reports the current in-progress operations.
Removes the current database.
Passes a JavaScript function to the mongod (page 925) instance for serverFlushes writes to disk and locks the database to prevent write operations an
Allows writes to continue on a database locked with db.fsyncLock() (
Returns a collection object. Used to access collections with names that are
Lists all collections in the current database.
Checks and returns the status of the last operation. Wraps getLastErro
Returns the status document for the last operation. Wraps getLastErro
Returns the Mongo() (page 919) connection object for the current connec
Returns the name of the current database.
Returns a status document containing all errors since the last error reset. W
Returns the current profiling level for database operations.
Returns a document that reflects the current profiling level and the profiling
Returns a document with replication statistics.
Provides access to the specified database.
Displays descriptions of common db object methods.
Returns a document with information about the system MongoDB runs on.
Returns a document that reports the state of the replica set.
Terminates a specified operation.
Displays a list of common database commands.
Loads all scripts in the system.js collection for the current database int
Ends an authenticated session.
Name
db.printCollectionStats() (page 891)
db.printReplicationInfo() (page 891)
db.printShardingStatus() (page 892)
db.printSlaveReplicationInfo() (page 892)
db.removeUser() (page 892)
db.repairDatabase() (page 892)
db.resetError() (page 893)
db.runCommand() (page 893)
db.serverBuildInfo() (page 893)
db.serverStatus() (page 893)
db.setProfilingLevel() (page 894)
db.shutdownServer() (page 894)
db.stats() (page 894)
db.version() (page 895)
db.addUser()
Definition
db.addUser(document)
Use db.addUser() (page 875) to add privilege documents to the system.users (page 270) collection in
a database, which creates database credentials in MongoDB.
Changed in version 2.4: The schema of system.users (page 270) changed in 2.4 to accommodate a more
sophisticated privilege model (page 265). In 2.4 db.addUser() (page 875) supports both forms of privilege
documents.
In MongoDB 2.4 you must pass db.addUser() (page 875) a document that contains a well-formed
system.users (page 270) document. In MongoDB 2.2 pass arguments to db.addUser() (page 875)
that describe user credentials (page 876). A 2.4 privilege document has a subset of the following fields:
field string user The username for a new database user.
field array roles An array of user roles.
field hash pwd A shared secret used to authenticate the user. The pwd field and the userSource
field are mutually exclusive. The document cannot contain both.
field string userSource The database that contains the credentials for the user. The userSource
field and the pwd field are mutually exclusive. The document cannot contain both.
field document otherDBRoles Roles this user has on other databases.
See system.users Privilege Documents (page 270) for documentation of the 2.4 privilege documents.
Examples The following are prototype db.addUser() (page 875) operations:
db.addUser( { user: "<user>", pwd: "<password>", roles: [<roles>] } )
This operation creates a system.users (page 270) document with a password using the pwd (page 270) field
In the following prototype, rather than specify a password directly, you can delegated the credential to another database
using the userSource (page 271) field:
db.addUser( { user: "<user>", userSource: "<database>", roles: [<roles>] } )
875
To create and add a 2.4-style privilege document to system.users (page 270) to grant readWrite (page 266)
privileges to a user named author with privileges, use the following operation:
db.addUser( { user: "author", pwd: "pass", roles: [ "readWrite" ] } )
If you want to store user credentials in a single users database, you can use delegated credentials (page 272), as in
the following example:
db.addUser( { user: "author", userSource: "users", roles: [ "readWrite" ] } )
See also:
Add a User to a Database (page 257), User Privilege Roles in MongoDB (page 265), and system.users Privilege
Documents (page 270).
Legacy Privilege Documents To create legacy (2.2. and earlier) privilege documents, db.addUser() (page 875)
accepts the following parameters:
param string user The username.
param string password The corresponding password.
param boolean readOnly Defaults to false. Grants users a restricted privilege set that only allows the
user to read the this database.
The command takes the following form:
db.addUser( "<username>", "<password>",
<read-only> )
Example
To create and add a legacy (2.2. and earlier) privilege document with a user named guest and the password pass
that has only readOnly privileges, use the following operation:
db.addUser( "guest", "pass", true )
Note: The mongo (page 942) shell excludes all db.addUser() (page 875) operations from the saved history.
Deprecated since version 2.4: The roles parameter replaces the readOnly parameter for db.addUser()
(page 875). 2.4 also adds the otherDBRoles (page 271) and userSource (page 271) fields to documents in
the system.users (page 270) collection.
db.auth()
Definition
db.auth(username, password)
Allows a user to authenticate to the database from within the shell.
param string username Specifies an existing username with access privileges for this database.
param string password Specifies the corresponding password.
Alternatively, you can use mongo --username and --password to specify authentication credentials.
Note: The mongo (page 942) shell excludes all db.auth() (page 876) operations from the saved history.
876
db.changeUserPassword()
Definition
db.changeUserPassword(username, password)
Allows an administrator to update a users password from within the shell.
param string username Specifies an existing username with access privileges for this database.
param string password Specifies the corresponding password.
Throws exception If an error occurs, the db.changeUserPassword() (page 877) helper
throws an exception with the error message and code.
Example
Example
The following operation changes the reporting users password to SOhSS3TbYhxusooLiW8ypJPxmt1oOfL:
db = db.getSiblingDB('records')
db.changeUserPassword("reporting", "SOhSS3TbYhxusooLiW8ypJPxmt1oOfL")
db.cloneCollection()
Definition
db.cloneCollection(from, collection, query)
Copies data directly between MongoDB instances. The db.cloneCollection() (page 877) wraps the
cloneCollection (page 748) database command and accepts the following arguments:
param string from Host name of the MongoDB instance that holds the collection to copy.
param string collection The collection in the MongoDB instance that you want to copy.
db.cloneCollection() (page 877) will only copy the collection with this name from
database of the same name as the current database the remote MongoDB instance. If you want
to copy a collection from a different database name you must use the cloneCollection
(page 748) directly.
param document query A standard query document that limits the documents copied as part of the
db.cloneCollection() (page 877) operation. All query selectors (page 621) available to
the find() (page 816) are available here.
db.cloneCollection() (page 877) does not allow you to clone a collection through a mongos
(page 938). You must connect directly to the mongod (page 925) instance.
db.cloneDatabase()
Definition
db.cloneDatabase(hostname)
Copies a remote database to the current database. The command assumes that the remote database has the same
name as the current database.
param string hostname The hostname of the database to copy.
This method provides a wrapper around the MongoDB database command clone (page 748). The copydb
(page 745) database command provides related functionality.
877
Example To clone a database named importdb on a host named hostname, issue the following:
use importdb
db.cloneDatabase("hostname")
New databases are implicitly created, so the current host does not need to have a database named importdb for this
command to succeed.
db.commandHelp()
Description
db.commandHelp(command)
Displays help text for the specified database command. See the Database Commands (page 694).
The db.commandHelp() (page 878) method has the following parameter:
param string command The name of a database command.
db.copyDatabase()
Definition
db.copyDatabase(origin, destination, hostname)
Copies a single logical database from a remote MongoDB instance to the local database.
db.copyDatabase() (page 878) wraps the copydb (page 745) database command, and takes the
following arguments:
param string origin The name of the database on the origin system.
param string destination The name of the database to copy the origin database into.
param string hostname The hostname of the origin database host. Omit the hostname to copy from
one name to another on the same server.
db.copyDatabase() (page 878) implicitly creates the destination databases if it does not exist. If you do
not specify the hostname argument, MongoDB assumes the origin and destination databases are on the local
instance.
The clone (page 748) database command provides related functionality.
Example To copy a database named records into a database named archive_records, use the following
invocation of db.copyDatabase() (page 878):
db.copyDatabase('records', 'archive_records')
db.createCollection()
Definition
db.createCollection(name, options)
Creates a new collection explicitly.
Because MongoDB creates a collection implicitly when the collection is first referenced in a command, this
method is used primarily for creating new capped collections. This is also used to pre-allocate space for an
ordinary collection.
The db.createCollection() (page 878) has the following prototype form:
878
This command creates a collection named log with a maximum size of 5 megabytes and a maximum of 5000 documents.
The following command simply pre-allocates a 2-gigabyte, uncapped collection named people:
db.createCollection("people", { size: 2147483648 } )
This command provides a wrapper around the database command create (page 747). See Capped Collections
(page 156) for more information about capped collections.
db.currentOp()
Definition
db.currentOp()
Returns A document that reports in-progress operations for the database instance.
The db.currentOp() (page 879) method can take no arguments or take the true argument, which returns a
more verbose output, including idle connections and system operations. The following example uses the true
argument:
879
db.currentOp(true)
db.currentOp() (page 879) is available only for users with administrative privileges.
You can use db.killOp() (page 890) in conjunction with the opid (page 881) field to terminate a currently
running operation. The following JavaScript operations for the mongo (page 942) shell filter the output of
specific types of operations:
Return all pending write operations:
db.currentOp().inprog.forEach(
function(d){
if(d.waitingForLock && d.lockType != "read")
printjson(d)
})
Warning: Terminate running operations with extreme caution. Only use db.killOp() (page 890) to
terminate operations initiated by clients and do not terminate internal database operations.
Example The following is an example of db.currentOp() (page 879) output. If you specify the true argument,
db.currentOp() (page 879) returns more verbose output.
{
"inprog": [
{
"opid" : 3434473,
"active" : <boolean>,
"secs_running" : 0,
"op" : "<operation>",
"ns" : "<database>.<collection>",
"query" : {
},
"client" : "<host>:<outgoing>",
"desc" : "conn57683",
"threadId" : "0x7f04a637b700",
"connectionId" : 57683,
"locks" : {
"^" : "w",
"^local" : "W",
"^<database>" : "W"
},
"waitingForLock" : false,
880
"msg": "<string>"
"numYields" : 0,
"progress" : {
"done" : <number>,
"total" : <number>
}
"lockStats" : {
"timeLockedMicros" : {
"R" : NumberLong(),
"W" : NumberLong(),
"r" : NumberLong(),
"w" : NumberLong()
},
"timeAcquiringMicros" : {
"R" : NumberLong(),
"W" : NumberLong(),
"r" : NumberLong(),
"w" : NumberLong()
}
}
},
]
}
881
currentOp.ns
The namespace the operation targets. MongoDB forms namespaces using the name of the database and the
name of the collection.
currentOp.query
A document containing the current operations query. The document is empty for operations that do not have
queries: getmore, insert, and command.
currentOp.client
The IP address (or hostname) and the ephemeral port of the client connection where the operation originates. If
your inprog array has operations from many different clients, use this string to relate operations to clients.
For some commands, including findAndModify (page 710) and db.eval() (page 884), the client will be
0.0.0.0:0, rather than an actual client.
currentOp.desc
A description of the client. This string includes the connectionId (page 882).
currentOp.threadId
An identifier for the thread that services the operation and its connection.
currentOp.connectionId
An identifier for the connection where the operation originated.
currentOp.locks
New in version 2.2.
The locks (page 882) document reports on the kinds of locks the operation currently holds. The following
kinds of locks are possible:
currentOp.locks.^
^ (page 882) reports on the use of the global lock for the mongod (page 925) instance. All operations
must hold the global lock for some phases of operation.
currentOp.locks.^local
^local (page 882) reports on the lock for the local database. MongoDB uses the local database
for a number of operations, but the most frequent use of the local database is for the oplog used in
replication.
currentOp.locks.^<database>
locks.^<database> (page 882) reports on the lock state for the database that this operation targets.
locks (page 882) replaces lockType in earlier versions.
currentOp.lockType
Changed in version 2.2: The locks (page 882) replaced the lockType (page 882) field in 2.2.
Identifies the type of lock the operation currently holds. The possible values are:
read
write
currentOp.waitingForLock
Returns a boolean value. waitingForLock (page 882) is true if the operation is waiting for a lock and
false if the operation has the required lock.
currentOp.msg
The msg (page 882) provides a message that describes the status and progress of the operation. In the case of
indexing or mapReduce operations, the field reports the completion percentage.
currentOp.progress
Reports on the progress of mapReduce or indexing operations. The progress (page 882) fields corresponds
882
to the completion percentage in the msg (page 882) field. The progress (page 882) specifies the following
information:
currentOp.progress.done
Reports the number completed.
currentOp.progress.total
Reports the total number.
currentOp.killed
Returns true if mongod (page 925) instance is in the process of killing the operation.
currentOp.numYields
numYields (page 883) is a counter that reports the number of times the operation has yielded to allow other
operations to complete.
Typically, operations yield when they need access to data that MongoDB has not yet fully read into memory.
This allows other operations that have data in memory to complete quickly while MongoDB reads in data for
the yielding operation.
currentOp.lockStats
New in version 2.2.
The lockStats (page 883) document reflects the amount of time the operation has spent both acquiring and
holding locks. lockStats (page 883) reports data on a per-lock type, with the following possible lock types:
R represents the global read lock,
W represents the global write lock,
r represents the database specific read lock, and
w represents the database specific write lock.
currentOp.timeLockedMicros
The timeLockedMicros (page 883) document reports the amount of time the operation has
spent holding a specific lock.
For operations that require more than one lock, like those that lock the local database to update
the oplog, then the values in this document can be longer than this value may be longer than the
total length of the operation (i.e. secs_running (page 881).)
currentOp.timeLockedMicros.R
Reports the amount of time in microseconds the operation has held the global read lock.
currentOp.timeLockedMicros.W
Reports the amount of time in microseconds the operation has held the global write lock.
currentOp.timeLockedMicros.r
Reports the amount of time in microseconds the operation has held the database specific
read lock.
currentOp.timeLockedMicros.w
Reports the amount of time in microseconds the operation has held the database specific
write lock.
currentOp.timeAcquiringMicros
The timeAcquiringMicros (page 883) document reports the amount of time the operation has spent
waiting to acquire a specific lock.
currentOp.timeAcquiringMicros.R
Reports the mount of time in microseconds the operation has waited for the global read lock.
883
currentOp.timeAcquiringMicros.W
Reports the mount of time in microseconds the operation has waited for the global write lock.
currentOp.timeAcquiringMicros.r
Reports the mount of time in microseconds the operation has waited for the database specific read
lock.
currentOp.timeAcquiringMicros.w
Reports the mount of time in microseconds the operation has waited for the database specific write
lock.
db.dropDatabase()
db.dropDatabase()
Removes the current database. Does not change the current database, so the insertion of any documents in this
database will allocate a fresh set of data files.
db.eval()
Definition
db.eval(function, arguments)
Provides the ability to run JavaScript code on the MongoDB server.
The helper db.eval() (page 884) in the mongo (page 942) shell wraps the eval (page 722) command.
Therefore, the helper method shares the characteristics and behavior of the underlying command with one exception: db.eval() (page 884) method does not support the nolock option.
The method accepts the following parameters:
param JavaScript function function A JavaScript function to execute.
param list arguments A list of arguments to pass to the JavaScript function. Omit if the function
does not take arguments.
The JavaScript function need not take any arguments, as in the first example, or may optionally take arguments
as in the second:
function () {
// ...
}
function (arg1, arg2) {
// ...
}
884
return doc;
},
"eliot", 5 );
"errmsg" : "exception: JavaScript execution failed: ReferenceError: x is not defined near '{ retur
"code" : 16722,
"ok" : 0
}
Warning:
By default, db.eval() (page 884) takes a global write lock before evaluating the JavaScript function.
As a result, db.eval() (page 884) blocks all other read and write operations to the database while the
db.eval() (page 884) operation runs. Set nolock to true on the eval (page 722) command to prevent
the eval (page 722) command from taking the global write lock before evaluating the JavaScript. nolock
does not impact whether operations within the JavaScript code itself takes a write lock.
Do not use db.eval() (page 884) for long running operations as db.eval() (page 884) blocks all other
operations. Consider using other server side code execution options (page 198).
You can not use db.eval() (page 884) with sharded data. In general, you should avoid using db.eval()
(page 884) in sharded cluster; nevertheless, it is possible to use db.eval() (page 884) with non-sharded
collections and databases stored in a sharded cluster.
With authentication (page 993) enabled, db.eval() (page 884) will fail during the operation if you
do not have the permission to perform a specified task.
Changed in version 2.4: You must have full admin access to run.
Changed in version 2.4: The V8 JavaScript engine, which became the default in 2.4, allows multiple JavaScript
operations to execute at the same time. Prior to 2.4, db.eval() (page 884) executed in a single thread.
See also:
Server-side JavaScript (page 198)
db.fsyncLock()
db.fsyncLock()
Forces the mongod (page 925) to flush pending all write operations to the disk and locks the entire mongod
(page 925) instance to prevent additional writes until the user releases the lock with the db.fsyncUnlock()
(page 886) command. db.fsyncLock() (page 885) is an administrative command.
This command provides a simple wrapper around a fsync (page 751) database command with the following
syntax:
885
This function locks the database and create a window for backup operations (page 136).
Note: The database cannot be locked with db.fsyncLock() (page 885) while profiling is enabled. You
must disable profiling before locking the database with db.fsyncLock() (page 885). Disable profiling using
db.setProfilingLevel() (page 894) as follows in the mongo (page 942) shell:
db.setProfilingLevel(0)
db.fsyncUnlock()
db.fsyncUnlock()
Unlocks a mongod (page 925) instance to allow writes and reverses the operation of a db.fsyncLock()
(page 885) operation. Typically you will use db.fsyncUnlock() (page 886) following a database backup
operation (page 136).
db.fsyncUnlock() (page 886) is an administrative command.
db.getCollection()
Description
db.getCollection(name)
Returns a collection name. This is useful for a collection whose name might interact with the shell itself, such
names that begin with _ or that mirror the database commands (page 694).
The db.getCollection() (page 886) method has the following parameter:
param string name The name of the collection.
db.getCollectionNames()
db.getCollectionNames()
Returns An array containing all collections in the existing database.
db.getLastError()
db.getLastError()
Returns The last error message string.
Sets the level of write concern for confirming the success of write operations.
See
getLastError (page 720) and Write Concern Reference (page 96) for all options, Write Concern (page 55)
for a conceptual overview, Write Operations (page 50) for information about all write operations in MongoDB,
.
db.getLastErrorObj()
db.getLastErrorObj()
Returns A full document with status information.
886
See also:
Write Concern (page 55), Write Concern Reference (page 96), and Replica Acknowledged (page 57).
db.getMongo()
db.getMongo()
Returns The current database connection.
db.getMongo() (page 887) runs when the shell initiates. Use this command to test that the mongo
(page 942) shell has a connection to the proper database instance.
db.getName()
db.getName()
Returns the current database name.
db.getPrevError()
db.getPrevError()
Returns A status document, containing the errors.
Deprecated since version 1.6.
This output reports all errors since the last time the database received a resetError (page 722) (also
db.resetError() (page 893)) command.
This method provides a wrapper around the getPrevError (page 721) command.
db.getProfilingLevel()
db.getProfilingLevel()
This method provides a wrapper around the database command profile (page 770) and returns the current
profiling level.
Deprecated since version 1.8.4: Use db.getProfilingStatus() (page 887) for related functionality.
db.getProfilingStatus()
db.getProfilingStatus()
Returns The current profile (page 770) level and slowms (page 997) setting.
db.getReplicationInfo()
Definition
db.getReplicationInfo()
Returns A document with the status of the replica status, using data polled from the oplog. Use
this output when diagnosing issues with replication.
887
Output
db.getReplicationInfo.logSizeMB
Returns the total size of the oplog in megabytes. This refers to the total amount of space allocated to the oplog
rather than the current size of operations stored in the oplog.
db.getReplicationInfo.usedMB
Returns the total amount of space used by the oplog in megabytes. This refers to the total amount of space
currently used by operations stored in the oplog rather than the total amount of space allocated.
db.getReplicationInfo.errmsg
Returns an error message if there are no entries in the oplog.
db.getReplicationInfo.oplogMainRowCount
Only present when there are no entries in the oplog. Reports a the number of items or rows in the oplog (e.g. 0).
db.getReplicationInfo.timeDiff
Returns the difference between the first and last operation in the oplog, represented in seconds.
Only present if there are entries in the oplog.
db.getReplicationInfo.timeDiffHours
Returns the difference between the first and last operation in the oplog, rounded and represented in hours.
Only present if there are entries in the oplog.
db.getReplicationInfo.tFirst
Returns a time stamp for the first (i.e. earliest) operation in the oplog. Compare this value to the last write
operation issued against the server.
Only present if there are entries in the oplog.
db.getReplicationInfo.tLast
Returns a time stamp for the last (i.e. latest) operation in the oplog. Compare this value to the last write operation
issued against the server.
Only present if there are entries in the oplog.
db.getReplicationInfo.now
Returns a time stamp that reflects reflecting the current time. The shell process generates this value, and the
datum may differ slightly from the server time if youre connecting from a remote host as a result. Equivalent
to Date() (page 916).
Only present if there are entries in the oplog.
db.getSiblingDB()
Definition
db.getSiblingDB(<database>)
param string database The name of a MongoDB database.
Returns A database object.
Used to return another database without modifying the db variable in the shell environment.
Example You can use db.getSiblingDB() (page 888) as an alternative to the use <database> helper. This
is particularly useful when writing scripts using the mongo (page 942) shell where the use helper is not available.
Consider the following sequence of operations:
888
db = db.getSiblingDB('users')
db.active.count()
This operation sets the db object to point to the database named users, and then returns a count (page 809) of the
collection named active. You can create multiple db objects, that refer to different databases, as in the following
sequence of operations:
users = db.getSiblingDB('users')
records = db.getSiblingDB('records')
users.active.count()
users.active.findOne()
records.requests.count()
records.requests.findOne()
This operation creates two db objects referring to different databases (i.e. users and records,) and then returns
a count (page 809) and an example document (page 824) from one collection in that database (i.e. active and
requests respectively.)
db.help()
db.help()
Returns Text output listing common methods on the db object.
db.hostInfo()
db.hostInfo()
New in version 2.2.
Returns A document with information about the underlying system that the mongod (page 925) or
mongos (page 938) runs on. Some of the returned fields are only included on some platforms.
db.hostInfo() (page 889) provides a helper in the mongo (page 942) shell around the hostInfo
(page 780) The output of db.hostInfo() (page 889) on a Linux system will resemble the following:
{
"system" : {
"currentTime" : ISODate("<timestamp>"),
"hostname" : "<hostname>",
"cpuAddrSize" : <number>,
"memSizeMB" : <number>,
"numCores" : <number>,
"cpuArch" : "<identifier>",
"numaEnabled" : <boolean>
},
"os" : {
"type" : "<string>",
"name" : "<string>",
"version" : "<string>"
},
"extra" : {
"versionString" : "<string>",
"libcVersion" : "<string>",
"kernelVersion" : "<string>",
"cpuFrequencyMHz" : "<string>",
"cpuFeatures" : "<string>",
"pageSize" : <number>,
889
"numPages" : <number>,
"maxOpenFiles" : <number>
},
"ok" : <return>
}
See hostInfo (page 781) for full documentation of the output of db.hostInfo() (page 889).
db.isMaster()
db.isMaster()
Returns A document that describes the role of the mongod (page 925) instance.
If the mongod (page 925) is a member of a replica set, then the ismaster (page 733) and secondary
(page 733) fields report if the instance is the primary or if it is a secondary member of the replica set.
See
isMaster (page 732) for the complete documentation of the output of db.isMaster() (page 890).
db.killOp()
Description
db.killOp(opid)
Terminates an operation as specified by the operation ID. To find operations and their corresponding IDs, see
db.currentOp() (page 879).
The db.killOp() (page 890) method has the following parameter:
param number opid An operation ID.
Warning: Terminate running operations with extreme caution. Only use db.killOp() (page 890) to
terminate operations initiated by clients and do not terminate internal database operations.
db.listCommands()
db.listCommands()
Provides a list of all database commands. See the Database Commands (page 694) document for a more extensive index of these options.
db.loadServerScripts()
db.loadServerScripts()
db.loadServerScripts() (page 890) loads all scripts in the system.js collection for the current
database into the mongo (page 942) shell session.
Documents in the system.js collection have the following prototype form:
{ _id : "<name>" , value : <function> } }
The documents in the system.js collection provide functions that your applications can use in any JavaScript
context with MongoDB in this database. These contexts include $where (page 634) clauses and mapReduce
(page 701) operations.
890
db.logout()
db.logout()
Ends the current authentication session. This function has no effect if the current session is not authenticated.
Note: If youre not logged in and using authentication, db.logout() (page 891) has no effect.
Changed in version 2.4: Because MongoDB now allows users defined in one database to have privileges on
another database, you must call db.logout() (page 891) while using the same database context that you
authenticated to.
If you authenticated to a database such as users or $external, you must issue db.logout() (page 891)
against this database in order to successfully log out.
Example
Use the use <database-name> helper in the interactive mongo (page 942) shell, or the following
db.getSiblingDB() (page 888) in the interactive shell or in mongo (page 942) shell scripts to change
the db object:
db = db.getSiblingDB('<database-name>')
When you have set the database context and db object, you can use the db.logout() (page 891) to log out
of database as in the following operation:
db.logout()
db.logout() (page 891) function provides a wrapper around the database command logout (page 724).
db.printCollectionStats()
db.printCollectionStats()
Provides a wrapper around the db.collection.stats() (page 848) method. Returns statistics from every
collection separated by three hyphen characters.
Note:
The db.printCollectionStats() (page 891) in the mongo (page 942) shell does
not return JSON. Use db.printCollectionStats() (page 891) for manual inspection, and
db.collection.stats() (page 848) in scripts.
See also:
collStats (page 763)
db.printReplicationInfo()
db.printReplicationInfo()
Provides a formatted report of the status of a replica set from the perspective of the primary set member. See
the replSetGetStatus (page 726) for more information regarding the contents of this output.
Note: The db.printReplicationInfo() (page 891) in the mongo (page 942) shell does not return JSON. Use db.printReplicationInfo() (page 891) for manual inspection, and rs.status()
(page 898) in scripts.
db.printShardingStatus()
891
Definition
db.printShardingStatus()
Prints a formatted report of the sharding configuration and the information regarding existing chunks in a
sharded cluster.
Only use db.printShardingStatus() (page 892) when connected to a mongos (page 938) instance.
The db.printShardingStatus() (page 892) method has the following parameter:
param Boolean verbose If true, the method displays details of the document distribution across
chunks when you have 20 or more chunks.
See sh.status() (page 910) for details of the output.
Note: The db.printShardingStatus() (page 892) in the mongo (page 942) shell does not return JSON. Use db.printShardingStatus() (page 892) for manual inspection, and Config Database
(page 564) in scripts.
See also:
sh.status() (page 910)
db.printSlaveReplicationInfo()
db.printSlaveReplicationInfo()
Provides a formatted report of the status of a replica set from the perspective of the secondary set member. See
the replSetGetStatus (page 726) for more information regarding the contents of this output.
Note: The db.printSlaveReplicationInfo() (page 892) in the mongo (page 942) shell does
not return JSON. Use db.printSlaveReplicationInfo() (page 892) for manual inspection, and
rs.status() (page 898) in scripts.
db.removeUser()
Definition
db.removeUser(username)
Removes the specified username from the database.
The db.removeUser() (page 892) method has the following parameter:
param string username The database username.
db.repairDatabase()
db.repairDatabase()
Warning: During normal operations, only use the repairDatabase (page 757) command and wrappers
including db.repairDatabase() (page 892) in the mongo (page 942) shell and mongod --repair,
to compact database files and/or reclaim disk space. Be aware that these operations remove and do not save
any corrupt data during the repair process.
If you are trying to repair a replica set member, and you have access to an intact copy of your data (e.g. a
recent backup or an intact member of the replica set), you should restore from that intact copy, and not use
repairDatabase (page 757).
Note: When using journaling, there is almost never any need to run repairDatabase (page 757). In the
event of an unclean shutdown, the server will be able restore the data files to a pristine state automatically.
892
db.repairDatabase() (page 892) provides a wrapper around the database command repairDatabase
(page 757), and has the same effect as the run-time option mongod --repair option, limited to only the
current database. See repairDatabase (page 757) for full documentation.
db.resetError()
db.resetError()
Deprecated since version 1.6.
Resets the error message returned by db.getPrevError (page 887) or getPrevError (page 721). Provides a wrapper around the resetError (page 722) command.
db.runCommand()
Definition
db.runCommand(command)
Provides a helper to run specified database commands (page 694). This is the preferred method to issue database
commands, as it provides a consistent interface between the shell and drivers.
param document,string command A database command, specified either in document form or as
a string. If specified as a string, db.runCommand() (page 893) transforms the string into a
document.
db.serverBuildInfo()
db.serverBuildInfo()
Provides a wrapper around the buildInfo (page 762) database command. buildInfo (page 762) returns a
document that contains an overview of parameters used to compile this mongod (page 925) instance.
db.serverStatus()
db.serverStatus()
Returns a document that provides an overview of the database processs state.
This command provides a wrapper around the database command serverStatus (page 782).
Changed in version 2.4: In 2.4 you can dynamically suppress portions of the db.serverStatus()
(page 893) output, or include suppressed sections in a document passed to the db.serverStatus()
(page 893) method, as in the following example:
db.serverStatus( { repl: 0, indexCounters: 0, locks: 0 } )
db.serverStatus( { workingSet: 1, metrics: 0, locks: 0 } )
db.serverStatus() (page 893) includes all fields by default, except workingSet (page 795), by default.
Note: You may only dynamically include top-level fields from the serverStatus (page 782) document that are
not included by default. You can exclude any field that db.serverStatus() (page 893) includes by default.
See also:
serverStatus (page 782) for complete documentation of the output of this function.
db.setProfilingLevel()
893
Definition
db.setProfilingLevel(level, slowms)
Modifies the current database profiler level used by the database profiling system to capture data about performance. The method provides a wrapper around the database command profile (page 770).
param integer level Specifies a profiling level, which is either 0 for no profiling, 1 for only slow
operations, or 2 for all operations.
param integer slowms Sets the threshold in milliseconds for the profile to consider a query or operation to be slow.
The level chosen can affect performance. It also can allow the server to write the contents of queries to the log,
which might have information security implications for your deployment.
Configure the slowms (page 997) option to set the threshold for the profiler to consider a query slow. Specify
this value in milliseconds to override the default.
mongod (page 925) writes the output of the database profiler to the system.profile collection.
mongod (page 925) prints information about queries that take longer than the slowms (page 997) to the log
even when the database profiler is not active.
Note: The database cannot be locked with db.fsyncLock() (page 885) while profiling is enabled. You
must disable profiling before locking the database with db.fsyncLock() (page 885). Disable profiling using
db.setProfilingLevel() (page 894) as follows in the mongo (page 942) shell:
db.setProfilingLevel(0)
db.shutdownServer()
db.shutdownServer()
Shuts down the current mongod (page 925) or mongos (page 938) process cleanly and safely.
This operation fails when the current database is not the admin database.
This command provides a wrapper around the shutdown (page 759).
db.stats()
Description
db.stats(scale)
Returns statistics that reflect the use state of a single database.
The db.stats() (page 894) method has the following parameter:
param number scale The scale at which to deliver results. Unless specified, this command returns
all data in bytes.
Returns A document with statistics reflecting the database systems state. For an explanation of the
output, see dbStats (page 767).
The db.stats() (page 894) method is a wrapper around the dbStats (page 767) database command.
Example The following example converts the returned values to kilobytes:
db.stats(1024)
894
Note: The scale factor rounds values to whole numbers. This can produce unpredictable and unexpected results in
some situations.
db.version()
db.version()
Returns The version of the mongod (page 925) or mongos (page 938) instance.
Replication
Replication Methods
Name
rs.add()
(page 895)
rs.addArb()
(page 896)
rs.conf()
(page 896)
rs.freeze()
(page 897)
rs.help()
(page 897)
rs.initiate()
(page 897)
rs.reconfig()
(page 897)
rs.remove()
(page 898)
rs.slaveOk()
(page 898)
rs.status()
(page 898)
rs.stepDown()
(page 899)
rs.syncFrom()
(page 899)
Description
Adds a member to a replica set.
Adds an arbiter to a replica set.
Returns the replica set configuration document.
Prevents the current member from seeking election as primary for a period of time.
Returns basic help text for replica set functions.
Initializes a new replica set.
Re-configures a replica set by applying a new replica set configuration object.
Remove a member from a replica set.
Sets the slaveOk property for the current connection. Deprecated. Use readPref()
(page 871) and Mongo.setReadPref() (page 919) to set read preference.
Returns a document with information about the state of the replica set.
Causes the current primary to become a secondary which forces an election.
Sets the member that this replica set member will sync from, overriding the default sync
target selection logic.
rs.add()
Definition
rs.add(host, arbiterOnly)
Adds a member to a replica set.
param string,document host The new member to add to the replica set. If a string, specifies the
hostname and optionally the port number for the new member. If a document, specifies a replica
set members document, as found in the members (page 480) array. To view a replica sets
members array, run rs.conf() (page 896).
895
param boolean arbiterOnly Applies only if the <host> value is a string. If true, the added host
is an arbiter.
You may specify new hosts in one of two ways:
1.as a hostname with an optional port number to use the default configuration as in the Add a Member to
an Existing Replica Set (page 435) example.
2.as a configuration document, as in the Configure and Add a Member (page 435) example.
This function will disconnect the shell briefly and forces a reconnection as the replica set renegotiates which
member will be primary. As a result, the shell will display an error even if this command succeeds.
rs.add() (page 895) provides a wrapper around some of the functionality of the replSetReconfig
(page 729) database command and the corresponding shell helper rs.reconfig() (page 897). See the
Replica Set Configuration (page 479) document for full documentation of all replica set configuration options.
Example To add a mongod (page 925) accessible on the default port 27017 running on the host
mongodb3.example.net, use the following rs.add() (page 895) invocation:
rs.add('mongodb3.example.net:27017')
To add mongodb3.example.net as a secondary-only (page 386) member of set, use the following form of
rs.add() (page 895):
rs.add( { "_id": 3, "host": "mongodbd3.example.net:27017", "priority": 0 } )
Replace, 3 with the next unused _id value in the replica set. See rs.conf() (page 896) to see the existing _id
values in the replica set configuration document.
See the Replica Set Configuration (page 479) and Replica Set Tutorials (page 419) documents for more information.
rs.addArb()
Description
rs.addArb(host)
Adds a new arbiter to an existing replica set.
The rs.addArb() (page 896) method takes the following parameter:
param string host Specifies the hostname and optionally the port number of the arbiter member to
add to replica set.
This function briefly disconnects the shell and forces a reconnection as the replica set renegotiates which member
will be primary. As a result, the shell displays an error even if this command succeeds.
rs.conf()
rs.conf()
Returns a document that contains the current replica set configuration document.
See Replica Set Configuration (page 479) for more information on the replica set configuration document.
rs.config()
rs.config() (page 896) is an alias of rs.conf() (page 896).
896
rs.freeze()
Description
rs.freeze(seconds)
Makes the current replica set member ineligible to become primary for the period specified.
The rs.freeze() (page 897) method has the following parameter:
param number seconds The duration the member is ineligible to become primary.
rs.freeze() (page 897) provides a wrapper around the database command replSetFreeze (page 726).
rs.help()
rs.help()
Returns a basic help text for all of the replication (page 377) related shell functions.
rs.initiate()
Description
rs.initiate(configuration)
Initiates a replica set. Optionally takes a configuration argument in the form of a document that holds the
configuration of a replica set.
The rs.initiate() (page 897) method has the following parameter:
param document configuration A document that specifies configuration settings (page 479) for the
new replica set. If a configuration is not specified, MongoDB uses a default configuration.
The rs.initiate() (page 897) method provides a wrapper around the replSetInitiate (page 728)
database command.
Replica Set Configuration See Member Configuration Tutorials (page 438) and Replica Set Configuration
(page 479) for examples of replica set configuration and invitation objects.
rs.reconfig()
Definition
rs.reconfig(configuration, force)
Initializes a new replica set configuration. Disconnects the shell briefly and forces a reconnection as the replica
set renegotiates which member will be primary. As a result, the shell will display an error even if this command
succeeds.
param document configuration A document that specifies the configuration of a replica set.
param document force If set as { force: true }, this forces the replica set to accept the
new configuration even if a majority of the members are not accessible. Use with caution, as
this can lead to term:rollback situations.
rs.reconfig() (page 897) overwrites the existing replica set configuration. Retrieve the current configuration object with rs.conf() (page 896), modify the configuration as needed and then use rs.reconfig()
(page 897) to submit the modified configuration object.
rs.reconfig() (page 897) provides a wrapper around the replSetReconfig (page 729) database
command.
897
If you want to force the reconfiguration if a majority of the set is not connected to the current member, or you are
issuing the command against a secondary, use the following form:
conf = rs.conf()
// modify conf to change configuration
rs.reconfig(conf, { force: true } )
Warning: Forcing a rs.reconfig() (page 897) can lead to rollback situations and other difficult to recover
from situations. Exercise caution when using this option.
See also:
Replica Set Configuration (page 479) and Replica Set Tutorials (page 419).
rs.remove()
Definition
rs.remove(hostname)
Removes the member described by the hostname parameter from the current replica set. This function will
disconnect the shell briefly and forces a reconnection as the replica set renegotiates which member will be
primary. As a result, the shell will display an error even if this command succeeds.
The rs.remove() (page 898) method has the following parameter:
param string hostname The hostname of a system in the replica set.
Note: Before running the rs.remove() (page 898) operation, you must shut down the replica set member
that youre removing.
Changed in version 2.2: This procedure is no longer required when using rs.remove() (page 898), but it
remains good practice.
rs.slaveOk()
rs.slaveOk()
Provides a shorthand for the following operation:
db.getMongo().setSlaveOk()
This allows the current connection to allow read operations to run on secondary members. See the
readPref() (page 871) method for more fine-grained control over read preference (page 405) in the mongo
(page 942) shell.
rs.status()
rs.status()
898
899
900
Sharding
Sharding Methods
Name
Description
sh._adminCommand
Runs a database command against the admin database, like db.runCommand()
(page 902)
(page 893), but can confirm that it is issued against a mongos (page 938).
sh._checkFullName()Tests a namespace to determine if its well formed.
(page 902)
sh._checkMongos() Tests to see if the mongo (page 942) shell is connected to a mongos (page 938)
(page 902)
instance.
sh._lastMigration()Reports on the last chunk migration.
(page 902)
sh.addShard()
Adds a shard to a sharded cluster.
(page 903)
sh.addShardTag()
Associates a shard with a tag, to support tag aware sharding (page 557).
(page 904)
sh.addTagRange()
Associates range of shard keys with a shard tag, to support tag aware sharding
(page 904)
(page 557).
sh.disableBalancing()
Disable balancing on a single collection in a sharded database. Does not affect
(page 905)
balancing of other collections in a sharded cluster.
sh.enableBalancing()
Activates the sharded collection balancer process if previously disabled using
(page 905)
sh.disableBalancing() (page 905).
sh.enableSharding()Enables sharding on a specific database.
(page 905)
sh.getBalancerHost()
Returns the name of a mongos (page 938) thats responsible for the balancer process.
(page 906)
sh.getBalancerState()
Returns a boolean to report if the balancer is currently enabled.
(page 906)
sh.help() (page 906) Returns help text for the sh methods.
sh.isBalancerRunning()
Returns a boolean to report if the balancer process is currently migrating chunks.
(page 907)
sh.moveChunk()
Migrates a chunk in a sharded cluster.
(page 907)
sh.removeShardTag()Removes the association between a shard and a shard tag shard tag.
(page 908)
sh.setBalancerState()
Enables or disables the balancer which migrates chunks between shards.
(page 908)
sh.shardCollection()
Enables sharding for a collection.
(page 908)
sh.splitAt()
Divides an existing chunk into two chunks using a specific value of the shard key as
(page 909)
the dividing point.
sh.splitFind()
Divides an existing chunk that contains a document matching a query into two
(page 909)
approximately equal chunks.
sh.startBalancer() Enables the balancer and waits for balancing to start.
(page 910)
sh.status()
Reports on the status of a sharded cluster, as db.printShardingStatus()
(page 910)
(page 892).
sh.stopBalancer() Disables the balancer and waits for any in progress balancing rounds to complete.
(page 912)
sh.waitForBalancer()
Internal. Waits for the balancer state to change.
(page 913)
sh.waitForBalancerOff()
Internal. Waits until the balancer stops running.
(page 913)
sh.waitForDLock() Internal. Waits for a specified distributed sharded cluster lock.
11.1. MongoDB Interface
901
(page 914)
sh.waitForPingChange()
Internal. Waits for a change in ping state from one of the mongos (page 938) in the
(page 914)
sharded cluster.
sh._adminCommand()
Definition
sh._adminCommand(command, checkMongos)
Runs a database command against the admin database of a mongos (page 938) instance.
param string command A database command to run against the admin database.
param boolean checkMongos Require verification that the shell is connected to a mongos
(page 938) instance.
See also:
db.runCommand() (page 893)
sh._checkFullName()
Definition
sh._checkFullName(namespace)
Verifies that a namespace name is well formed.
If the namespace is well formed,
sh._checkFullName() (page 902) method exits with no message.
the
Throws If the namespace is not well formed, sh._checkFullName() (page 902) throws:
name needs to be fully qualified <db>.<collection>
The sh._checkFullName() (page 902) method has the following parameter:
param string namespace The namespace of a collection. The namespace is the combination of the
database name and the collection name. Enclose the namespace in quotation marks.
sh._checkMongos()
sh._checkMongos()
Returns nothing
Throws not connected to a mongos
The sh._checkMongos() (page 902) method throws an error message if the mongo (page 942) shell is not
connected to a mongos (page 938) instance. Otherwise it exits (no return document or return code).
sh._lastMigration()
Definition
sh._lastMigration(namespace)
Returns information on the last migration performed on the specified database or collection.
The sh._lastMigration() (page 902) method has the following parameter:
param string namespace The namespace of a database or collection within the current database.
Output The sh._lastMigration() (page 902) method returns a document with details about the last migration
performed on the database or collection. The document contains the following output:
sh._lastMigration._id
The id of the migration task.
902
sh._lastMigration.server
The name of the server.
sh._lastMigration.clientAddr
The IP address and port number of the server.
sh._lastMigration.time
The time of the last migration, formatted as ISODate.
sh._lastMigration.what
The specific type of migration.
sh._lastMigration.ns
The complete namespace of the collection affected by the migration.
sh._lastMigration.details
A document containing details about the migrated chunk. The document includes min and max sub-documents
with the bounds of the migrated chunk.
sh.addShard()
Definition
sh.addShard(host)
Adds a database instance or replica set to a sharded cluster. The optimal configuration is to deploy shards across
replica sets. This method must be run on a mongos (page 938) instance.
The sh.addShard() (page 903) method has the following parameter:
param string host The hostname of either a standalone database instance or of a replica set. Include
the port number if the instance is running on a non-standard port. Include the replica set name if
the instance is a replica set, as explained below.
The sh.addShard() (page 903) method has the following prototype form:
sh.addShard("<host>")
Warning: Do not use localhost for the hostname unless your configuration server is also running on
localhost.
The sh.addShard() (page 903) method is a helper for the addShard (page 736) command. The addShard
(page 736) command has additional options which are not available with this helper.
Example To add a shard on a replica set, specify the name of the replica set and the hostname of at least one member
of the replica set, as a seed. If you specify additional hostnames, all must be members of the same replica set.
The following example adds a replica set named repl0 and specifies one member of the replica set:
sh.addShard("repl0/mongodb3.example.net:27327")
sh.addShardTag()
903
Definition
sh.addShardTag(shard, tag)
New in version 2.2.
Associates a shard with a tag or identifier. MongoDB uses these identifiers to direct chunks that fall within a
tagged range to specific shards. sh.addTagRange() (page 904) associates chunk ranges with tag ranges.
param string shard The name of the shard to which to give a specific tag.
param string tag The name of the tag to add to the shard.
Always issue sh.addShardTag() (page 904) when connected to a mongos (page 938) instance.
Example The following example adds three tags, NYC, LAX, and NRT, to three shards:
sh.addShardTag("shard0000", "NYC")
sh.addShardTag("shard0001", "LAX")
sh.addShardTag("shard0002", "NRT")
See also:
sh.addTagRange() (page 904) and sh.removeShardTag() (page 908).
sh.addTagRange()
Definition
sh.addTagRange(namespace, minimum, maximum, tag)
New in version 2.2.
Attaches a range of shard key values to a shard tag created using the sh.addShardTag() (page 904) method.
sh.addTagRange() (page 904) takes the following arguments:
param string namespace The namespace of the sharded collection to tag.
param document minimum The minimum value of the shard key range to include in the tag. Specify the minimum value in the form of <fieldname>:<value>. This value must be of the
same BSON type or types as the shard key.
param document maximum The maximum value of the shard key range to include in the tag. Specify the maximum value in the form of <fieldname>:<value>. This value must be of the
same BSON type or types as the shard key.
param string tag The name of the tag to attach the range specified by the minimum and maximum
arguments to.
Use sh.addShardTag() (page 904) to ensure that the balancer migrates documents that exist within the
specified range to a specific shard or set of shards.
Always issue sh.addTagRange() (page 904) when connected to a mongos (page 938) instance.
Note: If you add a tag range to a collection using sh.addTagRange() (page 904) and then later drop the
collection or its database, MongoDB does not remove the tag association. If you later create a new collection
with the same name, the old tag association will apply to the new collection.
904
1, zip:
sh.addTagRange( "exampledb.collection",
{ state: "NY", zip: MinKey },
{ state: "NY", zip: MaxKey },
"NY"
)
sh.disableBalancing()
Description
sh.disableBalancing(namespace)
Disables the balancer for the specified sharded collection. This does not affect the balancing of chunks for other
sharded collections in the same cluster.
The sh.disableBalancing() (page 905) method has the following parameter:
param string namespace The namespace of the collection.
For more information on the balancing process, see Manage Sharded Cluster Balancer (page 551) and Sharded
Collection Balancing (page 516).
sh.enableBalancing()
Description
sh.enableBalancing(collection)
Enables the balancer for the specified sharded collection.
The sh.enableBalancing() (page 905) method has the following parameter:
param string collection The namespace of the collection.
Important: sh.enableBalancing() (page 905) does not start balancing. Rather, it allows balancing of
this collection the next time the balancer runs.
For more information on the balancing process, see Manage Sharded Cluster Balancer (page 551) and Sharded
Collection Balancing (page 516).
sh.enableSharding()
Definition
sh.enableSharding(database)
Enables sharding on the specified database. This does not automatically shard any collections but makes it
possible to begin sharding collections using sh.shardCollection() (page 908).
The sh.enableSharding() (page 905) method has the following parameter:
param string database The name of the database shard. Enclose the name in quotation marks.
See also:
sh.shardCollection() (page 908)
905
sh.getBalancerHost()
sh.getBalancerHost()
Returns String in form hostname:port
sh.getBalancerHost() (page 906) returns the name of the server that is running the balancer.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerState() (page 906)
sh.isBalancerRunning() (page 907)
sh.setBalancerState() (page 908)
sh.startBalancer() (page 910)
sh.stopBalancer() (page 912)
sh.waitForBalancer() (page 913)
sh.waitForBalancerOff() (page 913)
sh.getBalancerState()
sh.getBalancerState()
Returns boolean
sh.getBalancerState() (page 906) returns true when the balancer is enabled and false if the balancer
is disabled. This does not reflect the current state of balancing operations: use sh.isBalancerRunning()
(page 907) to check the balancers current state.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
sh.isBalancerRunning() (page 907)
sh.setBalancerState() (page 908)
sh.startBalancer() (page 910)
sh.stopBalancer() (page 912)
sh.waitForBalancer() (page 913)
sh.waitForBalancerOff() (page 913)
sh.help()
sh.help()
Returns a basic help text for all sharding related shell functions.
906
sh.isBalancerRunning()
sh.isBalancerRunning()
Returns boolean
Returns true if the balancer process is currently running and migrating chunks and false if the balancer process is
not running. Use sh.getBalancerState() (page 906) to determine if the balancer is enabled or disabled.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
sh.getBalancerState() (page 906)
sh.setBalancerState() (page 908)
sh.startBalancer() (page 910)
sh.stopBalancer() (page 912)
sh.waitForBalancer() (page 913)
sh.waitForBalancerOff() (page 913)
sh.moveChunk()
Definition
sh.moveChunk(namespace, query, destination)
Moves the chunk that contains the document specified by the query to the destination shard.
sh.moveChunk() (page 907) provides a wrapper around the moveChunk (page 742) database command
and takes the following arguments:
param string namespace The namespace of the sharded collection that contains the chunk to migrate.
param document query An equality match on the shard key that selects the chunk to move.
param string destination The name of the shard to move.
Important: In most circumstances, allow the balancer to automatically migrate chunks, and avoid calling
sh.moveChunk() (page 907) directly.
See also:
moveChunk (page 742), sh.splitAt() (page 909), sh.splitFind() (page 909), Sharding (page 493), and
chunk migration (page 518).
Example Given the people collection in the records database, the following operation finds the chunk that
contains the documents with the zipcode field set to 53187 and then moves that chunk to the shard named
shard0019:
sh.moveChunk("records.people", { zipcode: 53187 }, "shard0019")
sh.removeShardTag()
907
Definition
sh.removeShardTag(shard, tag)
New in version 2.2.
Removes the association between a tag and a shard. Always issue sh.removeShardTag() (page 908) when
connected to a mongos (page 938) instance.
param string shard The name of the shard from which to remove a tag.
param string tag The name of the tag to remove from the shard.
See also:
sh.addShardTag() (page 904), sh.addTagRange() (page 904)
sh.setBalancerState()
Description
sh.setBalancerState(state)
Enables or disables the balancer. Use sh.getBalancerState() (page 906) to determine if the balancer is
currently enabled or disabled and sh.isBalancerRunning() (page 907) to check its current state.
The sh.getBalancerState() (page 906) method has the following parameter:
param Boolean state Set this to true to enable the balancer and false to disable it.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
sh.getBalancerState() (page 906)
sh.isBalancerRunning() (page 907)
sh.startBalancer() (page 910)
sh.stopBalancer() (page 912)
sh.waitForBalancer() (page 913)
sh.waitForBalancerOff() (page 913)
sh.shardCollection()
Definition
sh.shardCollection(namespace, key, unique)
Shards a collection using the key as a the shard key. sh.shardCollection() (page 908) takes the following arguments:
param string namespace The namespace of the collection to shard.
param document key A document that specifies the shard key to use to partition and distribute
objects among the shards. A shard key may be one field or multiple fields. A shard key with
multiple fields is called a compound shard key.
param Boolean unique When true, ensures that the underlying index enforces a unique constraint.
Hashed shard keys do not support unique constraints.
908
Warning:
MongoDB provides no method to deactivate sharding for a collection after calling
shardCollection (page 738). Additionally, after shardCollection (page 738), you cannot change
shard keys or modify the value of any field used in your shard key index.
See also:
shardCollection (page 738) for additional options, Sharding (page 493) and Sharding Introduction (page 493)
for an overview of sharding, Deploy a Sharded Cluster (page 522) for a tutorial, and Shard Keys (page 506) for
choosing a shard key.
Example Given the people collection in the records database, the following command shards the collection by
the zipcode field:
sh.shardCollection("records.people", { zipcode: 1} )
sh.splitAt()
Definition
sh.splitAt(namespace, query)
Splits the chunk containing the document specified by the query as if that document were at the middle of the
collection, even if the specified document is not the actual median of the collection.
param string namespace The namespace (i.e. <database>.<collection>) of the sharded
collection that contains the chunk to split.
param document query A query to identify a document in a specific chunk. Typically specify the
shard key for a document as the query.
Use this command to manually split chunks unevenly. Use the sh.splitFind() (page 909) function to
split a chunk at the actual median.
In most circumstances, you should leave chunk splitting to the automated processes within MongoDB. However,
when initially deploying a sharded cluster it is necessary to perform some measure of pre-splitting using manual
methods including sh.splitAt() (page 909).
sh.splitFind()
Definition
sh.splitFind(namespace, query)
Splits the chunk containing the document specified by the query at its median point, creating two roughly
equal chunks. Use sh.splitAt() (page 909) to split a collection in a specific point.
In most circumstances, you should leave chunk splitting to the automated processes. However, when initially
deploying a sharded cluster it is necessary to perform some measure of pre-splitting using manual methods
including sh.splitFind() (page 909).
param string namespace The namespace (i.e. <database>.<collection>) of the sharded
collection that contains the chunk to split.
param document query A query to identify a document in a specific chunk. Typically specify the
shard key for a document as the query.
909
sh.startBalancer()
Definition
sh.startBalancer(timeout, interval)
Enables the balancer in a sharded cluster and waits for balancing to initiate.
param integer timeout Milliseconds to wait.
param integer interval Milliseconds to sleep each cycle of waiting.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
sh.getBalancerState() (page 906)
sh.isBalancerRunning() (page 907)
sh.setBalancerState() (page 908)
sh.stopBalancer() (page 912)
sh.waitForBalancer() (page 913)
sh.waitForBalancerOff() (page 913)
sh.status()
Definition
sh.status()
Prints a formatted report of the sharding configuration and the information regarding existing chunks in a
sharded cluster. The default behavior suppresses the detailed chunk information if the total number of chunks
is greater than or equal to 20.
The sh.status() (page 910) method has the following parameter:
param Boolean verbose If true, the method displays details of the document distribution across
chunks when you have 20 or more chunks.
See also:
db.printShardingStatus() (page 892)
Output Examples The Sharding Version (page 911) section displays information on the config database:
--- Sharding Status --sharding version: {
"_id" : <num>,
"version" : <num>,
"minCompatibleVersion" : <num>,
"currentVersion" : <num>,
"clusterId" : <ObjectId>
}
The Shards (page 912) section lists information on the shard(s). For each shard, the section displays the name, host,
and the associated tags, if any.
910
shards:
{ "_id" : <shard name1>,
"host" : <string>,
"tags" : [ <string> ... ]
}
{ "_id" : <shard name2>,
"host" : <string>,
"tags" : [ <string> ... ]
}
...
The Databases (page 912) section lists information on the database(s). For each database, the section displays the
name, whether the database has sharding enabled, and the primary shard for the database.
databases:
{ "_id" : <dbname1>,
"partitioned" : <boolean>,
"primary" : <string>
}
{ "_id" : <dbname2>,
"partitioned" : <boolean>,
"primary" : <string>
}
...
The Sharded Collection (page 912) section provides information on the sharding details for sharded collection(s). For
each sharded collection, the section displays the shard key, the number of chunks per shard(s), the distribution of
documents across chunks 17 , and the tag information, if any, for shard key range(s).
<dbname>.<collection>
shard key: { <shard key> : <1 or hashed> }
chunks:
<shard name1> <number of chunks>
<shard name2> <number of chunks>
...
{ <shard key>: <min range1> } -->> { <shard key> : <max range1> } on : <shard name> <last modified
{ <shard key>: <min range2> } -->> { <shard key> : <max range2> } on : <shard name> <last modified
...
tag: <tag1> { <shard key> : <min range1> } -->> { <shard key> : <max range1> }
...
Output Fields
Sharding Version
sh.status.sharding-version._id
The _id (page 911) is an identifier for the version details.
sh.status.sharding-version.version
The version (page 911) is the version of the config server for the sharded cluster.
sh.status.sharding-version.minCompatibleVersion
The minCompatibleVersion (page 911) is the minimum compatible version of the config server.
sh.status.sharding-version.currentVersion
The currentVersion (page 911) is the current version of the config server.
17 The sharded collection section, by default, displays the chunk information if the total number of chunks is less than 20. To display the
information when you have 20 or more chunks, call the sh.status() (page 910) methods with the verbose parameter set to true, i.e.
sh.status(true).
911
sh.status.sharding-version.clusterId
The clusterId (page 912) is the identification for the sharded cluster.
Shards
sh.status.shards._id
The _id (page 912) displays the name of the shard.
sh.status.shards.host
The host (page 912) displays the host location of the shard.
sh.status.shards.tags
The tags (page 912) displays all the tags for the shard. The field only displays if the shard has tags.
Databases
sh.status.databases._id
The _id (page 912) displays the name of the database.
sh.status.databases.partitioned
The partitioned (page 912) displays whether the database has sharding enabled. If true, the database has
sharding enabled.
sh.status.databases.primary
The primary (page 912) displays the primary shard for the database.
Sharded Collection
sh.status.databases.shard-key
The shard-key (page 912) displays the shard key specification document.
sh.status.databases.chunks
The chunks (page 912) lists all the shards and the number of chunks that reside on each shard.
sh.status.databases.chunk-details
The chunk-details (page 912) lists the details of the chunks 1 :
The range of shard key values that define the chunk,
The shard where the chunk resides, and
The last modified timestamp for the chunk.
sh.status.databases.tag
The tag (page 912) lists the details of the tags associated with a range of shard key values.
sh.stopBalancer()
Definition
sh.stopBalancer(timeout, interval)
Disables the balancer in a sharded cluster and waits for balancing to complete.
param integer timeout Milliseconds to wait.
param integer interval Milliseconds to sleep each cycle of waiting.
See also:
sh.enableBalancing() (page 905)
sh.disableBalancing() (page 905)
sh.getBalancerHost() (page 906)
912
913
Definition
sh.waitForDLock(lockname, wait, timeout, interval)
Waits until the specified distributed lock changes state. sh.waitForDLock() (page 914) is an internal
method that takes the following arguments:
param string lockname The name of the distributed lock.
param Boolean wait Set to true to ensure the balancer is now active. Set to false to wait until
balancing stops and becomes inactive.
param integer timeout Milliseconds to wait.
param integer interval Milliseconds to sleep in each waiting cycle.
sh.waitForPingChange()
Definition
sh.waitForPingChange(activePings, timeout, interval)
sh.waitForPingChange() (page 914) waits for a change in ping state of one of the activepings, and
only returns when the specified ping changes state.
param array activePings An array of active pings from the mongos (page 568) collection.
param integer timeout Number of milliseconds to wait for a change in ping state.
param integer interval Number of milliseconds to sleep in each waiting cycle.
Subprocess
Subprocess Methods
Name
clearRawMongoProgramOutput() (page 914)
rawMongoProgramOutput() (page 914)
run()
runMongoProgram() (page 915)
runProgram() (page 915)
startMongoProgram()
stopMongoProgram() (page 915)
stopMongoProgramByPid() (page 915)
stopMongod() (page 915)
waitMongoProgramOnPort() (page 915)
waitProgram() (page 915)
Description
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
For internal use.
clearRawMongoProgramOutput()
clearRawMongoProgramOutput()
For internal use.
rawMongoProgramOutput()
rawMongoProgramOutput()
For internal use.
914
run()
run()
For internal use.
runMongoProgram()
runMongoProgram()
For internal use.
runProgram()
runProgram()
For internal use.
startMongoProgram()
_startMongoProgram()
For internal use.
stopMongoProgram()
stopMongoProgram()
For internal use.
stopMongoProgramByPid()
stopMongoProgramByPid()
For internal use.
stopMongod()
stopMongod()
For internal use.
waitMongoProgramOnPort()
waitMongoProgramOnPort()
For internal use.
waitProgram()
waitProgram()
For internal use.
Constructors
Object Constructors and Methods
Name
Date() (page 916)
UUID() (page 916)
ObjectId.getTimestamp()
(page 916)
ObjectId.toString() (page 916)
ObjectId.valueOf() (page 917)
11.1. MongoDB Interface
Description
Creates a date object. By default creates a date object including the
current date.
Converts a 32-byte hexadecimal string to the UUID BSON subtype.
Returns the timestamp portion of an ObjectId.
Displays the string representation of an ObjectId.
Displays the str attribute of an ObjectId as a hexadecimal string.
915
Date()
Date()
Returns Current date, as a string.
UUID()
Definition
UUID(<string>)
Generates a BSON UUID object.
param string hex Specify a 32-byte hexadecimal string to convert to the UUID BSON subtype.
Returns A BSON UUID object.
Example Create a 32 byte hexadecimal string:
var myuuid = '0123456789abcdeffedcba9876543210'
ObjectId.getTimestamp()
ObjectId.getTimestamp()
Returns The timestamp portion of the ObjectId() (page 104) object as a Date.
In the following example, call the getTimestamp() (page 916) method on an ObjectId (e.g.
ObjectId("507c7f79bcf86cd7994f6c0e")):
ObjectId("507c7f79bcf86cd7994f6c0e").getTimestamp()
ObjectId.toString()
ObjectId.toString()
Returns The string representation of the ObjectId() (page 104) object. This value has the format of
ObjectId(...).
Changed in version 2.2: In previous versions ObjectId.toString() (page 916) returns the value of the
ObjectId as a hexadecimal string.
In the following example, call the toString() (page 916) method on an ObjectId (e.g.
ObjectId("507c7f79bcf86cd7994f6c0e")):
ObjectId("507c7f79bcf86cd7994f6c0e").toString()
You can confirm the type of this object using the following operation:
916
typeof ObjectId("507c7f79bcf86cd7994f6c0e").toString()
ObjectId.valueOf()
ObjectId.valueOf()
Returns The value of the ObjectId() (page 104) object as a lowercase hexadecimal string. This value
is the str attribute of the ObjectId() object.
Changed in version 2.2: In previous versions ObjectId.valueOf() (page 917) returns the ObjectId()
object.
In the following example, call the valueOf() (page 917) method on an ObjectId (e.g.
ObjectId("507c7f79bcf86cd7994f6c0e")):
ObjectId("507c7f79bcf86cd7994f6c0e").valueOf()
You can confirm the type of this object using the following operation:
typeof ObjectId("507c7f79bcf86cd7994f6c0e").valueOf()
Connection
Connection Methods
Name
Mongo.getDB() (page 917)
Mongo.getReadPrefMode()
(page 918)
Mongo.getReadPrefTagSet()
(page 918)
Mongo.setReadPref() (page 919)
Mongo.setSlaveOk() (page 919)
Mongo() (page 919)
connect()
Description
Returns a database object.
Returns the current read preference mode for the MongoDB
connection.
Returns the read preference tag set for the MongoDB connection.
Sets the read preference for the MongoDB connection.
Allows operations on the current connection to read from secondary
members.
Creates a new connection object.
Connects to a MongoDB instance and to a specified database on that
instance.
Mongo.getDB()
Description
Mongo.getDB(<database>)
Provides access to database objects from the mongo (page 942) shell or from a JavaScript file.
The Mongo.getDB() (page 917) method has the following parameter:
param string database The name of the database to access.
917
Example The following example instantiates a new connection to the MongoDB instance running on the localhost
interface and returns a reference to "myDatabase":
db = new Mongo().getDB("myDatabase");
See also:
Mongo() (page 919) and connect() (page 920)
Mongo.getReadPrefMode()
Mongo.getReadPrefMode()
Returns The current read preference mode for the Mongo() (page 887) connection object.
See Read Preference (page 405) for an introduction to read preferences in MongoDB. Use
getReadPrefMode() (page 918) to return the current read preference mode, as in the following example:
db.getMongo().getReadPrefMode()
Use the following operation to return and print the current read preference mode:
print(db.getMongo().getReadPrefMode());
This operation will return one of the following read preference modes:
primary (page 489)
primaryPreferred (page 489)
secondary (page 489)
secondaryPreferred (page 489)
nearest (page 490)
See also:
Read Preference (page 405), readPref() (page 871),
getReadPrefTagSet() (page 918).
and
Mongo.getReadPrefTagSet()
Mongo.getReadPrefTagSet()
Returns The current read preference tag set for the Mongo() (page 887) connection object.
See Read Preference (page 405) for an introduction to read preferences and tag sets in MongoDB. Use
getReadPrefTagSet() (page 918) to return the current read preference tag set, as in the following example:
db.getMongo().getReadPrefTagSet()
Use the following operation to return and print the current read preference tag set:
printjson(db.getMongo().getReadPrefTagSet());
See also:
Read Preference (page 405), readPref() (page 871),
getReadPrefTagSet() (page 918).
and
Mongo.setReadPref()
918
Definition
Mongo.setReadPref(mode, tagSet)
Call the setReadPref() (page 919) method on a Mongo (page 887) connection object to control how the
client will route all queries to members of the replica set.
param string mode One of the following read preference modes: primary (page 489),
primaryPreferred (page 489), secondary (page 489), secondaryPreferred
(page 489), or nearest (page 490).
param array tagSet A tag set used to specify custom read preference modes. For details, see Tag
Sets (page 407).
Examples To set a read preference mode in the mongo (page 942) shell, use the following operation:
db.getMongo().setReadPref('primaryPreferred')
To set a read preference that uses a tag set, specify an array of tag sets as the second argument to
Mongo.setReadPref() (page 919), as in the following:
db.getMongo().setReadPref('primaryPreferred', [ { "dc": "east" } ] )
You can specify multiple tag sets, in order of preference, as in the following:
db.getMongo().setReadPref('secondaryPreferred',
[ { "dc": "east", "use": "production" },
{ "dc": "east", "use": "reporting" },
{ "dc": "east" },
{}
] )
If the replica set cannot satisfy the first tag set, the client will attempt to use the second read preference. Each tag set
can contain zero or more field/value tag pairs, with an empty document acting as a wildcard which matches a replica
set member with any tag set or no tag set.
Note: You must call Mongo.setReadPref() (page 919) on the connection object before retrieving documents
using that connection to use that read preference.
mongo.setSlaveOk()
Mongo.setSlaveOk()
For the current session, this command permits read operations from non-master (i.e. slave or secondary) instances. Practically, use this method in the following form:
db.getMongo().setSlaveOk()
Indicates that eventually consistent read operations are acceptable for the current application. This function
provides the same functionality as rs.slaveOk() (page 898).
See the readPref() (page 871) method for more fine-grained control over read preference (page 405) in the
mongo (page 942) shell.
Mongo()
Description
919
Mongo(host)
JavaScript constructor to instantiate a database connection from the mongo (page 942) shell or from a JavaScript
file.
The Mongo() (page 919) method has the following parameter:
param string host The host, either in the form of <host> or <host><:port>.
Instantiation Options
on the default port.
Use the constructor without a parameter to instantiate a connection to the localhost interface
Pass the <host> parameter to the constructor to instantiate a connection to the <host> and the default port.
Pass the <host><:port> parameter to the constructor to instantiate a connection to the <host> and the <port>.
See also:
Mongo.getDB() (page 917) and db.getMongo() (page 887).
connect()
connect(<hostname><:port>/<database>)
The connect() method creates a connection to a MongoDB instance. However, use the Mongo() (page 919)
object and its getDB() (page 917) method in most cases.
connect() accepts a string <hostname><:port>/<database> parameter to connect to the MongoDB
instance on the <hostname><:port> and return the reference to the database <database>.
The following example instantiates a new connection to the MongoDB instance running on the localhost interface and returns a reference to myDatabase:
db = connect("localhost:27017/myDatabase")
See also:
Mongo.getDB() (page 917)
920
Native
Native Methods
Name
cat()
version()
cd()
copyDbpath()
(page 922)
resetDbpath()
(page 922)
fuzzFile() (page 922)
getHostName()
(page 922)
getMemInfo()
(page 922)
hostname()
_isWindows()
(page 922)
listFiles() (page 922)
load()
ls()
md5sumFile()
(page 923)
mkdir()
pwd()
quit()
_rand() (page 924)
removeFile()
(page 924)
_srand() (page 924)
Description
Returns the contents of the specified file.
Returns the current version of the mongo (page 942) shell instance.
Changes the current working directory to the specified path.
Copies a local dbpath (page 993). For internal use.
Removes a local dbpath (page 993). For internal use.
For internal use to support testing.
Returns the hostname of the system running the mongo (page 942) shell.
Returns a document that reports the amount of memory used by the shell.
Returns the hostname of the system running the shell.
Returns true if the shell runs on a Windows system; false if a Unix or Linux
system.
Returns an array of documents that give the name and size of each object in the
directory.
Loads and runs a JavaScript file in the shell.
Returns a list of the files in the current directory.
The md5 hash of the specified file.
Creates a directory at the specified path.
Returns the current directory.
Exits the current shell session.
Returns a random number between 0 and 1.
Removes the specified file from the local file system.
For internal use.
cat()
Definition
cat(filename)
Returns the contents of the specified file. The method returns with output relative to the current shell session
and does not impact the server.
param string filename Specify a path and file name on the local file system.
version()
version()
Returns The version of the mongo (page 942) shell as a string.
Changed in version 2.4: In previous versions of the shell, version() would print the version instead of
returning a string.
cd()
11.1. MongoDB Interface
921
Definition
cd(path)
param string path A path on the file system local to the mongo (page 942) shell context.
cd() changes the directory context of the mongo (page 942) shell and has no effect on the MongoDB server.
copyDbpath()
copyDbpath()
For internal use.
resetDbpath()
resetDbpath()
For internal use.
fuzzFile()
Description
fuzzFile(filename)
For internal use.
param string filename A filename or path to a local file.
getHostName()
getHostName()
Returns The hostname of the system running the mongo (page 942) shell process.
getMemInfo()
getMemInfo()
Returns a document with two fields that report the amount of memory used by the JavaScript shell process. The
fields returned are resident and virtual.
hostname()
hostname()
Returns The hostname of the system running the mongo (page 942) shell process.
_isWindows()
_isWindows()
Returns boolean.
Returns true if the mongo (page 942) shell is running on a system that is Windows, or false if the shell is
running on a Unix or Linux systems.
listFiles()
listFiles()
Returns an array, containing one document per object in the directory. This function operates in the context of
the mongo (page 942) process. The included fields are:
922
name
Returns a string which contains the name of the object.
isDirectory
Returns true or false if the object is a directory.
size
Returns the size of the object in bytes. This field is only present for files.
load()
Definition
load(file)
Loads and runs a JavaScript file into the current shell environment.
The load() method has the following parameter:
param string filename Specifies the path of a JavaScript file to execute.
Specify filenames with relative or absolute paths. When using relative path names, confirm the current directory
using the pwd() method.
After executing a file with load(), you may reference any functions or variables defined the file from the
mongo (page 942) shell environment.
Example Consider the following examples of the load() method:
load("scripts/myjstest.js")
load("/data/db/scripts/myjstest.js")
ls()
ls()
Returns a list of the files in the current directory.
This function returns with output relative to the current shell session, and does not impact the server.
md5sumFile()
Description
md5sumFile(filename)
Returns a md5 hash of the specified file.
The md5sumFile() (page 923) method has the following parameter:
param string filename A file name.
Note: The specified filename must refer to a file located on the system running the mongo (page 942) shell.
mkdir()
923
Description
mkdir(path)
Creates a directory at the specified path. This method creates the entire path specified if the enclosing directory
or directories do not already exit.
This method is equivalent to mkdir -p with BSD or GNU utilities.
The mkdir() method has the following parameter:
param string path A path on the local filesystem.
pwd()
pwd()
Returns the current directory.
This function returns with output relative to the current shell session, and does not impact the server.
quit()
quit()
Exits the current shell session.
rand()
_rand()
Returns A random number between 0 and 1.
This function provides functionality similar to the Math.rand() function from the standard library.
removeFile()
Description
removeFile(filename)
Removes the specified file from the local file system.
The removeFile() (page 924) method has the following parameter:
param string filename A filename or path to a local file.
_srand()
_srand()
For internal use.
924
mongod
Synopsis mongod (page 925) is the primary daemon process for the MongoDB system. It handles data requests,
manages data format, and performs background management operations.
This document provides a complete overview of all command line options for mongod (page 925). These options are
primarily useful for testing purposes. In common operation, use the configuration file options (page 990) to control
the behavior of your database, which is fully capable of all operations described below.
Options
mongod
Core Options
mongod
command line option!help, -h
--help, -h
Returns a basic help and usage text.
command line option!version
--version
Returns the version of the mongod (page 925) daemon.
command line option!config <filename>, -f <filename>
--config <filename>, -f <filename>
Specifies a configuration file, that you can use to specify runtime-configurations. While the options are equivalent and accessible via the other command line arguments, the configuration file is the preferred method for
runtime configuration of mongod. See the Configuration File Options (page 990) document for more information about these options.
Note: Ensure the configuration file uses ASCII encoding. mongod (page 925) does not support configuration
files with non-ASCII encoding, including UTF-8.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on standard output or in the log file specified by --logpath
(page 939). Use the -v form to control the level of verbosity by including the option multiple times, (e.g.
-vvvvv.)
command line option!quiet
--quiet
Runs the mongod (page 925) instance in a quiet mode that attempts to limit the amount of output. This option
suppresses:
output from database commands, including drop (page 747), dropIndexes (page 750),
diagLogging (page 769), validate (page 771), and clean (page 752).
replication activity.
connection accepted events.
connection closed events.
command line option!port <port>
925
--port <port>
Specifies a TCP port for the mongod (page 925) to listen for client connections. By default mongod (page 925)
listens for connections on port 27017.
UNIX-like systems require root privileges to use ports with numbers lower than 1024.
command line option!bind_ip <ip address>
--bind_ip <ip address>
The IP address that the mongod (page 925) process will bind to and listen for connections. By default mongod
(page 925) listens for connections all interfaces. You may attach mongod (page 925) to any interface; however,
when attaching mongod (page 925) to a publicly accessible interface ensure that you have implemented proper
authentication and/or firewall restrictions to protect the integrity of your database.
command line option!maxConns <number>
--maxConns <number>
Specifies the maximum number of simultaneous connections that mongod (page 925) will accept. This setting will have no effect if it is higher than your operating systems configured maximum connection tracking
threshold.
Note: You cannot set maxConns (page 992) to a value higher than 20000.
command line option!objcheck
--objcheck
Forces the mongod (page 925) to validate all requests from clients upon receipt to ensure that clients never insert
invalid documents into the database. For objects with a high degree of sub-document nesting, --objcheck
(page 983) can have a small impact on performance. You can set --noobjcheck (page 959) to disable object
checking at run-time.
Changed in version 2.4: MongoDB enables --objcheck (page 983) by default, to prevent any client from
inserting malformed or invalid BSON into a MongoDB database.
command line option!noobjcheck
--noobjcheck
New in version 2.4.
Disables the default document validation that MongoDB performs on all incoming BSON documents.
command line option!logpath <path>
--logpath <path>
Specify a path for the log file that will hold all diagnostic logging information.
Unless specified, mongod (page 925) will output all log information to the standard output. Additionally, unless
you also specify --logappend (page 939), the logfile will be overwritten when the process restarts.
Note: The behavior of the logging system may change in the near future in response to the SERVER-449918
case.
command line option!logappend
--logappend
When specified, this option ensures that mongod (page 925) appends new entries to the end of the logfile rather
than overwriting the content of the log when the process restarts.
command line option!syslog
18 https://jira.mongodb.org/browse/SERVER-4499
926
--syslog
New in version 2.1.0.
Sends all logging output to the hosts syslog system rather than to standard output or a log file as with
--logpath (page 939).
Important: You cannot use --syslog (page 939) with --logpath (page 939).
command line option!pidfilepath <path>
--pidfilepath <path>
Specify a file location to hold the PID or process ID of the mongod (page 925) process. Useful for tracking
the mongod (page 925) process in combination with the mongod --fork option.
Without a specified --pidfilepath (page 940) option, mongos (page 938) creates no PID file.
command line option!keyFile <file>
--keyFile <file>
Specify the path to a key file to store authentication information. This option is only useful for the connection
between replica set members.
See also:
Replica Set Security (page 238) and Replica Set Tutorials (page 419).
command line option!nounixsocket
--nounixsocket
Disables listening on the UNIX socket. mongod (page 925) always listens on the UNIX socket, unless
--nounixsocket (page 940) is set, --bind_ip (page 938) is not set, or --bind_ip (page 938) does
not specify 127.0.0.1.
command line option!unixSocketPrefix <path>
--unixSocketPrefix <path>
Specifies a path for the UNIX socket. Unless this option has a value mongod (page 925) creates a socket with
http://docs.mongodb.org/manualtmp as a prefix.
MongoDB will always create and listen on a UNIX socket, unless --nounixsocket (page 940) is set,
--bind_ip (page 938) is not set, or --bind_ip (page 938) does not specify 127.0.0.1.
command line option!fork
--fork
Enables a daemon mode for mongod (page 925) that runs the process to the background. This is the normal
mode of operation, in production and production-like environments, but may not be desirable for testing.
command line option!auth
--auth
Enables database authentication for users connecting from remote hosts. Configure users via the mongo shell
(page 942). If no users exist, the localhost interface will continue to have access to the database until the you
create the first user.
See the Security and Authentication (page 237) page for more information regarding this functionality.
command line option!cpu
--cpu
Forces mongod (page 925) to report the percentage of CPU time in write lock. mongod (page 925) generates
output every four seconds. MongoDB writes this data to standard output or the logfile if using the logpath
(page 992) option.
927
Setting
off. No logging.
Log write operations.
Log read operations.
Log both read and write operations.
Log write and some read operations.
You can use the mongosniff (page 982) tool to replay this output for investigation. Given a typical diaglog
file, located at /data/db/diaglog.4f76a58c, you might use a command in the following form to read
these files:
mongosniff --source DIAGLOG /data/db/diaglog.4f76a58c
--diaglog (page 928) is for internal use and not intended for most users.
Warning: Setting the diagnostic level to 0 will cause mongod (page 925) to stop writing data to the
diagnostic log file. However, the mongod (page 925) instance will continue to keep the file open, even if
it is no longer writing data to the file. If you want to rename, move, or delete the diagnostic log you must
cleanly shut down the mongod (page 925) instance before doing so.
command line option!directoryperdb
--directoryperdb
Alters the storage pattern of the data directory to store each databases files in a distinct folder. This option will
create directories within the --dbpath (page 958) named for each directory.
Use this option in conjunction with your file system and device configuration so that MongoDB will store data
on a number of distinct disk devices to increase write throughput or disk capacity.
928
Warning: If you have an existing mongod (page 925) instance and dbpath (page 993), and you want to
enable --directoryperdb (page 958), you must migrate your existing databases to directories before
setting --directoryperdb (page 958) to access those databases.
Example
Given a dbpath (page 993) directory with the following items:
journal
mongod.lock
local.0
local.1
local.ns
test.0
test.1
test.ns
To enable --directoryperdb (page 958) you would need to modify the dbpath (page 993) to resemble
the following:
journal
mongod.lock
local/local.0
local/local.1
local/local.ns
test/test.0
test/test.1
test/test.ns
929
networks. mongod (page 925) disables IPv6 support by default in mongod (page 925) and all utilities.
command line option!jsonp
--jsonp
Permits JSONP access via an HTTP interface. Consider the security implications of allowing this activity before
enabling this option.
command line option!noauth
--noauth
Disable authentication. Currently the default. Exists for future compatibility and clarity.
command line option!nohttpinterface
--nohttpinterface
Disables the HTTP interface.
Note: In MongoDB Enterprise, the HTTP Console does not support Kerberos Authentication.
command line option!nojournal
--nojournal
Disables the durability journaling. By default, mongod (page 925) enables journaling in 64-bit versions after
v2.0.
command line option!noprealloc
--noprealloc
Disables the preallocation of data files. This will shorten the start up time in some cases, but can cause significant
performance penalties during normal operations.
command line option!noscripting
--noscripting
Disables the scripting engine.
command line option!notablescan
--notablescan
Forbids operations that require a table scan.
command line option!nssize <value>
--nssize <value>
Specifies the default size for namespace files (i.e .ns). This option has no impact on the size of existing
namespace files. The maximum size is 2047 megabytes.
The default value is 16 megabytes; this provides for approximately 24,000 namespaces. Each collection, as well
as each index, counts as a namespace.
command line option!profile <level>
--profile <level>
Changes the level of database profiling, which inserts information about operation performance into output of
mongod (page 925) or the log file. The following levels are available:
Level
0
1
2
Setting
Off. No profiling.
On. Only includes slow operations.
On. Includes all operations.
Profiling is off by default. Database profiling can impact database performance. Enable this option only after
careful consideration.
930
Note: When using journaling, there is almost never any need to run repairDatabase (page 757). In the
event of an unclean shutdown, the server will be able restore the data files to a pristine state automatically.
Changed in version 2.1.2.
If you run the repair option and have data in a journal file, mongod (page 925) will refuse to start. In these cases
you should start mongod (page 925) without the --repair (page 954) option to allow mongod (page 925) to
recover data from the journal. This will complete more quickly and will result in a more consistent and complete
data set.
To continue the repair operation despite the journal files, shut down mongod (page 925) cleanly and restart with
the --repair (page 954) option.
Note: --repair (page 954) copies data from the source data files into new data files in the repairpath
(page 997), and then replaces the original data files with the repaired data files. If repairpath (page 997)
is on the same device as dbpath (page 993), you may interrupt a mongod (page 925) running --repair
(page 954) without affecting the integrity of the data set.
command line option!repairpath <path>
--repairpath <path>
Specifies the root directory containing MongoDB data files, to use for the --repair (page 954) operation.
Defaults to a _tmp directory within the dbpath (page 993).
command line option!setParameter <options>
931
--setParameter <options>
New in version 2.4.
Specifies an option to configure on startup. Specify multiple options with multiple --setParameter
(page 939) options. See mongod Parameters (page 1005) for full documentation of these parameters. The setParameter (page 756) database command provides access to many of these parameters.
--setParameter (page 939) supports the following options:
enableLocalhostAuthBypass (page 1005)
enableTestCommands (page 1005)
journalCommitInterval (page 1006)
logLevel (page 1006)
logUserIds (page 1006)
notablescan (page 1006)
quiet (page 1007)
replApplyBatchSize (page 1006)
replIndexPrefetch (page 1006)
supportCompatibilityFormPrivilegeDocuments (page 1007)
syncdelay (page 1007)
textSearchEnabled (page 1008)
traceExceptions (page 1007)
command line option!slowms <value>
--slowms <value>
Defines the value of slow, for the --profile (page 930) option. The database logs all slow queries to the
log, even when the profiler is not turned on. When the database profiler is on, mongod (page 925) the profiler
writes to the system.profile collection. See the profile (page 770) command for more information on
the database profiler.
command line option!smallfiles
--smallfiles
Enables a mode where MongoDB uses a smaller default file size. Specifically, --smallfiles (page 932)
reduces the initial size for data files and limits them to 512 megabytes. --smallfiles (page 932) also
reduces the size of each journal files from 1 gigabyte to 128 megabytes.
Use --smallfiles (page 932) if you have a large number of databases that each holds a small quantity of
data. --smallfiles (page 932) can lead your mongod (page 925) to create a large number of files, which
may affect performance for larger databases.
command line option!shutdown
--shutdown
Used in control scripts, the --shutdown (page 932) will cleanly and safely terminate the mongod (page 925)
process. When invoking mongod (page 925) with this option you must set the --dbpath (page 958) option
either directly or by way of the configuration file (page 990) and the --config (page 938) option.
The --shutdown (page 932) option is available only on Linux systems.
command line option!syncdelay <value>
932
--syncdelay <value>
mongod (page 925) writes data very quickly to the journal, and lazily to the data files. --syncdelay
(page 932) controls how much time can pass before MongoDB flushes data to the database files via an fsync
operation. The default setting is 60 seconds. In almost every situation you should not set this value and use the
default setting.
The serverStatus (page 782) command reports the background flush threads status via the
backgroundFlushing (page 789) field.
syncdelay (page 998) has no effect on the journal (page 995) files or journaling (page 232).
Warning: If you set --syncdelay (page 932) to 0, MongoDB will not sync the memory mapped files
to disk. Do not set this value on production systems.
command line option!sysinfo
--sysinfo
Returns diagnostic system information and then exits. The information provides the page size, the number of
physical pages, and the number of available physical pages.
command line option!upgrade
--upgrade
Upgrades the on-disk data format of the files specified by the --dbpath (page 958) to the latest version, if
needed.
This option only affects the operation of mongod (page 925) if the data files are in an old format.
Note: In most cases you should not set this value, so you can exercise the most control over your upgrade
process. See the MongoDB release notes19 (on the download page) for more information about the upgrade
process.
command line option!traceExceptions
--traceExceptions
For internal diagnostic use only.
Replication Options command line option!replSet <setname>
--replSet <setname>
Use this option to configure replication with replica sets. Specify a replica set name as an argument to this set.
All hosts in the replica set must have the same set name.
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
See also:
Replication (page 377), Replica Set Tutorials (page 419), and Replica Set Configuration (page 479)
command line option!oplogSize <value>
--oplogSize <value>
Specifies a maximum size in megabytes for the replication operation log (e.g. oplog.) By mongod (page 925)
creates an oplog based on the maximum amount of space available. For 64-bit systems, the op log is typically
5% of available disk space.
19 http://www.mongodb.org/downloads
933
Once the mongod (page 925) has created the oplog for the first time, changing --oplogSize (page 933) will
not affect the size of the oplog.
command line option!fastsync
--fastsync
In the context of replica set replication, set this option if you have seeded this member with a snapshot of the
dbpath of another member of the set. Otherwise the mongod (page 925) will attempt to perform an initial sync,
as though the member were a new member.
Warning: If the data is not perfectly synchronized and mongod (page 925) starts with fastsync
(page 1000), then the secondary or slave will be permanently out of sync with the primary, which may
cause significant consistency problems.
command line option!replIndexPrefetch
--replIndexPrefetch
New in version 2.2.
You must use --replIndexPrefetch (page 934) in conjunction with replSet (page 1000). The default
value is all and available options are:
none
all
_id_only
By default secondary members of a replica set will load all indexes related to an operation into memory before
applying operations from the oplog. You can modify this behavior so that the secondaries will only load the _id
index. Specify _id_only or none to prevent the mongod (page 925) from loading any index into memory.
Master-Slave Replication These options provide access to conventional master-slave database replication. While
this functionality remains accessible in MongoDB, replica sets are the preferred configuration for database replication.
command line option!master
--master
Configures mongod (page 925) to run as a replication master.
command line option!slave
--slave
Configures mongod (page 925) to run as a replication slave.
command line option!source <host><:port>
--source <host><:port>
For use with the --slave (page 934) option, the --source option designates the server that this instance
will replicate.
command line option!only <arg>
--only <arg>
For use with the --slave (page 934) option, the --only option specifies only a single database to replicate.
command line option!slavedelay <value>
--slavedelay <value>
For use with the --slave (page 934) option, the --slavedelay option configures a delay in seconds, for
this slave to wait to apply operations from the master node.
command line option!autoresync
934
--autoresync
For use with the --slave (page 934) option. When set, --autoresync (page 934) option allows this slave
to automatically resync if it is more than 10 seconds behind the master. This setting may be problematic if the
--oplogSize (page 933) specifies a too small oplog. If the oplog is not large enough to store the difference
in changes between the masters current state and the state of the slave, this instance will forcibly resync itself
unnecessarily. When you set the autoresync (page 1001) option to false, the slave will not attempt an
automatic resync more than once in a ten minute period.
Sharding Cluster Options command line option!configsvr
--configsvr
Declares that this mongod (page 925) instance serves as the config database of a sharded cluster. When running
with this option, clients will not be able to write data to any database other than config and admin. The
default port for a mongod (page 925) with this option is 27019 and the default --dbpath (page 958) directory
is /data/configdb, unless specified.
Changed in version 2.2: --configsvr (page 935) also sets --smallfiles (page 932).
Changed in version 2.4: --configsvr (page 935) creates a local oplog.
Do not use --configsvr (page 935) with --replSet (page 933) or --shardsvr (page 935). Config
servers cannot be a shard server or part of a replica set.
command line option!shardsvr
--shardsvr
Configures this mongod (page 925) instance as a shard in a partitioned cluster. The default port for these
instances is 27018. The only effect of --shardsvr (page 935) is to change the port number.
command line option!moveParanoia
--moveParanoia
New in version 2.4.
During chunk migrations, --moveParanoia (page 935) forces the mongod (page 925) instances to save all
documents migrated from this shard in the moveChunk directory of the dbpath (page 993). MongoDB does
not delete data from this directory.
Prior to 2.4, --moveParanoia (page 935) was the default behavior of MongoDB.
SSL Options
See
Connect to MongoDB with SSL (page 249) for full documentation of MongoDBs support.
command line option!sslOnNormalPorts
--sslOnNormalPorts
New in version 2.2.
Note: The default distribution of MongoDB20 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Enables SSL for mongod (page 925). With --sslOnNormalPorts (page 935), a mongod (page 925)
requires SSL encryption for all connections on the default MongoDB port, or the port specified by --port
(page 980). By default, --sslOnNormalPorts (page 935) is disabled.
20 http://www.mongodb.org/downloads
935
936
See the Configuration File Options (page 990) for more information on how to configure mongod (page 925) using
the configuration file.
mongos
Synopsis mongos (page 938) for MongoDB Shard, is a routing service for MongoDB shard configurations that
processes queries from the application layer, and determines the location of this data in the sharded cluster, in order to
complete these operations. From the perspective of the application, a mongos (page 938) instance behaves identically
to any other MongoDB instance.
Note: Changed in version 2.1.
Some aggregation operations using the aggregate (page 694) will cause mongos (page 938) instances to require
more CPU resources than in previous versions. This modified performance profile may dictate alternate architecture
decisions if you use the aggregation framework extensively in a sharded environment.
25 http://www.mongodb.org/downloads
26 http://www.mongodb.org/downloads
937
See also:
Sharding (page 493) and Sharding Introduction (page 493).
Options
mongos
mongos
command line option!help, -h
--help, -h
Returns a basic help and usage text.
command line option!version
--version
Returns the version of the mongod (page 925) daemon.
command line option!config <filename>, -f <filename>
--config <filename>, -f <filename>
Specifies a configuration file, that you can use to specify runtime-configurations. While the options are equivalent and accessible via the other command line arguments, the configuration file is the preferred method for
runtime configuration of mongod. See the Configuration File Options (page 990) document for more information about these options.
Not all configuration options for mongod (page 925) make sense in the context of mongos (page 938).
Note: Ensure the configuration file uses ASCII encoding. mongod (page 925) does not support configuration
files with non-ASCII encoding, including UTF-8.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on standard output or in the log file specified by --logpath
(page 939). Use the -v form to control the level of verbosity by including the option multiple times, (e.g.
-vvvvv.)
command line option!quiet
--quiet
Runs the mongos (page 938) instance in a quiet mode that attempts to limit the amount of output.
command line option!port <port>
--port <port>
Specifies a TCP port for the mongos (page 938) to listen for client connections. By default mongos (page 938)
listens for connections on port 27017.
UNIX-like systems require root access to access ports with numbers lower than 1024.
command line option!bind_ip <ip address>
--bind_ip <ip address>
The IP address that the mongos (page 938) process will bind to and listen for connections. By default mongos
(page 938) listens for connections all interfaces. You may attach mongos (page 938) to any interface; however,
when attaching mongos (page 938) to a publicly accessible interface ensure that you have implemented proper
authentication and/or firewall restrictions to protect the integrity of your database.
command line option!maxConns <number>
938
--maxConns <number>
Specifies the maximum number of simultaneous connections that mongos (page 938) will accept. This setting
will have no effect if the value of this setting is higher than your operating systems configured maximum
connection tracking threshold.
This is particularly useful for mongos (page 938) if you have a client that creates a number of collections but
allows them to timeout rather than close the collections. When you set maxConns (page 992), ensure the value
is slightly higher than the size of the connection pool or the total number of connections to prevent erroneous
connection spikes from propagating to the members of a sharded cluster.
Note: You cannot set maxConns (page 992) to a value higher than 20000.
command line option!objcheck
--objcheck
Forces the mongos (page 938) to validate all requests from clients upon receipt to ensure that invalid objects
are never inserted into the database. This option has a performance impact, and is not enabled by default.
command line option!logpath <path>
--logpath <path>
Specify a path for the log file that will hold all diagnostic logging information.
Unless specified, mongos (page 938) will output all log information to the standard output. Additionally, unless
you also specify --logappend (page 939), the logfile will be overwritten when the process restarts.
command line option!logappend
--logappend
Specify to ensure that mongos (page 938) appends additional logging data to the end of the logfile rather than
overwriting the content of the log when the process restarts.
command line option!setParameter <options>
--setParameter <options>
New in version 2.4.
Specifies an option to configure on startup. Specify multiple options with multiple --setParameter
(page 939) options. See mongod Parameters (page 1005) for full documentation of these parameters. The setParameter (page 756) database command provides access to many of these parameters.
--setParameter (page 939) supports the following options:
enableLocalhostAuthBypass (page 1005)
enableTestCommands (page 1005)
logLevel (page 1006)
logUserIds (page 1006)
notablescan (page 1006)
quiet (page 1007)
supportCompatibilityFormPrivilegeDocuments (page 1007)
syncdelay (page 1007)
textSearchEnabled (page 1008)
command line option!syslog
--syslog
New in version 2.1.0.
939
Sends all logging output to the hosts syslog system rather than to standard output or a log file as with
--logpath (page 939).
Important: You cannot use --syslog (page 939) with --logpath (page 939).
command line option!pidfilepath <path>
--pidfilepath <path>
Specify a file location to hold the PID or process ID of the mongos (page 938) process. Useful for tracking the
mongos (page 938) process in combination with the mongos --fork option.
Without a specified --pidfilepath (page 940) option, mongos (page 938) creates no PID file.
command line option!keyFile <file>
--keyFile <file>
Specify the path to a key file to store authentication information. This option is only useful for the connection
between mongos (page 938) instances and components of the sharded cluster.
See also:
Sharded Cluster Security (page 509)
command line option!nounixsocket
--nounixsocket
Disables listening on the UNIX socket. mongos (page 938) always listens on the UNIX socket, unless
--nounixsocket (page 940) is set, --bind_ip (page 938) is not set, or --bind_ip (page 938) does
not specify 127.0.0.1.
command line option!unixSocketPrefix <path>
--unixSocketPrefix <path>
Specifies a path for the UNIX socket. Unless this option has a value mongos (page 938) creates a socket with
http://docs.mongodb.org/manualtmp as a prefix.
MongoDB will always create and listen on a UNIX socket, unless --nounixsocket (page 940) is set,
--bind_ip (page 938) is not set, or --bind_ip (page 938) specifies 127.0.0.1.
command line option!fork
--fork
Enables a daemon mode for mongos (page 938) which forces the process to the background. This is the normal
mode of operation, in production and production-like environments, but may not be desirable for testing.
command line option!configdb <config1>,<config2><:port>,<config3>
--configdb <config1>,<config2><:port>,<config3>
Set this option to specify a configuration database (i.e. config database) for the sharded cluster. You must
specify either 1 configuration server or 3 configuration servers, in a comma separated list.
Note: mongos (page 938) instances read from the first config server in the list provided. All mongos
(page 938) instances must specify the hosts to the --configdb (page 940) setting in the same order.
If your configuration databases reside in more that one data center, order the hosts in the --configdb
(page 940) argument so that the config database that is closest to the majority of your mongos (page 938)
instances is first servers in the list.
Warning: Never remove a config server from the --configdb (page 940) parameter, even if the config
server or servers are not available, or offline.
940
941
If you specify a value for --localThreshold (page 941), mongos (page 938) will construct the list
of replica members that are within the latency allowed by this value.
The mongos (page 938) will select a member to read from at random from this list.
The ping time used for a set member compared by the --localThreshold (page 941) setting is a moving
average of recent ping times, calculated, at most, every 10 seconds. As a result, some queries may reach
members above the threshold until the mongos (page 938) recalculates the average.
See the Member Selection (page 408) section of the read preference (page 405) documentation for more information.
command line option!noAutoSplit
--noAutoSplit
New in version 2.0.7.
--noAutoSplit (page 942) prevents mongos (page 938) from automatically inserting metadata splits in a
sharded collection. If set on all mongos (page 938), this will prevent MongoDB from creating new chunks as
the data in a collection grows.
Because any mongos (page 938) in a cluster can create a split, to totally disable splitting in a cluster you must
set --noAutoSplit (page 942) on all mongos (page 938).
Warning: With --noAutoSplit (page 942) enabled, the data in your sharded cluster may become
imbalanced over time. Enable with caution.
SSL Options
See
Connect to MongoDB with SSL (page 249) for full documentation of MongoDBs support.
command line option!authenticationDatabase <dbname>
--authenticationDatabase <dbname>
New in version 2.4.
Specifies the database that holds the users (e.g --username) credentials.
By default, mongos (page 938) assumes that the database specified to the --db (page 958) argument holds the
users credentials, unless you specify --authenticationDatabase (page 980).
See userSource (page 271), system.users Privilege Documents (page 270) and User Privilege Roles in MongoDB (page 265) for more information about delegated authentication in MongoDB.
command line option!authenticationMechanism <name>
--authenticationMechanism <name>
New in version 2.4.
Specifies the authentication mechanism. By default, the authentication mechanism is MONGODB-CR, which is
the MongoDB challenge/response authentication mechanism. In MongoDB Enterprise, mongos (page 938)
also includes support for GSSAPI to handle Kerberos authentication.
See Deploy MongoDB with Kerberos Authentication (page 259) for more information about Kerberos authentication.
mongo
Description
942
mongo
mongo (page 942) is an interactive JavaScript shell interface to MongoDB, which provides a powerful interface for
systems administrators as well as a way for developers to test queries and operations directly with the database. mongo
(page 942) also provides a fully functional JavaScript environment for use with a MongoDB. This document addresses
the basic invocation of the mongo (page 942) shell and an overview of its usage.
Interface
Options
mongo
command line option!shell
--shell
Enables the shell interface after evaluating a JavaScript file. If you invoke the mongo (page 942) command and
specify a JavaScript file as an argument, or use --eval (page 943) to specify JavaScript on the command line,
the --shell (page 943) option provides the user with a shell prompt after the file finishes executing.
command line option!nodb
--nodb
Prevents the shell from connecting to any database instances. Later, to connect to a database within the shell,
see Opening New Connections (page 202).
command line option!norc
--norc
Prevents the shell from sourcing and evaluating ~/.mongorc.js on start up.
command line option!quiet
--quiet
Silences output from the shell during the connection process.
command line option!port <port>
--port <port>
Specifies the port where the mongod (page 925) or mongos (page 938) instance is listening. Unless specified
mongo (page 942) connects to mongod (page 925) instances on port 27017, which is the default mongod
(page 925) port.
command line option!host <hostname>
--host <hostname>
specifies the host where the mongod (page 925) or mongos (page 938) is running to connect to as
<hostname>. By default mongo (page 942) will attempt to connect to a MongoDB process running on
the localhost.
command line option!eval <javascript>
--eval <javascript>
Evaluates a JavaScript expression specified as an argument to this option. mongo (page 942) does not load its
own environment when evaluating code: as a result many options of the shell environment are not available.
command line option!username <username>, -u <username>
--username <username>, -u <username>
Specifies a username to authenticate to the MongoDB instance. Use in conjunction with the --password
(page 980) option to supply a password. If you specify a username and password but the default database or the
specified database do not require authentication, mongo (page 942) will exit with an exception.
command line option!password <password>, -p <password>
11.2. Architecture and Components
943
The default distribution of MongoDB28 does not contain support for SSL. To use SSL you can ei-
27 http://www.mongodb.org/downloads
944
ther compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Specifies the password to decrypt the root certificate chain specified by --sslPEMKeyFile (page 936).
Only required if the certificate-key file is encrypted.
command line option!sslCAFile <filename>
--sslCAFile <filename>
New in version 2.4.
Note: The default distribution of MongoDB29 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Specifies the .pem file that contains the certificate from the Certificate Authority. Specify the file name of the
.pem file using relative or absolute paths
command line option!help, -h
--help, -h
Returns a basic help and usage text.
command line option!version
--version
Returns the version of the shell.
command line option!verbose
--verbose
Increases the verbosity of the output of the shell during the connection process.
command line option!ipv6
--ipv6
Enables IPv6 support that allows mongo (page 942) to connect to the MongoDB instance using an IPv6 network.
All MongoDB programs and processes, including mongo (page 942), disable IPv6 support by default.
<db address>
Specifies the database address of the database to connect to. For example:
mongo admin
The above command will connect the mongo (page 942) shell to the admin database on the local machine. You
may specify a remote database instance, with the resolvable hostname or IP address. Separate the database name
from the hostname using a http://docs.mongodb.org/manual character. See the following examples:
mongo mongodb1.example.net
mongo mongodb1/admin
mongo 10.8.8.10/test
<file.js>
Specifies a JavaScript file to run and then exit. Generally this should be the last option specified.
Optional
28 http://www.mongodb.org/downloads
29 http://www.mongodb.org/downloads
945
To specify a JavaScript file to execute and allow mongo (page 942) to prompt you for a password using --password (page 980), pass the filename as the first parameter with --username (page 980) and
--password (page 980) s the last options as in the following:
mongo file.js --username username --password
Use the --shell (page 943) option to return to a shell after the file finishes running.
Files ~/.dbshell
mongo (page 942) maintains a history of commands in the .dbshell file.
Note: mongo (page 942) does not recorded interaction related to authentication in the history file,
including authenticate (page 725) and db.addUser() (page 875).
Warning: Versions of Windows mongo.exe earlier than 2.2.0 will save the .dbshell file in the
mongo.exe working directory.
~/.mongorc.js
mongo (page 942) will read the .mongorc.js file from the home directory of the user invoking mongo
(page 942). In the file, users can define variables, customize the mongo (page 942) shell prompt, or update
information that they would like updated every time they launch a shell. If you use the shell to evaluate a
JavaScript file or expression either on the command line with --eval (page 943) or by specifying a .js
file to mongo (page 945), mongo (page 942) will read the .mongorc.js file after the JavaScript has
finished processing.
Specify the --norc (page 943) option to disable reading .mongorc.js.
http://docs.mongodb.org/manualtmp/mongo_edit<time_t>.js
Created by mongo (page 942) when editing a file. If the file exists mongo (page 942) will append an
integer from 1 to 10 to the time value to attempt to create a unique file.
%TEMP%mongo_edit<time_t>.js
Created by mongo.exe on Windows when editing a file. If the file exists mongo (page 942) will append
an integer from 1 to 10 to the time value to attempt to create a unique file.
Environment
EDITOR
Specifies the path to an editor to use with the edit shell command. A JavaScript variable EDITOR will override
the value of EDITOR (page 946).
HOME
Specifies the path to the home directory where mongo (page 942) will read the .mongorc.js file and write
the .dbshell file.
HOMEDRIVE
On Windows systems, HOMEDRIVE (page 946) specifies the path the directory where mongo (page 942) will
read the .mongorc.js file and write the .dbshell file.
HOMEPATH
Specifies the Windows path to the home directory where mongo (page 942) will read the .mongorc.js file
and write the .dbshell file.
946
Keyboard Shortcuts The mongo (page 942) shell supports the following keyboard shortcuts:
Keybinding
Up arrow
Down-arrow
Home
End
Tab
Left-arrow
Right-arrow
Ctrl-left-arrow
Ctrl-right-arrow
Meta-left-arrow
Meta-right-arrow
Ctrl-A
Ctrl-B
Ctrl-C
Ctrl-D
Ctrl-E
Ctrl-F
Ctrl-G
Ctrl-J
Ctrl-K
Ctrl-L or type cls
Ctrl-M
Ctrl-N
Ctrl-P
Ctrl-R
Ctrl-S
Ctrl-T
Ctrl-U
Ctrl-W
Ctrl-Y
Ctrl-Z
Ctrl-H
Ctrl-I
Meta-B
Meta-C
Meta-D
Meta-F
Meta-L
Meta-U
Meta-Y
Meta-Backspace
Meta-<
Meta->
30
Function
Retrieve previous command from history
Retrieve next command from history
Go to beginning of the line
Go to end of the line
Autocomplete method/command
Go backward one character
Go forward one character
Go backward one word
Go forward one word
Go backward one word
Go forward one word
Go to the beginning of the line
Go backward one character
Exit the mongo (page 942) shell
Delete a char (or exit the mongo (page 942) shell)
Go to the end of the line
Go forward one character
Abort
Accept/evaluate the line
Kill/erase the line
Clear the screen
Accept/evaluate the line
Retrieve next command from history
Retrieve previous command from history
Reverse-search command history
Forward-search command history
Transpose characters
Perform Unix line-discard
Perform Unix word-rubout
Yank
Suspend (job control works in linux)
Backward-delete a character
Complete, same as Tab
Go backward one word
Capitalize word
Kill word
Go forward one word
Change word to lowercase
Change word to uppercase
Yank-pop
Backward-kill word
Retrieve the first command in command history
Retrieve the last command in command history
Use Typically users invoke the shell with the mongo (page 942) command at the system prompt. Consider the
following examples for other scenarios.
30
MongoDB accommodates multiple keybinding. Since 2.0, mongo (page 942) includes support for basic emacs keybindings.
947
To connect to a database on a remote host using authentication and a non-standard port, use the following form:
mongo --username <user> --password <pass> --host <host> --port 28015
Replace <user>, <pass>, and <host> with the appropriate values for your situation and substitute or omit the
--port (page 980) as needed.
To execute a JavaScript file without evaluating the ~/.mongorc.js file before starting a shell session, use the
following form:
mongo --shell --norc alternate-environment.js
To execute a JavaScript file with authentication, with password prompted rather than provided on the command-line,
use the following form:
mongo script-file.js -u <user> -p
To print return a query as JSON, from the system prompt using the --eval option, use the following form:
mongo --eval 'db.collection.find().forEach(printjson)'
Use single quotes (e.g. ) to enclose the JavaScript, as well as the additional JavaScript required to generate this
output.
Windows Services
The mongod.exe (page 948) and mongos.exe (page 950) describe the options available for configuring MongoDB
when running as a Windows Service. The mongod.exe (page 948) and mongos.exe (page 950) binaries provide
a superset of the mongod (page 925) and mongos (page 938) options.
mongod.exe
Synopsis mongod.exe (page 948) is the build of the MongoDB daemon (i.e. mongod (page 925)) for the Windows platform. mongod.exe (page 948) has all of the features of mongod (page 925) on Unix-like platforms and is
completely compatible with the other builds of mongod (page 925). In addition, mongod.exe (page 948) provides
several options for interacting with the Windows platform itself.
This document only references options that are unique to mongod.exe (page 948). All mongod (page 925) options are available. See the mongod (page 925) and the Configuration File Options (page 990) documents for more
information regarding mongod.exe (page 948).
To install and use mongod.exe (page 948), read the Install MongoDB on Windows (page 16) document.
Options
mongod.exe
mongod.exe
command line option!install
--install
Installs mongod.exe (page 948) as a Windows Service and exits.
command line option!remove
948
--remove
Removes the mongod.exe (page 948) Windows Service. If mongod.exe (page 948) is running, this operation will stop and then remove the service.
Note: --remove (page 950) requires the --serviceName (page 950) if you configured a non-default
--serviceName (page 950) during the --install (page 950) operation.
command line option!reinstall
--reinstall
Removes mongod.exe (page 948) and reinstalls mongod.exe (page 948) as a Windows Service.
command line option!serviceName <name>
--serviceName <name>
Default: MongoDB
Set the service name of mongod.exe (page 948) when running as a Windows Service. Use this name with the
net start <name> and net stop <name> operations.
You must use --serviceName (page 950) in conjunction with either the --install (page 950) or
--remove (page 950) install option.
command line option!serviceDisplayName <name>
--serviceDisplayName <name>
Default: Mongo DB
Sets the name listed for MongoDB on the Services administrative application.
command line option!serviceDescription <description>
--serviceDescription <description>
Default: MongoDB Server
Sets the mongod.exe (page 948) service description.
You must use --serviceDescription (page 950) in conjunction with the --install (page 950) option.
Note: For descriptions that contain spaces, you must enclose the description in quotes.
command line option!serviceUser <user>
--serviceUser <user>
Runs the mongod.exe (page 948) service in the context of a certain user. This user must have Log on as a
service privileges.
You must use --serviceUser (page 950) in conjunction with the --install (page 950) option.
command line option!servicePassword <password>
--servicePassword <password>
Sets the password for <user> for mongod.exe (page 948) when running with the --serviceUser
(page 950) option.
You must use --servicePassword (page 951) in conjunction with the --install (page 950) option.
mongos.exe
Synopsis mongos.exe (page 950) is the build of the MongoDB Shard (i.e. mongos (page 938)) for the Windows
platform. mongos.exe (page 950) has all of the features of mongos (page 938) on Unix-like platforms and is
949
completely compatible with the other builds of mongos (page 938). In addition, mongos.exe (page 950) provides
several options for interacting with the Windows platform itself.
This document only references options that are unique to mongos.exe (page 950). All mongos (page 938) options are available. See the mongos (page 937) and the Configuration File Options (page 990) documents for more
information regarding mongos.exe (page 950).
To install and use mongos.exe (page 950), read the Install MongoDB on Windows (page 16) document.
Options
mongos.exe
mongos.exe
command line option!install
--install
Installs mongos.exe (page 950) as a Windows Service and exits.
command line option!remove
--remove
Removes the mongos.exe (page 950) Windows Service. If mongos.exe (page 950) is running, this operation will stop and then remove the service.
Note: --remove (page 950) requires the --serviceName (page 950) if you configured a non-default
--serviceName (page 950) during the --install (page 950) operation.
command line option!reinstall
--reinstall
Removes mongos.exe (page 950) and reinstalls mongos.exe (page 950) as a Windows Service.
command line option!serviceName <name>
--serviceName <name>
Default: MongoS
Set the service name of mongos.exe (page 950) when running as a Windows Service. Use this name with the
net start <name> and net stop <name> operations.
You must use --serviceName (page 950) in conjunction with either the --install (page 950) or
--remove (page 950) install option.
command line option!serviceDisplayName <name>
--serviceDisplayName <name>
Default: Mongo DB Router
Sets the name listed for MongoDB on the Services administrative application.
command line option!serviceDescription <description>
--serviceDescription <description>
Default: Mongo DB Sharding Router
Sets the mongos.exe (page 950) service description.
You must use --serviceDescription (page 950) in conjunction with the --install (page 950) option.
Note: For descriptions that contain spaces, you must enclose the description in quotes.
command line option!serviceUser <user>
950
--serviceUser <user>
Runs the mongos.exe (page 950) service in the context of a certain user. This user must have Log on as a
service privileges.
You must use --serviceUser (page 950) in conjunction with the --install (page 950) option.
command line option!servicePassword <password>
--servicePassword <password>
Sets the password for <user> for mongos.exe (page 950) when running with the --serviceUser
(page 950) option.
You must use --servicePassword (page 951) in conjunction with the --install (page 950) option.
Binary Import and Export Tools
mongodump (page 951) provides a method for creating BSON dump files from the mongod (page 925) instances,
while mongorestore (page 956) makes it possible to restore these dumps. bsondump (page 961) converts BSON
dump files into JSON. The mongooplog (page 962) utility provides the ability to stream oplog entries outside of
normal replication.
mongodump
Synopsis mongodump (page 951) is a utility for creating a binary export of the contents of a database. Consider
using this utility as part an effective backup strategy (page 136). Use mongodump (page 951) in conjunction with
mongorestore (page 956) to restore databases.
mongodump (page 951) can read data from either mongod (page 925) or mongos (page 938) instances, in addition
to reading directly from MongoDB data files without an active mongod (page 925).
Important: mongodump (page 951) does not create output for the local database.
Note: The format of data created by mongodump (page 951) tool from the 2.2 distribution or later is different and
incompatible with earlier versions of mongod (page 925).
See also:
mongorestore (page 956), Backup a Sharded Cluster with Database Dumps (page 190) and Backup Strategies for
MongoDB Systems (page 136).
Options
mongodump
mongodump
command line option!help
951
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Returns the version of the mongodump (page 951) utility and exits.
command line option!host <hostname><:port>
--host <hostname><:port>
Specifies a resolvable hostname for the mongod (page 925) that you wish to use to create the database dump.
By default mongodump (page 951) will attempt to connect to a MongoDB process ruining on the localhost port
number 27017.
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
To connect to a replica set, use the --host argument with a setname, followed by a slash and a commaseparated list of host names and port numbers. The mongodump (page 951) utility will, given the seed of at
least one connected set member, connect to the primary member of that set. This option would resemble:
You can always connect directly to a single MongoDB instance by specifying the host and port number directly.
command line option!port <port>
--port <port>
Specifies the port number, if the MongoDB instance is not running on the standard port. (i.e. 27017) You may
also specify a port number using the --host option.
command line option!ipv6
--ipv6
Enables IPv6 support that allows mongodump (page 951) to connect to the MongoDB instance using an IPv6
network. All MongoDB programs and processes, including mongodump (page 951), disable IPv6 support by
default.
command line option!ssl
--ssl
New in version 2.4: MongoDB added support for SSL connections to mongod (page 925) instances in mongodump.
Note: SSL support in mongodump is not compiled into the default distribution of MongoDB. See Connect to
MongoDB with SSL (page 249) for more information on SSL and MongoDB.
Additionally, mongodump does not support connections to mongod (page 925) instances that require client
certificate validation.
Allows mongodump (page 951) to connect to mongod (page 925) instance over an SSL connection.
command line option!username <username>, -u <username>
--username <username>, -u <username>
Specifies a username to authenticate to the MongoDB instance, if your database requires authentication. Use in
conjunction with the --password option to supply a password.
952
953
954
When you run with --forceTableScan (page 973), mongodump (page 951) does not use $snapshot
(page 692). As a result, the dump produced by mongodump (page 951) can reflect the state of the database at
many different points in time.
Tip
Use --forceTableScan (page 973) with extreme caution and consideration.
Behavior When running mongodump (page 951) against a mongos (page 938) instance where the sharded cluster
consists of replica sets, the read preference of the operation will prefer reads from secondary members of the set.
Warning: Changed in version 2.2: When used in combination with fsync (page 751) or db.fsyncLock()
(page 885), mongod (page 925) may block some reads, including those from mongodump (page 951), when
queued write operation waits behind the fsync (page 751) lock.
Required Privileges
read (page 265). 31
read (page 265) 1 and userAdmin (page 267).
readAnyDatabase (page 269),
userAdminAnyDatabase (page 269), and
clusterAdmin (page 267). 33
See User Privilege Roles in MongoDB (page 265) and system.users Privilege Documents (page 270) for more information on user roles.
Usage See the Backup and Restore with MongoDB Tools (page 181) for a larger overview of mongodump (page 951)
usage. Also see the mongorestore (page 956) document for an overview of the mongorestore (page 956), which
provides the related inverse functionality.
The following command creates a dump file that contains only the collection named collection in the database
named test. In this case the database is running on the local interface on port 27017:
mongodump --collection collection --db test
In the next example, mongodump (page 951) creates a backup of the database instance stored in the /srv/mongodb
directory on the local machine. This requires that no mongod (page 925) instance is using the /srv/mongodb
directory.
mongodump --dbpath /srv/mongodb
31 You
955
mongodump --host mongodb1.example.net --port 37017 --username user --password pass --out /opt/backup/
mongorestore
Synopsis The mongorestore (page 956) program writes data from a binary database dump created by
mongodump (page 951) to a MongoDB instance. mongorestore (page 956) can create a new database or add
data to an existing database.
mongorestore (page 956) can write data to either mongod or mongos (page 938) instances, in addition to writing
directly to MongoDB data files without an active mongod (page 925).
If you restore to an existing database, mongorestore (page 956) will only insert into the existing database, and
does not perform updates of any kind. If existing documents have the same value _id field in the target database and
collection, mongorestore (page 956) will not overwrite those documents.
Remember the following properties of mongorestore (page 956) behavior:
mongorestore (page 956) recreates indexes recorded by mongodump (page 951).
all operations are inserts, not updates.
mongorestore (page 956) does not wait for a response from a mongod (page 925) to ensure that the MongoDB process has received or recorded the operation.
The mongod (page 925) will record any errors to its log that occur during a restore operation, but
mongorestore (page 956) will not receive errors.
Note: The format of data created by mongodump (page 951) tool from the 2.2 distribution or later is different and
incompatible with earlier versions of mongod (page 925).
Options
mongorestore
mongorestore
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times (e.g. -vvvvv).
command line option!version
--version
Returns the version of the mongorestore (page 956) tool.
command line option!host <hostname><:port>
--host <hostname><:port>
Specifies a resolvable hostname for the mongod (page 925) to which you want to restore the database. By
956
default mongorestore (page 956) will attempt to connect to a MongoDB process running on the localhost
port number 27017. For an example of --host (page 979), see Restore a Database with mongorestore
(page 183).
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
To connect to a replica set, you can specify the replica set seed name, and a seed list of set members, in the
following format:
<replica_set_name>/<hostname1><:port>,<hostname2:<port>,...
957
Specifies the database that holds the users (e.g --username) credentials.
By default, mongorestore (page 956) assumes that the database specified to the --db (page 958) argument
holds the users credentials, unless you specify --authenticationDatabase (page 980).
See userSource (page 271), system.users Privilege Documents (page 270) and User Privilege Roles in MongoDB (page 265) for more information about delegated authentication in MongoDB.
command line option!authenticationMechanism <name>
--authenticationMechanism <name>
New in version 2.4.
Specifies the authentication mechanism. By default, the authentication mechanism is MONGODB-CR, which
is the MongoDB challenge/response authentication mechanism. In MongoDB Enterprise, mongorestore
(page 956) also includes support for GSSAPI to handle Kerberos authentication.
See Deploy MongoDB with Kerberos Authentication (page 259) for more information about Kerberos authentication.
command line option!dbpath <path>
--dbpath <path>
Specifies the directory of the MongoDB data files. If used, the --dbpath (page 958) option enables
mongorestore (page 956) to attach directly to local data files and insert the data without the mongod
(page 925). To run with --dbpath (page 958), mongorestore (page 956) needs to lock access to the
data directory: as a result, no mongod (page 925) can access the same path while the process runs. For an
example of --dbpath (page 958), see Restore Without a Running mongod (page 183).
command line option!directoryperdb
--directoryperdb
Use the --directoryperdb (page 958) in conjunction with the corresponding option to mongod
(page 925), which allows mongorestore (page 956) to import data into MongoDB instances that have every databases files saved in discrete directories on the disk. This option is only relevant when specifying the
--dbpath (page 958) option.
command line option!journal
--journal
Allows mongorestore (page 956) write to the durability journal to ensure that the data files will remain in a
consistent state during the write process. This option is only relevant when specifying the --dbpath (page 958)
option. For an example of --journal (page 958), see Restore Without a Running mongod (page 183).
command line option!db <db>, -d <db>
--db <db>, -d <db>
Use the --db (page 958) option to specify a database for mongorestore (page 956) to restore data into.
If the database doesnt exist, mongorestore (page 956) will create the specified database. If you do not
specify a <db>, mongorestore (page 956) creates new databases that correspond to the databases where
data originated and data may be overwritten. Use this option to restore data into a MongoDB instance that
already has data.
--db (page 958) does not control which BSON files mongorestore (page 956) restores. You must use the
mongorestore (page 956) path option (page 960) to limit that restored data.
command line option!collection <collection>, -c <collection>
--collection <collection>, -c <collection>
Use the --collection (page 958) option to specify a collection for mongorestore (page 956) to restore.
If you do not specify a <collection>, mongorestore (page 956) imports all collections created. Existing
958
data may be overwritten. Use this option to restore data into a MongoDB instance that already has data, or to
restore only some data in the specified imported data set.
command line option!objcheck
--objcheck
Forces the mongorestore (page 956) to validate all requests from clients upon receipt to ensure that clients
never insert invalid documents into the database. For objects with a high degree of sub-document nesting,
--objcheck (page 983) can have a small impact on performance. You can set --noobjcheck (page 959)
to disable object checking at run-time.
Changed in version 2.4: MongoDB enables --objcheck (page 983) by default, to prevent any client from
inserting malformed or invalid BSON into a MongoDB database.
command line option!noobjcheck
--noobjcheck
New in version 2.4.
Disables the default document validation that MongoDB performs on all incoming BSON documents.
command line option!filter <JSON>
--filter <JSON>
Limits the documents that mongorestore (page 956) imports to only those documents that match the JSON
document specified as <JSON>. Be sure to include the document in single quotes to avoid interaction with
your systems shell environment. For an example of --filter (page 959), see Restore a Subset of data from
a Binary Database Dump (page 183).
command line option!drop
--drop
Modifies the restoration procedure to drop every collection from the target database before restoring the collection from the dumped backup.
command line option!oplogReplay
--oplogReplay
Replays the oplog after restoring the dump to ensure that the current state of the database reflects the point-intime backup captured with the mongodump --oplog command. For an example of --oplogReplay
(page 959), see Restore Point in Time Oplog Backup (page 183).
command line option!keepIndexVersion
--keepIndexVersion
Prevents mongorestore (page 956) from upgrading the index to the latest version during the restoration
process.
command line option!w <number of replicas per write>
--w <number of replicas per write>
New in version 2.2.
Specifies the write concern for each write operation that mongorestore (page 956) writes to the target
database. By default, mongorestore (page 956) does not wait for a response for write acknowledgment
(page 55).
command line option!noOptionsRestore
--noOptionsRestore
New in version 2.2.
Prevents mongorestore (page 956) from setting the collection options, such as those specified by the
collMod (page 755) database command, on restored collections.
959
Here, mongorestore (page 956) reads the database dump in the dump/ sub-directory of the current directory, and
restores only the documents in the collection named people from the database named accounts. mongorestore
(page 956) restores data to the instance running on the localhost interface on port 27017.
In the next example, mongorestore (page 956) restores a backup of the database instance located in dump to a
database instance stored in the /srv/mongodb on the local machine. This requires that there are no active mongod
(page 925) instances attached to /srv/mongodb data directory.
mongorestore --dbpath /srv/mongodb
mongorestore --host mongodb1.example.net --port 37017 --username user --password pass /opt/backup/mon
bsondump
Synopsis The bsondump (page 961) converts BSON files into human-readable formats, including JSON. For example, bsondump (page 961) is useful for reading the output files generated by mongodump (page 951).
Important: bsondump (page 961) is a diagnostic tool for inspecting BSON files, not a tool for data ingestion or
other application use.
960
Options
bsondump
bsondump
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Returns the version of the bsondump (page 961) utility.
command line option!objcheck
--objcheck
Validates each BSON object before outputting it in JSON format. By default, bsondump (page 961) enables
--objcheck (page 983). For objects with a high degree of sub-document nesting, --objcheck (page 983)
can have a small impact on performance. You can set --noobjcheck (page 959) to disable object checking.
Changed in version 2.4: MongoDB enables --objcheck (page 983) by default, to prevent any client from
inserting malformed or invalid BSON into a MongoDB database.
command line option!noobjcheck
--noobjcheck
New in version 2.4.
Disables the default document validation that bsondump (page 961) performs on all BSON documents.
command line option!filter <JSON>
--filter <JSON>
Limits the documents that bsondump (page 961) exports to only those documents that match the JSON document specified as <JSON>. Be sure to include the document in single quotes to avoid interaction with your
systems shell environment.
command line option!type <=json|=debug>
--type <=json|=debug>
Changes the operation of bsondump (page 961) from outputting JSON (the default) to a debugging format.
<bsonfilename>
The final argument to bsondump (page 961) is a document containing BSON. This data is typically generated
by mongodump (page 951) or by MongoDB in a rollback operation.
Usage By default, bsondump (page 961) outputs data to standard output. To create corresponding JSON files, you
will need to use the shell redirect. See the following command:
bsondump collection.bson > collection.json
Use the following command (at the system shell) to produce debugging output for a BSON file:
bsondump --type=debug collection.bson
961
mongooplog
This command copies oplog entries from the mongod (page 925) instance running on the host
mongodb0.example.net and duplicates operations to the host mongodb1.example.net. If you do
not need to keep the --from host running during the migration, consider using mongodump (page 951) and
mongorestore (page 956) or another backup (page 136) operation, which may be better suited to your operation.
Note: If the mongod (page 925) instance specified by the --from argument is running with authentication
(page 993), then mongooplog (page 962) will not be able to copy oplog entries.
See also:
mongodump (page 951), mongorestore (page 956), Backup Strategies for MongoDB Systems (page 136), Replica
Set Oplog (page 410).
Options
mongooplog
mongooplog
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Returns the version of the mongooplog (page 962) utility.
command line option!host <hostname><:port>, -h
--host <hostname><:port>, -h
Specifies a resolvable hostname for the mongod (page 925) instance to which mongooplog (page 962) will
apply oplog operations retrieved from the serve specified by the --from option.
mongooplog (page 962) assumes that all target mongod (page 925) instances are accessible by way of port
27017. You may, optionally, declare an alternate port number as part of the hostname argument.
You can always connect directly to a single mongod (page 925) instance by specifying the host and port number
directly.
To connect to a replica set, you can specify the replica set seed name, and a seed list of set members, in the
following format:
962
<replica_set_name>/<hostname1><:port>,<hostname2:<port>,...
963
Specifies the authentication mechanism. By default, the authentication mechanism is MONGODB-CR, which
is the MongoDB challenge/response authentication mechanism. In MongoDB Enterprise, mongooplog
(page 962) also includes support for GSSAPI to handle Kerberos authentication.
See Deploy MongoDB with Kerberos Authentication (page 259) for more information about Kerberos authentication.
command line option!dbpath <path>
--dbpath <path>
Specifies a directory, containing MongoDB data files, to which mongooplog (page 962) will apply operations
from the oplog of the database specified with the --from option. When used, the --dbpath (page 958)
option enables mongo (page 942) to attach directly to local data files and write data without a running mongod
(page 925) instance. To run with --dbpath (page 958), mongooplog (page 962) needs to restrict access to
the data directory: as a result, no mongod (page 925) can be access the same path while the process runs.
command line option!directoryperdb
--directoryperdb
Use the --directoryperdb (page 958) in conjunction with the corresponding option to mongod
(page 925). This option allows mongooplog (page 962) to write to data files organized with each database
located in a distinct directory. This option is only relevant when specifying the --dbpath (page 958) option.
command line option!journal
--journal
Allows mongooplog (page 962) operations to use the durability journal to ensure that the data files will remain
in a consistent state during the writing process. This option is only relevant when specifying the --dbpath
(page 958) option.
command line option!seconds <number>, -s <number>
--seconds <number>, -s <number>
Specify a number of seconds of operations for mongooplog (page 962) to pull from the remote host.
Unless specified the default value is 86400 seconds, or 24 hours.
command line option!from <host[:port]>
--from <host[:port]>
Specify the host for mongooplog (page 962) to retrieve oplog operations from. mongooplog (page 962)
requires this option.
Unless you specify the --host option, mongooplog (page 962) will apply the operations collected with this
option to the oplog of the mongod (page 925) instance running on the localhost interface connected to port
27017.
command line option!oplogns <namespace>
--oplogns <namespace>
Specify a namespace in the --from host where the oplog resides. The default value is local.oplog.rs,
which is the where replica set members store their operation log. However, if youve copied oplog entries into
another database or collection, use this option to copy oplog entries stored in another location.
Namespaces take the form of [database].[collection].
Usage Consider the following prototype mongooplog (page 962) command:
mongooplog
Here, entries from the oplog of the mongod (page 925) running on port 27017. This only pull entries from the last
24 hours.
964
Use the --seconds argument to capture a greater or smaller amount of time. Consider the following example:
mongooplog
In this operation, mongooplog (page 962) captures 2 full days of operations. To migrate 12 hours of oplog entries,
use the following form:
mongooplog
For the previous two examples, mongooplog (page 962) migrates entries to the mongod (page 925) process running
on the localhost interface connected to the 27017 port. mongooplog (page 962) can also operate directly on
MongoDBs data files if no mongod (page 925) is running on the target host. Consider the following example:
mongooplog
Here, mongooplog (page 962) imports oplog operations from the mongod (page 925) host connected to port
27017. This migrates operations to the MongoDB data files stored in the /srv/mongodb directory. Additionally mongooplog (page 962) will use the durability journal to ensure that the data files remain in a consistent state.
Data Import and Export Tools
mongoimport (page 965) provides a method for taking data in JSON, CSV, or TSV and importing it into a mongod
(page 925) instance. mongoexport (page 969) provides a method to export data from a mongod (page 925) instance
into JSON, CSV, or TSV.
Note: The conversion between BSON and other formats lacks full type fidelity. Therefore you cannot use
mongoimport (page 965) and mongoexport (page 969) for round-trip import and export operations.
mongoimport
Synopsis The mongoimport (page 965) tool provides a route to import content from a JSON, CSV, or TSV export
created by mongoexport (page 969), or potentially, another third-party export tool. See the Import and Export
MongoDB Data (page 149) document for a more in depth usage overview, and the mongoexport (page 969) document
for more information regarding mongoexport (page 969), which provides the inverse importing capability.
Note: Do not use mongoimport (page 965) and mongoexport (page 969) for full instance, production backups
because they will not reliably capture data type information. Use mongodump (page 951) and mongorestore
(page 956) as described in Backup Strategies for MongoDB Systems (page 136) for this kind of functionality.
Options
mongoimport
mongoimport
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
11.2. Architecture and Components
965
--version
Returns the version of the mongoimport (page 965) program.
command line option!host <hostname><:port>, -h
--host <hostname><:port>, -h
Specifies a resolvable hostname for the mongod (page 925) to which you want to restore the database. By
default mongoimport (page 965) will attempt to connect to a MongoDB process ruining on the localhost port
numbered 27017.
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
To connect to a replica set, use the --host (page 979) argument with a setname, followed by a slash and a
comma-separated list of host and port names. mongoimport (page 965) will, given the seed of at least one
connected set member, connect to the primary of that set. This option would resemble:
--host repl0/mongo0.example.net,mongo0.example.net:27018,mongo1.example.net,mongo2.example.net
You can always connect directly to a single MongoDB instance by specifying the host and port number directly.
command line option!port <port>
--port <port>
Specifies the port number, if the MongoDB instance is not running on the standard port. (i.e. 27017) You may
also specify a port number using the mongoimport --host command.
command line option!ipv6
--ipv6
Enables IPv6 support that allows mongoimport (page 965) to connect to the MongoDB instance using an IPv6
network. All MongoDB programs and processes, including mongoimport (page 965), disable IPv6 support
by default.
command line option!ssl
--ssl
New in version 2.4: MongoDB added support for SSL connections to mongod (page 925) instances in mongoimport.
Note: SSL support in mongoimport is not compiled into the default distribution of MongoDB. See Connect to
MongoDB with SSL (page 249) for more information on SSL and MongoDB.
Additionally, mongoimport does not support connections to mongod (page 925) instances that require client
certificate validation.
Allows mongoimport (page 965) to connect to mongod (page 925) instance over an SSL connection.
command line option!username <username>, -u <username>
--username <username>, -u <username>
Specifies a username to authenticate to the MongoDB instance, if your database requires authentication. Use in
conjunction with the mongoimport --password option to supply a password.
command line option!password <password>, -p <password>
--password <password>, -p <password>
Specifies a password to authenticate to the MongoDB instance. Use in conjunction with the mongoimport
--username option to supply a username.
If you specify a --username and do not pass an argument to --password (page 980), mongoimport
(page 965) will prompt for a password interactively. If you do not specify a password on the command line,
--password (page 980) must be the last option.
966
967
968
--jsonArray
Changed in version 2.2: The limit on document size increased from 4MB to 16MB.
Accept import of data expressed with multiple MongoDB documents within a single JSON array.
Use in conjunction with mongoexport --jsonArray to import data written as a single JSON array. Limited to imports of 16 MB or smaller.
Usage In this example, mongoimport (page 965) imports the csv formatted data in the
http://docs.mongodb.org/manualopt/backups/contacts.csv into the collection contacts
in the users database on the MongoDB instance running on the localhost port numbered 27017.
mongoimport --db users --collection contacts --type csv --file /opt/backups/contacts.csv
In the following example, mongoimport (page 965) imports the data in the JSON formatted file contacts.json
into the collection contacts on the MongoDB instance running on the localhost port number 27017. Journaling is
explicitly enabled.
mongoimport --collection contacts --file contacts.json
In the next example, mongoimport (page 965) takes data passed to it on standard input (i.e. with a | pipe.) and imports it into the collection contacts in the sales database is the MongoDB datafiles located at /srv/mongodb/.
if the import process encounters an error, the mongoimport (page 965) will halt because of the --stopOnError
option.
mongoimport --db sales --collection contacts --stopOnError --dbpath /srv/mongodb/
In
the
final
example,
mongoimport
(page
965)
imports
data
from
the
file
http://docs.mongodb.org/manualopt/backups/mdb1-examplenet.json into the collection
contacts within the database marketing on a remote MongoDB database. This mongoimport (page 965)
accesses the mongod (page 925) instance running on the host mongodb1.example.net over port 37017, which
requires the username user and the password pass.
mongoimport --host mongodb1.example.net --port 37017 --username user --password pass --collection con
mongoexport
Synopsis mongoexport (page 969) is a utility that produces a JSON or CSV export of data stored in a MongoDB
instance. See the Import and Export MongoDB Data (page 149) document for a more in depth usage overview, and
the mongoimport (page 965) document for more information regarding the mongoimport (page 965) utility, which
provides the inverse importing capability.
Note: Do not use mongoimport (page 965) and mongoexport (page 969) for full-scale backups because they
may not reliably capture data type information. Use mongodump (page 951) and mongorestore (page 956) as
described in Backup Strategies for MongoDB Systems (page 136) for this kind of functionality.
Options
mongoexport
mongoexport
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
11.2. Architecture and Components
969
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Returns the version of the mongoexport (page 969) utility.
command line option!host <hostname><:port>
--host <hostname><:port>
Specifies a resolvable hostname for the mongod (page 925) from which you want to export data. By default
mongoexport (page 969) attempts to connect to a MongoDB process ruining on the localhost port number
27017.
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
To connect to a replica set, you can specify the replica set seed name, and a seed list of set members, in the
following format:
<replica_set_name>/<hostname1><:port>,<hostname2:<port>,...
970
If you specify a --username and do not pass an argument to --password (page 980), mongoexport
(page 969) will prompt for a password interactively. If you do not specify a password on the command line,
--password (page 980) must be the last argument specified.
command line option!authenticationDatabase <dbname>
--authenticationDatabase <dbname>
New in version 2.4.
Specifies the database that holds the users (e.g --username) credentials.
By default, mongoexport (page 969) assumes that the database specified to the --db (page 958) argument
holds the users credentials, unless you specify --authenticationDatabase (page 980).
See userSource (page 271), system.users Privilege Documents (page 270) and User Privilege Roles in MongoDB (page 265) for more information about delegated authentication in MongoDB.
command line option!authenticationMechanism <name>
--authenticationMechanism <name>
New in version 2.4.
Specifies the authentication mechanism. By default, the authentication mechanism is MONGODB-CR, which
is the MongoDB challenge/response authentication mechanism. In MongoDB Enterprise, mongoexport
(page 969) also includes support for GSSAPI to handle Kerberos authentication.
See Deploy MongoDB with Kerberos Authentication (page 259) for more information about Kerberos authentication.
command line option!dbpath <path>
--dbpath <path>
Specifies the directory of the MongoDB data files. If used, the --dbpath option enables mongoexport
(page 969) to attach directly to local data files and insert the data without the mongod (page 925). To run with
--dbpath, mongoexport (page 969) needs to lock access to the data directory: as a result, no mongod
(page 925) can access the same path while the process runs.
command line option!directoryperdb
--directoryperdb
Use the --directoryperdb (page 958) in conjunction with the corresponding option to mongod
(page 925), which allows mongoexport (page 969) to export data from MongoDB instances that have every databases files saved in discrete directories on the disk. This option is only relevant when specifying the
--dbpath (page 958) option.
command line option!journal
--journal
Allows mongoexport (page 969) operations to access the durability journal to ensure that the export is in a
consistent state. This option is only relevant when specifying the --dbpath (page 958) option.
command line option!db <db>, -d <db>
--db <db>, -d <db>
Use the --db (page 958) option to specify the name of the database that contains the collection you want to
export.
command line option!collection <collection>, -c <collection>
--collection <collection>, -c <collection>
Use the --collection (page 958) option to specify the collection that you want mongoexport (page 969)
to export.
command line option!fields <field1[,field2]>, -f <field1[,field2]>
971
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
ObjectId("51f0188846a64a1ed98fde7c"),
ObjectId("520e61b0c6646578e3661b59"),
ObjectId("520e642bb7fa4ea22d6b1871"),
ObjectId("520e6431b7fa4ea22d6b1872"),
ObjectId("520e6445b7fa4ea22d6b1873"),
"a"
"a"
"a"
"a"
"a"
:
:
:
:
:
1 }
1, "b"
2, "b"
3, "b"
5, "b"
:
:
:
:
2 }
3, "c" : 5 }
3, "c" : 6 }
6, "c" : 8 }
The following mongoexport (page 969) uses the -q option to export only the documents with the field a
greater than or equal to ($gte (page 622)) to 3:
mongoexport -d test -c records -q "{ a: { \$gte: 3 } } }" --out exportdir/myRecords.json
972
--slaveOk, -k
Allows mongoexport (page 969) to read data from secondary or slave nodes when using mongoexport
(page 969) with a replica set. This option is only available if connected to a mongod (page 925) or mongos
(page 938) and is not available when used with the mongoexport --dbpath option.
This is the default behavior.
command line option!out <file>, -o <file>
--out <file>, -o <file>
Specify a file to write the export to. If you do not specify a file name, the mongoexport (page 969) writes
data to standard output (e.g. stdout).
command line option!forceTableScan
--forceTableScan
New in version 2.2.
Forces mongoexport (page 969) to scan the data store directly: typically, mongoexport (page 969) saves
entries as they appear in the index of the _id field. Use --forceTableScan (page 973) to skip the index
and scan the data directly. Typically there are two cases where this behavior is preferable to the default:
1.If you have key sizes over 800 bytes that would not be present in the _id index.
2.Your database uses a custom _id field.
When you run with --forceTableScan (page 973), mongoexport (page 969) does not use $snapshot
(page 692). As a result, the export produced by mongoexport (page 969) can reflect the state of the database
at many different points in time.
Warning: Use --forceTableScan (page 973) with extreme caution and consideration.
Usage In the following example, mongoexport (page 969) exports the collection contacts
from the users database from the mongod (page 925) instance running on the localhost port
number 27017.
This command writes the export data in CSV format into a file located at
http://docs.mongodb.org/manualopt/backups/contacts.csv.
The fields.txt file contains a line-separated list of fields to export.
mongoexport --db users --collection contacts --csv --fieldFile fields.txt --out /opt/backups/contacts
The next example creates an export of the collection contacts from the MongoDB instance running on the localhost
port number 27017, with journaling explicitly enabled. This writes the export to the contacts.json file in JSON
format.
mongoexport --db sales --collection contacts --out contacts.json --journal
The following example exports the collection contacts from the sales database located in the MongoDB data
files located at /srv/mongodb/. This operation writes the export to standard output in JSON format.
mongoexport --db sales --collection contacts --dbpath /srv/mongodb/
Warning: The above example will only succeed if there is no mongod (page 925) connected to the data files
located in the /srv/mongodb/ directory.
The final example exports the collection contacts from the database marketing . This data resides on the
MongoDB instance located on the host mongodb1.example.net running on port 37017, which requires the
username user and the password pass.
973
mongoexport --host mongodb1.example.net --port 37017 --username user --password pass --collection con
Diagnostic Tools
mongostat (page 974), mongotop (page 979), and mongosniff (page 982) provide diagnostic information
related to the current operation of a mongod (page 925) instance.
Note: Because mongosniff (page 982) depends on libpcap, most distributions of MongoDB do not include
mongosniff (page 982).
mongostat
Synopsis The mongostat (page 974) utility provides a quick overview of the status of a currently running mongod
(page 925) or mongos (page 938) instance. mongostat (page 974) is functionally similar to the UNIX/Linux file
system utility vmstat, but provides data regarding mongod (page 925) and mongos (page 938) instances.
See also:
For more information about monitoring MongoDB, see Monitoring for MongoDB (page 138).
For more background on various other MongoDB status outputs see:
serverStatus (page 782)
replSetGetStatus (page 726)
dbStats (page 767)
collStats (page 763)
For an additional utility that provides MongoDB metrics see mongotop (page 979).
mongostat (page 974) connects to the mongod (page 925) instance running on the local host interface on TCP port
27017; however, mongostat (page 974) can connect to any accessible remote mongod (page 925) instance.
Options
mongostat
mongostat
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Returns the version of the mongostat (page 974) utility.
command line option!host <hostname><:port>
974
--host <hostname><:port>
Specifies a resolvable hostname for the mongod (page 925) from which you want to export data. By default
mongostat (page 974) attempts to connect to a MongoDB instance running on the localhost port number
27017.
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
To connect to a replica set, you can specify the replica set seed name, and a seed list of set members, in the
following format:
<replica_set_name>/<hostname1><:port>,<hostname2:<port>,...
975
--authenticationDatabase <dbname>
New in version 2.4.
Specifies the database that holds the users (e.g --username) credentials.
By default, mongostat (page 974) assumes that the database specified to the --db (page 958) argument holds
the users credentials, unless you specify --authenticationDatabase (page 980).
See userSource (page 271), system.users Privilege Documents (page 270) and User Privilege Roles in MongoDB (page 265) for more information about delegated authentication in MongoDB.
command line option!authenticationMechanism <name>
--authenticationMechanism <name>
New in version 2.4.
Specifies the authentication mechanism. By default, the authentication mechanism is MONGODB-CR, which
is the MongoDB challenge/response authentication mechanism. In MongoDB Enterprise, mongostat
(page 974) also includes support for GSSAPI to handle Kerberos authentication.
See Deploy MongoDB with Kerberos Authentication (page 259) for more information about Kerberos authentication.
command line option!noheaders
--noheaders
Disables the output of column or field names.
command line option!rowcount <number>, -n <number>
--rowcount <number>, -n <number>
Controls the number of rows to output. Use in conjunction with the sleeptime argument to control the
duration of a mongostat (page 974) operation.
Unless --rowcount (page 976) is specified, mongostat (page 974) will return an infinite number of rows
(e.g. value of 0.)
command line option!http
--http
Configures mongostat (page 974) to collect data using the HTTP interface rather than a raw database connection.
command line option!discover
--discover
With this option mongostat (page 974) discovers and reports on statistics from all members of a replica set
or sharded cluster. When connected to any member of a replica set, --discover (page 976) all non-hidden
members of the replica set. When connected to a mongos (page 938), mongostat (page 974) will return data
from all shards in the cluster. If a replica set provides a shard in the sharded cluster, mongostat (page 974)
will report on non-hidden members of that replica set.
The mongostat --host option is not required but potentially useful in this case.
command line option!all
--all
Configures mongostat (page 974) to return all optional fields (page 977).
<sleeptime>
The final argument is the length of time, in seconds, that mongostat (page 974) waits in between calls. By
default mongostat (page 974) returns one call every second.
976
mongostat (page 974) returns values that reflect the operations over a 1 second period. For values of
<sleeptime> greater than 1, mongostat (page 974) averages data to reflect average operations per second.
Fields mongostat (page 974) returns values that reflect the operations over a 1 second period. When mongostat
<sleeptime> has a value greater than 1, mongostat (page 974) averages the statistics to reflect average operations
per second.
mongostat (page 974) outputs the following fields:
inserts
The number of objects inserted into the database per second. If followed by an asterisk (e.g. *), the datum refers
to a replicated operation.
query
The number of query operations per second.
update
The number of update operations per second.
delete
The number of delete operations per second.
getmore
The number of get more (i.e. cursor batch) operations per second.
command
The number of commands per second. On slave and secondary systems, mongostat (page 974) presents two
values separated by a pipe character (e.g. |), in the form of local|replicated commands.
flushes
The number of fsync operations per second.
mapped
The total amount of data mapped in megabytes. This is the total data size at the time of the last mongostat
(page 974) call.
size
The amount of (virtual) memory in megabytes used by the process at the time of the last mongostat (page 974)
call.
res
The amount of (resident) memory in megabytes used by the process at the time of the last mongostat
(page 974) call.
faults
Changed in version 2.1.
The number of page faults per second.
Before version 2.1 this value was only provided for MongoDB instances running on Linux hosts.
locked
The percent of time in a global write lock.
Changed in version 2.2: The locked db field replaces the locked % field to more appropriate data regarding
the database specific locks in version 2.2.
locked db
New in version 2.2.
977
The percent of time in the per-database context-specific lock. mongostat (page 974) will report the database
that has spent the most time since the last mongostat (page 974) call with a write lock.
This value represents the amount of time that the listed database spent in a locked state combined with the time
that the mongod (page 925) spent in the global lock. Because of this, and the sampling method, you may see
some values greater than 100%.
idx miss
The percent of index access attempts that required a page fault to load a btree node. This is a sampled value.
qr
The length of the queue of clients waiting to read data from the MongoDB instance.
qw
The length of the queue of clients waiting to write data from the MongoDB instance.
ar
The number of active clients performing read operations.
aw
The number of active clients performing write operations.
netIn
The amount of network traffic, in bytes, received by the MongoDB instance.
This includes traffic from mongostat (page 974) itself.
netOut
The amount of network traffic, in bytes, sent by the MongoDB instance.
This includes traffic from mongostat (page 974) itself.
conn
The total number of open connections.
set
The name, if applicable, of the replica set.
repl
The replication status of the member.
Value
M
SEC
REC
UNK
SLV
Replication Type
master
secondary
recovering
unknown
slave
Usage In the first example, mongostat (page 974) will return data every second for 20 seconds. mongostat
(page 974) collects data from the mongod (page 925) instance running on the localhost interface on port 27017. All
of the following invocations produce identical behavior:
mongostat
mongostat
mongostat
mongostat
--rowcount 20 1
--rowcount 20
-n 20 1
-n 20
In the next example, mongostat (page 974) returns data every 5 minutes (or 300 seconds) for as long as the program
runs. mongostat (page 974) collects data from the mongod (page 925) instance running on the localhost interface
on port 27017. Both of the following invocations produce identical behavior.
978
In the following example, mongostat (page 974) returns data every 5 minutes for an hour (12 times.) mongostat
(page 974) collects data from the mongod (page 925) instance running on the localhost interface on port 27017. Both
of the following invocations produce identical behavior.
mongostat --rowcount 12 300
mongostat -n 12 300
In many cases, using the --discover will help provide a more complete snapshot of the state of an entire group
of machines. If a mongos (page 938) process connected to a sharded cluster is running on port 27017 of the local
machine, you can use the following form to return statistics from all members of the cluster:
mongostat --discover
mongotop
Synopsis mongotop (page 979) provides a method to track the amount of time a MongoDB instance spends reading and writing data. mongotop (page 979) provides statistics on a per-collection level. By default, mongotop
(page 979) returns values every second.
See also:
For more information about monitoring MongoDB, see Monitoring for MongoDB (page 138).
For additional background on various other MongoDB status outputs see:
serverStatus (page 782)
replSetGetStatus (page 726)
dbStats (page 767)
collStats (page 763)
For an additional utility that provides MongoDB metrics see mongostat (page 974).
Options
mongotop
mongotop
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Print the version of the mongotop (page 979) utility and exit.
command line option!host <hostname><:port>
979
--host <hostname><:port>
Specifies a resolvable hostname for the mongod from which you want to export data. By default mongotop
(page 979) attempts to connect to a MongoDB process running on the localhost port number 27017.
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
To connect to a replica set, you can specify the replica set seed name, and a seed list of set members, in the
following format:
<replica_set_name>/<hostname1><:port>,<hostname2:<port>,...
980
total
0ms
0ms
read
0ms
0ms
write
0ms
0ms
2012-08-13T15:45:40
981
local.system.indexes
admin.system.indexes
admin.
0ms
0ms
0ms
0ms
0ms
0ms
0ms
0ms
0ms
ns
test.system.namespaces
local.system.replset
local.system.indexes
admin.system.indexes
admin.
total
0ms
0ms
0ms
0ms
0ms
read
0ms
0ms
0ms
0ms
0ms
write
0ms
0ms
0ms
0ms
0ms
2012-08-13T15:45:55
To return a mongotop (page 979) report every 5 minutes, use the following command:
mongotop 300
To report the use of per-database locks, use mongotop --locks, which produces the following output:
$ mongotop --locks
connected to: 127.0.0.1
db
local
admin
.
total
0ms
0ms
0ms
read
0ms
0ms
0ms
write
0ms
0ms
0ms
2012-08-13T16:33:34
mongosniff
Synopsis mongosniff (page 982) provides a low-level operation tracing/sniffing view into database activity in
real time. Think of mongosniff (page 982) as a MongoDB-specific analogue of tcpdump for TCP/IP network
traffic. Typically, mongosniff (page 982) is most frequently used in driver development.
Note: mongosniff (page 982) requires libpcap and is only available for Unix-like systems. Furthermore, the
version distributed with the MongoDB binaries is dynamically linked against aversion 0.9 of libpcap. If your
system has a different version of libpcap, you will need to compile mongosniff (page 982) yourself or create a
symbolic link pointing to libpcap.so.0.9 to your local version of libpcap. Use an operation that resembles
the following:
ln -s /usr/lib/libpcap.so.1.1.1 /usr/lib/libpcap.so.0.9
982
--forward <host><:port>
Declares a host to forward all parsed requests that the mongosniff (page 982) intercepts to another mongod
(page 925) instance and issue those operations on that database instance.
Specify the target host name and port in the <host><:port> format.
To connect to a replica set, you can specify the replica set seed name, and a seed list of set members, in the
following format:
<replica_set_name>/<hostname1><:port>,<hostname2:<port>,...
Use the following command to only log invalid BSON objects for the mongod (page 925) or mongos (page 938)
running on the localhost interface and port 27018, for driver development and troubleshooting:
mongosniff --objcheck --source NET lo 27018
Build mongosniff To build mongosniff yourself, Linux users can use the following procedure:
1. Obtain prerequisites using your operating systems package management software. Dependencies include:
libpcap - to capture network packets.
git - to download the MongoDB source code.
scons and a C++ compiler - to build mongosniff (page 982).
2. Download a copy of the MongoDB source code using git:
git clone git://github.com/mongodb/mongo.git
3. Issue the following sequence of commands to change to the mongo/ directory and build mongosniff
(page 982):
cd mongo
scons mongosniff
983
Note: If you run scons mongosniff before installing libpcap you must run scons clean before you can
build mongosniff (page 982).
mongoperf
Synopsis mongoperf (page 984) is a utility to check disk I/O performance independently of MongoDB.
It times tests of random disk I/O and presents the results. You can use mongoperf (page 984) for any case apart
from MongoDB. The mmf (page 985) true mode is completely generic. In that mode it is somewhat analogous to
tools such as bonnie++34 (albeit mongoperf is simpler).
Specify options to mongoperf (page 984) using a JavaScript document.
See also:
bonnie35
bonnie++36
Output from an example run37
Checking Disk Performance with the mongoperf Utility38
Options
mongoperf
mongoperf
command line option!help
--help
Displays the options to mongoperf (page 984). Specify options to mongoperf (page 984) with a JSON
document described in the Configuration Fields (page 985) section.
<jsonconfig>
mongoperf (page 984) accepts configuration options in the form of a file that holds a JSON document. You
must stream the content of this file into mongoperf (page 984), as in the following operation:
mongoperf < config
In this example config is the name of a file that holds a JSON document that resembles the following example:
{
nThreads:<n>,
fileSizeMB:<n>,
sleepMicros:<n>,
mmf:<bool>,
r:<bool>,
w:<bool>,
recSizeKB:<n>,
syncDelay:<n>
}
See the Configuration Fields (page 985) section for documentation of each of these fields.
34 http://sourceforge.net/projects/bonnie/
35 http://www.textuality.com/bonnie/
36 http://sourceforge.net/projects/bonnie/
37 https://gist.github.com/1694664
38 http://blog.mongodb.org/post/40769806981/checking-disk-performance-with-the-mongoperf-utility
984
Configuration Fields
mongoperf.nThreads
Type: Integer.
Default: 1
Defines the number of threads mongoperf (page 984) will use in the test. To saturate your systems storage
system you will need multiple threads. Consider setting nThreads (page 985) to 16.
mongoperf.fileSizeMB
Type: Integer.
Default: 1 megabyte (i.e. 10242 bytes)
Test file size.
mongoperf.sleepMicros
Type: Integer.
Default: 0
mongoperf (page 984) will pause for the number of specified sleepMicros (page 985) divided by the
nThreads (page 985) between each operation.
mongoperf.mmf
Type: Boolean.
Default: false
Set mmf (page 985) to true to use memory mapped files for the tests.
Generally:
when mmf (page 985) is false, mongoperf (page 984) tests direct, physical, I/O, without caching. Use
a large file size to test heavy random I/O load and to avoid I/O coalescing.
when mmf (page 985) is true, mongoperf (page 984) runs tests of the caching system, and can use
normal file system cache. Use mmf (page 985) in this mode to test file system cache behavior with memory
mapped files.
mongoperf.r
Type: Boolean.
Default: false
Set r (page 985) to true to perform reads as part of the tests.
Either r (page 985) or w (page 985) must be true.
mongoperf.w
Type: Boolean.
Default: false
Set w (page 985) to true to perform writes as part of the tests.
Either r (page 985) or w (page 985) must be true.
mongoperf.recSizeKB
New in version 2.4.
Type: Integer.
Default: 4 kb
The size of each write operation.
985
mongoperf.syncDelay
Type: Integer.
Default: 0
Seconds between disk flushes. mongoperf.syncDelay (page 985) is similar to --syncdelay for
mongod (page 925).
The syncDelay (page 985) controls how frequently mongoperf (page 984) performs an asynchronous disk
flush of the memory mapped file used for testing. By default, mongod (page 925) performs this operation every
60 seconds. Use syncDelay (page 985) to test basic system performance of this type of operation.
Only use syncDelay (page 985) in conjunction with mmf (page 985) set to true.
The default value of 0 disables this.
Use
mongoperf < jsonconfigfile
Replace jsonconfigfile with the path to the mongoperf (page 984) configuration. You may also invoke
mongoperf (page 984) in the following form:
echo "{nThreads:16,fileSizeMB:1000,r:true}" | ./mongoperf
In this operation:
mongoperf (page 984) tests direct physical random read ios, using 16 concurrent reader threads.
mongoperf (page 984) uses a 1 gigabyte test file.
Consider using iostat, as invoked in the following example to monitor I/O performance during the test.
iostat -xm 2
GridFS
mongofiles (page 986) provides a command-line interact to a MongoDB GridFS storage system.
mongofiles
mongofiles
Synopsis The mongofiles (page 986) utility makes it possible to manipulate files stored in your MongoDB instance in GridFS objects from the command line. It is particularly useful as it provides an interface between objects
stored in your file system and GridFS.
All mongofiles (page 986) commands have the following form:
mongofiles <options> <commands> <filename>
986
mongofiles (page 986), like mongodump (page 951), mongoexport (page 969), mongoimport (page 965),
and mongorestore (page 956), can access data stored in a MongoDB data directory without requiring a running
mongod (page 925) instance, if no other mongod (page 925) is running.
Important: For replica sets, mongofiles (page 986) can only read from the sets primary.
Options
mongofiles
command line option!help
--help
Returns a basic help and usage text.
command line option!verbose, -v
--verbose, -v
Increases the amount of internal reporting returned on the command line. Increase the verbosity with the -v
form by including the option multiple times, (e.g. -vvvvv.)
command line option!version
--version
Returns the version of the mongofiles (page 986) utility.
command line option!host <hostname><:port>
--host <hostname><:port>
Specifies a resolvable hostname for the mongod (page 925) that holds your GridFS system. By default
mongofiles (page 986) attempts to connect to a MongoDB process ruining on the localhost port number
27017.
Optionally, specify a port number to connect a MongoDB instance running on a port other than 27017.
command line option!port <port>
--port <port>
Specifies the port number, if the MongoDB instance is not running on the standard port. (i.e. 27017) You may
also specify a port number using the mongofiles --host command.
command line option!ipv6
--ipv6
Enables IPv6 support that allows mongofiles (page 986) to connect to the MongoDB instance using an IPv6
network. All MongoDB programs and processes, including mongofiles (page 986), disable IPv6 support by
default.
command line option!ssl
--ssl
New in version 2.4: MongoDB added support for SSL connections to mongod (page 925) instances in mongofiles.
Note: SSL support in mongofiles is not compiled into the default distribution of MongoDB. See Connect to
MongoDB with SSL (page 249) for more information on SSL and MongoDB.
Additionally, mongofiles does not support connections to mongod (page 925) instances that require client certificate validation.
Allows mongofiles (page 986) to connect to mongod (page 925) instance over an SSL connection.
987
988
989
Examples To return a list of all files in a GridFS collection in the records database, use the following invocation
at the system shell:
mongofiles -d records list
This mongofiles (page 986) instance will connect to the mongod (page 925) instance running on the 27017
localhost interface to specify the same operation on a different port or hostname, and issue a command that resembles
one of the following:
mongofiles --port 37017 -d records list
mongofiles --hostname db1.example.net -d records list
mongofiles --hostname db1.example.net --port 37017 -d records list
Modify any of the following commands as needed if youre connecting the mongod (page 925) instances on different
ports or hosts.
To upload a file named 32-corinth.lp to the GridFS collection in the records database, you can use the
following command:
mongofiles -d records put 32-corinth.lp
To delete the 32-corinth.lp file from this GridFS collection in the records database, you can use the following
command:
mongofiles -d records delete 32-corinth.lp
To search for files in the GridFS collection in the records database that have the string corinth in their names,
you can use following command:
mongofiles -d records search corinth
To list all files in the GridFS collection in the records database that begin with the string 32, you can use the
following command:
mongofiles -d records list 32
To fetch the file from the GridFS collection in the records database named 32-corinth.lp, you can use the
following command:
mongofiles -d records get 32-corinth.lp
Synopsis Administrators and users can control mongod (page 925) or mongos (page 938) instances at runtime
either directly from mongods command line arguments (page 925) or using a configuration file.
While both methods are functionally equivalent and all settings are similar, the configuration file method is preferable.
If you installed from a package and have started MongoDB using your systems control script, youre already using a
configuration file.
To start mongod (page 925) or mongos (page 938) using a config file, use one of the following forms:
mongod
mongod
mongos
mongos
990
--config /etc/mongodb.conf
-f /etc/mongodb.conf
--config /srv/mongodb/mongos.conf
-f /srv/mongodb/mongos.conf
New in version 2.0: Before version 2.0, Boolean (i.e. true|false) or flag parameters, register as true, if they
appear in the configuration file, regardless of their value.
Note: Ensure the configuration file uses ASCII encoding. mongod (page 925) does not support configuration files
with non-ASCII encoding, including UTF-8.
Settings
verbose
Default: false
Increases the amount of internal reporting returned on standard output or in the log file generated by logpath
(page 992). To enable verbose (page 991) or to enable increased verbosity with vvvv (page 991), set these
options as in the following example:
verbose = true
vvvv = true
991
maxConns
Default: depends on system (i.e. ulimit and file descriptor) limits. Unless set, MongoDB will not limit its own
connections.
Specifies a value to set the maximum number of simultaneous connections that mongod (page 925) or mongos
(page 938) will accept. This setting has no effect if it is higher than your operating systems configured maximum
connection tracking threshold.
This is particularly useful for mongos (page 938) if you have a client that creates a number of collections but
allows them to timeout rather than close the collections. When you set maxConns (page 992), ensure the value
is slightly higher than the size of the connection pool or the total number of connections to prevent erroneous
connection spikes from propagating to the members of a shard cluster.
Note: You cannot set maxConns (page 992) to a value higher than 20000.
objcheck
Default: true
Changed in version 2.4: The default setting for objcheck (page 992) became true in 2.4. In earlier versions
objcheck (page 992) was false by default.
Forces the mongod (page 925) to validate all requests from clients upon receipt to ensure that clients never
insert invalid documents into the database. For objects with a high degree of sub-document nesting, objcheck
(page 992) can have a small impact on performance. You can set noobjcheck (page 992) to disable object
checking at run-time.
noobjcheck
New in version 2.4.
Default: false
Disables the default object validation that MongoDB performs on all incoming BSON documents.
logpath
Default: None. (i.e. http://docs.mongodb.org/manualdev/stdout)
Specify the path to a file name for the log file that will hold all diagnostic logging information.
Unless specified, mongod (page 925) will output all log information to the standard output.
logappend (page 992) is true, the logfile will be overwritten when the process restarts.
Unless
Note: Currently, MongoDB will overwrite the contents of the log file if the logappend (page 992) is not
used. This behavior may change in the future depending on the outcome of SERVER-449940 .
logappend
Default: false
Set to true to add new entries to the end of the logfile rather than overwriting the content of the log when the
process restarts.
If this setting is not specified, then MongoDB will overwrite the existing logfile upon start up.
Note: The behavior of the logging system may change in the near future in response to the SERVER-449941
case.
syslog
New in version 2.2.
40 https://jira.mongodb.org/browse/SERVER-4499
41 https://jira.mongodb.org/browse/SERVER-4499
992
Sends all logging output to the hosts syslog system rather than to standard output or a log file as with logpath
(page 992).
Important: You cannot use syslog (page 992) with logpath (page 992).
pidfilepath
Default: None.
Specify a file location to hold the PID or process ID of the mongod (page 925) process. Useful for tracking the
mongod (page 925) process in combination with the fork (page 993) setting.
Without a specified pidfilepath (page 993), mongos (page 938) creates no PID file.
Without this option, mongod (page 925) creates no PID file.
keyFile
Default: None.
Specify the path to a key file to store authentication information. This option is only useful for the connection
between replica set members.
See also:
Replica Set Security (page 238)
nounixsocket
Default: false
Set to true to disable listening on the UNIX socket.
MongoDB always creates and listens on the UNIX socket, unless nounixsocket (page 993) is set, or
bind_ip (page 991) is not set, or bind_ip (page 991) does not specify 127.0.0.1.
unixSocketPrefix
Default: http://docs.mongodb.org/manualtmp
Specifies a path for the UNIX socket. Unless this option has a value mongod (page 925) creates a socket with
http://docs.mongodb.org/manualtmp as a prefix.
MongoDB will always create and listen on a UNIX socket, unless nounixsocket (page 993) is set, bind_ip
(page 991) is not set. or bind_ip (page 991) does not specify 127.0.0.1.
fork
Default: false
Set to true to enable a daemon mode for mongod (page 925) that runs the process in the background.
auth
Default: false
Set to true to enable database authentication for users connecting from remote hosts. Configure users via the
mongo shell (page 942). If no users exist, the localhost interface will continue to have access to the database
until you create the first user.
cpu
Default: false
Set to true to force mongod (page 925) to report every four seconds CPU utilization and the amount of time
that the processor waits for I/O operations to complete (i.e. I/O wait.) MongoDB writes this data to standard
output, or the logfile if using the logpath (page 992) option.
dbpath
Default: /data/db/
993
Set this value to designate a directory for the mongod (page 925) instance to store
its data.
Typical locations include:
/srv/mongodb,
/var/lib/mongodb or
http://docs.mongodb.org/manualopt/mongodb
Unless specified, mongod (page 925) will look for data files in the default /data/db directory. (Windows
systems use the \data\db directory.) If you installed using a package management system. Check the
/etc/mongodb.conf file provided by your packages to see the configuration of the dbpath (page 993).
diaglog
Default: 0
Creates a very verbose diagnostic log for troubleshooting and recording various errors. MongoDB writes
these log files in the dbpath (page 993) directory and names them diaglog.<time in hex>, where
<time-in-hex> is the initiation time of logging as a hexadecimal string.
The value of this setting configures the level of verbosity. Possible values, and their impact are as follows.
Value
0
1
2
3
7
Setting
Off. No logging.
Log write operations.
Log read operations.
Log both read and write operations.
Log write and some read operations.
You can use the mongosniff (page 982) tool to replay this output for investigation. Given a typical diaglog
file, located at /data/db/diaglog.4f76a58c, you might use a command in the following form to read
these files:
mongosniff --source DIAGLOG /data/db/diaglog.4f76a58c
diaglog (page 994) is for internal use and not intended for most users.
Warning: Setting the diagnostic level to 0 will cause mongod (page 925) to stop writing data to the
diagnostic log file. However, the mongod (page 925) instance will continue to keep the file open, even if
it is no longer writing data to the file. If you want to rename, move, or delete the diagnostic log you must
cleanly shut down the mongod (page 925) instance before doing so.
directoryperdb
Default: false
Set to true to modify the storage pattern of the data directory to store each databases files in a distinct folder.
This option will create directories within the dbpath (page 993) named for each directory.
Use this option in conjunction with your file system and device configuration so that MongoDB will store data
on a number of distinct disk devices to increase write throughput or disk capacity.
994
Warning: If you have an existing mongod (page 925) instance and dbpath (page 993), and you want
to enable directoryperdb (page 994), you must migrate your existing databases to directories before
setting directoryperdb (page 994) to access those databases.
Example
Given a dbpath (page 993) directory with the following items:
journal
mongod.lock
local.0
local.1
local.ns
test.0
test.1
test.ns
To enable directoryperdb (page 994) you would need to modify the dbpath (page 993) to resemble
the following:
journal
mongod.lock
local/local.0
local/local.1
local/local.ns
test/test.0
test/test.1
test/test.ns
journal
Default: (on 64-bit systems) true
Default: (on 32-bit systems) false
Set to true to enable operation journaling to ensure write durability and data consistency.
Set to false to prevent the overhead of journaling in situations where durability is not required. To reduce the
impact of the journaling on disk usage, you can leave journal (page 995) enabled, and set smallfiles
(page 998) to true to reduce the size of the data and journal files.
Note: You must use nojournal (page 996) to disable journaling on 64-bit systems.
journalCommitInterval
Default: 100 or 30
Set this value to specify the maximum amount of time for mongod (page 925) to allow between journal operations. Lower values increase the durability of the journal, at the possible expense of disk performance.
The default journal commit interval is 100 milliseconds if a single block device (e.g. physical volume, RAID
device, or LVM volume) contains both the journal and the data files.
If different block devices provide the journal and data files the default journal commit interval is 30 milliseconds.
This option accepts values between 2 and 300 milliseconds.
To force mongod (page 925) to commit to the journal more frequently, you can specify j:true. When a write
operation with j:true is pending, mongod (page 925) will reduce journalCommitInterval (page 995)
to a third of the set value.
995
ipv6
Default: false
Set to true to IPv6 support to allow clients to connect to mongod (page 925) using IPv6 networks. mongod
(page 925) disables IPv6 support by default in mongod (page 925) and all utilities.
jsonp
Default: false
Set to true to permit JSONP access via an HTTP interface. Consider the security implications of allowing this
activity before setting this option.
noauth
Default: true
Disable authentication. Currently the default. Exists for future compatibility and clarity.
For consistency use the auth (page 993) option.
nohttpinterface
Default: false
Set to true to disable the HTTP interface. This command will override the rest (page 997) and disable the
HTTP interface if you specify both.
Note: In MongoDB Enterprise, the HTTP Console does not support Kerberos Authentication.
Changed in version 2.1.2: The nohttpinterface (page 996) option is not available for mongos (page 938)
instances before 2.1.2
nojournal
Default: (on 64-bit systems) false
Default: (on 32-bit systems) true
Set nojournal = true to disable durability journaling. By default, mongod (page 925) enables journaling
in 64-bit versions after v2.0.
Note: You must use journal (page 995) to enable journaling on 32-bit systems.
noprealloc
Default: false
Set noprealloc = true to disable the preallocation of data files. This will shorten the start up time in
some cases, but can cause significant performance penalties during normal operations.
noscripting
Default: false
Set noscripting = true to disable the scripting engine.
notablescan
Default: false
Set notablescan = true to forbid operations that require a table scan.
nssize
Default: 16
Specify this value in megabytes. The maximum size is 2047 megabytes.
Use this setting to control the default size for all newly created namespace files (i.e .ns). This option has no
impact on the size of existing namespace files.
996
Setting
Off. No profiling.
On. Only includes slow operations.
On. Includes all operations.
By default, mongod (page 925) disables profiling. Database profiling can impact database performance because
the profiler must record and process all database operations. Enable this option only after careful consideration.
quota
Default: false
Set to true to enable a maximum limit for the number data files each database can have. The default quota is
8 data files, when quota is true. Adjust the quota size with the with the quotaFiles (page 997) setting.
quotaFiles
Default: 8
Modify limit on the number of data files per database. This option requires the quota (page 997) setting.
rest
Default: false
Set to true to enable a simple REST interface.
repair
Default: false
Set to true to run a repair routine on all databases following start up. In general you should set this option on
the command line and not in the configuration file (page 145) or in a control script.
Use the mongod --repair option to access this functionality.
Note: Because mongod (page 925) rewrites all of the database files during the repair routine, if you do not
run repair (page 997) under the same user account as mongod (page 925) usually runs, you will need to run
chown on your database files to correct the permissions before starting mongod (page 925) again.
repairpath
Default: A _tmp directory in the dbpath (page 993).
Specify the path to the directory containing MongoDB data files, to use in conjunction with the repair
(page 997) setting or mongod --repair operation. Defaults to a _tmp directory within the dbpath
(page 993).
slowms
Default: 100
Specify values in milliseconds.
Sets the threshold for mongod (page 925) to consider a query slow for the database profiler. The database logs
all slow queries to the log, even when the profiler is not turned on. When the database profiler is on, mongod
(page 925) the profiler writes to the system.profile collection.
See also:
997
998
quiet
Default: false
Runs the mongod (page 925) or mongos (page 938) instance in a quiet mode that attempts to limit the amount
of output. This option suppresses:
output from database commands, including drop (page 747), dropIndexes (page 750),
diagLogging (page 769), validate (page 771), and clean (page 752).
replication activity.
connection accepted events.
connection closed events.
Note: For production systems this option is not recommended as it may make tracking problems during
particular connections much more difficult.
setParameter
New in version 2.4.
Specifies an option to configure on startup. Specify multiple options with multiple setParameter (page 999)
options. See mongod Parameters (page 1005) for full documentation of these parameters. The setParameter
(page 756) database command provides access to many of these parameters.
Declare all setParameter (page 756) settings in this file using the following form:
setParameter = <parameter>=<value>
For mongod (page 925) the following options are available using setParameter (page 999):
enableLocalhostAuthBypass (page 1005)
enableTestCommands (page 1005)
journalCommitInterval (page 1006)
logLevel (page 1006)
logUserIds (page 1006)
notablescan (page 1006)
quiet (page 1007)
replApplyBatchSize (page 1006)
replIndexPrefetch (page 1006)
supportCompatibilityFormPrivilegeDocuments (page 1007)
syncdelay (page 1007)
textSearchEnabled (page 1008)
traceExceptions (page 1007)
For mongos (page 938) the following options are available using setParameter (page 999):
enableLocalhostAuthBypass (page 1005)
enableTestCommands (page 1005)
logLevel (page 1006)
logUserIds (page 1006)
999
1000
slave
Default: false
Set to true to configure the current instance to act as slave instance in a replication configuration.
source
Default: <>
Form: <host><:port>
Used with the slave (page 1001) setting to specify the master instance from which this slave instance will
replicate
only
Default: <>
Used with the slave (page 1001) option, only (page 1001) specifies only a single database to replicate.
slaveDelay
Default: 0
Used with the slave (page 1001) setting, slaveDelay (page 1001) configures a delay in seconds, for this
slave to wait to apply operations from the master instance.
autoresync
Default: false
Used with the slave (page 1001) setting, set autoresync (page 1001) to true to force the slave to automatically resync if it is more than 10 seconds behind the master. This setting may be problematic if the
oplogSize (page 1000) of the oplog is too small. If the oplog is not large enough to store the difference in
changes between the masters current state and the state of the slave, this instance will forcibly resync itself
unnecessarily. When you set the autoresync (page 1001) option to false, the slave will not attempt an
automatic resync more than once in a ten minute period.
Sharded Cluster Options
configsvr
Default: false
Set this value to true to configure this mongod (page 925) instance to operate as the config database of a
shard cluster. When running with this option, clients will not be able to write data to any database other than
config and admin. The default port for a mongod (page 925) with this option is 27019 and the default
dbpath (page 993) directory is /data/configdb, unless specified.
Changed in version 2.2: configsvr (page 1001) also sets smallfiles (page 998).
Changed in version 2.4: configsvr (page 1001) creates a local oplog.
Do not use configsvr (page 1001) with replSet (page 1000) or shardsvr (page 1001). Config servers
cannot be a shard server or part of a replica set.
default port for mongod (page 925) with this option is 27019 and mongod (page 925) writes all data files to
the http://docs.mongodb.org/manualconfigdb sub-directory of the dbpath (page 993) directory.
shardsvr
Default: false
Set this value to true to configure this mongod (page 925) instance as a shard in a partitioned cluster. The
default port for these instances is 27018. The only effect of shardsvr (page 1001) is to change the port
number.
configdb
Default: None.
1001
Format: <config1>,<config2><:port>,<config3>
Set this option to specify a configuration database (i.e. config database) for the sharded cluster. You must
specify either 1 configuration server or 3 configuration servers, in a comma separated list.
This setting only affects mongos (page 938) processes.
Note: mongos (page 938) instances read from the first config server in the list provided. All mongos
(page 938) instances must specify the hosts to the configdb (page 1001) setting in the same order.
If your configuration databases reside in more that one data center, order the hosts in the configdb (page 1001)
setting so that the config database that is closest to the majority of your mongos (page 938) instances is first
servers in the list.
Warning: Never remove a config server from the configdb (page 1001) parameter, even if the config
server or servers are not available, or offline.
test
Default: false
Only runs unit tests and does not start a mongos (page 938) instance.
This setting only affects mongos (page 938) processes and is for internal testing use only.
chunkSize
Default: 64
The value of this option determines the size of each chunk of data distributed around the sharded cluster. The
default value is 64 megabytes. Larger chunks may lead to an uneven distribution of data, while smaller chunks
may lead to frequent and unnecessary migrations. However, in some circumstances it may be necessary to set a
different chunk size.
This setting only affects mongos (page 938) processes. Furthermore, chunkSize (page 1002) only sets the
chunk size when initializing the cluster for the first time. If you modify the run-time option later, the new value
will have no effect. See the Modify Chunk Size in a Sharded Cluster (page 547) procedure if you need to change
the chunk size on an existing sharded cluster.
localThreshold
New in version 2.2.
localThreshold (page 1002) affects the logic that mongos (page 938) uses when selecting replica set
members to pass reads operations to from clients. Specify a value to localThreshold (page 1002) in
milliseconds. The default value is 15, which corresponds to the default value in all of the client drivers (page 95).
This setting only affects mongos (page 938) processes.
When mongos (page 938) receives a request that permits reads to secondary members, the mongos (page 938)
will:
find the member of the set with the lowest ping time.
construct a list of replica set members that is within a ping time of 15 milliseconds of the nearest suitable
member of the set.
If you specify a value for localThreshold (page 1002), mongos (page 938) will construct the list of
replica members that are within the latency allowed by this value.
The mongos (page 938) will select a member to read from at random from this list.
The ping time used for a set member compared by the localThreshold (page 1002) setting is a moving
average of recent ping times, calculated, at most, every 10 seconds. As a result, some queries may reach
members above the threshold until the mongos (page 938) recalculates the average.
1002
See the Member Selection (page 408) section of the read preference (page 405) documentation for more information.
noAutoSplit
noAutoSplit (page 1003) is for internal use and is only available on mongos (page 938) instances.
New in version 2.0.7.
noAutoSplit (page 1003) prevents mongos (page 938) from automatically inserting metadata splits in a
sharded collection. If set on all mongos (page 938), this will prevent MongoDB from creating new chunks as
the data in a collection grows.
Because any mongos (page 938) in a cluster can create a split, to totally disable splitting in a cluster you must
set noAutoSplit (page 1003) on all mongos (page 938).
Warning: With noAutoSplit (page 1003) enabled, the data in your sharded cluster may become imbalanced over time. Enable with caution.
moveParanoia
New in version 2.4.
During chunk migrations, moveParanoia (page 1003) forces the mongod (page 925) instances to save all
documents migrated from this shard in the moveChunk directory of the dbpath (page 993). MongoDB does
not delete data from this directory.
Prior to 2.4, moveParanoia (page 1003) was the default behavior of MongoDB.
SSL Options
sslOnNormalPorts
New in version 2.2.
Note: The default distribution of MongoDB43 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Enables SSL for mongod (page 925) or mongos (page 938). With sslOnNormalPorts (page 1003), a
mongod (page 925) or mongos (page 938) requires SSL encryption for all connections on the default MongoDB port, or the port specified by port (page 991). By default, sslOnNormalPorts (page 1003) is
disabled.
sslPEMKeyFile
New in version 2.2.
Note: The default distribution of MongoDB44 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Specifies the .pem file that contains both the SSL certificate and key. Specify the file name of the .pem file
using relative or absolute paths
When using sslOnNormalPorts (page 1003), you must specify sslPEMKeyFile (page 1003).
sslPEMKeyPassword
New in version 2.2.
Note:
The default distribution of MongoDB45 does not contain support for SSL. To use SSL you can ei-
43 http://www.mongodb.org/downloads
44 http://www.mongodb.org/downloads
1003
ther compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Specifies the password to de-crypt the certificate-key file (i.e. sslPEMKeyFile (page 1003)). Only use
sslPEMKeyPassword (page 1003) if the certificate-key file is encrypted. In all cases, mongod (page 925)
or mongos (page 938) will redact the password from all logging and reporting output.
Changed in version 2.4: sslPEMKeyPassword (page 1003) is only needed when the private key is encrypted. In earlier versions mongod (page 925) or mongos (page 938) would require sslPEMKeyPassword
(page 1003) whenever using sslOnNormalPorts (page 1003), even when the private key was not encrypted.
sslCAFile
New in version 2.4.
Note: The default distribution of MongoDB46 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Specifies the .pem file that contains the root certificate chain from the Certificate Authority. Specify the file
name of the .pem file using relative or absolute paths
sslCRLFile
New in version 2.4.
Note: The default distribution of MongoDB47 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Specifies the .pem file that contains the Certificate Revocation List. Specify the file name of the .pem file
using relative or absolute paths
sslWeakCertificateValidation
New in version 2.4.
Note: The default distribution of MongoDB48 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
Disables the requirement for SSL certificate validation, that sslCAFile (page 1004) enables. With
sslWeakCertificateValidation (page 1004), mongod (page 925) or mongos (page 938) will accept connections if the client does not present a certificate when establishing the connection.
If the client presents a certificate and mongod (page 925) or mongos (page 938) has
sslWeakCertificateValidation (page 1004) enabled, mongod (page 925) or mongos (page 938)
will validate the certificate using the root certificate chain specified by sslCAFile (page 1004), and reject
clients with invalid certificates.
Use sslWeakCertificateValidation (page 1004) if you have a mixed deployment that includes clients
that do not or cannot present certificates to mongod (page 925) or mongos (page 938).
sslFIPSMode
New in version 2.4.
45 http://www.mongodb.org/downloads
46 http://www.mongodb.org/downloads
47 http://www.mongodb.org/downloads
48 http://www.mongodb.org/downloads
1004
Note: The default distribution of MongoDB49 does not contain support for SSL. To use SSL you can either compile MongoDB with SSL support or use MongoDB Enterprise. See Connect to MongoDB with SSL
(page 249) for more information about SSL and MongoDB.
When specified, mongod (page 925) or mongos (page 938) will use the FIPS mode of the installed OpenSSL
library. Your system must have a FIPS compliant OpenSSL library to use sslFIPSMode (page 1004).
mongod Parameters
1005
See also:
journalCommitInterval (page 995).
logUserIds
New in version 2.4.
Specify 1 to enable logging of userids.
Disabled by default.
logLevel
Specify an integer between 0 and 5 signifying the verbosity of the logging, where 5 is the most verbose.
Consider the following example which sets the logLevel (page 1006) to 2:
use admin
db.runCommand( { setParameter: 1, logLevel: 2 } )
See also:
verbose (page 991).
notablescan
Specify whether queries must use indexes. If 1, queries that perform a table scan instead of using an index will
fail.
Consider the following example which sets notablescan (page 1006) to true:
use admin
db.runCommand( { setParameter: 1, notablescan: 1 } )
See also:
notablescan (page 996)
replIndexPrefetch
New in version 2.2.
Use replIndexPrefetch (page 1006) in conjunction with replSet (page 1000). The default value is
all and available options are:
none
all
_id_only
By default secondary members of a replica set will load all indexes related to an operation into memory before
applying operations from the oplog. You can modify this behavior so that the secondaries will only load the _id
index. Specify _id_only or none to prevent the mongod (page 925) from loading any index into memory.
replApplyBatchSize
New in version 2.4.
1006
Specify the number of oplog entries to apply as a single batch. replApplyBatchSize (page 1006) must be
an integer between 1 and 1024. This option only applies to replica set members when they are in the secondary
state.
Batch sizes must be 1 for members with slaveDelay (page 1001) configured.
saslHostName
New in version 2.4.
saslHostName (page 1007) overrides MongoDBs default hostname detection for the purpose of configuring
SASL and Kerberos authentication.
saslHostName (page 1007) does not affect the hostname of the mongod (page 925) or mongos (page 938)
instance for any purpose beyond the configuration of SASL and Kerberos.
You can only set saslHostName (page 1007) during start-up, and cannot change this setting using the
setParameter (page 756) database command.
Note: saslHostName (page 1007) supports Kerberos authentication and is only included in MongoDB
Enterprise. See Deploy MongoDB with Kerberos Authentication (page 259) for more information.
supportCompatibilityFormPrivilegeDocuments
New in version 2.4.
supportCompatibilityFormPrivilegeDocuments (page 1007) is not available using
setParameter (page 756) database command. Use the setParameter (page 999) option in the
configuration file or the --setParameter option on the command line.
syncdelay
Specify the interval in seconds between fsync operations where mongod (page 925) flushes its working memory
to disk. By default, mongod (page 925) flushes memory to disk every 60 seconds. In almost every situation
you should not set this value and use the default setting.
Consider the following example which sets the syncdelay to 60 seconds:
db = db.getSiblingDB("admin")
db.runCommand( { setParameter: 1, syncdelay: 60 } )
See also:
syncdelay (page 998) and journalCommitInterval (page 1006).
traceExceptions
New in version 2.2.
Configures mongod (page 925) log full stack traces on assertions or errors. If 1, mongod (page 925) will log
full stack traces on assertions or errors.
Consider the following example which sets the traceExceptions to true:
use admin
db.runCommand( { setParameter: 1, traceExceptions: true } )
See also:
traceExceptions (page 998)
quiet
Sets quiet logging mode. If 1, mongod (page 925) will go into a quiet logging mode which will not log the
following events/activities:
connection events;
1007
the drop (page 747) command, the dropIndexes (page 750) command, the diagLogging
(page 769) command, the validate (page 771) command, and the clean (page 752) command; and
replication synchronization activities.
Consider the following example which sets the quiet to 1:
db = db.getSiblingDB("admin")
db.runCommand( { setParameter: 1, quiet: 1 } )
See also:
quiet (page 998)
textSearchEnabled
New in version 2.4.
Warning:
Do not enable or use text search on production systems.
Text indexes have significant storage requirements and performance costs. See Storage Requirements
and Performance Costs (page 332) for more information.
Enables the text search (page 333) feature. You must enable the feature before creating or accessing a text index.
mongod --setParameter textSearchEnabled=true
If the flag is not enabled, you cannot create new text indexes, and you cannot perform text searches. However,
MongoDB will continue to maintain existing text indexes.
releaseConnectionsAfterResponse
New in version 2.2.4: and 2.4.2
Changes the behavior of the connection pool that mongos (page 938) uses to connect to the shards. As a result,
each mongos (page 938) should need to maintain fewer connections to each shard. When enabled, the mongos
(page 938) will release a connection into the thread pool after each read operation or command.
Warning: For applications that do not use the default (page 55), journaled (page 57), or replica acknowledged (page 57) write concern modes of the driver, releaseConnectionsAfterResponse
(page 1008) will affect the meaning of getLastError (page 720).
If an application allows read operations in between write operations and getLastError (page 720) calls,
the resulting getLastError (page 720) will not report on the success of the proceeding write operation.
Use with caution.
To enable, use the following command while connected to a mongos (page 938):
use admin
db.runCommand( { setParameter: 1, releaseConnectionsAfterResponse: true } )
Alternately, you may start the mongos (page 938) instance with the following run-time option:
mongos --setParameter releaseConnectionsAfterResponse=true
To change this policy for the entire cluster, you must set releaseConnectionsAfterResponse
(page 1008) on each mongos (page 938) instance in the cluster.
1008
49
Returned by mongod.exe (page 948) or mongos.exe (page 950) on Windows when either receives a shutdown message from the Windows Service Control Manager.
1009
100
Returned by mongod (page 925) when the process throws an uncaught exception.
mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]
1010
Example
To describe a connection to a replica set named test, with the following mongod (page 925) hosts:
db1.example.net on port 27017 and
db2.example.net on port 2500.
You would use a connection string that resembles the following:
mongodb://db1.example.net,db2.example.net:2500/?replicaSet=test
replicaSet
Specifies the name of the replica set, if the mongod (page 925) is a member of a replica set.
When connecting to a replica set it is important to give a seed list of at least two mongod (page 925) instances.
If you only provide the connection point of a single mongod (page 925) instance, and omit the replicaSet
(page 1011), the client will create a standalone connection.
Connection Options
ssl
true: Initiate the connection with SSL.
false: Initiate the connection without SSL.
The default value is false.
Note: The ssl (page 1011) option is not supported by all drivers. See your driver (page 95) documentation
and the Connect to MongoDB with SSL (page 249) document.
connectTimeoutMS
The time in milliseconds to attempt a connection before timing out. The default is never to timeout, though
different drivers might vary. See the driver (page 95) documentation.
socketTimeoutMS
The time in milliseconds to attempt a send or receive on a socket before the attempt times out. The default is
never to timeout, though different drivers might vary. See the driver (page 95) documentation.
1011
Most drivers implement some kind of connection pooling handle this for you behind the scenes. Some drivers do not
support connection pools. See your driver (page 95) documentation for more information on the connection pooling
implementation. These options allow applications to configure the connection pool when connecting to the MongoDB
deployment.
maxPoolSize
The maximum number of connections in the connection pool. The default value is 100.
minPoolSize
The minimum number of connections in the connection pool. The default value is 0.
Note: The minPoolSize (page 1012) option is not supported by all drivers. For information on your driver,
see the drivers (page 95) documentation.
maxIdleTimeMS
The maximum number of milliseconds that a connection can remain idle in the pool before being removed and
closed.
This option is not supported by all drivers.
waitQueueMultiple
A number that the driver multiples the maxPoolSize (page 1012) value to, to provide the maximum number
of threads allowed to wait for a connection to become available from the pool. For default values, see the
MongoDB Drivers and Client Libraries (page 95) documentation.
waitQueueTimeoutMS
The maximum time in milliseconds that a thread can wait for a connection to become available. For default
values, see the MongoDB Drivers and Client Libraries (page 95) documentation.
Write Concern Options
Write concern (page 55) describes the kind of assurances that the program:mongod and the driver provide to the
application regarding the success and durability of the write operation. For a full explanation of write concern and
write operations in general see the: Write Operations (page 50):
w
Defines the level and kind of write concern, that the driver uses when calling getLastError (page 720). This
option can take either a number or a string as a value.
option number -1 The driver will not acknowledge write operations and will suppress all network
or socket errors.
option number 0 The driver will not acknowledge write operations but will pass or handle any
network and socket errors that it receives to the client. If you disable write concern but enable
the getLastError (page 720) commands w option, w overrides the w option.
option number 1 Provides basic acknowledgment of write operations. By specifying 1, you require
that a standalone mongod (page 925) instance, or the primary for replica sets, acknowledge all
write operations. For drivers released after the default write concern change (page 1089), this is
the default write concern setting.
option string majority For replica sets, if you specify the special majority value to w
(page 1012) option, write operations will only return successfully after a majority of the configured replica set members have acknowledged the write operation.
1012
option number n For replica sets, if you specify a number n greater than 1, operations with this
write concern return only after n members of the set have acknowledged the write. If you set n
to a number that is greater than the number of available set members or members that hold data,
MongoDB will wait, potentially indefinitely, for these members to become available.
option string tags For replica sets, you can specify a tag set (page 451) to require that all members
of the set that have these tags configured return confirmation of the write operation. See Replica
Set Tag Set Configuration (page 451) for more information.
wtimeoutMS
The time in milliseconds to wait for replication to succeed, as specified in the w (page 1012) option, before
timing out. When wtimeoutMS is 0, write operations will never time out.
journal
Controls whether write operations will wait until the mongod (page 925) acknowledges the write operations
and commits the data to the on disk journal.
option Boolean true Enables journal commit acknowledgment write concern. Equivalent to specifying the getLastError (page 720) command with the j option enabled.
option Boolean false Does not require that mongod (page 925) commit write operations to the
journal before acknowledging the write operation. This is the default option for the journal
(page 1013) parameter.
If you set journal (page 1013) to true, and specify a w (page 1012) value less than 1, journal (page 1013)
prevails.
If you set journal (page 1013) to true, and the mongod (page 925) does not have journaling enabled, as with
nojournal (page 996), then getLastError (page 720) will provide basic receipt acknowledgment (i.e.
w:1), and will include a jnote field in its return document.
Read Preference Options
Read preferences (page 405) describe the behavior of read operations with regards to replica sets. These parameters
allow you to specify read preferences on a per-connection basis in the connection string:
readPreference
Specifies the replica set read preference for this connection. This setting overrides any slaveOk value. The
read preference values are the following:
primary (page 489)
primaryPreferred (page 489)
secondary (page 489)
secondaryPreferred (page 489)
nearest (page 490)
For descriptions of each value, see Read Preference Modes (page 489).
The default value is primary (page 489), which sends all read operations to the replica sets primary.
readPreferenceTags
Specifies a tag set as a comma-separated list of colon-separated key-value pairs. For example:
dc:ny,rack:1
To specify a list of tag sets, use multiple readPreferenceTags. The following specifies two tag sets and
an empty tag set:
1013
readPreferenceTags=dc:ny,rack:1&readPreferenceTags=dc:ny&readPreferenceTags=
uuidRepresentation
option standard The standard binary representation.
option csharpLegacy The default representation for the C# driver.
option javaLegacy The default representation for the Java driver.
option pythonLegacy The default representation for the Python driver.
For the default, see the drivers (page 95) documentation for your driver.
Note: Not all drivers support the uuidRepresentation (page 1014) option. For information on your
driver, see the drivers (page 95) documentation.
Examples
The following provide example URI strings for common connection targets.
Database Server Running Locally
The following connects to a database server running locally on the default port:
mongodb://localhost
admin Database
The following connects and logs in to the admin database as user sysop with the password moon:
mongodb://sysop:moon@localhost
records Database
The following connects and logs in to the records database as user sysop with the password moon:
mongodb://sysop:moon@localhost/records
Note: Not all drivers support UNIX domain sockets. For information on your driver, see the drivers (page 95)
documentation.
1014
The following connects to a replica set with two members, one on db1.example.net and the other on
db2.example.net:
mongodb://db1.example.net,db2.example.com
The following connects to a replica set with three members running on localhost on ports 27017, 27018, and
27019:
mongodb://localhost,localhost:27018,localhost:27019
The following connects to a replica set with three members and distributes reads to the secondaries:
mongodb://example1.com,example2.com,example3.com/?readPreference=secondary
The following connects to a replica set with write concern configured to wait for replication to succeed on at least two
members, with a two-second timeout.
mongodb://example1.com,example2.com,example3.com/?w=2&wtimeoutMS=2000
1015
Number of Namespaces
The limitation on the number of namespaces is the size of the namespace file divided by 628.
A 16 megabyte namespace file can support approximately 24,000 namespaces. Each index also counts as a
namespace.
Size of Namespace File
Namespace files can be no larger than 2047 megabytes.
By default namespace files are 16 megabytes. You can configure the size using the nssize (page 996) option.
Indexes
Index Key
The total size of an indexed value must be less than 1024 bytes. MongoDB will not add that value to an index
if it is longer than 1024 bytes.
Number of Indexes per Collection
A single collection can have no more than 64 indexes.
Index Name Length
The names of indexes, including their namespace (i.e database and collection name) cannot be longer than 125
characters. The default index name is the concatenation of the field names and index directions.
You can explicitly specify an index name to the ensureIndex() (page 814) helper if the default index name
is too long.
Number of Indexed Fields in a Compound Index
There can be no more than 31 fields in a compound index.
Queries cannot use both text and Geospatial Indexes
that requires a different type of special index. For example you cannot combine text (page 715)
with the $near (page 637) operator.
See also:
The unique indexes limit in Sharding Operational Restrictions (page 1017).
Capped Collections
Maximum Number of Documents in a Capped Collection
Changed in version 2.4.
If you specify a maximum number of documents for a capped collection using the max parameter to create
(page 747), the limit must be less than 232 documents. If you do not specify a maximum number of documents
when creating a capped collection, there is no limit on the number of documents.
Replica Sets
Number of Members of a Replica Set
Replica sets can have no more than 12 members.
Number of Voting Members of a Replica Set
Only 7 members of a replica set can have votes at any given time. See can vote Non-Voting Members (page 400)
for more information
1016
Sharded Clusters
Sharded clusters have the restrictions and thresholds described here.
Sharding Operational Restrictions
1017
See
$or (page 625) and 2d Index Internals (page 330).
Spherical Polygons must fit within a hemisphere.
Any geometry specified with GeoJSON to $geoIntersects (page 637) or $geoWithin (page 635)
queries, must fit within a single hemisphere. MongoDB interprets geometries larger than half of the sphere
as queries for the smaller of the complementary geometries.
Combination Limit with Multiple $in Expressions
When using two or more $in (page 623) expressions, the product of the number of distinct elements in
the $in (page 623) arrays must be less than 4000000. Otherwise, MongoDB will throw an exception of
"combinatorial limit of $in partitioning of result set exceeded".
Naming Restrictions
Database Name Case Sensitivity
Database names are case sensitive even if the underlying file system is case insensitive. MongoDB does not
1018
permit database names that differ only by the case of the characters.
Restrictions on Database Names for Windows
Changed in version 2.2: Restrictions on Database Names for Windows (page 1054).
For MongoDB deployments running on Windows, MongoDB will not permit database names that include any
of the following characters:
/\. "*<>:|?
11.3.4 Glossary
$cmd A special virtual collection that exposes MongoDBs database commands. To use database commands, see
Issue Commands (page 164).
_id A field required in every MongoDB document. The _id field must have a unique value. You can think of
the _id field as the documents primary key. If you create a new document without an _id field, MongoDB
automatically creates the field and assigns a unique BSON ObjectId.
accumulator An expression in the aggregation framework that maintains state between documents in the aggregation
pipeline. For a list of accumulator operations, see $group (page 669).
admin database A privileged database. Users must have access to the admin database to run certain administrative
commands. For a list of administrative commands, see Instance Administration Commands (page 744).
aggregation Any of a variety of operations that reduces and summarizes large sets of data. MongoDBs
aggregate() (page 808) and mapReduce() (page 837) methods are two examples of aggregation operations. For more information, see Aggregation Concepts (page 279).
11.3. General Reference
1019
aggregation framework The set of MongoDB operators that let you calculate aggregate values without having to
use map-reduce. For a list of operators, see Aggregation Reference (page 306).
arbiter A member of a replica set that exists solely to vote in elections. Arbiters do not replicate data. See Replica
Set Arbiter (page 389).
B-tree A data structure commonly used by database management systems to store indexes. MongoDB uses B-trees
for its indexes.
balancer An internal MongoDB process that runs in the context of a sharded cluster and manages the migration
of chunks. Administrators must disable the balancer for all maintenance operations on a sharded cluster. See
Sharded Collection Balancing (page 516).
BSON A serialization format used to store documents and make remote procedure calls in MongoDB. BSON is a
portmanteau of the words binary and JSON. Think of BSON as a binary representation of JSON (JavaScript
Object Notation) documents.
See also:
Documents (page 92), BSON Types (page 105) and Data Type Fidelity (page 150)
BSON types The set of types supported by the BSON serialization format. For a list of BSON types, see BSON Types
(page 105).
CAP Theorem Given three properties of computing systems, consistency, availability, and partition tolerance, a
distributed computing system can provide any two of these features, but never all three.
capped collection A fixed-sized collection that automatically overwrites its oldest entries when it reaches its maximum size. The MongoDB oplog that is used in replication is a capped collection. See Capped Collections
(page 156).
checksum A calculated value used to ensure data integrity. The md5 algorithm is sometimes used as a checksum.
chunk A contiguous range of shard key values within a particular shard. Chunk ranges are inclusive of the lower
boundary and exclusive of the upper boundary. MongoDB splits chunks when they grow beyond the configured
chunk size, which by default is 64 megabytes. MongoDB migrates chunks when a shard contains too many
chunks of a collection relative to other shards. See Data Partitioning (page 495) and Sharding Mechanics
(page 515).
client The application layer that uses a database for data persistence and storage. Drivers provide the interface level
between the application layer and the database server.
cluster See sharded cluster.
collection A grouping of MongoDB documents. A collection is the equivalent of an RDBMS table. A collection exists
within a single database. Collections do not enforce a schema. Documents within a collection can have different
fields. Typically, all documents in a collection have a similar or related purpose. See What is a namespace in
MongoDB? (page 585).
compound index An index consisting of two or more keys. See Compound Indexes (page 322).
config database An internal database that holds the metadata associated with a sharded cluster. Applications and
administrators should not modify the config database in the course of normal operation. See Config Database
(page 564).
config server A mongod (page 925) instance that stores all the metadata associated with a sharded cluster. A production sharded cluster requires three config servers, each on a separate machine. See Config Servers (page 502).
control script A simple shell script, typically located in the /etc/rc.d or /etc/init.d directory, and used by
the systems initialization process to start, restart or stop a daemon process.
CRUD An acronym for the fundamental operations of a database: Create, Read, Update, and Delete. See MongoDB
CRUD Operations (page 35).
1020
CSV A text-based data format consisting of comma-separated values. This format is commonly used to exchange
data between relational databases since the format is well-suited to tabular data. You can import CSV files using
mongoimport (page 965).
cursor A pointer to the result set of a query. Clients can iterate through a cursor to retrieve results. By default,
cursors timeout after 10 minutes of inactivity. See Cursors (page 43).
daemon The conventional name for a background, non-interactive process.
data-center awareness A property that allows clients to address members in a system based on their locations.
Replica sets implement data-center awareness using tagging. See Data Center Awareness (page 153).
database A physical container for collections. Each database gets its own set of files on the file system. A single
MongoDB server typically has multiple databases.
database command A MongoDB operation, other than an insert, update, remove, or query. For a list of database
commands, see Database Commands (page 694). To use database commands, see Issue Commands (page 164).
database profiler A tool that, when enabled, keeps a record on all long-running operations in a databases
system.profile collection. The profiler is most often used to diagnose slow queries. See Database Profiling (page 143).
datum A set of values used to define measurements on the earth. MongoDB uses the WGS84 datum in certain
geospatial calculations. See Geospatial Indexes and Queries (page 326).
dbpath The location of MongoDBs data file storage. See dbpath (page 993).
delayed member A replica set member that cannot become primary and applies operations at a specified delay. The
delay is useful for protecting data from human error (i.e. unintentionally deleted databases) or updates that have
unforeseen effects on the production database. See Delayed Replica Set Members (page 387).
diagnostic log A verbose log of operations stored in the dbpath. See diaglog (page 994).
document A record in a MongoDB collection and the basic unit of data in MongoDB. Documents are analogous to
JSON objects but exist in the database in a more type-rich format known as BSON. See Documents (page 92).
dot notation MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument. See Dot Notation (page 94).
draining The process of removing or shedding chunks from one shard to another. Administrators must drain shards
before removing them from the cluster. See Remove Shards from an Existing Sharded Cluster (page 553).
driver A client library for interacting with MongoDB in a particular language. See MongoDB Drivers and Client
Libraries (page 95).
election The process by which members of a replica set select a primary on startup and in the event of a failure. See
Replica Set Elections (page 397).
eventual consistency A property of a distributed system that allows changes to the system to propagate gradually. In
a database system, this means that readable members are not required to reflect the latest writes at all times. In
MongoDB, reads to a primary have strict consistency; reads to secondaries have eventual consistency.
expression In the context of aggregation framework, expressions are the stateless transformations that operate on the
data that passes through a pipeline. See Aggregation Concepts (page 279).
failover The process that allows a secondary member of a replica set to become primary in the event of a failure.
See Replica Set High Availability (page 396).
field A name-value pair in a document. A document has zero or more fields. Fields are analogous to columns in
relational databases.
1021
firewall A system level networking filter that restricts access based on, among other things, IP address. Firewalls
form a part of an effective network security strategy. See Firewalls (page 240).
fsync A system call that flushes all dirty, in-memory pages to disk. MongoDB calls fsync() on its database files
at least every 60 seconds. See fsync (page 751).
geohash A geohash value is a binary representation of the location on a coordinate grid. See Calculation of Geohash
Values for 2d Indexes (page 331).
GeoJSON A geospatial data interchange format based on JavaScript Object Notation (JSON). GeoJSON is used
in geospatial queries (page 326). For supported GeoJSON objects, see Location Data (page 326). For the
GeoJSON format specification, see http://geojson.org/geojson-spec.html.
geospatial Data that relates to geographical location. In MongoDB, you may store, index, and query data according
to geographical parameters. See Geospatial Indexes and Queries (page 326).
GridFS A convention for storing large files in a MongoDB database. All of the official MongoDB drivers support
this convention, as does the mongofiles (page 986) program. See GridFS (page 154).
hashed shard key A special type of shard key that uses a hash of the value in the shard key field to distribute
documents among members of the sharded cluster. See Hashed Index (page 333).
haystack index A geospatial index that enhances searches by creating buckets of objects grouped by a second
criterion. See Haystack Indexes (page 330).
hidden member A replica set member that cannot become primary and are invisible to client applications. See
Hidden Replica Set Members (page 387).
idempotent The quality of an operation to produce the same result given the same input, whether run once or run
multiple times.
index A data structure that optimizes queries. See Index Concepts (page 318).
initial sync The replica set operation that replicates data from an existing replica set member to a new or restored
replica set member. See Initial Sync (page 412).
IPv6 A revision to the IP (Internet Protocol) standard that provides a significantly larger address space to more
effectively support the number of hosts on the contemporary Internet.
ISODate The international date format used by mongo (page 942) to display dates. The format is: YYYY-MM-DD
HH:MM.SS.millis.
JavaScript A popular scripting language originally designed for web browsers. The MongoDB shell and certain
server-side functions use a JavaScript interpreter. See Server-side JavaScript (page 198) for more information.
journal A sequential, binary transaction log used to bring the database into a consistent state in the event of a hard
shutdown. Journaling writes data first to the journal and then to the core data files. MongoDB enables journaling
by default for 64-bit builds of MongoDB version 2.0 and newer. Journal files are pre-allocated and exist as files
in the data directory. See Journaling Mechanics (page 232).
JSON JavaScript Object Notation. A human-readable, plain text format for expressing structured data with support in
many programming languages. For more information, see http://www.json.org. Certain MongoDB tools render
an approximation of MongoDB BSON documents in JSON format. See MongoDB Extended JSON (page 108).
JSON document A JSON document is a collection of fields and values in a structured format. For sample JSON
documents, see http://json.org/example.html.
JSONP JSON with Padding. Refers to a method of injecting JSON into applications. Presents potential security
concerns.
1022
legacy coordinate pairs The format used for geospatial data prior to MongoDB version 2.4. This format stores
geospatial data as points on a planar coordinate system (e.g. [ x, y ]). See Geospatial Indexes and Queries
(page 326).
LineString A LineString is defined by an array of two or more positions. A closed LineString with four or more positions is called a LinearRing, as described in the GeoJSON LineString specification: http://geojson.org/geojsonspec.html#linestring. To use a LineString in MongoDB, see Store GeoJSON Objects (page 328).
LVM Logical volume manager. LVM is a program that abstracts disk images from physical devices and provides a
number of raw disk manipulation and snapshot capabilities useful for system management. For information on
LVM and MongoDB, see Backup and Restore Using LVM on a Linux System (page 185).
map-reduce A data processing and aggregation paradigm consisting of a map phase that selects data and a reduce
phase that transforms the data. In MongoDB, you can run arbitrary aggregations over data using map-reduce. For
map-reduce implementation, see Map-Reduce (page 282). For all approaches to aggregation, see Aggregation
Concepts (page 279).
mapping type A Structure in programming languages that associate keys with values, where keys may nest other
pairs of keys and values (e.g. dictionaries, hashes, maps, and associative arrays). The properties of these
structures depend on the language specification and implementation. Generally the order of keys in mapping
types is arbitrary and not guaranteed.
master The database that receives all writes in a conventional master-slave replication. In MongoDB, replica sets
replace master-slave replication for most use cases. For more information on master-slave replication, see
Master Slave Replication (page 413).
md5 A hashing algorithm used to efficiently provide reproducible unique strings to identify and checksum data.
MongoDB uses md5 to identify chunks of data for GridFS. See filemd5 (page 750).
MIME Multipurpose Internet Mail Extensions. A standard set of type and encoding definitions used to declare
the encoding and type of data in multiple data storage, transmission, and email contexts. The mongofiles
(page 986) tool provides an option to specify a MIME type to describe a file inserted into GridFS storage.
mongo The MongoDB shell. The mongo (page 942) process starts the MongoDB shell as a daemon connected to
either a mongod (page 925) or mongos (page 938) instance. The shell has a JavaScript interface. See mongo
(page 942) and mongo Shell Methods (page 806).
mongod The MongoDB database server. The mongod (page 925) process starts the MongoDB server as a daemon.
The MongoDB server manages data requests and formats and manages background operations. See mongod
(page 925).
MongoDB An open-source document-based database system. MongoDB derives from the word humongous
because of the databases ability to scale up with ease and hold very large amounts of data. MongoDB stores
documents in collections within databases.
mongos The routing and load balancing process that acts an interface between an application and a MongoDB
sharded cluster. See mongos (page 937).
namespace
The canonical name for a collection or index in MongoDB. The namespace is
a combination of the database name and the name of the collection or index, like so:
[database-name].[collection-or-index-name].
All documents belong to a namespace.
See What is a namespace in MongoDB? (page 585).
natural order The order that a database stores documents on disk. Typically, the order of documents on disks
reflects insertion order, except when a document moves internally because an update operation increases its
size. In capped collections, documents do not move internally, and therefore insertion order and natural order are
identical in capped collections. MongoDB returns documents in forward natural order for a find() (page 816)
query with no parameters. MongoDB returns documents in reverse natural order for a find() (page 816) query
sorted (page 872) with a parameter of $natural:-1. See $natural (page 693).
1023
ObjectId A special 12-byte BSON type that guarantees uniqueness within the collection. The ObjectID is generated
based on timestamp, machine ID, process ID, and a process-local incremental counter. MongoDB uses ObjectId
values as the default values for _id fields.
operator A keyword beginning with a $ used to express an update, complex query, or data transformation. For
example, $gt is the query languages greater than operator. For available operators, see Operators (page 621).
oplog A capped collection that stores an ordered history of logical writes to a MongoDB database. The oplog is the
basic mechanism enabling replication in MongoDB. See Replica Set Oplog (page 410).
ordered query plan A query plan that returns results in the order consistent with the sort() (page 872) order. See
Query Plans (page 45).
padding The extra space allocated to document on the disk to prevent moving a document when it grows as the result
of update() (page 849) operations. See Padding Factor (page 65).
padding factor An automatically-calibrated constant used to determine how much extra space MongoDB should
allocate per document container on disk. A padding factor of 1 means that MongoDB will allocate only the
amount of space needed for the document. A padding factor of 2 means that MongoDB will allocate twice the
amount of space required by the document. See Padding Factor (page 65).
page fault The event that occurs when a process requests stored data (i.e. a page) from memory that the operating
system has moved to disk. See What are page faults? (page 610).
partition A distributed system architecture that splits data into ranges. Sharding uses partitioning. See Data Partitioning (page 495).
passive member A member of a replica set that cannot become primary because its priority (page 481) is 0. See
Priority 0 Replica Set Members (page 386).
pcap A packet-capture format used by mongosniff (page 982) to record packets captured from network interfaces
and display them as human-readable MongoDB operations. See Options (page 982).
PID A process identifier. UNIX-like systems assign a unique-integer PID to each running process. You can use a
PID to inspect a running process and send signals to it. See /proc File System (page 227).
pipe A communication channel in UNIX-like systems allowing independent processes to send and receive data. In
the UNIX shell, piped operations allow users to direct the output of one command into the input of another.
pipeline A series of operations in an aggregation process. See Aggregation Concepts (page 279).
Point
A single coordinate pair as described in the GeoJSON Point specification: http://geojson.org/geojsonspec.html#point. To use a Point in MongoDB, see Store GeoJSON Objects (page 328).
Polygon An array of LinearRing coordinate arrays, as described in the GeoJSON Polygon specification:
http://geojson.org/geojson-spec.html#polygon. For Polygons with multiple rings, the first must be the exterior
ring and any others must be interior rings or holes.
MongoDB does not permit the exterior ring to self-intersect. Interior rings must be fully contained within the
outer loop and cannot intersect or overlap with each other. See Store GeoJSON Objects (page 328).
powerOf2Sizes A per-collection setting that changes and normalizes the way MongoDB allocates space for each
document, in an effort to maximize storage reuse and to reduce fragmentation. This is the default for TTL
Collections (page 158). See collMod (page 755) and usePowerOf2Sizes (page 755).
pre-splitting An operation performed before inserting data that divides the range of possible shard key values into
chunks to facilitate easy insertion and high write throughput. In some cases pre-splitting expedites the initial
distribution of documents in sharded cluster by manually dividing the collection rather than waiting for the
MongoDB balancer to do so. See Create Chunks in a Sharded Cluster (page 545).
1024
primary In a replica set, the primary member is the current master instance, which receives all write operations. See
Primary (page ??).
primary key A records unique immutable identifier. In an RDBMS, the primary key is typically an integer stored
in each rows id field. In MongoDB, the _id field holds a documents primary key which is usually a BSON
ObjectId.
primary shard The shard that holds all the un-sharded collections. See Primary Shard (page 501).
priority A configurable value that helps determine which members in a replica set are most likely to become primary.
See priority (page 481).
projection A document given to a query that specifies which fields MongoDB returns in the result set. See Limit
Fields to Return from a Query (page 72). For a list of projection operators, see Projection Operators (page 646).
query A read request. MongoDB uses a JSON-like query language that includes a variety of query operators with
names that begin with a $ character. In the mongo (page 942) shell, you can issue queries using the find()
(page 816) and findOne() (page 824) methods. See Read Operations (page 39).
query optimizer A process that generates query plans. For each query, the optimizer generates a plan that matches
the query to the index that will return results as efficiently as possible. The optimizer reuses the query plan each
time the query runs. If a collection changes significantly, the optimizer creates a new query plan. See Query
Plans (page 45).
RDBMS Relational Database Management System. A database management system based on the relational model,
typically using SQL as the query language.
read lock In the context of a reader-writer lock, a lock that while held allows concurrent readers but no writers. See
What type of locking does MongoDB use? (page 596).
read preference A setting that determines how clients direct read operations. Read preference affects all replica sets,
including shards. By default, MongoDB directs reads to primaries for strict consistency. However, you may
also direct reads to secondaries for eventually consistent reads. See Read Preference (page 405).
record size The space allocated for a document including the padding. For more information on padding, see Padding
Factor (page 65) and compact (page 752).
recovering A replica set member status indicating that a member is not ready to begin normal activities of a secondary
or primary. Recovering members are unavailable for reads.
replica pairs The precursor to the MongoDB replica sets.
Deprecated since version 1.6.
replica set A cluster of MongoDB servers that implements master-slave replication and automated failover. MongoDBs recommended replication strategy. See Replication (page 377).
replication A feature allowing multiple database servers to share the same data, thereby ensuring redundancy and
facilitating load balancing. See Replication (page 377).
replication lag The length of time between the last operation in the primarys oplog and the last operation applied to
a particular secondary. In general, you want to keep replication lag as small as possible. See Replication Lag
(page 463).
resident memory The subset of an applications memory currently stored in physical RAM. Resident memory is a
subset of virtual memory, which includes memory mapped to physical RAM and to disk.
REST An API design pattern centered around the idea of resources and the CRUD operations that apply to them.
Typically REST is implemented over HTTP. MongoDB provides a simple HTTP REST interface that allows
HTTP clients to run commands against the server. See REST Interface (page 139) and REST API (page 241).
rollback A process that reverts writes operations to ensure the consistency of all replica set members. See Rollbacks
During Replica Set Failover (page 401).
1025
secondary A replica set member that replicates the contents of the master database. Secondary members may handle
read requests, but only the primary members can handle write operations. See Secondaries (page ??).
secondary index A database index that improves query performance by minimizing the amount of work that the
query engine must perform to fulfill a query. See Indexes (page 313).
set name The arbitrary name given to a replica set. All members of a replica set must have the same name specified
with the replSet (page 1000) setting or the --replSet option.
shard A single mongod (page 925) instance or replica set that stores some portion of a sharded clusters total data
set. In production, all shards should be replica sets. See Shards (page 499).
shard key The field MongoDB uses to distribute documents among members of a sharded cluster. See Shard Keys
(page 506).
sharded cluster The set of nodes comprising a sharded MongoDB deployment. A sharded cluster consists of three
config processes, one or more replica sets, and one or more mongos (page 938) routing processes. See Sharded
Cluster Components (page 499).
sharding A database architecture that partitions data by key ranges and distributes the data among two or more
database instances. Sharding enables horizontal scaling. See Sharding (page 493).
shell helper A method in the mongo shell that provides a more concise syntax for a database command (page 694).
Shell helpers improve the general interactive experience. See mongo Shell Methods (page 806).
single-master replication A replication topology where only a single database instance accepts writes. Singlemaster replication ensures consistency and is the replication topology employed by MongoDB. See Replica Set
Primary (page 382).
slave A read-only database that replicates operations from a master database in conventional master/slave replication.
In MongoDB, replica sets replace master/slave replication for most use cases. However, for information on
master/slave replication, see Master Slave Replication (page 413).
split The division between chunks in a sharded cluster. See Chunk Splits in a Sharded Cluster (page 519).
SQL Structured Query Language (SQL) is a common special-purpose programming language used for interaction
with a relational database, including access control, insertions, updates, queries, and deletions. There are some
similar elements in the basic SQL syntax supported by different database vendors, but most implementations
have their own dialects, data types, and interpretations of proposed SQL standards. Complex SQL is generally
not directly portable between major RDBMS products. SQL is often used as metonym for relational databases.
SSD Solid State Disk. A high-performance disk drive that uses solid state electronics for persistence, as opposed to
the rotating platters and movable read/write heads used by traditional mechanical hard drives.
standalone An instance of mongod (page 925) that is running as a single server and not as part of a replica set. To
convert a standalone into a replica set, see Convert a Standalone to a Replica Set (page 432).
strict consistency A property of a distributed system requiring that all members always reflect the latest changes to
the system. In a database system, this means that any system that can provide data must reflect the latest writes
at all times. In MongoDB, reads from a primary have strict consistency; reads from secondary members have
eventual consistency.
sync The replica set operation where members replicate data from the primary. Sync first occurs when MongoDB
creates or restores a member, which is called initial sync. Sync then occurs continually to keep the member
updated with changes to the replica sets data. See Replica Set Data Synchronization (page 412).
syslog On UNIX-like systems, a logging process that provides a uniform standard for servers and processes to submit
logging information. MongoDB provides an option to send output to the hosts syslog system. See syslog
(page 992).
1026
tag A label applied to a replica set member or shard and used by clients to issue data-center-aware operations. For
more information on using tags with replica sets and with shards, see the following sections of this manual: Tag
Sets (page 407) and Behavior and Operations (page 557).
TSV A text-based data format consisting of tab-separated values. This format is commonly used to exchange data
between relational databases, since the format is well-suited to tabular data. You can import TSV files using
mongoimport (page 965).
TTL Stands for time to live and represents an expiration time or period for a given piece of information to remain
in a cache or other temporary storage before the system deletes it or ages it out. MongoDB has a TTL collection
feature. See Expire Data from Collections by Setting TTL (page 158).
unique index An index that enforces uniqueness for a particular field across a single collection. See Unique Indexes
(page 334).
unordered query plan A query plan that returns results in an order inconsistent with the sort() (page 872) order.
See Query Plans (page 45).
upsert An operation that will either update the first document matched by a query or insert a new document if
none matches. The new document will have the fields implied by the operation. You perform upserts with the
update() (page 849) operation. See Upsert Parameter (page 850).
virtual memory An applications working memory, typically residing on both disk an in physical RAM.
WGS84 The default datum MongoDB uses to calculate geometry over an Earth-like sphere. MongoDB uses the
WGS84 datum for geospatial queries on GeoJSON objects. See the EPSG:4326: WGS 84 specification:
http://spatialreference.org/ref/epsg/4326/.
working set The data that MongoDB uses most often. This data is preferably held in RAM, solid-state drive (SSD),
or other fast media. See What is the working set? (page 611).
write concern Specifies whether a write operation has succeeded. Write concern allows your application to detect
insertion errors or unavailable mongod (page 925) instances. For replica sets, you can configure write concern
to confirm replication to a specified number of members. See Write Concern (page 55).
write lock A lock on the database for a given writer. When a process writes to the database, it takes an exclusive write
lock to prevent other processes from writing or reading. For more information on locks, see FAQ: Concurrency
(page 596).
writeBacks The process within the sharding system that ensures that writes issued to a shard that is not responsible
for the relevant chunk get applied to the proper shard. For related information, see What does writebacklisten in
the log mean? (page 603) and writeBacksQueued (page 792).
See also:
The genindex may provide useful insight into the reference material in this manual.
1027
1028
CHAPTER 12
Release Notes
Always install the latest, stable version of MongoDB. See MongoDB Version Numbers (page 1090) for more information.
See the following release notes for an account of the changes in major versions. Release notes also include instructions
for upgrade.
1029
Developer Productivity
Aggregation Framework refinements include an overhaul of the underlying engine introduced in MongoDB
2.2 making it easier to leverage real-time, in-place analytics. MongoDB 2.4 includes significant performance
improvements, additional support for binary data, support for $geoWithin (page 635) and $near (page 637)
geospatial queries, improved string concatenation with the new $concat (page 681) operator, and improved
date calculation semantics.
Geospatial enhancements support new use cases with support for polygon intersection queries (with
$geoIntersects (page 637)), support for GeoJSON, and an improved spherical model. Learn more about
geospatial improvements in 2 (page 1039).
Text Search provides a simplified, integrated approach to incorporating search functionality into apps with
support for language specific stemming and stop words in 15 languages and real time indexes. Text search is
beta for 2.4 and is not recommended for production use. Learn more about text search (page 1039).
New update semantics for arrays with the $push (page 659) update operator. Applications can now use
$slice (page 661) to maintain fixed size arrays, and use $sort (page 661) to maintain sorted arrays. Learn
more about capped arrays (page 1043).
New $setOnInsert (page 654) update operator supports specifying fields to add only on insert and upsert
operations.
Ease of Operations
Hashed indexes and shard keys provide simple, even distribution for reads and writes. Learn more about hashed
indexes and shard keys (page 1039).
New serverStatus (page 782) metrics including a working set analysis tool makes capacity planning easier
for operations teams. Learn more about the new serverStatus metrics (page 1041) including the working set
analyzer (page 794).
More control for operators with the ability to terminate indexing operations with automatic resource cleanup.
Improved Performance
V8 JavaScript engine offers better performance and concurrency with JavaScript based actions including those
using the $where (page 634) query operator as well as mapReduce (page 701) and eval (page 722). Learn
more about MongoDB on V8 (page 1043), and JavaScript Changes in MongoDB 2.4 (page 1043).
Improvements to count (page 695) provide dramatically faster count operations. Counting is now up to 20
times faster for low cardinality index based counts.
Significant optimizations to $elemMatch (page 645) when using a multi-key index.
More Robust Security
Role-Based privileges allow organizations to assign more granular security policies for server, database and
cluster administration. Learn more about role based access control in MongoDB (page 265).
Kerberos authentication mechanism in MongoDB Enterprise.
1030
MongoDB Enterprise
MongoDB Enterprise is a commercial edition of MongoDB that includes enterprise-grade capabilities, such as advanced security features, management tools, software integrations and certifications. Available as part of the MongoDB Enterprise Subscription, this edition includes the most comprehensive SLA and a commercial license. Continue
reading for more information on MongoDB Enterprise2
Learning More
These features represent only a small portion of the improvements made in MongoDB 2.4. For more details see the
MongoDB 2.4 release notes (page 1029) and Jira for a complete list of all cases closed for MongoDB 2.4 sorted by
user votes3
Platform Support
For OS X, MongoDB 2.4 only supports OS X versions 10.6 (Snow Leopard) and later. There are no other platform
support changes in MongoDB 2.4. See the downloads page4 for more information on platform support.
Upgrade Process
Upgrade MongoDB to 2.4
In the general case, the upgrade from MongoDB 2.2 to 2.4 is a binary-compatible drop-in upgrade: shut down the
mongod (page 925) instances and replace them with mongod (page 925) instances running 2.4. However, before
you attempt any upgrade please familiarize yourself with the content of this document, particularly the procedure for
upgrading sharded clusters (page 1032) and the considerations for reverting to 2.2 after running 2.4 (page 1036).
Content
3 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.3.2%22,+%222.3.1%22,+%222
rc0%22,+%222.4.0-rc1%22,+%222.4.0-rc2%22,+%222.4.0-rc3%22%29+ORDER+BY+votes+DESC,+status+DESC,+priority+DESC
4 http://www.mongodb.org/downloads/
1031
If you have system.users (page 270) documents (i.e. for auth (page 993)) that you created before 2.4 you
must ensure that there are no duplicate values for the user field in the system.users (page 270) collection
in any database. If you do have documents with duplicate user fields, you must remove them before upgrading.
See Compatibility Change: User Uniqueness Enforced (page 1040) for more information.
Upgrade Standalone mongod Instance to MongoDB 2.4
1. Download binaries of the latest release in the 2.4 series from the MongoDB Download Page5 . See Install
MongoDB (page 3) for more information.
2. Shutdown your mongod (page 925) instance. Replace the existing binary with the 2.4 mongod (page 925)
binary and restart mongod (page 925).
Upgrade a Replica Set from MongoDB 2.2 to MongoDB 2.4 You can upgrade to 2.4 by performing a rolling upgrade of the set by upgrading the members individually while the other members are available to minimize downtime.
Use the following procedure:
1. Upgrade the secondary members of the set one at a time by shutting down the mongod (page 925) and replacing the 2.2 binary with the 2.4 binary. After upgrading a mongod (page 925) instance, wait for the member to recover to SECONDARY state before upgrading the next instance. To check the members state, issue
rs.status() (page 898) in the mongo (page 942) shell.
2. Use the mongo (page 942) shell method rs.stepDown() (page 899) to step down the primary to allow the
normal failover (page 396) procedure. rs.stepDown() (page 899) expedites the failover procedure and is
preferable to shutting down the primary directly.
Once the primary has stepped down and another member has assumed PRIMARY state, as observed in the output
of rs.status() (page 898), shut down the previous primary and replace mongod (page 925) binary with
the 2.4 binary and start the new process.
Note: Replica set failover is not instant but will render the set unavailable to read or accept writes until the
failover process completes. Typically this takes 10 seconds or more. You may wish to plan the upgrade during
a predefined maintenance window.
1032
Warning:
Before you start the upgrade, ensure that the amount of free space on the filesystem for the config database
(page 564) is 4 to 5 times the amount of space currently used by the config database (page 564) data files.
Additionally, ensure that all indexes in the config database (page 564) are {v:1} indexes. If a critical index
is a {v:0} index, chunk splits can fail due to known issues with the {v:0} format. {v:0} indexes are
present on clusters created with MongoDB 2.0 or earlier.
The duration of the metadata upgrade depends on the network latency between the node performing the
upgrade and the three config servers. Ensure low latency between the upgrade process and the config servers.
While the upgrade is in progress, you cannot make changes to the collection meta-data. For example, during
the upgrade, do not perform:
sh.enableSharding() (page 905),
sh.shardCollection() (page 908),
sh.addShard() (page 903),
db.createCollection() (page 878),
db.collection.drop() (page 812),
db.dropDatabase() (page 884),
any operation that creates a database, or
any other operation that modifies the cluster meta-data in any way. See Sharding Reference (page 563)
for a complete list of sharding commands. Note, however, that not all commands on the Sharding
Reference (page 563) page modifies the cluster meta-data.
Once you upgrade to 2.4 and complete the upgrade procedure do not use 2.0 mongod (page 925) and
mongos (page 938) processes in your cluster. 2.0 process may re-introduce old meta-data formats into
cluster meta-data.
Note: The upgraded config database will require more storage space than before, to make backup and working copies
of the config.chunks (page 566) and config.collections (page 567) collections. As always, if storage
requirements increase, the mongod (page 925) might need to pre-allocate additional data files. See What tools can I
use to investigate storage use in MongoDB? (page 611) for more information.
Meta-Data Upgrade Procedure Changes to the meta-data format for sharded clusters, stored in the config database
(page 564), require a special meta-data upgrade procedure when moving to 2.4.
Do not perform operations that modify meta-data while performing this procedure. See Upgrade a Sharded Cluster
from MongoDB 2.2 to MongoDB 2.4 (page 1032) for examples of prohibited operations.
1. Before you start the upgrade, ensure that the amount of free space on the filesystem for the config database
(page 564) is 4 to 5 times the amount of space currently used by the config database (page 564) data files.
Additionally, ensure that all indexes in the config database (page 564) are {v:1} indexes. If a critical index is
a {v:0} index, chunk splits can fail due to known issues with the {v:0} format. {v:0} indexes are present
on clusters created with MongoDB 2.0 or earlier.
The duration of the metadata upgrade depends on the network latency between the node performing the upgrade
and the three config servers. Ensure low latency between the upgrade process and the config servers.
To check the version of your indexes, use db.collection.getIndexes() (page 826).
If any index on the config database is {v:0}, you should rebuild those indexes by connecting to the
mongos (page 938) and either: rebuild all indexes using the db.collection.reIndex() (page 844)
method, or drop and rebuild specific indexes using db.collection.dropIndex() (page 812) and then
db.collection.ensureIndex() (page 814). If you need to upgrade the _id index to {v:1} use
db.collection.reIndex() (page 844).
You may have {v:0} indexes on other databases in the cluster.
2. Turn off the balancer (page 516) in the sharded cluster, as described in Disable the Balancer (page 552).
12.1. Current Stable Release
1033
Optional
For additional security during the upgrade, you can make a backup of the config database using mongodump
(page 951) or other backup tools.
3. Ensure there are no version 2.0 mongod (page 925) or mongos (page 938) processes still active in the sharded
cluster. The automated upgrade process checks for 2.0 processes, but network availability can prevent a definitive check. Wait 5 minutes after stopping or upgrading version 2.0 mongos (page 938) processes to confirm
that none are still active.
4. Start a single 2.4 mongos (page 938) process with configdb (page 1001) pointing to the sharded clusters
config servers (page 502) and with the --upgrade option. The upgrade process happens before the process
becomes a daemon (i.e. before --fork.)
You can upgrade an existing mongos (page 938) instance to 2.4 or you can start a new mongos instance that
can reach all config servers if you need to avoid reconfiguring a production mongos (page 938).
Start the mongos (page 938) with a command that resembles the following:
mongos --configdb <config servers> --upgrade
Without the --upgrade option 2.4 mongos (page 938) processes will fail to start until the upgrade process is
complete.
The upgrade will prevent any chunk moves or splits from occurring during the upgrade process. If there are
very many sharded collections or there are stale locks held by other failed processes, acquiring the locks for all
collections can take seconds or minutes. See the log for progress updates.
5. When the mongos (page 938) process starts successfully, the upgrade is complete. If the mongos (page 938)
process fails to start, check the log for more information.
If the mongos (page 938) terminates or loses its connection to the config servers during the upgrade, you may
always safely retry the upgrade.
However, if the upgrade failed during the short critical section, the mongos (page 938) will exit and report that
the upgrade will require manual intervention. To continue the upgrade process, you must follow the Resync
after an Interruption of the Critical Section (page 1035) procedure.
Optional
If the mongos (page 938) logs show the upgrade waiting for the upgrade lock, a previous upgrade process may
still be active or may have ended abnormally. After 15 minutes of no remote activity mongos (page 938) will
force the upgrade lock. If you can verify that there are no running upgrade processes, you may connect to a 2.2
mongos (page 938) process and force the lock manually:
mongo <mongos.example.net>
db.getMongo().getCollection("config.locks").findOne({ _id : "configUpgrade" })
If the process specified in the process field of this document is verifiably offline, run the following operation
to force the lock.
It is always more safe to wait for the mongos (page 938) to verify that the lock is inactive, if you have any
doubts about the activity of another upgrade operation. In addition to the configUpgrade, the mongos
(page 938) may need to wait for specific collection locks. Do not force the specific collection locks.
1034
6. Upgrade and restart other mongos (page 938) processes in the sharded cluster, without the --upgrade option.
See Complete Sharded Cluster Upgrade (page 1036) for more information.
7. Re-enable the balancer (page 552). You can now perform operations that modify cluster meta-data.
Once you have upgraded, do not introduce version 2.0 MongoDB processes into the sharded cluster. This can reintroduce old meta-data formats into the config servers. The meta-data change made by this upgrade process will help
prevent errors caused by cross-version incompatibilities in future versions of MongoDB.
Resync after an Interruption of the Critical Section During the short critical section of the upgrade that applies
changes to the meta-data, it is unlikely but possible that a network interruption can prevent all three config servers
from verifying or modifying data. If this occurs, the config servers (page 502) must be re-synced, and there may be
problems starting new mongos (page 938) processes. The sharded cluster will remain accessible, but avoid all cluster
meta-data changes until you resync the config servers. Operations that change meta-data include: adding shards,
dropping databases, and dropping collections.
Note: Only perform the following procedure if something (e.g. network, power, etc.) interrupts the upgrade process
during the short critical section of the upgrade. Remember, you may always safely attempt the meta data upgrade
procedure (page 1033).
To resync the config servers:
1. Turn off the balancer (page 516) in the sharded cluster and stop all meta-data operations. If you are in the
middle of an upgrade process (Upgrade a Sharded Cluster from MongoDB 2.2 to MongoDB 2.4 (page 1032)),
you have already disabled the balancer.
2. Shut down two of the three config servers, preferably the last two listed in the
configdb (page 1001) string.
For example, if your configdb (page 1001) string is
configA:27019,configB:27019,configC:27019, shut down configB and configC. Shutting down the last two config servers ensures that most mongos (page 938) instances will have uninterrupted
access to cluster meta-data.
3. mongodump (page 951) the data files of the active config server (configA).
4. Move the data files of the deactivated config servers (configB and configC) to a backup location.
5. Create new, empty data directories.
6. Restart the disabled config servers with --dbpath pointing to the now-empty data directory and --port
pointing to an alternate port (e.g. 27020).
7. Use mongorestore (page 956) to repopulate the data files on the disabled documents from the active config
server (configA) to the restarted config servers on the new port (configB:27020,configC:27020).
These config servers are now re-synced.
8. Restart the restored config servers on the old port, resetting the port back to the old settings (configB:27019
and configC:27019).
9. In some cases connection pooling may cause spurious failures, as the mongos (page 938) disables old connections only after attempted use. 2.4 fixes this problem, but to avoid this issue in version 2.2, you can restart
all mongos (page 938) instances (one-by-one, to avoid downtime) and use the rs.stepDown() (page 899)
method before restarting each of the shard replica set primaries.
10. The sharded cluster is now fully resynced; however before you attempt the upgrade process again, you must
manually reset the upgrade state using a version 2.2 mongos (page 938). Begin by connecting to the 2.2
mongos (page 938) with the mongo (page 942) shell:
mongo <mongos.example.net>
1035
11. Finally retry the upgrade process, as in Upgrade a Sharded Cluster from MongoDB 2.2 to MongoDB 2.4
(page 1032).
Complete Sharded Cluster Upgrade After you have successfully completed the meta-data upgrade process described in Meta-Data Upgrade Procedure (page 1033), and the 2.4 mongos (page 938) instance starts, you can
upgrade the other processes in your MongoDB deployment.
While the balancer is still disabled, upgrade the components of your sharded cluster in the following order:
Upgrade all mongos (page 938) instances in the cluster, in any order.
Upgrade all 3 mongod (page 925) config server instances, upgrading the first system in the mongos
--configdb argument last.
Upgrade each shard, one at a time, upgrading the mongod (page 925) secondaries before running
replSetStepDown (page 730) and upgrading the primary of each shard.
When this process is complete, you can now re-enable the balancer (page 552).
Rolling Upgrade Limitation for 2.2.0 Deployments Running with auth Enabled MongoDB cannot support
deployments that mix 2.2.0 and 2.4.0, or greater, components. MongoDB version 2.2.1 and later processes can exist in
mixed deployments with 2.4-series processes. Therefore you cannot perform a rolling upgrade from MongoDB 2.2.0
to MongoDB 2.4.0. To upgrade a cluster with 2.2.0 components, use one of the following procedures.
1. Perform a rolling upgrade of all 2.2.0 processes to the latest 2.2-series release (e.g. 2.2.3) so that there are no
processes in the deployment that predate 2.2.1. When there are no 2.2.0 processes in the deployment, perform a
rolling upgrade to 2.4.0.
2. Stop all processes in the cluster. Upgrade all processes to a 2.4-series release of MongoDB, and start all processes at the same time.
Upgrade from 2.3 to 2.4 If you used a mongod (page 925) from the 2.3 or 2.4-rc (release candidate) series, you
can safely transition these databases to 2.4.0 or later; however, if you created 2dsphere or text indexes using a
mongod (page 925) before v2.4-rc2, you will need to rebuild these indexes. For example:
db.records.dropIndex( { loc: "2dsphere" } )
db.records.dropIndex( "records_text" )
db.records.ensureIndex( { loc: "2dsphere" } )
db.records.ensureIndex( { records: "text" } )
Downgrade MongoDB from 2.4 to Previous Versions For some cases the on-disk format of data files used by 2.4
and 2.2 mongod (page 925) is compatible, and you can upgrade and downgrade if needed. However, several new
features in 2.4 are incompatible with previous versions:
2dsphere indexes are incompatible with 2.2 and earlier mongod (page 925) instances.
text indexes are incompatible with 2.2 and earlier mongod (page 925) instances.
using a hashed index as a shard key are incompatible with 2.2 and earlier mongos (page 938) instances.
hashed indexes are incompatible with 2.0 and earlier mongod (page 925) instances.
1036
Important: Collections sharded using hashed shard keys, should not use 2.2 mongod (page 925) instances, which
cannot correctly support cluster operations for these collections.
If you completed the meta-data upgrade for a sharded cluster (page 1032), you can safely downgrade to 2.2 MongoDB
processes. Do not use 2.0 processes after completing the upgrade procedure.
Note: In sharded clusters, once you have completed the meta-data upgrade procedure (page 1032), you cannot use
2.0 mongod (page 925) or mongos (page 938) instances in the same cluster.
If you complete the meta-data upgrade, you can have a mixed cluster that has both 2.2 and 2.4 mongod (page 925)
and mongos (page 938) instances, if needed. However, do not create 2dsphere or text indexes in a cluster that
has 2.2 components.
Considerations and Compatibility If you upgrade to MongoDB 2.4, and then need to run MongoDB 2.2 with the
same data files, consider the following limitations.
If you use a hashed index as the shard key index, which is only possible under 2.4 you will not be able to
query data in this sharded collection. Furthermore, a 2.2 mongos (page 938) cannot properly route an insert
operation for a collections sharded using a hashed index for the shard key index: any data that you insert using
a 2.2 mongos (page 938), will not arrive on the correct shard and will not be reachable by future queries.
If you never create an 2dsphere or text index, you can move between a 2.4 and 2.2 mongod (page 925) for
a given data set; however, after you create the first 2dsphere or text index with a 2.4 mongod (page 925)
you will need to run a 2.2 mongod (page 925) with the --upgrade option and drop any 2dsphere or text
index.
Upgrade and Downgrade Procedures
Basic Downgrade and Upgrade
Except as described below, moving between 2.2 and 2.4 is a drop-in replacement:
stop the existing mongod (page 925), using the --shutdown option as follows:
mongod --dbpath /var/mongod/data --shutdown
Then, you will need to drop any existing 2dsphere or text indexes using db.collection.dropIndex()
(page 812), for example:
db.records.dropIndex( { loc: "2dsphere" } )
db.records.dropIndex( "records_text" )
1037
Warning: --upgrade will run repairDatabase (page 757) on any database where you have created a
2dsphere or text index, which will rebuild all indexes.
Troubleshooting Upgrade/Downgrade Operations If you do not use --upgrade, when you attempt to start a 2.2
mongod (page 925) and you have created a 2dsphere or text index, mongod (page 925) will return the following
message:
'need to upgrade database index_plugin_upgrade with pdfile version 4.6, new version: 4.5 Not upgradin
While running 2.4, to check the data file version of a MongoDB database, use the following operation in the shell:
db.getSiblingDB('<databasename>').stats().dataFileVersion
The major data file 6 version for both 2.2 and 2.4 is 4, the minor data file version for 2.2 is 5 and the minor data file
version for 2.4 is 6 after you create a 2dsphere or text index.
Compatibility and Index Type Changes in MongoDB 2.4
In 2.4 MongoDB includes two new features related to indexes that users upgrading to version 2.4 must consider,
particularly with regard to possible downgrade paths. For more information on downgrades, see Downgrade MongoDB
from 2.4 to Previous Versions (page 1036).
New Index Types In 2.4 MongoDB adds two new index types: 2dsphere and text. These index types do not
exist in 2.2, and for each database, creating a 2dsphere or text index, will upgrade the data-file version and make
that database incompatible with 2.2.
If you intend to downgrade, you should always drop all 2dsphere and text indexes before moving to 2.2.
You can use the downgrade procedure (page 1036) to downgrade these databases and run 2.2 if needed, however this
will run a full database repair (as with repairDatabase (page 757),) for all affected databases.
Index Type Validation In MongoDB 2.2 and earlier you could specify invalid index types that did not exist. In
these situations, MongoDB would create an ascending (e.g. 1) index. Invalid indexes include index types specified by
strings that do not refer to an existing index type, and all numbers other than 1 and -1. 7
In 2.4, creating any invalid index will result in an error. Furthermore, you cannot create a 2dsphere or text index
on a collection if its containing database has any invalid index types. 1
Example
If you attempt to add an invalid index in MongoDB 2.4, as in the following:
db.coll.ensureIndex( { field: "1" } )
The data file version (i.e. pdfile version) is independent and unrelated to the release version of MongoDB.
In 2.4, indexes that specify a type of "1" or "-1" (the strings "1" and "-1") will continue to exist, despite a warning on start-up. However,
a secondary in a replica set cannot complete an initial sync from a primary that has a "1" or "-1" index. Avoid all indexes with invalid types.
7
1038
"ok": 1
}
See Upgrade MongoDB to 2.4 (page 1031) for full upgrade instructions.
Changes
Major Features
Text Search MongoDB 2.4 adds text search of content in MongoDB databases as a beta feature. With the new text
index (page 332), and supporting, text (page 715) command you can search for text using boolean queries in data
stored in MongoDB, using an index that updates in real-time and is always consistent with the data set. See Text
Indexes (page 332) for more information about text index and text search in MongoDB.
Note: You cannot combine the text (page 715) command, which requires a special text index (page 332), with a
query operator that requires a different type of special index. For example you cannot combine text (page 715) with
the $near (page 637) operator.
New Geospatial Indexes with GeoJSON and Improved Spherical Geometry MongoDB adds the new
2dsphere geospatial index in addition to the existing 2d index. The 2dsphere index supports improved spherical
queries and supports the following GeoJSON8 objects:
Point
LineString
Polygon
The 2dsphere index supports all current geospatial query operators (page 635) and introduces the following new
query operator for queries on GeoJSON data:
$geoWithin (page 635) operator
$geoIntersects (page 637) operator
The operators use the new $geometry (page 639) parameter.
The $within (page 636) operator no longer requires a geospatial index. Additionally, 2.4 deprecates the $within
(page 636) operator. Use $geoWithin (page 635) operator instead.
For more information on geospatial indexes in 2.4, see:
Geospatial Indexes and Queries (page 326)
2d Index Internals (page 330)
New Hashed Index and Sharding with a Hashed Shard Key To support an easy to configure and evenly distributed
shard key, version 2.4 adds a new hashed index type that indexes documents using hashes of field values.
See Hashed Index (page 333) for documentation of hashed indexes, and Hashed Shard Keys (page 506) for documentation of hash-based sharding.
8 http://geojson.org/geojson-spec.html
1039
Security Improvements
SASL Library Change In 2.4.4, MongoDB Enterprise uses Cyrus SASL. Earlier 2.4 Enterprise versions use GNU
SASL (libgsasl). To upgrade to 2.4.4 MongoDB Enterprise or greater, you must install all package dependencies
related to this change, including the appropriate Cyrus SASL GSSAPI library. See Install MongoDB Enterprise
(page 20) for details of the dependencies.
Role Based Access Control and New Privilege Documents MongoDB 2.4 introduces a role based access control
system that provides more granular privileges to MongoDB users. See User Privilege Roles in MongoDB (page 265)
for more information.
To support the new access control system, 2.4 also introduces a new format for documents in a databases
system.users (page 270) collection. See system.users Privilege Documents (page 270) for more information.
Use supportCompatibilityFormPrivilegeDocuments (page 1007) to disable the legacy privilege documents, which MongoDB continues to support in 2.4.
Enhanced SSL Support In 2.4, MongoDB instances can optionally require clients to provide SSL certificates signed
by a Certificate Authority. You must use the MongoDB distribution that supports SSL, and your client driver must
support SSL. See Connect to MongoDB with SSL (page 249) for more information.
Compatibility Change: User Uniqueness Enforced 2.4 now enforces uniqueness of the user field in user privilege documents (i.e. in the system.users (page 270) collection.) Previous versions of MongoDB did not enforce
this requirement, and existing databases may have duplicates.
Administration Changes
--setParameter Option Available on the mongos and mongod Command Line You can now use
--setParameter on the command line and setParameter (page 999) in the configuration file. For mongod
(page 925) the following options are available using setParameter (page 999):
enableLocalhostAuthBypass (page 1005)
enableTestCommands (page 1005)
9 http://www.mongodb.com/products/mongodb-enterprise
10 http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-java-driver/
11 http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-csharp-driver/
1040
1041
Increased Chunk Migration Write Concern By default, all insert and delete operations that occur as part of a
chunk migration in a sharded cluster will have an increased write concern, to ensure that at least one secondary
acknowledges each insert and deletion operation. This change slows the potential speed of a chunk migration, but
increases reliability and ensures that a large number of chunk migrations cannot affect the availability of a sharded
cluster.
BSON Document Validation Enabled by Default for mongod and mongorestore Starting in 2.4, MongoDB
enables basic BSON object validation for mongod (page 925) and mongorestore (page 956) when writing to MongoDB data files. This prevents any client from inserting invalid or malformed BSON into a MongoDB database. For
objects with a high degree of sub-document nesting this validation may have a small performance impact. objcheck
(page 992), which was previously disabled by default, provides this validation.
Indexing Changes
Support for Multiple Concurrent Index Builds A single mongod (page 925) instance can build multiple indexes
in the background at the same time. See building indexes in the background (page 336) for more information on
background index builds. Foreground index builds hold a database lock and must proceed one at a time.
db.killOp() Can Now Kill Foreground Index Builds The db.killOp() (page 890) method will now terminate a foreground index build, in addition to the other operations supported in previous versions.
Improved Validation of Index Types Before 2.4, mongod (page 925) would create an ascending scalar index (e.g.
{ a : 1 }) when users attempted to create an index of a type that did not exist. Creating an index of an invalid
index type will generate an error in 2.4.
See Compatibility and Index Type Changes in MongoDB 2.4 (page 1038) for more information.
Interface Changes
$setOnInsert New Update Operator To set fields only when an upsert (page 849) performs an insert, use
the $setOnInsert (page 654) operator with the upsert (page 849) .
Example
A collection named coll has no documents with _id equal to 1.
The following upsert (page 849) operation inserts a document and applies the $setOnInsert (page 654) operator
to set the fields x and y:
db.coll.update( { _id: 1 },
{ $setOnInsert: { x: 25, y: 30 } },
{ upsert: true } )
The newly-inserted document has the field x set to 25 and the field y set to 30:
{ "_id" : 1, "x" : 25, "y" : 30 }
Note: The $setOnInsert (page 654) operator performs no operation for upserts (page 849) that only perform
an update and for updates (page 849) when the upsert option is false.
1042
Limit Number of Elements in an Array In 2.4, by using the $push (page 659) operator with the $each
(page 660), the $sort (page 661), and the $slice (page 661) modifiers, you can add multiple elements to an array,
sort and limit the number of elements in the modified array to maintain an array with a fixed number of elements.
See Limit Number of Elements in an Array after an Update (page 89) for an example where an update maintains the
top three scores for a student.
See also:
The following pages provide additional information and examples:
$push (page 659) operator
$each (page 660) modifier
$sort (page 661) modifier
$slice (page 661) modifier
JavaScript Engine Changed to V8
JavaScript Changes in MongoDB 2.4
(page 1043) in MongoDB 2.4:
Improved Concurrency Previously, MongoDB operations that required the JavaScript interpreter had to acquire
a lock, and a single mongod (page 925) could only run a single JavaScript operation at a time. The switch to V8
improves concurrency by permitting multiple JavaScript operations to run at the same time.
Modernized JavaScript Implementation (ES5) The 5th edition of ECMAscript12 , abbreviated as ES5, adds many
new language features, including:
standardized JSON13 ,
strict mode14 ,
function.bind()15 ,
array extensions16 , and
getters and setters.
With V8, MongoDB supports the ES5 implementation of Javascript with the following exceptions.
Note: The following features do not work as expected on documents returned from MongoDB queries:
Object.seal() throws an exception on documents returned from MongoDB queries.
Object.freeze() throws an exception on documents returned from MongoDB queries.
Object.preventExtensions() incorrectly allows the addition of new properties on documents returned
from MongoDB queries.
enumerable properties, when added to documents returned from MongoDB queries, are not saved during
write operations.
12 http://www.ecma-international.org/publications/standards/Ecma-262.htm
13 http://www.ecma-international.org/ecma-262/5.1/#sec-15.12.1
14 http://www.ecma-international.org/ecma-262/5.1/#sec-4.2.2
15 http://www.ecma-international.org/ecma-262/5.1/#sec-15.3.4.5
16 http://www.ecma-international.org/ecma-262/5.1/#sec-15.4.4.16
1043
Removed Non-Standard SpiderMonkey Features V8 does not support the following non-standard SpiderMonkey21 JavaScript extensions, previously supported by MongoDBs use of SpiderMonkey as its JavaScript engine.
E4X Extensions V8 does not support the non-standard E4X22 extensions. E4X provides a native XML23 object to
the JavaScript language and adds the syntax for embedding literal XML documents in JavaScript code.
You need to use alternative XML processing if you used any of the following constructors/methods:
XML()
Namespace()
QName()
XMLList()
isXMLName()
Destructuring Assignment V8 does not support the non-standard destructuring assignments. Destructuring assignment extract[s] data from arrays or objects using a syntax that mirrors the construction of array and object literals. Mozilla docs24
Example
The following destructuring assignment is invalid with V8 and throws a SyntaxError:
original = [4, 8, 15];
var [b, ,c] = a; // <== destructuring assignment
print(b) // 4
print(c) // 15
Iterator(), StopIteration(), and Generators V8 does not support Iterator(), StopIteration(), and generators25 .
InternalError() V8 does not support InternalError(). Use Error() instead.
for each...in Construct V8 does not support the use of for each...in26 construct. Use for (var x in y)
construct instead.
Example
17 https://jira.mongodb.org/browse/SERVER-8216
18 https://jira.mongodb.org/browse/SERVER-8223
19 https://jira.mongodb.org/browse/SERVER-8215
20 https://jira.mongodb.org/browse/SERVER-8214
21 https://developer.mozilla.org/en-US/docs/SpiderMonkey
22 https://developer.mozilla.org/en-US/docs/E4X
23 https://developer.mozilla.org/en-US/docs/E4X/Processing_XML_with_E4X
24 https://developer.mozilla.org/en-US/docs/JavaScript/New_in_JavaScript/1.7#Destructuring_assignment_(Merge_into_own_page.2Fsection)
25 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Iterators_and_Generators
26 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Statements/for_each...in
1044
Instead, in version 2.4, you can use the for (var x in y) construct:
var o = { name: 'MongoDB', version: 2.4 };
for (var prop in o) {
var value = o[prop];
print(value);
}
You can also use the array instance method forEach() with the ES5 method Object.keys():
Object.keys(o).forEach(function (key) {
var value = o[key];
print(value);
});
Instead, you can implement using the Array instance method forEach() and the ES5 method Object.keys()
:
var a = { w: 1, x: 2, y: 3, z: 4 }
var arr = [];
Object.keys(a).forEach(function (key) {
var val = a[key];
if (val > 2) arr.push(val * val);
})
printjson(arr)
Note:
The new logic uses the Array instance method forEach() and not the generic method
Array.forEach(); V8 does not support Array generic methods. See Array Generic Methods (page 1047) for
more information.
27 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Predefined_Core_Objects#Array_comprehensions
1045
Multiple Catch Blocks V8 does not support multiple catch blocks and will throw a SyntaxError.
Example
The following multiple catch blocks is invalid with V8 and will throw "SyntaxError:
if":
Unexpected token
try {
something()
} catch (err if err instanceof SomeError) {
print('some error')
} catch (err) {
print('standard error')
}
Conditional Function Definition V8 will produce different outcomes than SpiderMonkey with conditional function
definitions28 .
Example
The following conditional function definition produces different outcomes in SpiderMonkey versus V8:
function test () {
if (false) {
function go () {};
}
print(typeof go)
}
With SpiderMonkey, the conditional function outputs undefined, whereas with V8, the conditional function outputs
function.
If your code defines functions this way, it is highly recommended that you refactor the code. The following example
refactors the conditional function definition to work in both SpiderMonkey and V8.
function test () {
var go;
if (false) {
go = function () {}
}
print(typeof go)
}
1046
SyntaxError: In strict mode code, functions can only be declared at top level or immediately within a
String Generic Methods V8 does not support String generics30 . String generics are a set of methods on the String
class that mirror instance methods.
Example
The following use of the generic method String.toLowerCase() is invalid with V8:
var name = 'MongoDB';
var lower = String.toLowerCase(name);
With V8, use the String instance method toLowerCase() available through an instance of the String class
instead:
var name = 'MongoDB';
var lower = name.toLowerCase();
print(name + ' becomes ' + lower);
With V8, use the String instance methods instead of following generic methods:
String.charAt()
String.charCodeAt()
String.concat()
String.endsWith()
String.indexOf()
String.lastIndexOf()
String.localeCompare()
String.match()
String.quote()
String.replace()
String.search()
String.slice()
String.split()
String.startsWith()
String.substr()
String.substring()
String.toLocaleLowerCase()
String.toLocaleUpperCase()
String.toLowerCase()
String.toUpperCase()
String.trim()
String.trimLeft()
String.trimRight()
Array Generic Methods V8 does not support Array generic methods31 . Array generics are a set of methods on the
Array class that mirror instance methods.
Example
The following use of the generic method Array.every() is invalid with V8:
var arr = [4, 8, 15, 16, 23, 42];
function isEven (val) {
return 0 === val % 2;
}
var allEven = Array.every(arr, isEven);
print(allEven);
With V8, use the Array instance method every() available through an instance of the Array class instead:
30 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/String#String_generic_methods
31 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array#Array_generic_methods
1047
With V8, use the Array instance methods instead of the following generic methods:
Array.concat()
Array.every()
Array.filter()
Array.forEach()
Array.indexOf()
Array.join()
Array.lastIndexOf()
Array.map()
Array.pop()
Array.push()
Array.reverse()
Array.shift()
Array.slice()
Array.some()
Array.sort()
Array.splice()
Array.unshift()
Array Instance Method toSource() V8 does not support the Array instance method toSource()32 . Use the
Array instance method toString() instead.
uneval() V8 does not support the non-standard method uneval(). Use the standardized JSON.stringify()33
method instead.
In 2.4 the default JavaScript engine in the mongo (page 942) shell mongod (page 925) is now V8. This change affects
all JavaScript behavior including the mapReduce (page 701), group (page 697), and eval (page 722) commands,
as well as the $where (page 634) query operator.
Use the new interpreterVersion() method in the mongo (page 942) shell and the javascriptEngine
(page 763) field in the output of db.serverBuildInfo() (page 893) to determine which JavaScript engine a
MongoDB binary uses.
The primary impacts of the change from the previous JavaScript engine, SpiderMonkey, to V8 are:
improved concurrency for JavaScript operations,
modernized JavaScript implementation, and
removed non-standard SpiderMonkey features.
See JavaScript Changes in MongoDB 2.4 (page 1043) for more information about all changes .
Additional Limitations for Map-Reduce and $where Operations In MongoDB 2.4, map-reduce
operations (page 701), the group (page 697) command, and $where (page 634) operator expressions cannot
access certain global functions or properties, such as db, that are available in the mongo (page 942) shell.
When upgrading to MongoDB 2.4, you will need to refactor your code if your map-reduce operations
(page 701), group (page 697) commands, or $where (page 634) operator expressions include any global shell
functions or properties that are no longer available, such as db.
The following JavaScript functions and properties are available to map-reduce operations (page 701), the
group (page 697) command, and $where (page 634) operator expressions in MongoDB 2.4:
32 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array/toSource
33 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/JSON/stringify
1048
Available Properties
Available Functions
args
MaxKey
MinKey
assert()
BinData()
DBPointer()
DBRef()
doassert()
emit()
gc()
HexData()
hex_md5()
isNumber()
isObject()
ISODate()
isString()
Map()
MD5()
NumberInt()
NumberLong()
ObjectId()
print()
printjson()
printjsononeline()
sleep()
Timestamp()
tojson()
tojsononeline()
tojsonObject()
UUID()
version()
Improvements to the Aggregation Framework MongoDB 2.4 introduces a number of additional functionality and
improved performance for the Aggregation Framework (page 279). Consider the following additions in 2.4:
$match (page 666) queries now support the $geoWithin (page 635) operator for bounded geospatial queries.
The new $geoNear (page 671) pipeline stage to support geospatial queries.
$min (page 675) operator only considers non-null and existing field values. If all the values for a field are null
or are missing, the operator returns null for the minimum value.
For sort operations where the $sort (page 670) stage immediately precedes a $limit (page 667) in the
pipeline, the MongoDB can perform a more efficient sort that does not require keeping the entire result set in
memory.
The new $millisecond (page 686) operator returns the millisecond portion of a date.
The new $concat (page 681) operator concatenates array of strings.
Additional Resources
MongoDB Downloads34 .
Whats New in MongoDB 2.4 (page 1029).
All JIRA issues resolved in 2.435 .
All Backwards incompatible changes36 .
All Third Party License Notices37 .
34 http://mongodb.org/downloads
35 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.3.2%22,+%222.3.1%22,+%222
rc0%22,+%222.4.0-rc1%22,+%222.4.0-rc2%22,+%222.4.0-rc3%22%29
36 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.3.2%22%2C+%222.3.1%22%2
rc0%22%2C+%222.4.0-rc1%22%2C+%222.4.0-rc2%22%2C+%222.4.0-rc3%22%29+AND+%22Backward+Breaking%22+in+%28+Rarely+%2C+sometimes%2C+yes
37 https://github.com/mongodb/mongo/blob/v2.4/distsrc/THIRD-PARTY-NOTICES
1049
Upgrading
MongoDB 2.2 is a production release series and succeeds the 2.0 production release series.
MongoDB 2.0 data files are compatible with 2.2-series binaries without any special migration process. However,
always perform the upgrade process for replica sets and sharded clusters using the procedures that follow.
Always upgrade to the latest point release in the 2.2 point release. Currently the latest release of MongoDB is 2.4.6.
Synopsis
mongod (page 925), 2.2 is a drop-in replacement for 2.0 and 1.8.
Check your driver (page 95) documentation for information regarding required compatibility upgrades, and
always run the recent release of your driver.
Typically, only users running with authentication, will need to upgrade drivers before continuing with the upgrade to 2.2.
For all deployments using authentication, upgrade the drivers (i.e. client libraries), before upgrading the
mongod (page 925) instance or instances.
For all upgrades of sharded clusters:
turn off the balancer during the upgrade process. See the Disable the Balancer (page 552) section for more
information.
upgrade all mongos (page 938) instances before upgrading any mongod (page 925) instances.
Other than the above restrictions, 2.2 processes can interoperate with 2.0 and 1.8 tools and processes. You can safely
upgrade the mongod (page 925) and mongos (page 938) components of a deployment one by one while the deployment is otherwise operational. Be sure to read the detailed upgrade procedures below before upgrading production
systems.
Upgrading a Standalone mongod
1. Download binaries of the latest release in the 2.2 series from the MongoDB Download Page38 .
38 http://downloads.mongodb.org/
1050
2. Shutdown your mongod (page 925) instance. Replace the existing binary with the 2.2 mongod (page 925)
binary and restart MongoDB.
Upgrading a Replica Set
You can upgrade to 2.2 by performing a rolling upgrade of the set by upgrading the members individually while the
other members are available to minimize downtime. Use the following procedure:
1. Upgrade the secondary members of the set one at a time by shutting down the mongod (page 925) and replacing the 2.0 binary with the 2.2 binary. After upgrading a mongod (page 925) instance, wait for the member to recover to SECONDARY state before upgrading the next instance. To check the members state, issue
rs.status() (page 898) in the mongo (page 942) shell.
2. Use the mongo (page 942) shell method rs.stepDown() (page 899) to step down the primary to allow the
normal failover (page 396) procedure. rs.stepDown() (page 899) expedites the failover procedure and is
preferable to shutting down the primary directly.
Once the primary has stepped down and another member has assumed PRIMARY state, as observed in the output
of rs.status() (page 898), shut down the previous primary and replace mongod (page 925) binary with
the 2.2 binary and start the new process.
Note: Replica set failover is not instant but will render the set unavailable to read or accept writes until the
failover process completes. Typically this takes 10 seconds or more. You may wish to plan the upgrade during
a predefined maintenance window.
Changes
Major Features
Aggregation Framework The aggregation framework makes it possible to do aggregation operations without
needing to use map-reduce. The aggregate (page 694) command exposes the aggregation framework, and the
aggregate() (page 808) helper in the mongo (page 942) shell provides an interface to these operations. Consider
the following resources for background on the aggregation framework and its use:
39 https://jira.mongodb.org/browse/SERVER-6902
1051
1052
Compatibility Changes
Authentication Changes MongoDB 2.2 provides more reliable and robust support for authentication clients, including drivers and mongos (page 938) instances.
If your cluster runs with authentication:
For all drivers, use the latest release of your driver and check its release notes.
In sharded environments, to ensure that your cluster remains available during the upgrade process you must use
the upgrade procedure for sharded clusters (page 1051).
findAndModify Returns Null Value for Upserts that Perform Inserts In version 2.2, for upsert that perform
inserts with the new option set to false, findAndModify (page 710) commands will now return the following
output:
{ 'ok': 1.0, 'value': null }
In the mongo (page 942) shell, upsert findAndModify (page 710) operations that perform inserts (with new set to
false.)only output a null value.
In version 2.0 these operations would return an empty document, e.g. { }.
See: SERVER-622643 for more information.
mongodump 2.2 Output Incompatible with Pre-2.2 mongorestore If you use the mongodump (page 951)
tool from the 2.2 distribution to create a dump of a database, you must use a 2.2 (or later) version of mongorestore
(page 956) to restore that dump.
See: SERVER-696144 for more information.
ObjectId().toString() Returns String Literal ObjectId("...") In version 2.2, the toString()
(page 916) method returns the string representation of the ObjectId() (page 104) object and has the format
ObjectId("...").
Consider the following example that calls the toString()
ObjectId("507c7f79bcf86cd7994f6c0e") object:
(page
916)
method
on
the
ObjectId("507c7f79bcf86cd7994f6c0e").toString()
valueOf()
(page
917)
method
on
the
ObjectId("507c7f79bcf86cd7994f6c0e").valueOf()
43 https://jira.mongodb.org/browse/SERVER-6226
44 https://jira.mongodb.org/browse/SERVER-6961
1053
The maximum size of a collection name is 128 characters, including the name of the database. However, for
maximum flexibility, collections should have names less than 80 characters.
Collections names may have any other valid UTF-8 string.
See the SERVER-444245 and the Are there any restrictions on the names of Collections? (page 592) FAQ item.
Restrictions on Database Names for Windows Database names running on Windows can no longer contain the
following characters:
/\. "*<>:|?
The names of the data files include the database name. If you attempt to upgrade a database instance with one or more
of these characters, mongod (page 925) will refuse to start.
Change the name of these databases before upgrading. See SERVER-458446 and SERVER-672947 for more information.
_id Fields and Indexes on Capped Collections All capped collections now have an _id field by default, if they
exist outside of the local database, and now have indexes on the _id field. This change only affects capped
collections created with 2.2 instances and does not affect existing capped collections.
See: SERVER-551648 for more information.
45 https://jira.mongodb.org/browse/SERVER-4442
46 https://jira.mongodb.org/browse/SERVER-4584
47 https://jira.mongodb.org/browse/SERVER-6729
48 https://jira.mongodb.org/browse/SERVER-5516
1054
New $elemMatch Projection Operator The $elemMatch (page 648) operator allows applications to narrow
the data returned from queries so that the query operation will only return the first matching element in an array. See
the $elemMatch (projection) (page 648) documentation and the SERVER-223849 and SERVER-82850 issues for more
information.
Windows Specific Changes
Windows XP is Not Supported As of 2.2, MongoDB does not support Windows XP. Please upgrade to a more
recent version of Windows to use the latest releases of MongoDB. See SERVER-564851 for more information.
Service Support for mongos.exe You may now run mongos.exe (page 950) instances as a Windows Service.
See the mongos.exe (page 949) reference and MongoDB as a Windows Service (page 19) and SERVER-158952 for
more information.
Log Rotate Command Support MongoDB for Windows now supports log rotation by way of the logRotate
(page 760) database command. See SERVER-261253 for more information.
New Build Using SlimReadWrite Locks for Windows Concurrency Labeled 2008+ on the Downloads Page54 ,
this build for 64-bit versions of Windows Server 2008 R2 and for Windows 7 or newer, offers increased performance
over the standard 64-bit Windows build of MongoDB. See SERVER-384455 for more information.
Tool Improvements
Index Definitions Handled by mongodump and mongorestore When you specify the --collection option
to mongodump (page 951), mongodump (page 951) will now backup the definitions for all indexes that exist on the
source database. When you attempt to restore this backup with mongorestore (page 956), the target mongod
(page 925) will rebuild all indexes. See SERVER-80856 for more information.
mongorestore (page 956) now includes the --noIndexRestore option to provide the preceding behavior. Use
--noIndexRestore to prevent mongorestore (page 956) from building previous indexes.
mongooplog for Replaying Oplogs The mongooplog (page 962) tool makes it possible to pull oplog entries from mongod (page 925) instance and apply them to another mongod (page 925) instance. You can use
mongooplog (page 962) to achieve point-in-time backup of a MongoDB data set. See the SERVER-387357 case
and the mongooplog (page 962) documentation.
Authentication Support for mongotop and mongostat mongotop (page 979) and mongostat (page 974)
now contain support for username/password authentication. See SERVER-387558 and SERVER-387159 for more information regarding this change. Also consider the documentation of the following options for additional information:
49 https://jira.mongodb.org/browse/SERVER-2238
50 https://jira.mongodb.org/browse/SERVER-828
51 https://jira.mongodb.org/browse/SERVER-5648
52 https://jira.mongodb.org/browse/SERVER-1589
53 https://jira.mongodb.org/browse/SERVER-2612
54 http://www.mongodb.org/downloads
55 https://jira.mongodb.org/browse/SERVER-3844
56 https://jira.mongodb.org/browse/SERVER-808
57 https://jira.mongodb.org/browse/SERVER-3873
58 https://jira.mongodb.org/browse/SERVER-3875
59 https://jira.mongodb.org/browse/SERVER-3871
1055
mongotop --username
mongotop --password
mongostat --username
mongostat --password
Write Concern Support for mongoimport and mongorestore mongoimport (page 965) now provides an
option to halt the import if the operation encounters an error, such as a network interruption, a duplicate key exception,
or a write error. The --stopOnError option will produce an error rather than silently continue importing data. See
SERVER-393760 for more information.
In mongorestore (page 956), the --w option provides support for configurable write concern.
mongodump Support for Reading from Secondaries You can now run mongodump (page 951) when connected
to a secondary member of a replica set. See SERVER-385461 for more information.
mongoimport Support for full 16MB Documents Previously, mongoimport (page 965) would only import
documents that were less than 4 megabytes in size. This issue is now corrected, and you may use mongoimport
(page 965) to import documents that are at least 16 megabytes ins size. See SERVER-459362 for more information.
Timestamp() Extended JSON format MongoDB extended JSON now includes a new Timestamp() type to
represent the Timestamp type that MongoDB uses for timestamps in the oplog among other contexts.
This permits tools like mongooplog (page 962) and mongodump (page 951) to query for specific timestamps.
Consider the following mongodump (page 951) operation:
Improved Shell User Interface 2.2 includes a number of changes that improve the overall quality and consistency
of the user interface for the mongo (page 942) shell:
Full Unicode support.
Bash-like line editing features. See SERVER-431264 for more information.
Multi-line command support in shell history. See SERVER-347065 for more information.
Windows support for the edit command. See SERVER-399866 for more information.
Helper to load Server-Side Functions The db.loadServerScripts() (page 890) loads the contents of the
current databases system.js collection into the current mongo (page 942) shell session. See SERVER-165167 for
more information.
60 https://jira.mongodb.org/browse/SERVER-3937
61 https://jira.mongodb.org/browse/SERVER-3854
62 https://jira.mongodb.org/browse/SERVER-4593
63 https://jira.mongodb.org/browse/SERVER-3483
64 https://jira.mongodb.org/browse/SERVER-4312
65 https://jira.mongodb.org/browse/SERVER-3470
66 https://jira.mongodb.org/browse/SERVER-3998
67 https://jira.mongodb.org/browse/SERVER-1651
1056
Support for Bulk Inserts If you pass an array of documents to the insert() (page 832) method, the mongo
(page 942) shell will now perform a bulk insert operation. See SERVER-381968 and SERVER-239569 for more
information.
Note: For bulk inserts on sharded clusters, the getLastError (page 720) command alone is insufficient to verify
success. Applications should must verify the success of bulk inserts in application logic.
Operations
Support for Logging to Syslog See the SERVER-295770 case and the documentation of the syslog (page 992)
run-time option or the mongod --syslog and mongos --syslog command line-options.
touch Command Added the touch (page 759) command to read the data and/or indexes from a collection into
memory. See: SERVER-202371 and touch (page 759) for more information.
indexCounters No Longer Report Sampled Data indexCounters now report actual counters that reflect
index use and state. In previous versions, these data were sampled. See SERVER-578472 and indexCounters for
more information.
Padding Specifiable on compact Command See the documentation of the compact (page 752) and the
SERVER-401873 issue for more information.
Added Build Flag to Use System Libraries The Boost library, version 1.49, is now embedded in the MongoDB
code base.
If you want to build MongoDB binaries using system Boost libraries, you can pass scons using the
--use-system-boost flag, as follows:
scons --use-system-boost
When building MongoDB, you can also pass scons a flag to compile MongoDB using only system libraries rather
than the included versions of the libraries. For example:
scons --use-system-all
1057
Replication
Improved Logging for Replica Set Lag When secondary members of a replica set fall behind in replication,
mongod (page 925) now provides better reporting in the log. This makes it possible to track replication in general and
identify what process may produce errors or halt replication. See SERVER-357579 for more information.
Replica Set Members can Sync from Specific Members The new replSetSyncFrom (page 730) command
and new rs.syncFrom() (page 899) helper in the mongo (page 942) shell make it possible for you to manually
configure from which member of the set a replica will poll oplog entries. Use these commands to override the default
selection logic if needed. Always exercise caution with replSetSyncFrom (page 730) when overriding the default
behavior.
Replica Set Members will not Sync from Members Without Indexes Unless buildIndexes: false To
prevent inconsistency between members of replica sets, if the member of a replica set has buildIndexes (page 481)
set to true, other members of the replica set will not sync from this member, unless they also have buildIndexes
(page 481) set to true. See SERVER-416080 for more information.
New Option To Configure Index Pre-Fetching during Replication By default, when replicating options, secondaries will pre-fetch Indexes (page 313) associated with a query to improve replication throughput in most cases. The
replIndexPrefetch (page 1000) setting and --replIndexPrefetch option allow administrators to disable
this feature or allow the mongod (page 925) to pre-fetch only the index on the _id field. See SERVER-671881 for
more information.
Map Reduce Improvements
Index on Shard Keys Can Now Be a Compound Index If your shard key uses the prefix of an existing index,
then you do not need to maintain a separate index for your shard key in addition to your existing index. This index,
however, cannot be a multi-key index. See the Shard Key Indexes (page 520) documentation and SERVER-150684 for
more information.
Migration Thresholds Modified The migration thresholds (page 517) have changed in 2.2 to permit more even
distribution of chunks in collections that have smaller quantities of data. See the Migration Thresholds (page 517)
documentation for more information.
79 https://jira.mongodb.org/browse/SERVER-3575
80 https://jira.mongodb.org/browse/SERVER-4160
81 https://jira.mongodb.org/browse/SERVER-6718
82 https://jira.mongodb.org/browse/SERVER-4521
83 https://jira.mongodb.org/browse/SERVER-4158
84 https://jira.mongodb.org/browse/SERVER-1506
1058
Licensing Changes
Added License notice for Google Perftools (TCMalloc Utility). See the License Notice85 and the SERVER-468386 for
more information.
Resources
MongoDB Downloads87 .
All JIRA issues resolved in 2.288 .
All backwards incompatible changes89 .
All third party license notices90 .
Whats New in MongoDB 2.2 Online Conference91 .
Upgrading
Although the major version number has changed, MongoDB 2.0 is a standard, incremental production release and
works as a drop-in replacement for MongoDB 1.8.
Preparation
Read through all release notes before upgrading, and ensure that no changes will affect your deployment.
If you create new indexes in 2.0, then downgrading to 1.8 is possible but you must reindex the new collections.
mongoimport (page 965) and mongoexport (page 969) now correctly adhere to the CSV spec for handling
CSV input/output. This may break existing import/export workflows that relied on the previous behavior. For more
information see SERVER-109792 .
Journaling93 is enabled by default in 2.0 for 64-bit builds. If you still prefer to run without journaling, start mongod
(page 925) with the --nojournal run-time option. Otherwise, MongoDB creates journal files during startup. The
first time you start mongod (page 925) with journaling, you will see a delay as mongod (page 925) creates new files.
In addition, you may see reduced write throughput.
85 https://github.com/mongodb/mongo/blob/v2.2/distsrc/THIRD-PARTY-NOTICES#L231
86 https://jira.mongodb.org/browse/SERVER-4683
87 http://mongodb.org/downloads
88 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.1.0%22%2C+%222.1.1%22%2
rc0%22%2C+%222.2.0-rc1%22%2C+%222.2.0-rc2%22%29+ORDER+BY+component+ASC%2C+key+DESC
89 https://jira.mongodb.org/secure/IssueNavigator.jspa?requestId=11225
90 https://github.com/mongodb/mongo/blob/v2.2/distsrc/THIRD-PARTY-NOTICES
91 http://www.mongodb.com/events/webinar/mongodb-online-conference-sept
92 https://jira.mongodb.org/browse/SERVER-1097
93 http://www.mongodb.org/display/DOCS/Journaling
1059
2.0 mongod (page 925) instances are interoperable with 1.8 mongod (page 925) instances; however, for best results,
upgrade your deployments using the following procedures:
Upgrading a Standalone mongod
1. Upgrade the secondary members of the set one at a time by shutting down the mongod (page 925) and replacing
the 1.8 binary with the 2.0.x binary from the MongoDB Download Page95 .
2. To avoid losing the last few updates on failover you can temporarily halt your application (failover should take
less than 10 seconds), or you can set write concern (page 55) in your application code to confirm that each
update reaches multiple servers.
3. Use the rs.stepDown() (page 899) to step down the primary to allow the normal failover (page 396) procedure.
rs.stepDown() (page 899) and replSetStepDown (page 730) provide for shorter and more consistent
failover procedures than simply shutting down the primary directly.
When the primary has stepped down, shut down its instance and upgrade by replacing the mongod (page 925)
binary with the 2.0.x binary.
Upgrading a Sharded Cluster
1. Upgrade all config server instances first, in any order. Since config servers use two-phase commit, shard configuration metadata updates will halt until all are up and running.
2. Upgrade mongos (page 938) routers in any order.
Changes
Compact Command
A compact (page 752) command is now available for compacting a single collection and its indexes. Previously, the
only way to compact was to repair the entire database.
Concurrency Improvements
When going to disk, the server will yield the write lock when writing data that is not likely to be in memory. The
initial implementation of this feature now exists:
See SERVER-256396 for more information.
The specific operations yield in 2.0 are:
94 http://downloads.mongodb.org/
95 http://downloads.mongodb.org/
96 https://jira.mongodb.org/browse/SERVER-2563
1060
Updates by _id
Removes
Long cursor iterations
Default Stack Size
MongoDB 2.0 reduces the default stack size. This change can reduce total memory usage when there are many (e.g.,
1000+) client connections, as there is a thread per connection. While portions of a threads stack can be swapped out
if unused, some operating systems do this slowly enough that it might be an issue. The default stack size is lesser of
the system setting or 1MB.
Index Performance Enhancements
v2.0 includes significant improvements to the index (page 345). Indexes are often 25% smaller and 25% faster (depends
on the use case). When upgrading from previous versions, the benefits of the new index type are realized only if you
create a new index or re-index an old one.
Dates are now signed, and the max index key size has increased slightly from 819 to 1024 bytes.
All operations that create a new index will result in a 2.0 index by default. For example:
Reindexing results on an older-version index results in a 2.0 index. However, reindexing on a secondary does
not work in versions prior to 2.0. Do not reindex on a secondary. For a workaround, see SERVER-386697 .
The repairDatabase command converts indexes to a 2.0 indexes.
To convert all indexes for a given collection to the 2.0 type (page 1061), invoke the compact (page 752) command.
Once you create new indexes, downgrading to 1.8.x will require a re-index of any indexes created using 2.0. See Build
Old Style Indexes (page 345).
Sharding Authentication
Hidden Nodes in Sharded Clusters In 2.0, mongos (page 938) instances can now determine when a member of a
replica set becomes hidden without requiring a restart. In 1.8, mongos (page 938) if you reconfigured a member as
hidden, you had to restart mongos (page 938) to prevent queries from reaching the hidden member.
Priorities Each replica set member can now have a priority value consisting of a floating-point from 0 to 1000,
inclusive. Priorities let you control which member of the set you prefer to have as primary the member with the
highest priority that can see a majority of the set will be elected primary.
For example, suppose you have a replica set with three members, A, B, and C, and suppose that their priorities are set
as follows:
As priority is 2.
Bs priority is 3.
97 https://jira.mongodb.org/browse/SERVER-3866
1061
Cs priority is 1.
During normal operation, the set will always chose B as primary. If B becomes unavailable, the set will elect A as
primary.
For more information, see the priority (page 481) documentation.
Data-Center Awareness You can now tag replica set members to indicate their location. You can use these tags
to design custom write rules (page 55) across data centers, racks, specific servers, or any other architecture choice.
For example, an administrator can define rules such as very important write or customerData or audit-trail to
replicate to certain servers, racks, data centers, etc. Then in the application code, the developer would say:
db.foo.insert(doc, {w : "very important write"})
which would succeed if it fulfilled the conditions the DBA defined for very important write.
For more information, see Tagging98 .
Drivers may also support tag-aware reads. Instead of specifying slaveOk, you specify slaveOk with tags indicating
which data-centers to read from. For details, see the MongoDB Drivers and Client Libraries (page 95) documentation.
w : majority You can also set w to majority to ensure that the write propagates to a majority of nodes, effectively committing it. The value for majority will automatically adjust as you add or remove nodes from the
set.
For more information, see Write Concern (page 55).
Reconfiguration with a Minority Up If the majority of servers in a set has been permanently lost, you can now
force a reconfiguration of the set to bring it back online.
For more information see Reconfigure a Replica Set with Unavailable Members (page 455).
Primary Checks for a Caught up Secondary before Stepping Down To minimize time without a primary, the
rs.stepDown() (page 899) method will now fail if the primary does not see a secondary within 10 seconds of its
latest optime. You can force the primary to step down anyway, but by default it will return an error message.
See also Force a Member to Become Primary (page 448).
Extended Shutdown on the Primary to Minimize Interruption When you call the shutdown (page 759) command, the primary will refuse to shut down unless there is a secondary whose optime is within 10 seconds of the
primary. If such a secondary isnt available, the primary will step down and wait up to a minute for the secondary to
be fully caught up before shutting down.
Note that to get this behavior, you must issue the shutdown (page 759) command explicitly; sending a signal to the
process will not trigger this behavior.
You can also force the primary to shut down, even without an up-to-date secondary available.
Maintenance Mode When repair or compact (page 752) runs on a secondary, the secondary will automatically
drop into recovering mode until the operation finishes. This prevents clients from trying to read from it while its
busy.
98 http://www.mongodb.org/display/DOCS/Data+Center+Awareness#DataCenterAwareness-Tagging%28version2.0%29
1062
Geospatial Features
Multi-Location Documents Indexing is now supported on documents which have multiple location objects, embedded either inline or in nested sub-documents. Additional command options are also supported, allowing results to
return with not only distance but the location used to generate the distance.
For more information, see Multi-location Documents99 .
Polygon searches Polygonal $within (page 636) queries are also now supported for simple polygon shapes. For
details, see the $within (page 636) operator documentation.
Journaling Enhancements
Journaling is now enabled by default for 64-bit platforms. Use the --nojournal command line option to
disable it.
The journal is now compressed for faster commits to disk.
A new --journalCommitInterval run-time option exists for specifying your own group commit interval.
The default settings do not change.
A new { getLastError: { j: true } } (page 720) option is available to wait for the group commit. The group commit will happen sooner when a client is waiting on {j: true}. If journaling is disabled,
{j: true} is a no-op.
New ContinueOnError Option for Bulk Insert
Set the continueOnError option for bulk inserts, in the driver (page 95), so that bulk insert will continue to insert
any remaining documents even if an insert fails, as is the case with duplicate key exceptions or network interruptions.
The getLastError (page 720) command will report whether any inserts have failed, not just the last one. If multiple
errors occur, the client will only receive the most recent getLastError (page 720) results.
See OP_INSERT100 .
Note: For bulk inserts on sharded clusters, the getLastError (page 720) command alone is insufficient to verify
success. Applications should must verify the success of bulk inserts in application logic.
Map Reduce
Output to a Sharded Collection Using the new sharded flag, it is possible to send the result of a map/reduce to
a sharded collection. Combined with the reduce or merge flags, it is possible to keep adding data to very large
collections from map/reduce jobs.
For more information, see MapReduce Output Options101 and mapReduce (page 701).
Performance Improvements Map/reduce performance will benefit from the following:
Larger in-memory buffer sizes, reducing the amount of disk I/O needed during a job
Larger javascript heap size, allowing for larger objects and less GC
99 http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments
100 http://www.mongodb.org/display/DOCS/Mongo+Wire+Protocol#MongoWireProtocol-OPINSERT
101 http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Outputoptions
1063
Supports pure JavaScript execution with the jsMode flag. See mapReduce (page 701).
New Querying Features
Additional regex options: s Allows the dot (.) to match all characters including new lines. This is in addition to
the currently supported i, m and x. See Regular Expressions102 and $regex (page 633).
$and A special boolean $and (page 626) query operator is now available.
Command Output Changes
The output of the validate (page 771) command and the documents in the system.profile collection have
both been enhanced to return information as BSON objects with keys for each value rather than as free-form strings.
Shell Features
Custom Prompt You can define a custom prompt for the mongo (page 942) shell. You can change the prompt at
any time by setting the prompt variable to a string or a custom JavaScript function returning a string. For examples,
see Custom Prompt103 .
Default Shell Init Script On startup, the shell will check for a .mongorc.js file in the users home directory.
The shell will execute this file after connecting to the database and before displaying the prompt.
If you would like the shell not to run the .mongorc.js file automatically, start the shell with --norc.
For more information, see mongo (page 942).
Most Commands Require Authentication
In 2.0, when running with authentication (e.g. auth (page 993)) all database commands require authentication, except
the following commands.
isMaster (page 732)
authenticate (page 725)
getnonce (page 725)
buildInfo (page 762)
ping (page 770)
isdbgrid (page 743)
102 http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-RegularExpressions
103 http://www.mongodb.org/display/DOCS/Overview+-+The+MongoDB+Interactive+Shell#Overview-TheMongoDBInteractiveShellCustomPrompt
1064
Resources
MongoDB Downloads104
All JIRA Issues resolved in 2.0105
All Backward Incompatible Changes106
Upgrading
MongoDB 1.8 is a standard, incremental production release and works as a drop-in replacement for MongoDB 1.6,
except:
Replica set members should be upgraded in a particular order, as described in Upgrading a Replica Set
(page 1066).
The mapReduce (page 701) command has changed in 1.8, causing incompatibility with previous releases.
mapReduce (page 701) no longer generates temporary collections (thus, keepTemp has been removed). Now,
you must always supply a value for out. See the out field options in the mapReduce (page 701) document.
If you use MapReduce, this also likely means you need a recent version of your client driver.
Preparation
Read through all release notes before upgrading and ensure that no changes will affect your deployment.
Upgrading a Standalone mongod
1065
1066
"ismaster" : false,
"secondary" : true,
"hosts" : [
"ubuntu:27017",
"ubuntu:27018"
],
"arbiters" : [
"ubuntu:27019"
],
"primary" : "ubuntu:27018",
"ok" : 1
}
// for each secondary
config.members[0].priority = 0
config.members[3].priority = 0
config.members[4].priority = 0
rs.reconfig(config)
5. Shut down the primary (the final 1.6 server), and then restart it with the 1.8.x binary from the MongoDB
Download Page110 .
Upgrading a Sharded Cluster
1067
Returning to 1.6
If for any reason you must move back to 1.6, follow the steps above in reverse. Please be careful that you have not
inserted any documents larger than 4MB while running on 1.8 (where the max size has increased to 16MB). If you
have you will get errors when the server tries to read those documents.
Journaling Returning to 1.6 after using 1.8 Journaling (page 232) works fine, as journaling does not change anything
about the data file format. Suppose you are running 1.8.x with journaling enabled and you decide to switch back to
1.6. There are two scenarios:
If you shut down cleanly with 1.8.x, just restart with the 1.6 mongod binary.
If 1.8.x shut down uncleanly, start 1.8.x up again and let the journal files run to fix any damage (incomplete
writes) that may have existed at the crash. Then shut down 1.8.x cleanly and restart with the 1.6 mongod binary.
Changes
Journaling
MongoDB now supports write-ahead Journaling Mechanics (page 232) to facilitate fast crash recovery and durability
in the storage engine. With journaling enabled, a mongod (page 925) can be quickly restarted following a crash
without needing to repair the collections. The aggregation framework makes it possible to do aggregation
Sparse and Covered Indexes
Sparse Indexes (page 335) are indexes that only include documents that contain the fields specified in the index.
Documents missing the field will not appear in the index at all. This can significantly reduce index size for indexes of
fields that contain only a subset of documents within a collection.
Covered Indexes (page 368) enable MongoDB to answer queries entirely from the index when the query only selects
fields that the index contains.
Incremental MapReduce Support
The mapReduce (page 701) command supports new options that enable incrementally updating existing collections.
Previously, a MapReduce job could output either to a temporary collection or to a named permanent collection, which
it would overwrite with new data.
You now have several options for the output of your MapReduce jobs:
113 http://downloads.mongodb.org/
1068
You can merge MapReduce output into an existing collection. Output from the Reduce phase will replace
existing keys in the output collection if it already exists. Other keys will remain in the collection.
You can now re-reduce your output with the contents of an existing collection. Each key output by the reduce
phase will be reduced with the existing document in the output collection.
You can replace the existing output collection with the new results of the MapReduce job (equivalent to setting
a permanent output collection in previous releases)
You can compute MapReduce inline and return results to the caller without persisting the results of the job. This
is similar to the temporary collections generated in previous releases, except results are limited to 8MB.
For more information, see the out field options in the mapReduce (page 701) document.
Additional Changes and Enhancements
1.8.1
Sharding migrate fix when moving larger chunks.
Durability fix with background indexing.
Fixed mongos concurrency issue with many incoming connections.
1.8.0
All changes from 1.7.x series.
1.7.6
Bug fixes.
1.7.5
Journaling (page 232).
Extent allocation improvements.
Improved replica set connectivity for mongos (page 938).
getLastError (page 720) improvements for sharding.
1.7.4
mongos (page 938) routes slaveOk queries to secondaries in replica sets.
New mapReduce (page 701) output options.
Sparse Indexes (page 335).
1.7.3
Initial covered index (page 368) support.
Distinct can use data from indexes when possible.
mapReduce (page 701) can merge or reduce results into an existing collection.
mongod (page 925) tracks and mongostat (page 974) displays network usage. See mongostat (page 974).
1069
1.8.1114 , 1.8.0115
1.7.6116 , 1.7.5117 , 1.7.4118 , 1.7.3119 , 1.7.2120 , 1.7.1121 , 1.7.0122
114 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/v09MbhEm62Y
115 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/JeHQOnam6Qk
116 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/3t6GNZ1qGYc
117 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/S5R0Tx9wkEg
118 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/9Om3Vuw-y9c
119 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/DfNUrdbmflI
120 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/df7mwK6Xixo
121 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/HUR9zYtTpA8
122 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/TUnJCg9161A
1070
Resources
MongoDB Downloads123
All JIRA Issues resolved in 1.8124
Upgrading
MongoDB 1.6 is a drop-in replacement for 1.4. To upgrade, simply shutdown mongod (page 925) then restart with
the new binaries.
Please note that you should upgrade to the latest version of whichever driver youre using. Certain drivers, including
the Ruby driver, will require the upgrade, and all the drivers will provide extra features for connecting to replica sets.
Sharding
Sharding (page 493) is now production-ready, making MongoDB horizontally scalable, with no single point of failure.
A single instance of mongod (page 925) can now be upgraded to a distributed cluster with zero downtime when the
need arises.
Sharding (page 493)
Deploy a Sharded Cluster (page 522)
Convert a Replica Set to a Replicated Sharded Cluster (page 530)
Replica Sets
Replica sets (page 377), which provide automated failover among a cluster of n nodes, are also now available.
Please note that replica pairs are now deprecated; we strongly recommend that replica pair users upgrade to replica
sets.
Replication (page 377)
Deploy a Replica Set (page 420)
Convert a Standalone to a Replica Set (page 432)
123 http://mongodb.org/downloads
124 https://jira.mongodb.org/secure/IssueNavigator.jspa?mode=hide&requestId=10172
1071
Other Improvements
The w option (and wtimeout) forces writes to be propagated to n servers before returning success (this works
especially well with replica sets)
$or queries (page 625)
Improved concurrency
$slice (page 650) operator for returning subsets of arrays
64 indexes per collection (formerly 40 indexes per collection)
64-bit integers can now be represented in the shell using NumberLong
The findAndModify (page 710) command now supports upserts. It also allows you to specify fields to return
$showDiskLoc option to see disk location of a document
Support for IPv6 and UNIX domain sockets
Installation
Windows service improvements
The C++ client is a separate tarball from the binaries
1.6.x Release Notes
1.6.5125
1.5.x Release Notes
1.5.8126
1.5.7127
1.5.6128
1.5.5129
1.5.4130
1.5.3131
1.5.2132
1.5.1133
1.5.0134
125 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/06_QCC05Fpk
126 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/uJfF1QN6Thk
127 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/OYvz40RWs90
128 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/4l0N2U_H0cQ
129 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/oO749nvTARY
130 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/380V_Ec_q1c
131 https://groups.google.com/forum/?hl=en&fromgroups=#!topic/mongodb-user/hsUQL9CxTQw
132 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/94EE3HVidAA
133 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/7SBPQ2RSfdM
134 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/VAhJcjDGTy0
1072
Upgrading
Were pleased to announce the 1.4 release of MongoDB. 1.4 is a drop-in replacement for 1.2. To upgrade you just
need to shutdown mongod (page 925), then restart with the new binaries. (Users upgrading from release 1.0 should
review the 1.2 release notes (page 1074), in particular the instructions for upgrading the DB format.)
Release 1.4 includes the following improvements over release 1.2:
Core Server Enhancements
concurrency (page 596) improvements
indexing memory improvements
background index creation (page 336)
better detection of regular expressions so the index can be used in more cases
Replication and Sharding
better handling for restarting slaves offline for a while
fast new slaves from snapshots (--fastsync)
configurable slave delay (--slavedelay)
replication handles clock skew on master
$inc (page 651) replication fixes
sharding alpha 3 - notably 2-phase commit on config servers
Deployment and Production
configure slow threshold for profiling (page 167)
ability to do fsync + lock (page 751) for backing up raw files
135 https://jira.mongodb.org/secure/IssueNavigator.jspa?mode=hide&requestId=10107
1073
New Features
More indexes per collection
Faster index creation
Map/Reduce
Stored JavaScript functions
Configurable fsync time
1074
1075
See the full index of this page for a complete list of changes included in 2.6 (Development Series
2.5.x).
Downloading (page 1076)
Changes (page 1076)
SNMP Enterprise Identifier Changed (page 1076)
Default bind_ip for RPM and DEB Packages (page 1076)
Aggregation Pipeline Changes (page 1077)
Update Improvements (page 1081)
Sharding Improvements (page 1082)
Support for Auditing (page 1083)
isMaster Comand includes Wire Protocol Versions (page 1083)
SASL Library Change (page 1084)
LDAP Support for Authentication (page 1084)
x.509 Authentication (page 1085)
Limit for maxConns Removed (page 1088)
Background Index Builds Replicate to Secondaries (page 1088)
mongod Automatically Continues in Progress Index Builds Following Restart (page 1088)
Global mongorc.js File (page 1088)
Geospatial Enhancements (page 1089)
Downloading
You can download the 2.6 release candidate on the downloads page136 in the Development Release (Unstable) section.
There are no distribution packages for development releases, but you can use the binaries provided for testing purposes.
See Install MongoDB on Linux (page 11), Install MongoDB on Windows (page 16), or Install MongoDB on OS X
(page 13) for the basic installation process.
Changes
Important: The MongoDB 2.5-series, which will become MongoDB 2.6, is for testing and development only. All
identifiers, names, interfaces are subject to change. Do not use a MongoDB 2.5 release in production situations.
In 2.5.1, the IANA enterprise identifier for MongoDB changed from 37601 to 34601. Users of SNMP monitoring
must modify their SNMP configuration (i.e. MIB) accordingly.
Default bind_ip for RPM and DEB Packages
In the packages provided by 10gen for MongoDB in RPM (Red Hat, CentOS, Fedora Linux, and derivatives) and DEB
(Debian, Ubuntu, and derivatives,) the default bind_ip (page 991) value attaches MongoDB components to the localhost interface only. These packages set this default in the default configuration file (i.e. /etc/mongodb.conf.)
If you use one of these packages and have not modified the default /etc/mongodb.conf file, you will need to set
bind_ip (page 991) before or during the upgrade.
There is no default bind_ip setting in any other 10gen distributions of MongoDB.
136 http://www.mongodb.org/downloads
1076
Aggregation Operation May Return a Cursor The aggregate (page 694) may now return a cursor rather
than a result document. Specify the cursor operation and an initial batch size as options to the aggregate
(page 694) command to return a cursor object. In 2.5.2, the temporary mongo (page 942) shell helper
db.collection.aggregateCursor() provides access to this capability.
By returning a cursor, aggregation pipelines can return result sets of any size. In previous versions, the result of an
aggregation operation could be no larger than 16 megabytes.
The following command will return an aggregation cursor:
db.runCommand(
{ aggregate: "records",
pipeline: [
{ $project: { name: { $toUpper: "$_id" }, _id: 0 } },
{ $sort: { name: 1 } }
],
cursor: {}
}
)
In the mongo (page 942) shell, you may use the aggregateCursor() method:
db.records.aggregateCursor(
[
1077
Cursors returned from the aggregation command are equivalent to cursors returned from find() (page 816) queries
in every way.
You may specify an initial batch size using the following form:
db.runCommand(
{ aggregate: "records",
pipeline: [
{ $project : { name: { $toUpper: "$_id" }, _id: 0 } },
{ $sort : { name : 1 } }
],
cursor: { batchSize: 10 }
}
)
The {batchSize: 10 } document specifies the size of the initial batch size only. Specify subsequent batch sizes
to OP_GET_MORE137 operations as with other MongoDB cursors.
Improved Sorting New in version 2.5.2.
The $sort (page 670) and $group (page 669) stages now use a more efficient sorting system within the mongod
(page 925) process. This improves performance for sorting operations that cannot rely on an index or that exceed the
maximum memory use limit, see Aggregation Sort Operation (page 1018) for more information.
For large sort operations these external sort operations can write data to the _tmp directory in the dbpath
(page 993) directory. To enable external sort use the new allowDiskUsage argument to the aggregate
(page 694), as in the following prototype:
{
aggregate: "<collection>",
pipeline: [<pipeline>],
allowDiskUsage: <boolean>
}
You can run this pipeline operation, using the following command:
db.runCommand(
{ aggregate: "records",
pipeline: [
{ $project : { name: { $toUpper: "$_id" }, _id: 0 } },
{ $sort : { name : 1 } }
],
cursor: { batchSize: 10 },
allowDiskUsage: true
}
)
1078
This operation evaluates every field at every level for all documents in the test collection, and uses the $cond
(page 686) expression and the variables $$CONTINUE and $$PRUNE to specify the redaction of document
parts in the aggregation pipeline.
Important: $$CONTINUE will become $$DESCEND in 2.5.3.
Specifically, if the field level is less than 3, (i.e. { $lt: [ $level, 3 ] }) $redact
(page 1078) continues (i.e. $$CONTINUE) evaluating the fields and sub-documents at this level of the input document. If the value of level is greater than 3, then $redact (page 1078) removes all data at this level
of the document.
The result of this aggregation operation is as follows:
{ a: {
level: 1,
d: "World."
}
You may also specify $$KEEP as a variable to $cond (page 686), which returns the entire sub-document
without transversing, as $$CONTINUE.
See
$cond (page 686).
Set Expression Operations in $project In 2.5.2, the $project (page 664) aggregation pipeline stage now
supports the following set expressions:
1079
Important: Set operators take arrays as their arguments and treat these arrays as sets. Except for $setDifference
(page 1080), the set operators ignore duplicate entries in an input array and produce arrays containing unique entries.
$setIsSubset
Takes two arrays and returns true when the first array is a subset of the second and false otherwise.
$setEquals
Takes two arrays and returns true when they contain the same elements, and false otherwise.
$setDifference
Takes two arrays and returns an array containing the elements that only exist in the first array.
$setDifference (page 1080) may produce arrays containing duplicate items in 2.5.2 if the input array
contains duplicate items.
$setIntersection
Takes any number of arrays and returns an array that contains the elements that appear in every input array.
$setUnion
Takes any number of arrays and returns an array that containing the elements that appear in any input array.
$all
Takes a single array and returns true if all its values are true and false otherwise.
$all (page 1080) will become $allElementsTrue in 2.5.3.
$any
Takes a single expression that returns an array and returns true if any of its values are true and false
otherwise.
$any (page 1080) will become $anyElementTrue in 2.5.3.
$map and $let Expressions in Aggregation Pipeline Stages
Tip
For $let (page 1080) and $map (page 1080), the aggregation framework introduces variables. To specify a variable,
use the name of the variable prefixed by 2 dollar signs (i.e. $$) as in: $$<name>.
$let
$let (page 1080) binds variables for use in sub-expressions. For example:
{ $project: { $let: { vars: { tally: 75, count: 50 } },
in: { remaining: { $subtract: [ "$$tally", "$$count" ] } } } }
$let (page 1080) is available the in $project (page 664), $group (page 669), and $redact (page 1078)
pipeline stages.
$map
$map (page 1080) applies a sub-expression to each item in an array and returns an array with the result of the
sub-expression
$map (page 1080) is available the in $project (page 664), $group (page 669), and $redact (page 1078)
pipeline stages.
Given an input document that resembles the following:
{ skews: [ 1, 1, 2, 3, 5, 8 ] }
1080
The $map (page 1080) would transform the input document into the following output document:
{ adjustments: [ 13, 13, 14, 15, 17, 20 ] }
$literal Expression for Aggregation Pipeline Stages The new $literal (page 1081) operator allows users
to explicitly specify documents in aggregation operations that the pipeline stage would otherwise interpret directly.
For example, use $literal (page 1081) to project fields with dollar signs (e.g. $) in their values.
$literal
Wraps an expression to prevent the aggregation pipeline from interpreting an object directly.
Consider the following example:
db.runCommand( { aggregate: "records",
pipeline: [ { $project: { costsOneDollar:
{ $eq: [ "$price", { $literal: "$1.00" } ] } }
} ] } )
This projects documents with a field named costsOneDollar that holds a boolean value if the value of the field is
the string $1.00.
Update Improvements
MongoDB 2.5.2 includes a new subsystem which provides more reliable and extensible update operations. 2.5.2
introduces two new changes to the update language:
$mul Update Operator
$mul
The new $mul (page 1081) allows you to multiply the value of a field by the specified amount. If the field does
not exist in a document, $mul (page 1081) sets field to the specified amount. Consider the following prototype:
Example
Given a collection named records with the following documents:
{ _id: 1, a: 1, b: 4 }
{ _id: 2, a: 1, b: 3 }
{ _id: 3, a: 1 }
1081
xor operation for $bit Operator The $bit (page 663) now supports bitwise updates using a logical xor operation. See the documentation of $bit (page 663) for more information on bitwise updates. Consider the following
operation:
db.collection.update( { field: NumberInt(1) }, { $bit: { field: { xor: NumberInt(5) } } } );
Sharding Improvements
Support for Removing Orphan Data From Shards New in version 2.5.2.
The new cleanupOrphaned (page 1082) is available to remove orphaned data from a sharded cluster. Orphaned
data are those documents in a collection that exist on shards that belong to another shard in the cluster. Orphaned data
are the result of failed migrations or incomplete migration cleanup due to abnormal shutdown.
cleanupOrphaned
cleanupOrphaned (page 1082) removes documents from one orphaned chunk range on shard and runs
directly on the shards mongod (page 925) instance, and requires the clusterAdmin (page 267) for systems running with auth (page 993). You do not need to disable the balancer to run cleanupOrphaned
(page 1082).
In the common case, cleanupOrphaned (page 1082) takes the following form:
{ cleanupOrphaned: <namespace>, startingFromKey: { } }
The startingFromKey field specifies the lowest value of the shard key to begin searching for orphaned data.
The empty document (i.e. { }) is equivalent to the minimum value for the shard key (i.e. $minValue).
The cleanupOrphaned (page 1082) command returns a document that contains a field named
stoppedAtKey that you can use to construct a loop, as in the following example in the mongo (page 942)
shell:
use admin
var nextKey = {};
while (
nextKey = db.runCommand( { cleanupOrphaned : "test.user", startingFromKey : nextKey } ).stopp
) { printjson(nextKey); }
The bounds option specified to mergeChunks (page 1082) must be the boundaries of existing chunks in the
collections. See the output of sh.status() (page 910) to see the current shard boundaries.
Important: Both chunks must reside on the same shard.
1082
Tip
In 2.5.2, mergeChunks (page 1082) has the following restrictions, which are subject to change in future
releases:
mergeChunks (page 1082) will only merge two chunks.
one chunk must not hold any documents (i.e. an empty chunk).
1083
MongoDB Enterprise uses Cyrus SASL instead of GNU SASL (libgsasl). This change has the following SASL2
and Cyrus SASL library and GSSAPI plugin dependencies:
For Debian or Ubuntu, install the following:
For CentOS, Red Hat Enterprise Linux, and Amazon AMI, install the following:
sudo yum install cyrus-sasl cyrus-sasl-lib cyrus-sasl-devel cyrus-sasl-gssapi
MongoDB Enterprise provides support for proxy authentication of users. This change allows administrators to configure a MongoDB cluster to authenticate users via Linux PAM or by proxying authentication requests to a specified
LDAP service.
Warning: Because this change uses SASL PLAIN mechanism to transmit the user password to the MongoDB
server, you should, in general, use only on a trusted channel (VPN, SSL, trusted wired network).
Configuration LDAP support for user authentication requires proper configuration of the saslauthd daemon
process as well as introduces a new server parameter, saslauthdPath. saslauthdPath is the path to the Unix
Domain Socket of the saslauthd instance to use for proxy authentication.
saslauthd Configuration On systems that configure saslauthd with a /etc/sysconfig/saslauthd
file, such as Red Hat Enterprise Linux, Fedora, CentOS, Amazon Linux AMI, set the mechanism MECH to ldap:
MECH=ldap
On systems that configure saslauthd with a /etc/default/saslauthd file, set the mechanisms option to
ldap:
MECHANISMS="ldap"
To use with ActiveDirectory, start saslauthd with the following configuration options:
ldap_servers: <ldap uri, e.g. ldaps://ad.example.net>
ldap_use_sasl: yes
ldap_mech: DIGEST-MD5
ldap_auth_method: fastbind
To connect to an OpenLDAP server, use a test saslauthd.conf with the following content:
ldap_servers: <ldap uri, e.g. ldaps://ad.example.net>
ldap_search_base: ou=Users,dc=example,dc=com
ldap_filter: (uid=%u)
To use this sample OpenLDAP configuration, create users with a uid attribute (login name) and place under the
Users organizational unit (ou).
1084
To test the saslauthd configuration, use testsaslauthd utility, as in the following example:
testsaslauthd -u testuser -p testpassword -s mongod -f /var/run/saslauthd/mux
Or to set the configuration in the configuration file (page 990), add the parameters:
setParameter=saslauthdPath=/<some>/<path>/saslauthd/mux
setParameter=authenticationMechanisms=PLAIN
Otherwise, set the saslauthdPath to the empty string "" to use the librarys default value and the
authenticationMechanisms parameter to PLAIN, as in the following command line example:
mongod --setParameter saslauthdPath="" --setParameter authenticationMechanisms=PLAIN
Or to set the configuration in the configuration file (page 990), add the parameters:
setParameter=saslauthdPath=""
setParameter=authenticationMechanisms=PLAIN
Authenticate in the mongo Shell To use this authentication mechanism in the mongo (page 942) shell, you must
pass digestPassword: false to db.auth() (page 876) when authenticating on the $external database,
since the server must receive an undigested password to forward on to saslauthd, as in the following example:
use $external
db.auth(
{
mechanism: "PLAIN",
user: "application/[email protected]",
pwd: "some1nterestingPwd",
digestPassword: false
}
)
x.509 Authentication
MongoDB introduces x.509 certificate authentication for use with a secure SSL connection (page 249).
Important: To use SSL, you must either use MongoDB Enterprise or build MongoDB locally using scons with the
--ssl option.
The x.509 authentication allows clients to authenticate to servers with certificates instead of with username and password.
1085
The x.509 authentication also allows sharded cluster members and replica set members to use x.509 certificates to
verify their membership to the cluster or the replica set instead of using key files. The membership authentication is
an internal process.
x.509 Certificate The x.509 certificate for client authentication and the x.509 certificate for internal authentication
have different properties.
The client certificate must have the following properties:
A single Certificate Authority (CA) must issue the certificates for both the client and the server.
Client certificates must contain the following fields:
keyUsage = digitalSignature
extendedKeyUsage = clientAuth
The member certificate, used for internal authentication to verify membership to the sharded cluster or a replica set,
must have the following properties:
A single Certificate Authority (CA) must issue all the x.509 certificates for the members of a sharded cluster or
a replica set.
The member certificates subject, which contains the Distinguished Name (DN), must match the subject
of the certificate on the server, starting from and including the Organizational Unit (OU) of the certificate on the
server.
New Protocol and Parameters The change for x.509 authentication introduces a new MONGODB-X509 protocol. For internal authentication for membership, the change also introduces the --clusterAuthMode,
--sslClusterFile and the --sslClusterPassword options.
Use the --clusterAuthMode option to enable internal x.509 authentication for membership.
--clusterAuthMode option can have one of the following values:
The
Value
Description
keyfile
Default value. Use keyfile for authentication.
sendKeyfileFor rolling upgrade purposes. Send the keyfile for authentication but can accept either keyfile or
x.509 certificate.
sendX509
For rolling upgrade purposes. Send the x.509 certificate for authentication but can accept either
keyfile or x.509 certificate.
x509
Recommended. Send the x.509 certificate for authentication and accept only x.509 certificate.
For the --sslClusterFile option, specify the full path to the x.509 certificate and key PEM file for the cluster or
set member. If the key is encrypted, specify the password with the --sslClusterPassword option.
Configure MongoDB Server to Use x.509 Configure the MongoDB server from the command line, as in the following 138 :
mongod --sslOnNormalPorts --sslPEMKeyFile <path to sslCertificate and key PEM file> --sslCAFile <path
You may also specify these options in the configuration file (page 990):
sslOnNormalPorts = true
sslPEMKeyFile = <path to sslCertificate and key PEM file>
sslCAFile = <path to the root CA PEM file>
138
Include any additional options, SSL or otherwise, that are required for your specific configuration.
1086
To specify the x.509 certificate for internal cluster member authentication, append the additional SSL options
--clusterAuthMode and --sslClusterFile, as in the following example for a member of a replica set
2
:
mongod --replSet <name> --sslOnNormalPorts --sslPEMKeyFile <path to sslCertificate and key PEM file>
Authenticate with a x.509 Certificate To authenticate with a client certificate, you must first add a MongoDB user
that corresponds to the client certificate. See Add x.509 Certificate subject as a User (page 1087).
To authenticate, use the db.auth() (page 876) method in the $external database. For the mechanism field,
specify "MONGODB-X509", and for the user field, specify the user, or the subject, that corresponds to the client
certificate.
For example, if using the mongo (page 942) shell,
1. Connect mongo (page 942) shell to the mongod (page 925) set up for SSL:
mongo --ssl --sslPEMKeyFile <path to CA signed client PEM file>
2. To perform the authentication, use the db.auth() (page 876) method in the $external database.
db.getSiblingDB("$external").auth(
{
mechanism: "MONGODB-X509",
user: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=mySta
}
)
Add x.509 Certificate subject as a User To authenticate with a client certificate, you must first add the value of
the subject from the client certificate as a MongoDB user.
1. You can retrieve the subject from the client certificate with the following command:
openssl x509 -in <pathToClient PEM> -inform PEM -subject -nameopt RFC2253
2. Add the value of the subject, omitting the spaces, from the certificate as a user. For example, in the mongo
(page 942) shell, to add the user to the test database:
use test
db.addUser({
user: "CN=myName,OU=myOrgUnit,O=myOrg,L=myLocality,ST=myState,C=myCountry",
userSource: '$external',
roles: ['readAnyDatabase', 'readWriteAnyDatabase']
})
See Add a User to a Database (page 257) for details on adding a user with roles using privilege documents (page 270).
Upgrade Clusters to x.509 Authentication To upgrade clusters that are currently using keyfile authentication to
x.509 authentication, use a rolling upgrade process:
1087
1. For each node of a cluster, set --clusterAuthMode to sendKeyFile. With this setting, each node continues to use its keyfile to authenticate itself as a member. However, each node can now accept either a keyfile
or the x.509 certificate from other members to authenticate those members. Upgrade all nodes of the cluster to
this setting.
2. Then, for each node of a cluster, set --clusterAuthMode to sendX509 and set --sslClusterFile
to the appropriate path of the nodes certificate. 139 With this setting, each node uses its x.509 certificate to
authenticate itself as a member. However, each node continues to accept either a keyfile or the x.509 certificate
from other members to authenticate those members. Upgrade all nodes of the cluster to this setting.
3. Optional but recommended. Finally, for each node of the cluster, set --clusterAuthMode to x509 to only
use the x.509 certificate for authentication.
Limit for maxConns Removed
Starting in MongoDB 2.5.0, there is no longer any upward limit for the maxConns (page 992), or mongod
--maxConns and mongos --maxConns options. Previous versions capped the maximum possible maxConns
(page 992) setting at 20,000 connections.
Background Index Builds Replicate to Secondaries
Starting in MongoDB 2.5.0, if you initiate a background index build (page 336) on a primary, the secondaries will
replicate the index build in the background. In previous versions of MongoDB, secondaries built all indexes in the
foreground, even if the primary built an index in the background.
For all index builds, secondaries will not begin building indexes until the primary has successfully completed the index
build.
mongod Automatically Continues in Progress Index Builds Following Restart
If your mongod (page 925) instance was building an index when it shutdown or terminated, mongod (page 925)
will now continue building the index when the mongod (page 925) restarts. Previously, the index build had to finish
building before mongod (page 925) shutdown.
To disable this behavior the 2.5 series adds a new run time option, noIndexBuildRetry (page 1088) (or via,
--noIndexBuildRetry on the command line,) for mongod (page 925). noIndexBuildRetry (page 1088)
prevents mongod (page 925) from continuing rebuilding indexes that did not finished building when the mongod
(page 925) last shut down.
noIndexBuildRetry
By default, mongod (page 925) will attempt to rebuild indexes upon start-up if mongod (page 925) shuts down
or stops in the middle of an index build. When enabled, this option prevents this behavior.
Global mongorc.js File
If the file mongorc.js exists in the /etc directory (or %ProgramData%\MongoDB directory on Windows), the
mongo (page 942) shell evaluates the contents of this file on start-up. Then, the mongo (page 942) shell evaluates the
users .mongorc.js file if the file exists in the users HOME directory.
The --norc (page 943) option for mongo (page 942) suppresses only the users .mongorc.js file.
Important: The mongorc.js in /etc directory must have read permission for the user running the shell.
139
If the key is encrypted, set the --sslClusterPassword to the password to decrypt the key.
1088
Geospatial Enhancements
145
See the documentation of Write Concern (page 55) for more information about write concern in MongoDB.
Please migrate to the new MongoClient class expeditiously.
140 http://geojson.org/geojson-spec.html
141 http://geojson.org/geojson-spec.html#id5
142 http://geojson.org/geojson-spec.html#id6
143 http://geojson.org/geojson-spec.html#id7
144 http://geojson.org/geojson-spec.html#geometrycollection
145 The drivers will call getLastError (page 720) without arguments, which is logically equivalent to the w:
1 option; however, this
operation allows replica set users to override the default write concern with the getLastErrorDefaults (page 483) setting in the Replica Set
Configuration (page 479).
1089
Releases
The following driver releases will include the changes outlined in Changes (page 1089). See each drivers release
notes for a full account of each release as well as other related driver-specific changes.
C#, version 1.7
Java, version 2.10.0
Node.js, version 1.2
Perl, version 0.501.1
PHP, version 1.4
Python, version 2.4
Ruby, version 1.8
1090
CHAPTER 13
The MongoDB Manual1 contains comprehensive documentation on the MongoDB document-oriented database management system. This page describes the manuals licensing, editions, and versions, and describes how to make a
change request and how to contribute to the manual.
For more information on MongoDB, see MongoDB: A Document Oriented Database2 . To download MongoDB, see
the downloads page3 .
13.1 License
This manual is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported4 (i.e.
CC-BY-NC-SA) license.
The MongoDB Manual is copyright 2011-2013 MongoDB, Inc.
13.2 Editions
In addition to the MongoDB Manual5 , you can also access this content in the following editions:
ePub Format6
Single HTML Page7
PDF Format8
HTML tar.gz9
You also can access PDF files that contain subsets of the MongoDB Manual:
MongoDB Reference Manual10
MongoDB CRUD Operations11
1 http://docs.mongodb.org/manual/#
2 http://www.mongodb.org/about/
3 http://www.mongodb.org/downloads
4 http://creativecommons.org/licenses/by-nc-sa/3.0/
5 http://docs.mongodb.org/manual/#
6 http://docs.mongodb.org/master/MongoDB-manual.epub
7 http://docs.mongodb.org/master/single/
8 http://docs.mongodb.org/master/MongoDB-manual.pdf
9 http://docs.mongodb.org/master/MongoDB-manual.tar.gz
10 http://docs.mongodb.org/master/MongoDB-reference-manual.pdf
11 http://docs.mongodb.org/master/MongoDB-crud-guide.pdf
1091
and
stable
version
of
the
manual
is
always
available
at
1092
In this direction, the MongoDB Documentation project uses the service provided by Smartling23 to translate the MongoDB documentation into additional non-English languages. This translation project is largely supported by the work
of volunteer translators from the MongoDB community who contribute to the translation effort.
If you would like to volunteer to help translate the MongoDB documentation, please:
complete the MongoDB Contributor Agreement24 , and
create an account on Smartling at translate.docs.mongodb.org25 .
Please use the same email address you use to sign the contributor as you use to create your Smartling account.
The mongodb-translators26 user group exists to facilitate collaboration between translators and the documentation
team at large. You can join the Google Group without signing the contributors agreement.
We currently have the following languages configured:
Arabic27
Chinese28
Czech29
French30
German31
Hungarian32
Indonesian33
Italian34
Japanese35
Korean36
Lithuanian37
Polish38
Portuguese39
Romanian40
Russian41
Spanish42
23 http://smartling.com/
24 http://www.mongodb.com/legal/contributor-agreement
25 http://translate.docs.mongodb.org/
26 http://groups.google.com/group/mongodb-translators
27 http://ar.docs.mongodb.org
28 http://cn.docs.mongodb.org
29 http://cs.docs.mongodb.org
30 http://fr.docs.mongodb.org
31 http://de.docs.mongodb.org
32 http://hu.docs.mongodb.org
33 http://id.docs.mongodb.org
34 http://it.docs.mongodb.org
35 http://jp.docs.mongodb.org
36 http://ko.docs.mongodb.org
37 http://lt.docs.mongodb.org
38 http://pl.docs.mongodb.org
39 http://pt.docs.mongodb.org
40 http://ro.docs.mongodb.org
41 http://ru.docs.mongodb.org
42 http://es.docs.mongodb.org
1093
Turkish43
Ukrainian44
If you would like to initiate a translation project to an additional language, please report this issue using the Report a
Problem link above or by posting to the mongodb-translators45 list.
Currently the translation project only publishes rendered translation. While the translation effort is currently focused
on the web site we are evaluating how to retrieve the translated phrases for use in other media.
See also:
Contribute to the Documentation (page 1092)
Style Guide and Documentation Conventions (page 1094)
MongoDB Manual Organization (page 1102)
MongoDB Documentation Practices and Processes (page 1098)
MongoDB Documentation Build System (page 1103)
The entire documentation source for this manual is available in the mongodb/docs repository46 , which is one of the
MongoDB project repositories on GitHub47 .
To contribute to the documentation, you can open a GitHub account48 , fork the mongodb/docs repository49 , make a
change, and issue a pull request.
In order for the documentation team to accept your change, you must complete the MongoDB Contributor Agreement50 .
You can clone the repository by issuing the following command at your system shell:
git clone git://github.com/mongodb/docs.git
1094
Document History
2011-09-27: Document created with a (very) rough list of style guidelines, conventions, and questions.
2012-01-12: Document revised based on slight shifts in practice, and as part of an effort of making it easier for people
outside of the documentation team to contribute to documentation.
2012-03-21: Merged in content from the Jargon, and cleaned up style in light of recent experiences.
2012-08-10: Addition to the Referencing section.
2013-02-07: Migrated this document to the manual. Added map-reduce terminology convention. Other edits.
Naming Conventions
This section contains guidelines on naming files, sections, documents and other document elements.
File naming Convention:
For Sphinx, all files should have a .txt extension.
Separate words in file names with hyphens (i.e. -.)
For most documents, file names should have a terse one or two word name that
scribes the material covered in the document.
Allow the path of the file within the
ument tree to add some of the required context/categorization.
For example its
ceptable
to
have
http://docs.mongodb.org/manualcore/sharding.rst
http://docs.mongodb.org/manualadministration/sharding.rst.
dedocacand
For tutorials, the full title of the document should be in the file name.
For example,
http://docs.mongodb.org/manualtutorial/replace-one-configuration-server-in-a-shard Phrase headlines and titles so that they the content contained within the section so that users can determine what
questions the text will answer, and material that it will address without needing them to read the content. This
shortens the amount of time that people spend looking for answers, and improvise search/scanning, and possibly
SEO.
Prefer titles and headers in the form of Using foo over How to Foo.
When using target references (i.e. :ref: references in documents,) use names that include enough context to
be intelligible thought all documentations. For example, use replica-set-secondary-only-node
as opposed to secondary-only-node. This is to make the source more usable and easier to maintain.
Style Guide
This includes the local typesetting, English, grammatical, conventions and preferences that all documents in the manual
should use. The goal here is to choose good standards, that are clear, and have a stylistic minimalism that does not
interfere with or distract from the content. A uniform style will improve user experience, and minimize the effect of a
multi-authored document.
Punctuation
Use the oxford comma.
Oxford commas are the commas in a list of things (e.g. something, something else, and another thing) before
the conjunction (e.g. and or or.).
Do not add two spaces after terminal punctuation, such as periods.
1095
1096
Use unavailable, offline, or unreachable to refer to a mongod instance that cannot be accessed. Do not
use the colloquialism down.
Always write out units (e.g. megabytes) rather than using abbreviations (e.g. MB.)
Structural Formulations
There should be at least two headings at every nesting level. Within an h2 block, there should be either: no
h3 blocks, 2 h3 blocks, or more than 2 h3 blocks.
Section headers are in title case (capitalize first, last, and all important words) and should effectively describe
the contents of the section. In a single document you should strive to have section titles that are not redundant
and grammatically consistent with each other.
Use paragraphs and paragraph breaks to increase clarity and flow. Avoid burying critical information in the
middle of long paragraphs. Err on the side of shorter paragraphs.
Prefer shorter sentences to longer sentences. Use complex formations only as a last resort, if at all (e.g. compound complex structures that require semi-colons).
Avoid paragraphs that consist of single sentences as they often represent a sentence that has unintentionally
become too complex or incomplete. However, sometimes such paragraphs are useful for emphasis, summary,
or introductions.
As a corollary, most sections should have multiple paragraphs.
For longer lists and more complex lists, use bulleted items rather than integrating them inline into a sentence.
Do not expect that the content of any example (inline or blocked,) will be self explanatory. Even when it feels
redundant, make sure that the function and use of every example is clearly described.
ReStructured Text and Typesetting
Place spaces between nested parentheticals and elements in JavaScript examples. For example, prefer { [ a,
a, a ] } over {[a,a,a]}.
For underlines associated with headers in RST, use:
= for heading level 1 or h1s. Use underlines and overlines for document titles.
- for heading level 2 or h2s.
~ for heading level 3 or h3s.
for heading level 4 or h4s.
Use hyphens (-) to indicate items of an ordered list.
Place footnotes and other references, if you use them, at the end of a section rather than the end of a file.
Use the footnote format that includes automatic numbering and a target name for ease of use. For instance a
footnote tag may look like: [#note]_ with the corresponding directive holding the body of the footnote that
resembles the following: .. [#note].
Do not include ..
code-block::
[language] in footnotes.
As it makes sense, use the .. code-block:: [language] form to insert literal blocks into the text.
While the double colon, ::, is functional, the .. code-block:: [language] form makes the source
easier to read and understand.
For all mentions of referenced types (i.e. commands, operators, expressions, functions, statuses, etc.) use the
reference types to ensure uniform formatting and cross-referencing.
1097
Geo-Location
1. While MongoDB is capable of storing coordinates in sub-documents, in practice, users should only store
coordinates in arrays. (See: DOCS-4154 .)
MongoDB Documentation Practices and Processes
This document provides an overview of the practices and processes.
54 https://jira.mongodb.org/browse/DOCS-41
1098
Practices
Commits
When relevant, include a Jira case identifier in a commit message. Reference documentation cases when applicable,
but feel free to reference other cases from jira.mongodb.org55 .
Err on the side of creating a larger number of discrete commits rather than bundling large set of changes into one
commit.
For the sake of consistency, remove trailing whitespaces in the source file.
Hard wrap files to between 72 and 80 characters per-line.
Standards and Practices
At least two people should vet all non-trivial changes to the documentation before publication. One of the
reviewers should have significant technical experience with the material covered in the documentation.
All development and editorial work should transpire on GitHub branches or forks that editors can then merge
into the publication branches.
Collaboration
1099
Builds
Building the documentation is useful because Sphinx59 and docutils can catch numerous errors in the format and
syntax of the documentation. Additionally, having access to an example documentation as it will appear to the users
is useful for providing more effective basis for the review process. Besides Sphinx, Pygments, and Python-Docutils,
the documentation repository contains all requirements for building the documentation resource.
Talk to someone on the documentation team if you are having problems running builds yourself.
Publication
The makefile for this repository contains targets that automate the publication process. Use make html to publish
a test build of the documentation in the build/ directory of your repository. Use make publish to build the full
contents of the manual from the current branch in the ../public-docs/ directory relative the docs repository.
Other targets include:
man - builds UNIX Manual pages for all Mongodb utilities.
push - builds and deploys the contents of the ../public-docs/.
pdfs - builds a PDF version of the manual (requires LaTeX dependencies.)
Branches
This section provides an overview of the git branches in the MongoDB documentation repository and their use.
At the present time, future work transpires in the master, with the main publication being current. As the
documentation stabilizes, the documentation team will begin to maintain branches of the documentation for specific
MongoDB releases.
Migration from Legacy Documentation
The MongoDB.org Wiki contains a wealth of information. As the transition to the Manual (i.e. this project and
resource) continues, its critical that no information disappears or goes missing. The following process outlines how
to migrate a wiki page to the manual:
1. Read the relevant sections of the Manual, and see what the new documentation has to offer on a specific topic.
In this process you should follow cross references and gain an understanding of both the underlying information
and how the parts of the new content relates its constituent parts.
2. Read the wiki page you wish to redirect, and take note of all of the factual assertions, examples presented by the
wiki page.
3. Test the factual assertions of the wiki page to the greatest extent possible. Ensure that example output is accurate.
In the case of commands and reference material, make sure that documented options are accurate.
4. Make corrections to the manual page or pages to reflect any missing pieces of information.
The target of the redirect need not contain every piece of information on the wiki page, if the manual as a
whole does, and relevant section(s) with the information from the wiki page are accessible from the target of the
redirection.
59 http://sphinx.pocoo.org/
1100
5. As necessary, get these changes reviewed by another writer and/or someone familiar with the area of the information in question.
At this point, update the relevant Jira case with the target that youve chosen for the redirect, and make the ticket
unassigned.
6. When someone has reviewed the changes and published those changes to Manual, you, or preferably someone
else on the team, should make a final pass at both pages with fresh eyes and then make the redirect.
Steps 1-5 should ensure that no information is lost in the migration, and that the final review in step 6 should be
trivial to complete.
Review Process
Types of Review The content in the Manual undergoes many types of review, including the following:
Initial Technical Review Review by an engineer familiar with MongoDB and the topic area of the documentation.
This review focuses on technical content, and correctness of the procedures and facts presented, but can improve any
aspect of the documentation that may still be lacking. When both the initial technical review and the content review
are complete, the piece may be published.
Content Review Textual review by another writer to ensure stylistic consistency with the rest of the manual. Depending on the content, this may precede or follow the initial technical review. When both the initial technical review
and the content review are complete, the piece may be published.
Consistency Review This occurs post-publication and is content focused. The goals of consistency reviews are to
increase the internal consistency of the documentation as a whole. Insert relevant cross-references, update the style as
needed, and provide background fact-checking.
When possible, consistency reviews should be as systematic as possible and we should avoid encouraging stylistic and
information drift by editing only small sections at a time.
Subsequent Technical Review If the documentation needs to be updated following a change in functionality of the
server or following the resolution of a user issue, changes may be significant enough to warrant additional technical
review. These reviews follow the same form as the initial technical review, but is often less involved and covers a
smaller area.
Review Methods If youre not a usual contributor to the documentation and would like to review something, you
can submit reviews in any of the following methods:
If youre reviewing an open pull request in GitHub, the best way to comment is on the overview diff, which
you can find by clicking on the diff button in the upper left portion of the screen. You can also use the
following URL to reach this interface:
https://github.com/mongodb/docs/pull/[pull-request-id]/files
Replace [pull-request-id] with the identifier of the pull request. Make all comments inline, using
GitHubs comment system.
You may also provide comments directly on commits, or on the pull request itself but these commit-comments
are archived in less coherent ways and generate less useful emails, while comments on the pull request lead to
less specific changes to the document.
1101
Leave feedback on Jira cases in the DOCS60 project. These are better for more general changes that arent
necessarily tied to a specific line, or affect multiple files.
Create a fork of the repository in your GitHub account, make any required changes and then create a pull request
with your changes.
If you insert lines that begin with any of the following annotations:
.. TODO:
TODO:
.. TODO
TODO
followed by your comments, it will be easier for the original writer to locate your comments. The two dots ..
format is a comment in reStructured Text, which will hide your comments from Sphinx and publication if youre
worried about that.
This format is often easier for reviewers with larger portions of content to review.
MongoDB Manual Organization
This document provides an overview of the global organization of the documentation resource. Refer to the notes
below if you are having trouble understanding the reasoning behind a files current location, or if you want to add new
documentation but arent sure how to integrate it into the existing resource.
If you have questions, dont hesitate to open a ticket in the Documentation Jira Project61 or contact the documentation
team62 .
Global Organization
Indexes
and
Experience The
documentation
project
has
two
index
files:
http://docs.mongodb.org/manualcontents.txt and http://docs.mongodb.org/manualindex.txt.
The contents file provides the documentations tree structure, which Sphinx uses to create the left-pane navigational
structure, to power the Next and Previous page functionality, and to provide all overarching outlines of the
resource. The index file is not included in the contents file (and thus builds will produce a warning here) and is
the page that users first land on when visiting the resource.
Having separate contents and index files provides a bit more flexibility with the organization of the resource while
also making it possible to customize the primary user experience.
Additionally, in the top level of the source/ directory, there are a number of topical index or outline files. These
(like the index and contents files) use the .. toctree:: directive to provide organization within the documentation. The subject-specific landing pages indexes combine to create the index in the contents file.
Topical Indexes and Meta Organization Because the documentation on any given subject exists in a number of
different locations across the resource the topical indexes provide the real structure and organization to the resource.
This organization makes it possible to provide great flexibility while still maintaining a reasonable organization of files
and URLs for the documentation. Consider the following example:
Given that topic such as replication, has material regarding the administration of replica sets, as well
as reference material, an overview of the functionality, and operational tutorials, it makes more sense to
include a few locations for documents, and use the meta documents to provide the topic-level organization.
Current landing pages include:
60 http://jira.mongodb.org/browse/DOCS
61 https://jira.mongodb.org/browse/DOCS
62 [email protected]
1102
administration
applications
crud
faq
mongo
reference
replication
security
sharding
Additional topical indexes are forthcoming.
The Top Level Folders
The documentation has a number of top-level folders, that hold all of the content of the resource. Consider the
following list and explanations below:
administration - contains all of the operational and architectural information that systems and database administrators need to know in order to run MongoDB. Topics include: monitoring, replica sets, shard clusters,
deployment architectures, and configuration.
applications - contains information about application development and use. While most documentation regarding application development is within the purview of the driver documentation, there are some larger topics
regarding the use of these features that deserve some coverage in this context. Topics include: drivers, schema
design, optimization, replication, and sharding.
core - contains overviews and introduction to the core features, functionality, and concepts of MongoDB.
Topics include: replication, sharding, capped collections, journaling/durability, aggregation.
reference - contains references and indexes of shell functions, database commands, status outputs, as well as
manual pages for all of the programs come with MongoDB (e.g. mongostat and mongodump.)
tutorial - contains operational guides and tutorials that lead users through common tasks (administrative and
conceptual) with MongoDB. This includes programming patterns and operational guides.
faq - contains all the frequently asked questions related to MongoDB, in a collection of topical files.
MongoDB Documentation Build System
This document contains more direct instructions for building the MongoDB documentation.
Getting Started
Install Dependencies The MongoDB Documentation project depends on the following tools:
GNU Make
GNU Tar
Python
Git
1103
Feel free to use pip rather than easy_install to install python packages.
To generate the images used in the documentation, download and install Inkscape63 .
Optional
To generate PDFs for the full production build, install a TeX distribution (for building the PDF.) If you do not have a
LaTeX installation, use MacTeX64 . This is only required to build PDFs.
Arch Linux Install packages from the system repositories with the following command:
pacman -S python2-sphinx python2-yaml inkscape python2-pip
Optional
To generate PDFs for the full production build, install the following packages from the system repository:
pacman -S texlive-bin texlive-core texlive-latexextra
Debian/Ubuntu Install the required system packages with the following command:
apt-get install python-sphinx python-yaml python-argparse inkscape python-pip
Optional
To generate PDFs for the full production build, install the following packages from the system repository:
63 http://inkscape.org/download/
64 http://www.tug.org/mactex/2011/
1104
Then run the bootstrap.py script in the docs/ repository, to configure the build dependencies:
python bootstrap.py
This downloads and configures the mongodb/docs-tools65 repository, which contains the authoritative build system
shared between branches of the MongoDB Manual and other MongoDB documentation projects.
You can run bootstrap.py regularly to update build system.
Building the Documentation
The MongoDB documentation build system is entirely accessible via make targets. For example, to build an HTML
version of the documentation issue the following command:
make html
You can find the build output in build/<branch>/html, where <branch> is the name of the current branch.
In addition to the html target, the build system provides the following targets:
publish Builds and integrates all output for the production build.
Build output is in
build/public/<branch>/. When you run publish in the master, the build will generate
some output in build/public/.
push; stage Uploads the production build to the production or staging web servers. Depends on publish. Requires access production or staging environment.
push-all; stage-all Uploads the entire content of build/public/ to the web servers.
publish. Not used in common practice.
Depends on
push-with-delete; stage-with-delete Modifies the action of push and stage to remove remote file
that dont exist in the local build. Use with caution.
html; latex; dirhtml; epub; texinfo; man; json These are standard targets derived from the default
Sphinx Makefile, with adjusted dependencies. Additionally, for all of these targets you can append -nitpick
to increase Sphinxs verbosity, or -clean to remove all Sphinx build artifacts.
latex performs several additional post-processing steps on .tex output generated by Sphinx. This target will
also compile PDFs using pdflatex.
html and man also generates a .tar.gz file of the build outputs for inclusion in the final releases.
Build Mechanics and Tools
Internally the build system has a number of components and processes. See the docs-tools README66 for more
information on the internals. This section documents a few of these components from a very high level and lists useful
operations for contributors to the documentation.
65 http://github.com/mongodb/docs-tools/
66 https://github.com/mongodb/docs-tools/blob/master/README.rst
1105
Fabric Fabric is an orchestration and scripting package for Python. The documentation uses Fabric to handle the
deployment of the build products to the web servers and also unifies a number of independent build operations. Fabric
commands have the following form:
fab <module>.<task>[:<argument>]
The <argument> is optional in most cases. Additionally some tasks are available at the root level, without a module.
To see a full list of fabric tasks, use the following command:
fab -l
You can chain fabric tasks on a single command line, although this doesnt always make sense.
Important fabric tasks include:
tools.bootstrap Runs the bootstrap.py script. Useful for re-initializing the repository without needing to
be in root of the repository.
tools.dev; tools.reset tools.dev switches the origin remote of the docs-tools checkout in build
directory, to ../docs-tools to facilitate build system testing and development. tools.reset resets the
origin remote for normal operation.
tools.conf tools.conf returns the content of the configuration object for the current project. These data are
useful during development.
stats.report:<filename> Returns, a collection of readability statistics.
source/ tree.
make Provides a thin wrapper around Make calls. Allows you to start make builds from different locations in the
project repository.
process.refresh_dependencies Updates the time stamp of .txt source files with changed include files, to
facilitate Sphinxs incremental rebuild process. This task runs internally as part of the build process.
Buildcloth Buildcloth67 is a meta-build tool, used to generate Makefiles programatically. This makes the build
system easier to maintain, and makes it easier to use the same fundamental code to generate various branches of the
Manual as well as related documentation projects. See makecloth/ in the docs-tools repository68 for the relevant code.
Running make with no arguments will regenerate these parts of the build system automatically.
Rstcloth Rstcloth69 is a library for generating reStructuredText programatically. This makes it possible to generate
content for the documentation, such as tables, tables of contents, and API reference material programatically and
transparently. See rstcloth/ in the docs-tools repository70 for the relevant code.
If you have any questions, please feel free to open a Jira Case71 .
67 https://pypi.python.org/pypi/buildcloth/
68 https://github.com/mongodb/docs-tools/tree/master/makecloth
69 https://pypi.python.org/pypi/rstcloth
70 https://github.com/mongodb/docs-tools/tree/master/rstcloth
71 https://jira.mongodb.org/browse/DOCS
1106
Index
Symbols
-{-}all
command line option 976
-{-}auth
command line option 927
-{-}authenticationDatabase <dbname>
command line option 942, 944, 953, 957, 963, 967,
971, 975, 980, 988
-{-}authenticationMechanism <name>
command line option 942, 944, 953, 958, 963, 967,
971, 976, 980, 988
-{-}autoresync
command line option 934
-{-}bind_ip <ip address>
command line option 926, 938
-{-}chunkSize <value>
command line option 941
-{-}collection <collection>, -c <collection>
command line option 953, 958, 967, 971, 989
-{-}config <filename>, -f <filename>
command line option 925, 938
-{-}configdb <config1>,<config2><:port>,<config3>
command line option 940
-{-}configsvr
command line option 935
-{-}cpu
command line option 927
-{-}csv
command line option 972
-{-}db <db>, -d <db>
command line option 953, 958, 967, 971, 989
-{-}dbpath <path>
command line option 928, 953, 958, 964, 967, 971,
988
-{-}diaglog <value>
command line option 928
-{-}directoryperdb
command line option 928, 953, 958, 964, 967, 971,
988
-{-}discover
1108
Index
Index
-{-}source <host><:port>
command line option 934
-{-}ssl
command line option 944, 952, 957, 963, 966, 970,
975, 987
-{-}sslCAFile <filename>
command line option 936, 945
-{-}sslCRLFile <filename>
command line option 936
-{-}sslFIPSMode
command line option 937
-{-}sslOnNormalPorts
command line option 935
-{-}sslPEMKeyFile <filename>
command line option 936, 944
-{-}sslPEMKeyPassword <value>
command line option 936, 944
-{-}sslWeakCertificateValidation
command line option 937
-{-}stopOnError
command line option 968
-{-}syncdelay <value>
command line option 932
-{-}sysinfo
command line option 933
-{-}syslog
command line option 926, 939
-{-}test
command line option 941
-{-}traceExceptions
command line option 933
-{-}type <=json|=debug>
command line option 961
-{-}type <MIME>, t <MIME>
command line option 989
-{-}type <json|csv|tsv>
command line option 968
-{-}unixSocketPrefix <path>
command line option 927, 940
-{-}upgrade
command line option 933, 941
-{-}upsert
command line option 968
-{-}upsertFields <field1[,field2]>
command line option 968
-{-}username <username>, -u <username>
command line option 943, 952, 957, 963, 966, 970,
975, 980, 988
-{-}verbose
command line option 945
-{-}verbose, -v
command line option 925, 938, 952, 956, 961, 962,
965, 969, 974, 979, 987
-{-}version
1109
Index
Index
1111
Index
Index
1113
1114
D
daemon, 1021
data-center awareness, 1021
data_binary (BSON type), 109
data_date (BSON type), 109
data_maxkey (BSON type), 109
data_minkey (BSON type), 109
data_oid (BSON type), 109
data_ref (BSON type), 109
data_regex (BSON type), 109
data_timestamp (BSON type), 109
data_undefined (BSON type), 109
database, 502, 1021
local, 485
database command, 1021
Database Name Case Sensitivity (MongoDB system
limit), 1018
database profiler, 1021
database references, 121
dataSize (database command), 769
Date (shell method), 916
datum, 1021
db.addUser (shell method), 875
db.auth (shell method), 876
db.changeUserPassword (shell method), 877
db.cloneCollection (shell method), 877
db.cloneDatabase (shell method), 877
db.collection.aggregate (shell method), 808
db.collection.count (shell method), 809
db.collection.createIndex (shell method), 810
db.collection.dataSize (shell method), 812
db.collection.distinct (shell method), 812
db.collection.drop (shell method), 812
db.collection.dropIndex (shell method), 812
db.collection.dropIndexes (shell method), 813
db.collection.ensureIndex (shell method), 814
db.collection.find (shell method), 816
db.collection.findAndModify (shell method), 821
Index
Index
1115
E
EDITOR, 595, 946
election, 1021
emptycapped (database command), 803
enableLocalhostAuthBypass (setParameter option), 1005
enableSharding (database command), 576, 736
enableTestCommands (setParameter option), 1005
environment variable
EDITOR, 595, 946
HOME, 205, 946
HOME directory, 1088
HOMEDRIVE, 946
HOMEPATH, 946
eval (database command), 722
eventual consistency, 1021
expireAfterSeconds, 755
explain.allPlans (MongoDB reporting output), 865
explain.clauses (MongoDB reporting output), 865
explain.clusteredType (MongoDB reporting output), 865
explain.cursor (MongoDB reporting output), 863
explain.indexBounds (MongoDB reporting output), 865
explain.indexOnly (MongoDB reporting output), 864
explain.isMultiKey (MongoDB reporting output), 864
explain.millis (MongoDB reporting output), 865
explain.millisShardAvg (MongoDB reporting output),
865
explain.millisShardTotal (MongoDB reporting output),
865
1116
F
failover, 1021
replica set, 396
fastsync (setting), 1000
FATAL (replica set state), 488
features (database command), 799
field, 1021
filemd5 (database command), 750
files._id (MongoDB reporting output), 111
files.aliases (MongoDB reporting output), 111
files.chunkSize (MongoDB reporting output), 111
files.contentType (MongoDB reporting output), 111
files.filename (MongoDB reporting output), 111
files.length (MongoDB reporting output), 111
files.md5 (MongoDB reporting output), 111
files.metadata (MongoDB reporting output), 111
files.uploadDate (MongoDB reporting output), 111
findAndModify (database command), 710
firewall, 1022
flushRouterConfig (database command), 735
forceerror (database command), 806
fork (setting), 993
fsync, 1022
fsync (database command), 751
fundamentals
sharding, 503
fuzzFile (shell method), 922
G
geohash, 1022
GeoJSON, 1022
geoNear (database command), 708
geoSearch (database command), 709
geospatial, 1022
geospatial queries, 354
exact, 354
Index
idempotent, 1022
handshake (database command), 799
index, 1022
hashed shard key, 1022
_id, 320
haystack index, 1022
background creation, 336
hidden member, 1022
compound, 322, 340
HOME, 205
create, 339, 340
HOME directory, 1088
create in background, 344
HOMEDRIVE, 946
drop duplicates, 337, 341
hostInfo (database command), 780
duplicates, 337, 341
hostInfo (MongoDB reporting output), 781
embedded fields, 321
hostInfo.extra (MongoDB reporting output), 781
hashed, 333, 342
hostInfo.extra.alwaysFullSync (MongoDB reporting outlist indexes, 347, 348
put), 781
measure use, 348
hostInfo.extra.cpuFeatures (MongoDB reporting output),
monitor index building, 347
782
multikey, 324
hostInfo.extra.cpuFrequencyMHz (MongoDB reporting
name, 337
output), 782
options, 335
hostInfo.extra.kernelVersion (MongoDB reporting outoverview, 313
put), 781
Index
1117
rebuild, 346
remove, 346
replica set, 343
sort order, 323
sparse, 334, 341
subdocuments, 321
TTL index, 334
unique, 334, 340
index (collection flag), 755
Index Key (MongoDB system limit), 1016
Index Name Length (MongoDB system limit), 1016
index types, 319
primary key, 320
indexStats (database command), 774
indexStats.bucketBodyBytes (MongoDB reporting output), 775
indexStats.depth (MongoDB reporting output), 775
indexStats.index (MongoDB reporting output), 775
indexStats.isIdIndex (MongoDB reporting output), 775
indexStats.keyPattern (MongoDB reporting output), 775
indexStats.overall (MongoDB reporting output), 775
indexStats.overall.bsonRatio (MongoDB reporting output), 775
indexStats.overall.fillRatio (MongoDB reporting output),
775
indexStats.overall.keyCount (MongoDB reporting output), 775
indexStats.overall.keyNodeRatio (MongoDB reporting
output), 775
indexStats.overall.numBuckets (MongoDB reporting output), 775
indexStats.overall.usedKeyCount (MongoDB reporting
output), 775
indexStats.perLevel (MongoDB reporting output), 775
indexStats.perLevel.bsonRatio (MongoDB reporting output), 776
indexStats.perLevel.fillRatio (MongoDB reporting output), 776
indexStats.perLevel.keyCount (MongoDB reporting output), 776
indexStats.perLevel.keyNodeRatio (MongoDB reporting
output), 776
indexStats.perLevel.numBuckets (MongoDB reporting
output), 775
indexStats.perLevel.usedKeyCount (MongoDB reporting
output), 776
indexStats.storageNs (MongoDB reporting output), 775
indexStats.version (MongoDB reporting output), 775
initial sync, 1022
installation, 3
installation guides, 3
installation tutorials, 3
internals
config database, 564
1118
IPv6, 1022
ipv6 (setting), 995
isdbgrid (database command), 743
isMaster (database command), 473, 732
isMaster.arbiterOnly (MongoDB reporting output), 474,
733
isMaster.arbiters (MongoDB reporting output), 474, 733
isMaster.hidden (MongoDB reporting output), 474, 734
isMaster.hosts (MongoDB reporting output), 474, 733
isMaster.ismaster (MongoDB reporting output), 473, 733
isMaster.localTime (MongoDB reporting output), 473,
733
isMaster.maxBsonObjectSize (MongoDB reporting output), 473, 733
isMaster.maxMessageSizeBytes (MongoDB reporting
output), 473, 733
isMaster.maxWireVersion (MongoDB reporting output),
1083
isMaster.me (MongoDB reporting output), 474, 734
isMaster.minWireVersion (MongoDB reporting output),
1083
isMaster.msg (MongoDB reporting output), 474, 733
isMaster.passive (MongoDB reporting output), 474, 734
isMaster.passives (MongoDB reporting output), 474, 733
isMaster.primary (MongoDB reporting output), 474, 733
isMaster.secondary (MongoDB reporting output), 474,
733
isMaster.setName (MongoDB reporting output), 474, 733
isMaster.tags (MongoDB reporting output), 474, 734
ISODate, 1022
J
JavaScript, 1022
journal, 1022
journal (setting), 995
journalCommitInterval (setParameter option), 1006
journalCommitInterval (setting), 995
journalLatencyTest (database command), 805
JSON, 1022
JSON document, 1022
JSONP, 1022
jsonp (setting), 996
K
keyFile (setting), 993
L
legacy coordinate pairs, 1023
Length of Database Names (MongoDB system limit),
1019
LineString, 1023
listCommands (database command), 762
listDatabases (database command), 761
listFiles (shell method), 922
Index
Index
M
map-reduce, 1023
mapping type, 1023
mapReduce (database command), 701
mapreduce.shardedfinish (database command), 800
master, 1023
master (setting), 1000
maxConns (setting), 992
Maximum Number of Documents in a Capped Collection
(MongoDB system limit), 1016
md5, 1023
md5sumFile (shell method), 923
medianKey (database command), 742
mergeChunks (database command), 1082
MIME, 1023
mkdir (shell method), 924
mongo, 1023
mongo (program), 942, 943
Mongo (shell method), 919
Mongo.getDB (shell method), 917
Mongo.getReadPrefMode (shell method), 918
Mongo.getReadPrefTagSet (shell method), 918
Mongo.setReadPref (shell method), 919
Mongo.setSlaveOk (shell method), 919
mongod, 1023
mongod (program), 925
mongod.exe (program), 948
MongoDB, 1023
mongodump (program), 951
mongoexport (program), 969
mongofiles (program), 986, 987
mongoimport (program), 965
mongooplog (program), 962
mongoperf (program), 984
mongoperf.fileSizeMB (setting), 985
mongoperf.mmf (setting), 985
mongoperf.nThreads (setting), 985
mongoperf.r (setting), 985
mongoperf.recSizeKB (setting), 985
mongoperf.sleepMicros (setting), 985
mongoperf.syncDelay (setting), 985
mongoperf.w (setting), 985
mongorestore (program), 956
mongos, 510, 1023
mongos (program), 938
mongos.exe (program), 950
1119
padding, 1024
padding factor, 1024
page fault, 1024
partition, 1024
passive member, 1024
pcap, 1024
PID, 1024
N
pidfilepath (setting), 993
ping (database command), 770
namespace, 1023
pipe, 1024
local, 485
pipeline, 1024
system, 228
Point, 1024
Namespace Length (MongoDB system limit), 1015
Polygon, 1024
natural order, 1023
port (setting), 991
nearest (read preference mode), 490
Nested Depth for BSON Documents (MongoDB system powerOf2Sizes, 1024
pre-splitting, 1024
limit), 1015
primary, 1025
netstat (database command), 770
primary (read preference mode), 489
noauth (setting), 996
PRIMARY (replica set state), 487
noAutoSplit (setting), 1003
primary key, 1025
nohttpinterface (setting), 996
primary shard, 1025
noIndexBuildRetry (setting), 1088
primaryPreferred (read preference mode), 489
nojournal (setting), 996
priority, 1025
noobjcheck (setting), 992
profile (database command), 770
noprealloc (setting), 996
profile (setting), 997
noscripting (setting), 996
projection, 1025
notablescan (setParameter option), 1006
pwd (shell method), 924
notablescan (setting), 996
nounixsocket (setting), 993
Q
nssize (setting), 996
Number of Indexed Fields in a Compound Index (Mon- Queries cannot use both text and Geospatial Indexes
goDB system limit), 1016
(MongoDB system limit), 1016
Number of Indexes per Collection (MongoDB system query, 1025
limit), 1016
query optimizer, 45, 1025
Number of Members of a Replica Set (MongoDB system quiet (setParameter option), 1007
limit), 1016
quiet (setting), 998
Number of Namespaces (MongoDB system limit), 1015
quit (shell method), 924
Number of Voting Members of a Replica Set (MongoDB quota (setting), 997
system limit), 1016
quotaFiles (setting), 997
1120
Index
architecture, 46
replSet (setting), 1000
connection pooling, 46
replSetElect (database command), 800
read operations
replSetFreeze (database command), 474, 726
query, 39
replSetFresh (database command), 800
read preference, 405, 1025
replSetGetRBID (database command), 800
background, 405
replSetGetStatus (database command), 475, 726
behavior, 408
replSetGetStatus.date (MongoDB reporting output), 475,
member selection, 408
727
modes, 489
replSetGetStatus.members (MongoDB reporting output),
mongos, 409
475, 727
nearest, 408
replSetGetStatus.members.errmsg (MongoDB reporting
ping time, 408
output), 476, 727
semantics, 489
replSetGetStatus.members.health (MongoDB reporting
sharding, 409
output), 476, 727
tag sets, 407, 451
replSetGetStatus.members.lastHeartbeat (MongoDB rereadAnyDatabase (user role), 269
porting output), 476, 728
readWrite (user role), 266
replSetGetStatus.members.name (MongoDB reporting
readWriteAnyDatabase (user role), 269
output), 475, 727
record size, 1025
replSetGetStatus.members.optime (MongoDB reporting
recovering, 1025
output), 476, 727
RECOVERING (replica set state), 488
replSetGetStatus.members.optime.i (MongoDB reporting
references, 121
output), 476, 727
reIndex (database command), 756
replSetGetStatus.members.optime.t (MongoDB reporting
releaseConnectionsAfterResponse (setParameter option),
output), 476, 727
1008
replSetGetStatus.members.optimeDate (MongoDB reremoveFile (shell method), 924
porting output), 476, 727
removeShard (database command), 578, 737
replSetGetStatus.members.pingMS (MongoDB reporting
renameCollection (database command), 744
output), 476, 728
repair (setting), 997
replSetGetStatus.members.self (MongoDB reporting outrepairDatabase (database command), 757
put), 475, 727
repairpath (setting), 997
replSetGetStatus.members.state (MongoDB reporting
replApplyBatchSize (setParameter option), 1006
output), 476, 727
replica pairs, 1025
replSetGetStatus.members.stateStr (MongoDB reporting
replica set, 1025
output), 476, 727
elections, 397
replSetGetStatus.members.uptime (MongoDB reporting
failover, 396, 397
output), 476, 727
index, 343
replSetGetStatus.myState (MongoDB reporting output),
local database, 485
475, 727
network partitions, 397
replSetGetStatus.set (MongoDB reporting output), 475,
reconfiguration, 455
727
resync, 450
replSetGetStatus.syncingTo (MongoDB reporting outrollbacks, 401
put), 476, 728
security, 238
replSetHeartbeat (database command), 800
sync, 410, 450
replSetInitiate (database command), 476, 728
tag sets, 451
replSetMaintenance (database command), 477, 729
replica set members
replSetReconfig (database command), 478, 729
arbiters, 388
replSetStepDown (database command), 730
delayed, 387
replSetSyncFrom (database command), 478, 730
hidden, 387
replSetTest (database command), 805
non-voting, 400
resetDbpath (shell method), 922
replication, 1025
resetError (database command), 722
replication lag, 1025
resident memory, 1025
replIndexPrefetch (setParameter option), 1006
REST, 1025
replIndexPrefetch (setting), 1000
rest (setting), 997
Index
1121
S
saslHostName (setParameter option), 1007
secondary, 1026
secondary (read preference mode), 489
SECONDARY (replica set state), 487
secondary index, 1026
secondary throttle, 550
secondaryPreferred (read preference mode), 489
security
replica set, 238
serverStatus (database command), 782
serverStatus.asserts (MongoDB reporting output), 792
serverStatus.asserts.msg (MongoDB reporting output),
792
serverStatus.asserts.regular (MongoDB reporting output),
792
serverStatus.asserts.rollovers (MongoDB reporting output), 792
serverStatus.asserts.user (MongoDB reporting output),
792
serverStatus.asserts.warning (MongoDB reporting output), 792
serverStatus.backgroundFlushing (MongoDB reporting
output), 789
1122
serverStatus.backgroundFlushing.average_ms
(MongoDB reporting output), 789
serverStatus.backgroundFlushing.flushes (MongoDB reporting output), 789
serverStatus.backgroundFlushing.last_finished
(MongoDB reporting output), 789
serverStatus.backgroundFlushing.last_ms (MongoDB reporting output), 789
serverStatus.backgroundFlushing.total_ms (MongoDB
reporting output), 789
serverStatus.connections (MongoDB reporting output),
787
serverStatus.connections.available (MongoDB reporting
output), 787
serverStatus.connections.current (MongoDB reporting
output), 787
serverStatus.connections.totalCreated (MongoDB reporting output), 787
serverStatus.cursors (MongoDB reporting output), 789
serverStatus.cursors.clientCursors_size (MongoDB reporting output), 789
serverStatus.cursors.timedOut (MongoDB reporting output), 789
serverStatus.cursors.totalOpen (MongoDB reporting output), 789
serverStatus.dur (MongoDB reporting output), 793
serverStatus.dur.commits (MongoDB reporting output),
793
serverStatus.dur.commitsInWriteLock (MongoDB reporting output), 793
serverStatus.dur.compression (MongoDB reporting output), 793
serverStatus.dur.earlyCommits (MongoDB reporting output), 793
serverStatus.dur.journaledMB (MongoDB reporting output), 793
serverStatus.dur.timeMS (MongoDB reporting output),
793
serverStatus.dur.timeMS.dt (MongoDB reporting output),
793
serverStatus.dur.timeMS.prepLogBuffer (MongoDB reporting output), 793
serverStatus.dur.timeMS.remapPrivateView (MongoDB
reporting output), 794
serverStatus.dur.timeMS.writeToDataFiles (MongoDB
reporting output), 794
serverStatus.dur.timeMS.writeToJournal (MongoDB reporting output), 793
serverStatus.dur.writeToDataFilesMB (MongoDB reporting output), 793
serverStatus.extra_info (MongoDB reporting output), 788
serverStatus.extra_info.heap_usage_bytes (MongoDB reporting output), 788
serverStatus.extra_info.note (MongoDB reporting out-
Index
put), 788
serverStatus.extra_info.page_faults (MongoDB reporting
output), 788
serverStatus.globalLock (MongoDB reporting output),
785
serverStatus.globalLock.activeClients (MongoDB reporting output), 786
serverStatus.globalLock.activeClients.readers
(MongoDB reporting output), 786
serverStatus.globalLock.activeClients.total (MongoDB
reporting output), 786
serverStatus.globalLock.activeClients.writers (MongoDB
reporting output), 786
serverStatus.globalLock.currentQueue
(MongoDB
reporting output), 786
serverStatus.globalLock.currentQueue.readers
(MongoDB reporting output), 786
serverStatus.globalLock.currentQueue.total (MongoDB
reporting output), 786
serverStatus.globalLock.currentQueue.writers
(MongoDB reporting output), 786
serverStatus.globalLock.lockTime (MongoDB reporting
output), 785
serverStatus.globalLock.ratio (MongoDB reporting output), 786
serverStatus.globalLock.totalTime (MongoDB reporting
output), 785
serverStatus.host (MongoDB reporting output), 783
serverStatus.indexCounters (MongoDB reporting output),
788
serverStatus.indexCounters.accesses (MongoDB reporting output), 788
serverStatus.indexCounters.hits (MongoDB reporting
output), 788
serverStatus.indexCounters.misses (MongoDB reporting
output), 788
serverStatus.indexCounters.missRatio (MongoDB reporting output), 788
serverStatus.indexCounters.resets (MongoDB reporting
output), 788
serverStatus.localTime (MongoDB reporting output), 783
serverStatus.locks (MongoDB reporting output), 783
serverStatus.locks.. (MongoDB reporting output), 783
serverStatus.locks...timeAcquiringMicros (MongoDB reporting output), 784
serverStatus.locks...timeAcquiringMicros.R (MongoDB
reporting output), 784
serverStatus.locks...timeAcquiringMicros.W (MongoDB
reporting output), 784
serverStatus.locks...timeLockedMicros (MongoDB reporting output), 783
serverStatus.locks...timeLockedMicros.R (MongoDB reporting output), 784
serverStatus.locks...timeLockedMicros.r (MongoDB re-
Index
1123
787
serverStatus.mem.supported (MongoDB reporting output), 787
serverStatus.mem.virtual (MongoDB reporting output),
787
serverStatus.metrics (MongoDB reporting output), 795
serverStatus.metrics.document (MongoDB reporting output), 795
serverStatus.metrics.document.deleted
(MongoDB
reporting output), 795
serverStatus.metrics.document.inserted (MongoDB reporting output), 796
serverStatus.metrics.document.returned (MongoDB reporting output), 796
serverStatus.metrics.document.updated (MongoDB reporting output), 796
serverStatus.metrics.getLastError (MongoDB reporting
output), 796
serverStatus.metrics.getLastError.wtime (MongoDB reporting output), 796
serverStatus.metrics.getLastError.wtime.num (MongoDB
reporting output), 796
serverStatus.metrics.getLastError.wtime.totalMillis
(MongoDB reporting output), 796
serverStatus.metrics.getLastError.wtimeouts (MongoDB
reporting output), 796
serverStatus.metrics.operation (MongoDB reporting output), 796
serverStatus.metrics.operation.fastmod (MongoDB reporting output), 796
serverStatus.metrics.operation.idhack (MongoDB reporting output), 796
serverStatus.metrics.operation.scanAndOrder (MongoDB
reporting output), 796
serverStatus.metrics.queryExecutor (MongoDB reporting
output), 796
serverStatus.metrics.queryExecutor.scanned (MongoDB
reporting output), 796
serverStatus.metrics.record (MongoDB reporting output),
796
serverStatus.metrics.record.moves (MongoDB reporting
output), 796
serverStatus.metrics.repl (MongoDB reporting output),
797
serverStatus.metrics.repl.apply (MongoDB reporting output), 797
serverStatus.metrics.repl.apply.batches (MongoDB reporting output), 797
serverStatus.metrics.repl.apply.batches.num (MongoDB
reporting output), 797
serverStatus.metrics.repl.apply.batches.totalMillis (MongoDB reporting output), 797
serverStatus.metrics.repl.apply.ops (MongoDB reporting
output), 797
1124
Index
Index
1125
1126
Index
1127
V
v (setting), 991
validate (database command), 771
validate.bytesWithHeaders (MongoDB reporting output),
773
validate.bytesWithoutHeaders (MongoDB reporting output), 773
validate.datasize (MongoDB reporting output), 772
validate.deletedCount (MongoDB reporting output), 773
validate.deletedSize (MongoDB reporting output), 773
validate.errors (MongoDB reporting output), 774
validate.extentCount (MongoDB reporting output), 772
validate.extents (MongoDB reporting output), 772
validate.extents.firstRecord (MongoDB reporting output),
772
validate.extents.lastRecord (MongoDB reporting output),
772
validate.extents.loc (MongoDB reporting output), 772
validate.extents.nsdiag (MongoDB reporting output), 772
validate.extents.size (MongoDB reporting output), 772
validate.extents.xnext (MongoDB reporting output), 772
validate.extents.xprev (MongoDB reporting output), 772
validate.firstExtent (MongoDB reporting output), 772
validate.firstExtentDetails (MongoDB reporting output),
772
validate.firstExtentDetails.firstRecord (MongoDB reporting output), 773
validate.firstExtentDetails.lastRecord (MongoDB reporting output), 773
validate.firstExtentDetails.loc (MongoDB reporting output), 772
validate.firstExtentDetails.nsdiag (MongoDB reporting
output), 773
validate.firstExtentDetails.size (MongoDB reporting output), 773
validate.firstExtentDetails.xnext (MongoDB reporting
output), 773
validate.firstExtentDetails.xprev (MongoDB reporting
output), 773
validate.invalidObjects (MongoDB reporting output), 773
validate.keysPerIndex (MongoDB reporting output), 773
validate.lastExtent (MongoDB reporting output), 772
validate.lastExtentSize (MongoDB reporting output), 772
validate.nIndexes (MongoDB reporting output), 773
validate.nrecords (MongoDB reporting output), 772
validate.ns (MongoDB reporting output), 772
validate.objectsFound (MongoDB reporting output), 773
validate.ok (MongoDB reporting output), 774
validate.padding (MongoDB reporting output), 772
validate.valid (MongoDB reporting output), 773
verbose (setting), 991
version (shell method), 921
1128
W
waitMongoProgramOnPort (shell method), 915
waitProgram (shell method), 915
WGS84, 1027
whatsmyuri (database command), 779
working set, 1027
write concern, 54, 1027
write lock, 1027
write operations, 50
writebacklisten (database command), 801
writeBacks, 1027
writeBacksQueued (database command), 801
writeBacksQueued.hasOpsQueued (MongoDB reporting
output), 801
writeBacksQueued.queues (MongoDB reporting output),
801
writeBacksQueued.queues.minutesSinceLastCall (MongoDB reporting output), 801
writeBacksQueued.queues.n (MongoDB reporting output), 801
writeBacksQueued.totalOpsQueued (MongoDB reporting output), 801
Index