How to configure MongoDB replica set on Ubuntu 20.04

How to configure MongoDB replica set on Ubuntu 20.04

An early version of this tutorial was written by Justin Ellingwood .

Introduction

MongoDB , also known as _Mongo_, is a document-oriented database used in many modern web applications. It is classified as a NoSQL database because it does not rely on the traditional table-based relational database structure. Instead, it uses a JSON-like document with a dynamic mode. This means that unlike relational databases, MongoDB does not require a predefined schema before adding data to the database.

When using a database, it is often useful to have multiple copies of the data. This provides redundancy in the event of a failure of one of the database servers, which can improve the availability and scalability of the database and reduce read latency. The practice of synchronizing data in multiple independent databases is called replication. In MongoDB, a group of servers that maintain the same data set through replication is called a replication set.

This tutorial briefly introduces how replication works in MongoDB, and outlines how to configure and start a replication set with three members. In the configuration of this example, each member of the replication set will be an independent MongoDB instance running on an Ubuntu 20.04 server.

Note : Please note that the procedures outlined in this guide are intended to demonstrate how to quickly build and run a copy. After completing this tutorial, you will have a working replica set, but it will not enable any security features. This setting is not recommended for production environments .

The community version of MongoDB comes with two authentication methods that can help keep the database secure, namely _key file authentication_ and _x.509 authentication_. For production deployments that use replication, the MongoDB documentation recommends the use of x.509 authentication, which describes the key file as a "minimal form of security" and "best suitable for testing or development environments." However, obtaining and configuring an x.509 certificate There are many precautions and decisions in the process, which must be made on a case-by-case basis, which is beyond the scope of the DigitalOcean tutorial.

If you plan to use your replica set for testing or development, we strongly recommend that you follow our tutorial: How to configure key file authentication for MongoDB replica set on Ubuntu 20.04 .

prerequisites

To complete this guide, you will need.

  • 3.servers, each running Ubuntu 20.04. All three servers should have a non-root administrative user and a firewall configured with UFW. To set these up, follow our initial server setup guide for Ubuntu 20.04 .
  • Install MongoDB on each of your Ubuntu servers. To do this, please follow our tutorial: How to install MongoDB on Ubuntu 20.04 , making sure to complete each step on your three servers.

Please note that for clarity, this guide will refer to the three servers as mongo0 , mongo1, and mongo2 . Any example showing a command or file change executed on mongo0 will have a blue background, like this.

Commands and file changes executed on mongo1 will have a pink background.

The example executed on mongo2 will have a green background.

Finally, the commands that must be run or the file modifications that must be made on each server will have a standard gray background, like this.

Understanding MongoDB replica set

As mentioned in the introduction, MongoDB handles replication through an implementation called replica set. Each running instance of MongoDB belonging to a certain replication set is called one of its members. _Each replica set must have one _primary_ member and at least one_secondary_ member.

The main member is the main access point for transactions with the replica set, and is the only member that can accept write operations. Each replication set can only have one primary member at a time, because replication occurs by copying the OPL_OG (abbreviation of "Operation Log") of the primary member and repeatedly recording the changes on the respective data set of the second member. Multiple master stations accepting write operations will cause data conflicts.

By default, the application will only query the read and write operations of the primary member. You can configure your settings to read from one or more secondary members, but since data is transmitted asynchronously, reading from secondary nodes may cause old data to be provided. Therefore, such a configuration is not ideal for every use case.

One feature that distinguishes MongoDB's replication set from other replication implementations is its automatic failover mechanism. In the case that the primary member is unavailable, an automatic election process will occur between the secondary nodes to select a new primary member. A replica set can have up to 50 members, but at most 7 members can vote in an election.

However, if the second-level member pool contains an even number of nodes, it may be impossible to elect a new primary member due to voting deadlock. This requires the addition of a third category of members in the replica set: arbitrators . The arbitrator is an optional member of the replica set, in this case voting to ensure that the replica set can reach a decision. But it should be noted that the arbitrators do not have a copy of the data set, and they are prohibited from becoming the main members of the copy set. If a replica set has only one second-level member, then an arbiter is needed.

Sometimes, you may not want all sub-members to follow the standard rules of replica set sub-members. MongoDB allows you to configure the sub-members of the replica set to assume the following non-standard roles.

  • The replication member with priority 0 . In some cases, selecting certain collection members as the primary location can have a negative impact on the performance of your application. For example, if you are copying data to a remote data center, or a sub-member s hardware is insufficient for it to be the primary access point for the set, set its priority to
    0
    , You can ensure that the member will not become the primary member, but you can continue to copy data.
  • Hidden copy members . In some cases, you need to keep a group of members' access and visibility to customers, and at the same time hide back-end members with independent purposes, which should not be used for read operations. For example, you may need a second-level member as the basis for analysis work, which will benefit from an up-to-date data set, but will put pressure on the staff. By setting this member as hidden, it will not interfere with the general operation of the replica set. Hidden members must be set to priority
    0
    To avoid becoming the main members, but they can vote in the election.
  • Delayed replication members . By setting a delay option for the secondary member, you can control how long the secondary member waits to perform each action it replicates from the OPLOG of the primary member. This is useful if you want to prevent accidental deletion or recovery from destructive operations. For example, if you delay a sub-member for half a day, it will not immediately perform unexpected operations on its own data set and can be used to restore changes. The delayed member cannot become the main member, but can vote in the election. In most cases, they should also be hidden to prevent applications from reading outdated data.

Step 1-Configure DNS resolution

When initializing the replica set in step 4, you need to provide an address where each member of the replica set can be contacted by the members of the other two replica sets. The MongoDB documentation recommends not to use an IP address when configuring a replica set, because the IP address can change unexpectedly. Instead, MongoDB recommends using logical DNS hostnames when configuring replica sets.

One method is to configure subdomains for each replication member . Although configuring subdomains is ideal for production environments or other long-term solutions, this tutorial will outline how to edit each server s respective

hosts
File to configure DNS resolution.

hosts
It is a special file that allows you to assign human-readable hostnames to numeric IP addresses. This means that if the IP address of any of your servers changes, you only need to update the
hosts
File instead of reconfiguring the replica set.

In Linux and other Unix-like systems,

hosts
Is stored in
/etc/
Directory. On your three servers , edit the file with your favorite text editor. Here, we will use
nano
.

sudo nano/etc/hosts duplicated code

After configuring the first few lines of the local host , add an entry for each member of the replication set. These entries are in the form of an IP address followed by a human-readable name of your choice, as in this example.

/etc/hosts

IP_address any_hostnameCopy code

You can configure your server to any hostname you want, but it may be helpful to make each hostname descriptive. In the examples in this guide, three servers will use these hostnames.

  • mongo0.replset.member
  • mongo1.replset.member
  • mongo2.replset.member

Using these hostnames, your

/etc/hosts
The file will look similar to the highlighted line below.

/etc/hosts

... 127.0.0.1 localhost 203.0.113.0 mongo0.replset.member 203.0.113.1 mongo1.replset.member 203.0.113.2 mongo2.replset.member ... Copy code

If you don t know the IP address of your server, you can run the following on each server

curl
Command to retrieve them.
icanhazip.com
Is a website that displays the IP address of any computer used to access it. By providing its URL as
curl
The parameter of the command, the command will print the IP address of the server on which you run it to standard output.

curl -4 icanhazip.comCopy the code

If you use DigitalOcean Droplets, you can also find the IP address of your server in the control panel .

The new line you add here should be the same on all three of your hosts. Save and close the files on each of your servers. If you use

nano
To edit these files, press
CTRL + X
,
Y
, then press
ENTER
.

When editing, saving and closing each server

hosts
After the file, you have completed the DNS resolution configuration of the replica set. Now you can continue to update the firewall rules of each server to allow them to communicate with each other.

Step 2-Update the firewall configuration of each server with UFW

Assuming that you follow the initial server setup guide of the prerequisites , you will set up a firewall on each server where MongoDB is installed, and

OpenSSH
The UFW configuration file enables access permissions. This is an important security measure because these firewalls currently block connections to any port on your server, except
ssh
, The keys proposed by these connections are associated with each server s respective
authorized_keys
The keys in the file are the same.

However, these firewalls also prevent the MongoDB instances on each server from communicating with each other, preventing you from starting the replication set. To correct this problem, you need to add a new firewall rule that allows each server to access ports on the other two servers, and MongoDB is listening for connections on these ports.

On mongo0 , run the following

ufw
Command to allow mongo1 to access the port on mongo0
27017
.

sudo ufw allow from mongo1_server_ip to any port 27017 Copy the code

Be sure to change

mogno1_server_ip
To reflect the actual IP address of your mongo1 server. Please note,
ufw
Command will not match
hosts
The hostname configured in the file works together, so make sure to use the actual IP address of your server in this command and the following commands. In addition, if you have updated the Mongo instance on this server to use a non-default port, please make sure to change
27017
To reflect the port actually used by your MongoDB instance.

Then add another firewall rule to let mongo2 access the same port.

sudo ufw allow from mongo2_server_ip to any port 27017 Copy the code

Next, update the firewall rules of the other two servers. Run the following command on mongo1 and make sure to change the IP address to reflect the IP addresses of mongo0 and mongo2 respectively .

sudo ufw allow from mongo0_server_ip to any port 27017 sudo ufw allow from mongo2_server_ip to any port 27017 Copy code

Finally, run these two commands on mongo2 . Also, make sure you enter the correct IP address for each server.

sudo ufw allow from mongo0_server_ip to any port 27017 sudo ufw allow from mongo1_server_ip to any port 27017 Copy code

After adding these UFW rules, each of your three MongoDB servers will be allowed to access the ports used by MongoDB on the other two servers. However, you cannot test this yet, because the Mongo instance on each server currently blocks any external connections. After enabling replication by updating the configuration file of each MongoDB instance in the next step, you can perform this test.

Step 3-Enable replication in the MongoDB configuration file of each server

At this point, you have edited your server s

/etc/hosts
File to configure host names, which will resolve to the IP address of each server. You also opened the firewall of each server, allowing the other two servers to access the default MongoDB port,
27107
. Now you are ready to configure the MongoDB installation on each server to enable replication.

This step outlines how to edit the MongoDB configuration file (

/etc/mongod.conf
) To do this. You must complete each procedure in this step on each server, but for demonstration purposes, we will use mongo0 in the example .

On mongo0 , open the MongoDB configuration file with your favorite text editor.

sudo nano/etc/mongod.conf copy the code

Although you have opened the firewall of each server, allowing other servers to access the port

27017
, But MongoDB is currently bound to
127.0.0.1
, The local loopback network interface. This means that MongoDB can only accept connections from the server where it is installed.

In order to allow remote connections, except

127.0.0.1
, You must bind MongoDB to the publicly routable IP address of your server. In this way, your MongoDB installation will be able to monitor connections from remote machines to your MongoDB server.

turn up

network interfaces
section. By default, it looks like this.

/etc/mongod.conf

... # network interfaces net: port: 27017 bindIp: 127.0.0.1 ... Copy code

In

bindIp:
Add a comma to the first line, followed by the host name or public IP address of mongo0 . This example uses the host name configured in step 1.

/etc/mongod.conf

... # network interfaces net: port: 27017 bindIp: 127.0.0.1,mongo0.replset.member ... Copy code

Next, find at the bottom of the file that says

#replication:
That line. It will look like this.

/etc/mongod.conf

... #replication: ... Copy code

Remove the pound symbol (

#
), uncomment. Then add one below this line
replSetName
Instruction, followed by a name that MongoDB uses to identify the replica set.

/etc/mongod.conf

... replication: replSetName: "rs0" ... Copy code

In this example,

replSetName
The value of the instruction is
"rs0"
. You can provide any name you want here, but using a descriptive name may help. But remember that each server s
mongod.conf
File must be in
replSetName
They have the same name after the instruction so that their MongoDB instances become members of the same replication set.

Please note that in

replSetName
There are two spaces before the instruction, and the name is wrapped in quotation marks (
"
), these two points are necessary for the configuration to be read correctly.

After updating these two parts of the file,

net
with
replication
And they will look like this.

/etc/mongod.conf

... # network interfaces net: port: 27017 bindIp: 127.0.0.1,mongo0.replset.member ... replication: replSetName: "rs0" ... Copy code

Save and close the file. Then mongo1 and mongo2 on

/etc/mongod.conf
Make the same modification to the file. After doing this, in the configuration file of mongo1 , these updated parts will look like this.

/etc/mongod.conf

... # network interfaces net: port: 27017 bindIp: 127.0.0.1,mongo1.replset.member ... replication: replSetName: "rs0" ... Copy code

The following is what these parts look like in mongo2's configuration file.

/etc/mongod.conf

... # network interfaces net: port: 27017 bindIp: 127.0.0.1,mongo2.replset.member ... replication: replSetName: "rs0" ... Copy code

To reiterate, your

bindIp
The IP address or host name added in the command must be of the server you are editing
mongod.conf
file.

In each server s

mongod.conf
After making these changes to the files, save and close each file. Then, restart on each server by issuing the following command
mongod
service.

sudo systemctl restart mongod copy the code

In this way, you have enabled replication for the MongoDB instance of each server.

Note : At this point, you can use

nc
Command to test whether the firewall rules you added in step 2 are correct.
nc
, Short for _netcat_, is a tool used to establish TCP or UDP network connections. In this case, it is useful for testing because it allows you to specify an IP address and a port number at the same time when establishing a connection.

The following example

nc
Commands include
-z
Option, this option restricts the tool to only scan a monitoring daemon on the target server without sending any data to it. Looking back at the prerequisite installation tutorial , MongoDB runs as a service daemon, so this option is useful for testing connectivity. It also includes
v
Option, this option increases the roughness of the command, causing it to return more information than other methods.

This example

nc
, Shows the situation of trying to reach mongo1 from mongo0 .

nc -zv mongo1.replset.member 27017 Copy the code

The output below shows that mongo0 can reach mongo1 on the port used by MongoDB .

OutputConnection to mongo1.replset.member 27017 port [tcp/ *] succeeded! Copy the code

You can test the connection between each pair of servers by repeating this command on each server and specifying the appropriate hostname or IP address.

When editing each server

mongod.conf
File to enable copying and restart
mongod
After the service, you can start the replication set and add each Mongo instance as a member.

Step 4-Start the replication set and add members

Now that you have configured three MongoDB installations, you can open the MongoDB shell to start replication and add each member as a member.

For demonstration purposes, the example in this step will use the MongoDB instance on mongo0 to start the replication set. However, you can start replication from any server.

mongod.conf
The file has been properly configured.

On mongo0 , open the MongoDB shell.

mongocopy code

From the prompt, you can run

rs.initiate()
Method from
mongo
Start a replica set in the shell. However, running this method itself will only initiate replication for the machine on which you are running this method, and then you need to publish it for each member
rs.add()
Method to add your other Mongo instances.

To recap, MongoDB stores its data in a JSON-like structure called _document_. Because you have edited it on each of your servers

mongod.conf
File to configure the replication of three Mongo instances, you can
rs.initiate
The method contains a document that saves the configuration details of each member. This will allow you to start a replicated set and add each member at once, instead of running multiple separate methods.

To do this, by entering the following and pressing

ENTER
, Start one
rs.initiate()
method.

rs.initiate ( copy code

Until you enter the closing parenthesis, Mongo will not put

rs.initiate
Method registration is complete. Before you do this, the prompt will start from a greater than sign (
>
) Becomes an ellipsis (
...
).

Like objects in JSON, documents in MongoDB begin and end with curly braces (

{
with
}
). To start adding a configuration document for the replica set, enter an opening brace.

{ Copy code

MongoDB documents are composed of any number of field and value pairs, in the form

field: value
. The first field and value pair of this particular document must be one
_id:
Field, this field provides a name to identify the replica set; the value of this field must match your
mongod.conf
Set in the file
replSetName
The instructions are the same, in our case it is
"rs0"
.

Enter this field and value pair, add a comma after it, and press

ENTER
To start a new line.

_id: "rs0" , copy the code

Next, add a

members:
Field. However, in this
members:
After the field, replace a single value with an array containing multiple documents. Each array represents a replica set member to be added. In MongoDB documents, arrays are always enclosed in square brackets (
[
with
]
).

Add to

members:
Field, followed by an opening square bracket to start the array, and then press
ENTER
To move to the next line.

members: [ Copy code

Now add a document with two field and value pairs, separated by commas, representing the first member of the replicate set. The first field of this document is another

_id:
Field, it accepts an integer for internal identification of members. The second one is
host:
Field, there must be a string containing the host name, which will be resolved to an address that can reach the member Mongo instance.

{_id: 0, host: "mongo0.replset.member" }, copy the code

Note : If any of your Mongo instances are running on the default port of MongoDB--

27017
, You must add a colon (
:
), and then the port number, as in this example.

{_id: 0, host: "mongo0.replset.member:27018" }, copy the code

After entering the first file, enter other files for the other members of the replication set. Make sure to separate each file with a comma.

{_id: 1, host: "mongo1.replset.member" }, {_id: 2, host: "mongo2.replset.member" } Copy code

Next, end the array by entering a square bracket.

] Copy code

Finally, end the configuration file with a closing brace, and then end the method with a closing parenthesis.

}) Copy code

all of these,

rs.initiate()
The method will look like this.

> rs.initiate( ... { ... _id: "rs0", ... members: [ ... {_id: 0, host: "mongo0.replset.member" }, ... {_id: 1, host: "mongo1.replset.member" }, ... {_id: 2, host: "mongo2.replset.member"} ...] ... }) Copy code

Assuming you have entered all the details correctly, once you enter the closing bracket and press

ENTER
, The method will run and start the replication set. If the method returns in the output
"ok": 1
, Which means that the replication set is started correctly.

Output{ "ok": 1, "$clusterTime": { "clusterTime": Timestamp(1612389071, 1), "signature": { "hash": BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId": NumberLong(0) } }, "operationTime": Timestamp(1612389071, 1) } Copy code

If the replication set starts as expected, you will notice that the MongoDB client prompt will only have a greater than sign (

>
) Becomes the following.

MongoDB installs some built-in methods, you can use them to manage and retrieve replica set information. Among these methods,

rs.help()
The method is particularly helpful because it can return a list of these replica set methods and a description of what they do.

rs.help() Copy code
Output rs.status() {replSetGetStatus: 1} checks repl set status rs.initiate() {replSetInitiate: null} initiates set with default settings rs.initiate(cfg) {replSetInitiate: cfg} initiates set with configuration cfg rs.conf() get the current configuration object from local.system.replset rs.reconfig(cfg) updates the configuration of a running replica set with cfg (disconnects) rs.add(hostportstr) add a new member to the set with default attributes (disconnects) rs.add(membercfgobj) add a new member to the set with extra attributes (disconnects) rs.addArb(hostportstr) add a new member which is arbiterOnly:true (disconnects) rs.stepDown([stepdownSecs, catchUpSecs]) step down as primary (disconnects) rs.syncFrom(hostportstr) make a secondary sync from the given member rs.freeze(secs) make a node ineligible to become primary for the time specified rs.remove(hostportstr) remove a host from the replica set (disconnects) rs.secondaryOk() allow queries on secondary nodes rs.printReplicationInfo() check oplog size and time range rs.printSecondaryReplicationInfo() check replica set members and replication lag db.isMaster() check who is primary db.hello() check who is primary reconfiguration helpers disconnect from the database so the shell will display an error, even if the command succeeds. Copy code

Running

rs.help()
Or after another of these methods, you may see the client prompt change to the following.

This means that the MongoDB instance you are connected to is selected as a member of the master set.

Please note that if you have additional nodes that you want to add to the replica set in the future, you can use them after configuring them

rs.add()
, Just like you configured the current replication set members in the previous step.

rs.add( "mongo3.replset.member" ) Copy code

Now you can press

CTRL + C
Or run
exit
Command to close the MongoDB client.

exitCopy code

Your replicated set is now up and running, and you can start integrating it with your application.

Warning . When you open the MongoDB prompt to start the replication set, you may have noticed a warning message similar to this one.

... 2021-02-03T21:45:48.379+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted ... Copy code

This message indicates that you have not enabled access control for your database. According to MongoDB's documentation.

MongoDB uses role-based access control (RBAC) to manage access to the MongoDB system. A user is granted one or more roles, which determine the user's access to database resources and operations.

Because access control is not enabled on any of your MongoDB instances, anyone who has access to the three servers in the replication set can also get access to the Mongo instance on that server. This constitutes an important security risk, because it means they can also gain access to your application data.

The way to eliminate this warning and add a layer of security to your replicated set is to configure _key file authentication_. As mentioned in the introduction, the MongoDB documentation describes the key file as a "minimum form of security" and "most suitable for a test or development environment."

Please note that for production deployments, the MongoDB documentation instead recommends the use of x.509 certificates for internal member authentication. There are many considerations and decisions in the process of obtaining and configuring x.509 certificates, which must be made on a case-by-case basis, which is beyond the scope of this tutorial.

If you plan to use the replica set for testing or development, we strongly recommend that you follow our tutorial: How to configure MongoDB replica set key file authentication on Ubuntu 20.04 .

Conclusion

Database replication has been widely used as a strategy to improve performance, availability, and data security, so much so that it is recommended that any database used in a production environment should enable some form of replication. Replication is also universal and can play many different roles in the data architecture, such as reporting or disaster recovery. The automatic failover feature found in MongoDB's replication set makes it particularly valuable, helping to ensure that your data remains highly available in the event of a failure.

If you want to learn more about MongoDB, we encourage you to check out our entire MongoDB tutorial collection.