What is pacemaker in linux

Содержание

Chapter 2. Getting started with Pacemaker
2.1. Learning to use Pacemaker
2.2. Learning to configure failover

Chapter 2. Getting started with Pacemaker

To familiarize yourself with the tools and processes you use to create a Pacemaker cluster, you can run the following procedures. They are intended for users who are interested in seeing what the cluster software looks like and how it is administered, without needing to configure a working cluster.

These procedures do not create a supported Red Hat cluster, which requires at least two nodes and the configuration of a fencing device. For full information on Red Hat’s support policies, requirements, and limitations for RHEL High Availability clusters, see Support Policies for RHEL High Availability Clusters.

2.1. Learning to use Pacemaker

By working through this procedure, you will learn how to use Pacemaker to set up a cluster, how to display cluster status, and how to configure a cluster service. This example creates an Apache HTTP server as a cluster resource and shows how the cluster responds when the resource fails.

Prerequisites

A single node running RHEL 8
A floating IP address that resides on the same network as one of the node’s statically assigned IP addresses
The name of the node on which you are running is in your /etc/hosts file

# yum install pcs pacemaker fence-agents-all . # systemctl start pcsd.service # systemctl enable pcsd.service

If you are running the firewalld daemon, enable the ports that are required by the Red Hat High Availability Add-On.

# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --reload

# passwd hacluster . # pcs host auth z1.example.com

# pcs cluster setup my_cluster --start z1.example.com . # pcs cluster status Cluster Status: Stack: corosync Current DC: z1.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Thu Oct 11 16:11:18 2018 Last change: Thu Oct 11 16:11:00 2018 by hacluster via crmd on z1.example.com 1 node configured 0 resources configured PCSD Status: z1.example.com: Online

The use of stonith-enabled=false is completely inappropriate for a production cluster. It tells the cluster to simply pretend that failed nodes are safely fenced.

# pcs property set stonith-enabled=false

Do not use systemctl enable to enable any services that will be managed by the cluster to start at system boot.

# yum install -y httpd wget . # firewall-cmd --permanent --add-service=http # firewall-cmd --reload # cat /var/www/html/index.html My Test Site - $(hostname)  END

In order for the Apache resource agent to get the status of Apache, create the following addition to the existing configuration to enable the status server URL.

# cat /etc/httpd/conf.d/status.conf  SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 Allow from ::1  END

# pcs resource describe apache .

In this example, the IP address resource and the apache resource are both configured as part of a group named apachegroup , which ensures that the resources are kept together to run on the same node when you are configuring a working multi-node cluster.

# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.120 --group apachegroup # pcs resource create WebSite ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf statusurl="http://localhost/server-status" --group apachegroup # pcs status Cluster name: my_cluster Stack: corosync Current DC: z1.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Fri Oct 12 09:54:33 2018 Last change: Fri Oct 12 09:54:30 2018 by root via cibadmin on z1.example.com 1 node configured 2 resources configured Online: [ z1.example.com ] Full list of resources: Resource Group: apachegroup ClusterIP (ocf::heartbeat:IPaddr2): Started z1.example.com WebSite (ocf::heartbeat:apache): Started z1.example.com PCSD Status: z1.example.com: Online .

After you have configured a cluster resource, you can use the pcs resource config command to display the options that are configured for that resource.

# pcs resource config WebSite Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) monitor interval=1min (WebSite-monitor-interval-1min)

# killall -9 httpd

Check the cluster status. You should see that stopping the web service caused a failed action, but that the cluster software restarted the service and you should still be able to access the website.

# pcs status Cluster name: my_cluster . Current DC: z1.example.com (version 1.1.13-10.el7-44eb2dd) - partition with quorum 1 node and 2 resources configured Online: [ z1.example.com ] Full list of resources: Resource Group: apachegroup ClusterIP (ocf::heartbeat:IPaddr2): Started z1.example.com WebSite (ocf::heartbeat:apache): Started z1.example.com Failed Resource Actions: * WebSite_monitor_60000 on z1.example.com 'not running' (7): call=13, status=complete, exitreason='none', last-rc-change='Thu Oct 11 23:45:50 2016', queued=0ms, exec=0ms PCSD Status: z1.example.com: Online

You can clear the failure status on the resource that failed once the service is up and running again and the failed action notice will no longer appear when you view the cluster status.

# pcs resource cleanup WebSite

# pcs cluster stop --all

2.2. Learning to configure failover

The following procedure provides an introduction to creating a Pacemaker cluster running a service that will fail over from one node to another when the node on which the service is running becomes unavailable. By working through this procedure, you can learn how to create a service in a two-node cluster and you can then observe what happens to that service when it fails on the node on which it running.

Читайте также: See all mounts linux

This example procedure configures a two-node Pacemaker cluster running an Apache HTTP server. You can then stop the Apache service on one node to see how the service remains available.

Prerequisites

Two nodes running RHEL 8 that can communicate with each other
A floating IP address that resides on the same network as one of the node’s statically assigned IP addresses
The name of the node on which you are running is in your /etc/hosts file

# yum install pcs pacemaker fence-agents-all . # systemctl start pcsd.service # systemctl enable pcsd.service

If you are running the firewalld daemon, on both nodes enable the ports that are required by the Red Hat High Availability Add-On.

# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --reload

# passwd hacluster

# pcs host auth z1.example.com z2.example.com

# pcs cluster setup my_cluster --start z1.example.com z2.example.com

The use of stonith-enabled=false is completely inappropriate for a production cluster. It tells the cluster to simply pretend that failed nodes are safely fenced.

# pcs property set stonith-enabled=false

When you run the pcs cluster status command, it may show output that temporarily differs slightly from the examples as the system components start up.

# pcs cluster status Cluster Status: Stack: corosync Current DC: z1.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Thu Oct 11 16:11:18 2018 Last change: Thu Oct 11 16:11:00 2018 by hacluster via crmd on z1.example.com 2 nodes configured 0 resources configured PCSD Status: z1.example.com: Online z2.example.com: Online

Do not use systemctl enable to enable any services that will be managed by the cluster to start at system boot.

# yum install -y httpd wget . # firewall-cmd --permanent --add-service=http # firewall-cmd --reload # cat /var/www/html/index.html My Test Site - $(hostname)  END

In order for the Apache resource agent to get the status of Apache, on each node in the cluster create the following addition to the existing configuration to enable the status server URL.

# cat /etc/httpd/conf.d/status.conf  SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 Allow from ::1  END

# pcs resource describe apache .

In this example, the IP address resource and the apache resource are both configured as part of a group named apachegroup , which ensures that the resources are kept together to run on the same node. Run the following commands from one node in the cluster:

# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.120 --group apachegroup # pcs resource create WebSite ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf statusurl="http://localhost/server-status" --group apachegroup # pcs status Cluster name: my_cluster Stack: corosync Current DC: z1.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Fri Oct 12 09:54:33 2018 Last change: Fri Oct 12 09:54:30 2018 by root via cibadmin on z1.example.com 2 nodes configured 2 resources configured Online: [ z1.example.com z2.example.com ] Full list of resources: Resource Group: apachegroup ClusterIP (ocf::heartbeat:IPaddr2): Started z1.example.com WebSite (ocf::heartbeat:apache): Started z1.example.com PCSD Status: z1.example.com: Online z2.example.com: Online .

Point a browser to the website you created using the floating IP address you configured. This should display the text message you defined, displaying the name of the node on which the website is running.
Stop the apache web service. Using killall -9 simulates an application-level crash.

# killall -9 httpd

Check the cluster status. You should see that stopping the web service caused a failed action, but that the cluster software restarted the service on the node on which it had been running and you should still be able to access the web browser.

# pcs status Cluster name: my_cluster Stack: corosync Current DC: z1.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Fri Oct 12 09:54:33 2018 Last change: Fri Oct 12 09:54:30 2018 by root via cibadmin on z1.example.com 2 nodes configured 2 resources configured Online: [ z1.example.com z2.example.com ] Full list of resources: Resource Group: apachegroup ClusterIP (ocf::heartbeat:IPaddr2): Started z1.example.com WebSite (ocf::heartbeat:apache): Started z1.example.com Failed Resource Actions: * WebSite_monitor_60000 on z1.example.com 'not running' (7): call=31, status=complete, exitreason='none', last-rc-change='Fri Feb 5 21:01:41 2016', queued=0ms, exec=0ms

# pcs resource cleanup WebSite

# pcs node standby z1.example.com

# pcs status Cluster name: my_cluster Stack: corosync Current DC: z1.example.com (version 2.0.0-10.el8-b67d8d0de9) - partition with quorum Last updated: Fri Oct 12 09:54:33 2018 Last change: Fri Oct 12 09:54:30 2018 by root via cibadmin on z1.example.com 2 nodes configured 2 resources configured Node z1.example.com: standby Online: [ z2.example.com ] Full list of resources: Resource Group: apachegroup ClusterIP (ocf::heartbeat:IPaddr2): Started z2.example.com WebSite (ocf::heartbeat:apache): Started z2.example.com

# pcs node unstandby z1.example.com

# pcs cluster stop --all

Источник