drbd interview questions
Top drbd frequently asked interview questions
From my output from /proc/drbd I am trying to extract 'UpToDate/UpToDate' section of this output per device (0 and 1). I tried:
cat /proc/drbd | grep ' 0:' | grep -Eo 'ds:(.*)'
But that gives me:
ds:UpToDate/UpToDate C r-----
That is not what I'm looking for (looking for getting the slot where UpToDate/UpToDate propagates) or basically a return of 'UpToDate/UpToDate'..Anyways, here is the output of /proc/drbd:
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
Source: (StackOverflow)
how to run docker in production, with a active/active or active/standby HA system?
are there any guides or best practices?
i am thinking of 3 scenarios:
1) NFS - for two servers - wich are prepped with docker-machine
and mounting a shared NFS to /var/lib/docker/
- so both docker nodes should see the same files. (using some sort of filer, like vnx, efs, and so on.)
2) using DRBD to replicate a disk - and mount it to: /var/lib/docker/
- so data is on both nodes, and the active node can mount it and run containers, in case of failover the other node mounts and starts the containers
3) using DRBD - as above - and export a NFS server, mounting the NFS on both nodes to : /var/lib/docker/
- so as above both nodes can mount and run containers, using Heartbeat/Pacemaker to travel the virtual-IP & DRBD switching
what is the best practice on running docker-containers in production to make them high availaible.
regards
Source: (StackOverflow)
When virtual/diskless node is used on DRBL cluster using Open MPI version 1.8.4, the error occurs:
Error: unknown option "--hnp-topo-sig"
I guess something with the topology signature and looks new. Any suggestions?
Typical command:
mpirun --machinefile machines -np 4 mpi_hello
machinefile: node1 slots = 4
Thank you in advance
Source: (StackOverflow)
I searched on the web a definitive answer to the following question but couldn't find a clear YES or NO, or a clear procedure on how to enable this approach!
In a two nodes setup, with DRBD as the block device replication technology, with OCFs2 as the clustered file system (which requires active/active DRBD mode), is it possible to use LUKS to encrypt the underlying block device such that it is usable from any node in the cluster? Does the kernel require the passphrase on each node at boot time? If not, how does it work?
Thanks in advance for your responses.
D.
Source: (StackOverflow)
I need to setup two-node Web cluster for Apache web site. I have Hyper-V infrastructure and only two nodes.
The points are load-balancing and high availability.
I installed and configured two VMs with CentOS 7, Pacemaker cluster, MariaDB 10. I configured Master/Slave ocf::percona:mysql resource in Pacemaker.
Next i need a shared storage for web site content.
I created DRBD disk in dual-primary mode and GFS2 in top of it. I tested it without adding to Pacemaker. All worked fine but, to make it automaticaly promoted, i need to manage these via Pacemaker.
The problem is that Pacemaker need fencing to create DRBD resource but there is no stonith agents for Hyper-V.
I read that in previous version for CentOS 6 it was possible to create SSH stonith agent. I tried to do this, but pcs not works with it.
Is it possible to use Pacemaker in top of Hyper-V for now? Or may be exist another way to use DRBD in dual primary?
Source: (StackOverflow)
I installed drbd for replicate data on two host. After installing successed, I check status drbd:
root@host3:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@sighted, 2012-10-09 12:47:51
0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
ns:105400 nr:0 dw:0 dr:106396 al:0 bm:20 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
But when I try to: mount /dev/sdb1 /mnt (/dev/sdb1 - device drbd), It not working. This is error:
root@host3:~# mount /dev/sdb1 /mnt/
mount: unknown filesystem type 'drbd'
So, what can I do to mount drbd device?
Source: (StackOverflow)
i want to ask , i have installing DRBD binary package for CentOS on RHEL v5, there is 2 files.
1. drbd83-8.3.13-2.el5.centos.x86_64.rpm
2. kmod-drbd83-8.3.13-1.el5.centos.x86_64.rpm
first i execute drbd83-8.3.13-2.el5.centos.x86_64.rpm file with rpm -i <filename>
, and then i execute kmod-drbd83-8.3.13-1.el5.centos.x86_64.rpm file with same command, but the second operation give output below :
error: Failed dependencies:
kernel(rhel5_lib_u6) = aab649531cab69cbeff5665f2aef9e0dba844b20 is needed by kmod-drbd83-8.3.13-1.el5.centos.x86_64
so what i must to do ??
i know, it required dependencies file named is aab649531cab69cbeff5665f2aef9e0dba844b20, but i do know what is aab649531cab69cbeff5665f2aef9e0dba844b20 ??
Source: (StackOverflow)
Server: Ubuntu server 14 lts + PostgreSQL 9.2
I want create cluster database using drbd, but i can't set PGDATA without cluster initialization. I just need say pgsql use data from drbd disk. How i can do it?
Example 1:
mkdir /cluster/var/lib/pgsql -p
chown postgres:postgres /cluster/var/lib/pgsql -R
cp -R /var/lib/pgsql /cluster/var/lib/pgsql
edit /etc/init.d/postgresql :
PGDATA=/cluster/var/lib/pgsql/data
...
PGLOG=/cluster/var/lib/pgsql/pgstartup.log
/etc/init.d/postgresql start
in postgresql 8.3 it works, but in 9.2 i can't change pgdata in /etc/init.d/postgresql, i need find another file and set pgdata, but, surprise, it's do nothing.
Example 2:
PGDATA - Specifies the directory where the database cluster is to be stored; can be overridden using the -D option.
Ok, let's start:
--pgdata=directory
yeah, it's works! but now we have postgresql-xc and error like "postgresql don't know this user - postgresql".
drbd start replicate data from cluster, but postgresql start it too.
UPD 1:
root: initdb --pgdata=/home/username/dir
~initdb not install~bla-bla-bla~use apt-get install postgres-xc
UPD2:
$: /usr/lib/postgresql/9.3/bin/initdb --pgdata=/whateveryouwant
#now you can run postgresql only one way:
$: /usr/lib/postgresql/9.3/bin/postgres -D /see_up
#then:
LOG: database system was shut down at 2014-09-26 15:56:33 YEKT
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
#aaaaaaaaaaand...nothing. just empty console, ^C stopping postgres
#another SSH connect:
$: ps-ela
S 1000 5995 5217 0 80 0 - 62202 poll_s pts/0 00:00:00 postgres
1 S 1000 5997 5995 0 80 0 - 62202 poll_s ? 00:00:00 postgres
1 S 1000 5998 5995 0 80 0 - 62202 poll_s ? 00:00:00 postgres
1 S 1000 5999 5995 0 80 0 - 62202 poll_s ? 00:00:00 postgres
1 S 1000 6000 5995 0 80 0 - 62415 poll_s ? 00:00:00 postgres
1 S 1000 6001 5995 0 80 0 - 26121 poll_s ? 00:00:00 postgres
#is it ok? because...
$: /etc/init.d/postgresql status
9.3/main (port 5432): down
Source: (StackOverflow)
The Situation: I have two identical Supermicro servers with a lot of RAM and storage capacity. The servers got an Adaptec RAID Controller, which was used to create a RAID 1 for the OS and a RAID 50 for "data". The raid sets are identical on both servers. The servers also have built-in IPMI, which can be used for hardware watchdog on Proxmox (for fencing purposes).
I want both servers being a Proxmox VE node, and both servers should have the very same data on it. That's why data replication is a must. In case one server is down, the second one should be able to serve the VMs and Containers. As per Proxmox wiki, three nodes are required for HA, but I only have two of these.
What I did so far is to get Proxmox 4.1 installed on both servers and created a XFS partition on both machines, which is mirrored synchronously via DRBD. That way, the data is replicated in real-time. However, it turns out that this is pretty slow on the VMs when writing many small files (the two servers are connected via 10Gbit for DRBD purposes. When writing large files, the troughput on that interface is about 1.04 Gbit/s.
Is there any way to improve the I/O troughput, or are there any other recommendations to build a better setup for this?
Source: (StackOverflow)
Here is the normal way to initialize the drbd partition:
ON BOTH SERVERS
drbdadm create-md r0
drbdadm up r0
Both servers should be now connected, check with
ONLY ON PRIMARY
drbdadm -- --overwrite-data-of-peer primary r0
cat /proc/drbd
AFTER BOTH SERVERS UP-TO-DATE - ON PRIMARY
mkfs –t ext4 –b 4096 /dev/drbd0
I now tried to prepare a primary without secondary available (e.g. customer
wants a single server system and probably later wants to add a hot-standby
server)
drbdadm create-md r0
drbdadm up r0
drbdadm primary r0
I got the error:
0: State change failed: (-2) Need access to UpToDate data
Is there a solution ?
Source: (StackOverflow)
we use MySQL 5.6 InnoDB engine for our database purpose.
currently we have a single instance of it , but as our application is growing , we want to have our database in cluster mode. now what we have done till now :
We tried MySQL NDB cluster 7.3.1 (Lab version ): but we still face problems with foreign keys and it doesn't seems to be reliable.so we tried next option.
we tried mysql active/passive clustering with DRBD, pacemaker and corosync : everything works fine on two nodes.
now we want to do Mysql active-active clustering , even after a lot of google i am unble to find any information on it , can we do it with DRBD or there is some other way to do it.
please help !!!!!
Source: (StackOverflow)
I run Debian linux.
I want to make a virtual machine in virtualbox, wich boots from a server via drbl.
The server will be my laptop.
The virtual machine start booting, but fall into infifite boot-loop.
It load one part of the OS from the server, then stop loading, start it again...
Have anyone seen this error?
Source: (StackOverflow)
i have configured DRBD,corosync,pacemaker in mysystem..it works well when i have only one resource group. but i face the below problem when i added one more resource group.
crm_mon -f // shows below error
Migration summary:
* Node cent64asf1:
* Node cent64asf2:
r1_fs: migration-threshold=1000000 fail-count=1000000 last-failure='Sat Aug 23 18:00:45 2014'
Failed actions:
r1_fs_start_0 on cent64asf2 'unknown error' (1): call=98, status=complete, last-rc-change='Sat Aug 23 18:00:45 2014', queued=107ms, exec=0ms
Source: (StackOverflow)
On two system with DRBD over LVM Logical Volumes, Primary/Secondary after a try change (promote) Secondary to Primary, get an error on TWO nodes.
Kernel panic on Centos System
Well.. i like fsck one or other nodes....
But get a wrong message.
mount: unknown filesystem type 'drbd'
fsck /dev/sata/vm-100-disk-1
fsck from util-linux-ng 2.17.2
fsck: fsck.drbd: not found
fsck: Error 2 while executing fsck.drbd for /dev/mapper/sata-vm--100--disk--1
Desesperate... any info on Google.. Ask. Stack...
Source: (StackOverflow)
In all the systems that I am working with DRBD after verification that there are many messages in the log.
kernel: block drbd0: Out of sync: start=403446112, size=328 (sectors)
In some system might think it is by the workload, but there are some teams that are not nearly work.
The computers are connected in a network with 1Gb quality
These messages do not give me much fiablidad the system and that ultimately require cron to check the timing, and reset the fault blocks, which converts a synchronous system of course, in an asynchronous system.
Is this normal?
Any solution?
Any wrong?
common {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"
}
syncer {
# rate after al-extents use-rle cpu-mask verify-alg csums-alg
verify-alg sha1;
rate 40M;
}
}
resource r0 {
protocol C;
startup {
wfc-timeout 15; # non-zero wfc-timeout can be dangerous (http://forum.proxmox.com/threads/3465-Is-it-safe-to-use-wfc-timeout-in-DRBD-configuration)
degr-wfc-timeout 60;
}
net {
cram-hmac-alg sha1;
shared-secret "XXXXXXXXXX";
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
on pro01 {
device /dev/drbd0;
disk /dev/pve/vm-100-disk-1;
address YYY.YYY.YYY.YYY:7788;
meta-disk internal;
}
on pro02 {
device /dev/drbd0;
disk /dev/pve/vm-100-disk-1;
address YYY.YYY.YYY.YYY:7788;
meta-disk internal;
}
}
Source: (StackOverflow)