EzDevInfo.com

linux-networking interview questions

Top linux-networking frequently asked interview questions

arp-requests cannot be seen by specific nodes

I create an open ad-hoc wlan by using iwconfig (I have the same issue with wpa_supplicant as well). there are 4 nodes on the network as seen on the figure below. The nodes run ubuntu 12.04 and debian squeeze, and have 3.7.1, 3.5 and 3.2 kernels. I use two different usb dongle brands (TP link and ZCN) that all have AR9271 chipset and ath9k_htc driver (here is lsusb output and ethtool output).

The problem I am experiencing is that two nodes (10.0.0.2 and 10.0.0.5) which have TP link usb wifi dongles can ping any node on the network, and vice-versa. However, the other nodes (10.0.0.6 and 10.0.0.7) that have ZCN wifi dongle cannot ping each other, but they have no problem communicating with TP-link wifi modules. tcpdump shows that 10.0.0.6 and 10.0.0.7 cannot see their arp-request, e.g.

20:37:52.470305 ARP, Request who-has 10.0.0.7 tell 10.0.0.6, length 28
20:37:53.463713 ARP, Request who-has 10.0.0.7 tell 10.0.0.6, length 28
20:37:54.463622 ARP, Request who-has 10.0.0.7 tell 10.0.0.6, length 28
20:37:55.472868 ARP, Request who-has 10.0.0.7 tell 10.0.0.6, length 28
20:37:56.463439 ARP, Request who-has 10.0.0.7 tell 10.0.0.6, length 28
20:37:57.463469 ARP, Request who-has 10.0.0.7 tell 10.0.0.6, length 28

but they are able to see and get reply from TP-link's modules.

20:39:23.634459 ARP, Request who-has 10.0.0.2 tell 10.0.0.6, length 28
20:39:23.634551 ARP, Reply 10.0.0.2 is-at 64:70:02:18:d4:6a (oui Unknown), length 28
20:39:23.636687 IP 10.0.0.6 > 10.0.0.2: ICMP echo request, id 572, seq 1, length 64
20:39:23.636809 IP 10.0.0.2 > 10.0.0.6: ICMP echo reply, id 572, seq 1, length 64
20:39:24.635497 IP 10.0.0.6 > 10.0.0.2: ICMP echo request, id 572, seq 2, length 64
20:39:24.635558 IP 10.0.0.2 > 10.0.0.6: ICMP echo reply, id 572, seq 2, length 64
20:39:28.651946 ARP, Request who-has 10.0.0.6 tell 10.0.0.2, length 28
20:39:28.654021 ARP, Reply 10.0.0.6 is-at 00:19:70:94:7c:8b (oui Unknown), length 28

My question is that what could be the reason that 10.0.0.6 and 10.0.0.7 cannot see the arp-request that they send each other? How can I find out the problem?

If I add couple more nodes with ZCN wifi dongle on the network, these nodes are also not able to talk with each other, but they are fine with TP-link. Or if I swap the wifi modules, the nodes with ZCN have always problem but TP-link modules are fine. enter image description here

here is the /etc/network/interfaces, ifconfig, iwconfig, ip a, ip r, route outputs

EDIT: I was suspecting if the problem is arp_filter related but /proc/sys/net/ipv4/conf/*/arp_filter is 0 on the all subdomains(*). If I add arp info of 10.0.0.6 and 10.0.0.7 manually on these nodes, tcpdump and wireshark does not show that they send ping to each other. If I ping the broadcast address (10.0.0.255 in my case), 10.0.0.6 and 10.0.0.7 are able hear it.

EDIT2: Here is pcap files http://filebin.net/6cle9a5iae from 10.0.0.6 (ZCN module), 10.0.0.7 (ZCN module), and 10.0.0.5 (TP-link module that does not have problem). here is the ping outputs from 10.0.0.6 http://pastebin.com/swFP2CJ9 I captured the packages simultaneously. The link also includes ifconfig; iwconfig; and uname- a outputs for each node.


Source: (StackOverflow)

Linux e1000e (Intel networking driver) problems galore, where do I start?

I'm currently having a major problem with e1000e (not working at all) in Ubuntu Maverick (1.0.2-k4), after resume I'm getting a lot of stuff in dmesg:

[ 9085.820197] e1000e 0000:02:00.0: PCI INT A disabled
[ 9089.907756] e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k4
[ 9089.907762] e1000e: Copyright (c) 1999 - 2009 Intel Corporation.
[ 9089.907797] e1000e 0000:02:00.0: Disabling ASPM  L1
[ 9089.907827] e1000e 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 9089.907857] e1000e 0000:02:00.0: setting latency timer to 64
[ 9089.908529] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
[ 9089.908922] e1000e 0000:02:00.0: Disabling ASPM L0s 
[ 9089.908954] e1000e 0000:02:00.0: (unregistered net_device): PHY reset is blocked due to SOL/IDER session.
[ 9090.024625] e1000e 0000:02:00.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:0a:e4:3e:ce:74
[ 9090.024630] e1000e 0000:02:00.0: eth0: Intel(R) PRO/1000 Network Connection
[ 9090.024712] e1000e 0000:02:00.0: eth0: MAC: 2, PHY: 2, PBA No: 005302-003
[ 9090.109492] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
[ 9090.164219] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X

and, a bunch of

[ 2128.005447] e1000e 0000:02:00.0: eth0: Detected Hardware Unit Hang:
[ 2128.005452]   TDH                  <89>
[ 2128.005454]   TDT                  <27>
[ 2128.005456]   next_to_use          <27>
[ 2128.005458]   next_to_clean        <88>
[ 2128.005460] buffer_info[next_to_clean]:
[ 2128.005463]   time_stamp           <6e608>
[ 2128.005465]   next_to_watch        <8a>
[ 2128.005467]   jiffies              <6f929>
[ 2128.005469]   next_to_watch.status <0>
[ 2128.005471] MAC Status             <80080703>
[ 2128.005473] PHY Status             <796d>
[ 2128.005475] PHY 1000BASE-T Status  <4000>
[ 2128.005477] PHY Extended Status    <3000>
[ 2128.005480] PCI Status             <10>

I decided to compile the latest stable e1000e to 1.2.17, now I'm getting:

[ 9895.678050] e1000e: Intel(R) PRO/1000 Network Driver - 1.2.17-NAPI
[ 9895.678055] e1000e: Copyright(c) 1999 - 2010 Intel Corporation.
[ 9895.678098] e1000e 0000:02:00.0: Disabling ASPM  L1
[ 9895.678129] e1000e 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 9895.678162] e1000e 0000:02:00.0: setting latency timer to 64
[ 9895.679136] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
[ 9895.679160] e1000e 0000:02:00.0: Disabling ASPM L0s 
[ 9895.679192] e1000e 0000:02:00.0: (unregistered net_device): PHY reset is blocked due to SOL/IDER session.
[ 9895.791758] e1000e 0000:02:00.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:0a:e4:3e:ce:74
[ 9895.791766] e1000e 0000:02:00.0: eth0: Intel(R) PRO/1000 Network Connection
[ 9895.791850] e1000e 0000:02:00.0: eth0: MAC: 3, PHY: 2, PBA No: 005302-003
[ 9895.892464] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
[ 9895.948175] e1000e 0000:02:00.0: irq 44 for MSI/MSI-X
[ 9895.949111] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 9895.954694] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: RX/TX
[ 9895.954703] e1000e 0000:02:00.0: eth0: 10/100 speed: disabling TSO
[ 9895.955157] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 9906.832056] eth0: no IPv6 routers present

With 1.2.20 I get:

[ 9711.525465] e1000e: Intel(R) PRO/1000 Network Driver - 1.2.20-NAPI
[ 9711.525472] e1000e: Copyright(c) 1999 - 2010 Intel Corporation.
[ 9711.525521] e1000e 0000:02:00.0: Disabling ASPM  L1
[ 9711.525554] e1000e 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 9711.525586] e1000e 0000:02:00.0: setting latency timer to 64
[ 9711.526460] e1000e 0000:02:00.0: irq 45 for MSI/MSI-X
[ 9711.526487] e1000e 0000:02:00.0: Disabling ASPM L0s 
[ 9711.526523] e1000e 0000:02:00.0: (unregistered net_device): PHY reset is blocked due to SOL/IDER session.
[ 9711.639763] e1000e 0000:02:00.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:0a:e4:3e:ce:74
[ 9711.639771] e1000e 0000:02:00.0: eth0: Intel(R) PRO/1000 Network Connection
[ 9711.639854] e1000e 0000:02:00.0: eth0: MAC: 3, PHY: 2, PBA No: 005302-003
[ 9712.060770] e1000e 0000:02:00.0: irq 45 for MSI/MSI-X
[ 9712.116195] e1000e 0000:02:00.0: irq 45 for MSI/MSI-X
[ 9712.117098] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 9712.122684] e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
[ 9712.122693] e1000e 0000:02:00.0: eth0: 10/100 speed: disabling TSO
[ 9712.123142] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 9722.920014] eth0: no IPv6 routers present

But, I'm still getting these

[ 9982.992851] PCI Status             <10>
[ 9984.993602] e1000e 0000:02:00.0: eth0: Detected Hardware Unit Hang:
[ 9984.993606]   TDH                  <5d>
[ 9984.993608]   TDT                  <6b>
[ 9984.993611]   next_to_use          <6b>
[ 9984.993613]   next_to_clean        <5b>
[ 9984.993615] buffer_info[next_to_clean]:
[ 9984.993617]   time_stamp           <24da80>
[ 9984.993619]   next_to_watch        <5d>
[ 9984.993621]   jiffies              <24f200>
[ 9984.993624]   next_to_watch.status <0>
[ 9984.993626] MAC Status             <80080703>
[ 9984.993628] PHY Status             <796d>
[ 9984.993630] PHY 1000BASE-T Status  <4000>
[ 9984.993632] PHY Extended Status    <3000>
[ 9984.993635] PCI Status             <10>
[ 9986.001047] e1000e 0000:02:00.0: eth0: Reset adapter
[ 9986.176202] e1000e: eth0 NIC Link is Up 10 Mbps Full Duplex, Flow Control: RX/TX
[ 9986.176211] e1000e 0000:02:00.0: eth0: 10/100 speed: disabling TSO

I'm not sure where to start troubleshooting this. Any ideas?

Here is the result of ethtool -d eth0

MAC Registers
-------------
0x00000: CTRL (Device control register)  0x18100248
      Endian mode (buffers):             little
      Link reset:                        reset
      Set link up:                       1
      Invert Loss-Of-Signal:             no
      Receive flow control:              enabled
      Transmit flow control:             enabled
      VLAN mode:                         disabled
      Auto speed detect:                 disabled
      Speed select:                      1000Mb/s
      Force speed:                       no
      Force duplex:                      no
0x00008: STATUS (Device status register) 0x80080703
      Duplex:                            full
      Link up:                           link config
      TBI mode:                          disabled
      Link speed:                        10Mb/s
      Bus type:                          PCI Express
      Port number:                       0
0x00100: RCTL (Receive control register) 0x04048002
      Receiver:                          enabled
      Store bad packets:                 disabled
      Unicast promiscuous:               disabled
      Multicast promiscuous:             disabled
      Long packet:                       disabled
      Descriptor minimum threshold size: 1/2
      Broadcast accept mode:             accept
      VLAN filter:                       enabled
      Canonical form indicator:          disabled
      Discard pause frames:              filtered
      Pass MAC control frames:           don't pass
      Receive buffer size:               2048
0x02808: RDLEN (Receive desc length)     0x00001000
0x02810: RDH   (Receive desc head)       0x00000001
0x02818: RDT   (Receive desc tail)       0x000000F0
0x02820: RDTR  (Receive delay timer)     0x00000000
0x00400: TCTL (Transmit ctrl register)   0x3103F0FA
      Transmitter:                       enabled
      Pad short packets:                 enabled
      Software XOFF Transmission:        disabled
      Re-transmit on late collision:     enabled
0x03808: TDLEN (Transmit desc length)    0x00001000
0x03810: TDH   (Transmit desc head)      0x00000000
0x03818: TDT   (Transmit desc tail)      0x00000000
0x03820: TIDV  (Transmit delay timer)    0x00000008
PHY type:                                IGP2

and ethtool -c eth0

Coalesce parameters for eth0:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 3
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 0
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

Here is also the lspci -vvv for this controller

02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
    Subsystem: Lenovo ThinkPad X60s
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 45
    Region 0: Memory at ee000000 (32-bit, non-prefetchable) [size=128K]
    Region 2: I/O ports at 2000 [size=32]
    Capabilities: [c8] Power Management version 2
        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
        Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
    Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Address: 00000000fee0300c  Data: 415a
    Capabilities: [e0] Express (v1) Endpoint, MSI 00
        DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
        DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
        LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <128ns, L1 <64us
            ClockPM+ Surprise- LLActRep- BwNot-
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
            ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
        UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap- ChkEn-
    Capabilities: [140 v1] Device Serial Number 00-0a-e4-ff-ff-3e-ce-74
    Kernel driver in use: e1000e
    Kernel modules: e1000e

I filed a bug on this upstream, still no idea how to get more useful information.

Here is a the result of the running that script

EEPROM FIX UPDATE

$ sudo bash fixeep-82573-dspd.sh eth0
eth0: is a "82573L Gigabit Ethernet Controller"
This fixup is applicable to your hardware
Your eeprom is up to date, no changes were made

Do I still need to do anything? Also here is my EEPROM dump

$ sudo ethtool -e eth0
Offset      Values
------      ------
0x0000      00 0a e4 3e ce 74 30 0b b2 ff 51 00 ff ff ff ff 
0x0010      53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80 
0x0020      00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 
0x0030      c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 
0x0040      08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 
0x0050      14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 
0x0060      00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 
0x0070      ff ff ff ff ff ff ff ff ff ff ff ff ff ff 4a e0

I'd also like to note that I used eth0 every day for years and until recently never had an issue.


Source: (StackOverflow)

Advertisements

multiple MAC addresses on one physical network interface (linux)

Simple question: How can I setup multiple MAC addresses on one physical network interface (linux)?

Why? My ISP is checking ip<->mac on GW and I d like to route traffic through my "linuxbox" and than forward it with different source ip.

Without checking ip<->mac, I will use eth0, eth0:0, but in this situation I need unique MAC address for every IP.


Source: (StackOverflow)

How can you tell what a server actually does? [closed]

I've been handed 3 Linux boxes, 1 front facing with apache on it and another 2 which, as far as I can tell, don't do an awful lot. All running on Redhat.

The question is simple: How can I tell what the server is actually doing? Zero documentation is available from the creator.


Source: (StackOverflow)

Transfer 15TB of tiny files

I'm archiving data from one server to another. Initially I started a rsync job. It took 2 weeks for it to build the file list just for 5 TB of data and another week to transfer 1 TB of data.

Then I had to kill the job as we need some down time on the new server.

It's been agreed that we will tar it up since we probably won't need to access it again. I was thinking of breaking it into 500 GB chunks. After I tar it then I was going to copy it across through ssh. I was using tar and pigz but it is still too slow.

Is there a better way to do it? I think both servers are on Redhat. Old server is Ext4 and the new one is XFS.

File sizes range from few kb to few mb and there are 24 million jpegs in 5TB. So I'm guessing around 60-80 million for 15TB.

edit: After playing with rsync, nc, tar, mbuffer and pigz for a couple of days. The bottleneck is going to be the disk IO. As the data is striped across 500 SAS disks and around 250 million jpegs. However, now I learnt about all these nice tools that I can use in future.


Source: (StackOverflow)

"Stale NFS file handle" after reboot

On the server node, it is possible to access an exported folder. However, after reboots (both server and client), the folder is no longer accessible from the clients.

On server

# ls /data
Folder1
Forlder2

and the /etc/exports file contains

/data 192.168.1.0/24(rw,no_subtree_check,async,no_root_squash)

On client

# ls /data
ls: cannot access /data: Stale NFS file handle

I have to say that there were no problem with the shared folder from client side however after reboots (server and client), I see this message.

Any way to fix that?


Source: (StackOverflow)

Can I have dots in a hostname?

I'm using names like a.alpha for the hostname of my linux box, but it seams that these name are not completely usable. The response of a hostname shell command is correct (a.alpha). But the name printed after my user account is "user@a" instead of "user@a.alpha". When I use avahi, I can reach (by hostname) a.alpha, but not b.alpha. Is that normal?


Source: (StackOverflow)

How to "restart" particular network interface on RHEL?

In RHEL, instead of using service network restart command, how can i restart a particular network interface, lets say "eth1", with only one command.

"Only one command" because that is the only interface where my ssh is working on also. So if i'm about to use: ifdown and then ifup, i will never be able to hit the ifup command as my ssh has been terminated once after ifdown eth1 command.

So there should be a single command which allows me to altogether bring down and then bring up the interface which is serving my current ssh connection. So i do not need to worry about connection totally lost to my server.

Any idea please?


Source: (StackOverflow)

Intermittently high ping times to router via Wi-Fi

On my local network there are (among others) 5 machines (running Debian Jessie or Arch) wirelessly connected to a Netgear WNDR4000 router. Below is a graph of the ping times to the router from each of the machines, collected over a period of around half an hour.

Ping times

Observations:

  • When things are going well, the ping times are all below 3ms (under 1ms for two of the machines, including the problem machine purple)

  • At irregular intervals (of the order of 100s), three of these machines (red, green, purple) suffer degradation of ping times, while the other two appear unaffected.

  • The degradation periods coincide for all 3 machines.

  • The degradation for purple is two orders of magnitude more severe than for green and red, with ping times typically reaching over 20000ms for purple and 200ms for red and green.

  • If purple is physically moved nearer the router, the degradation completely disappears for purple while continuing as before for both red and green.

  • Red is 3m away and in direct line of sight from the base station; purple's usual location is about 10m away without direct line of sight.

This makes network access on purple intolarably slow (when it is in its normal location).

Can you suggest how to go about diagnosing and fixing the problem?


Source: (StackOverflow)

Should /etc/hosts contain an entry like '127.0.0.1 localhost myhost.example.org myhost'?

When looking at a variety of Linux and FreeBSD systems, I've noticed that on some systems /etc/hosts contains an entry for the public hostname of the host, but not on other systemst.

What is the best practice here? Should my /etc/hosts file contain an entry for the hosts FQDN (e.g. myhost.example.org) and for the short hostname (e.g. myhost)? Should the record for the FQDN point to the localhost or should it point to the external IP of the box?

For example, the default configuration on many RHEL/EL boxes doesn't put the public hostname into /etc/hosts:

myhost # cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
myhost #

The other variant is that the host's short hostname and FQDN also point to 127.0.0.1. I've been told that this is an older practice which is frowned upon these days, but plenty of admins still do this.

myhost # cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 myhost myhost.example.org
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
myhost #    

The third variant is that the hosts's FQDN and short hostname are given the external IP address of the host. This third varient seems optimal to me because it reduces lookups against the DNS servers.

myhost # cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
74.125.239.xxx myhost myhost.example.org
myhost #  

What is the best practice here?


Source: (StackOverflow)

Delete a iptables chain with its all rules

I have a chain appended with many rules like:

> :i_XXXXX_i - [0:0]
> -A INPUT -s 282.202.203.83/32 -j i_XXXXX_i 
> -A INPUT -s 222.202.62.253/32 -j i_XXXXX_i 
> -A INPUT -s 222.202.60.62/32 -j i_XXXXX_i 
> -A INPUT -s 224.93.27.235/32 -j i_XXXXX_i 
> -A OUTPUT -d 282.202.203.83/32 -j i_XXXXX_i 
> -A OUTPUT -d 222.202.62.253/32 -j i_XXXXX_i 
> -A OUTPUT -d 222.202.60.62/32 -j i_XXXXX_i 
> -A OUTPUT -d 224.93.27.235/32 -j i_XXXXX_i

when I try to delete this chain with:

iptables -X XXXX

but got error like (tried iptables -F XXXXX before):

iptables: Too many links.

Is there a easy way to delete the chain by once command?


Source: (StackOverflow)

PXE Booting Virtualbox guests in internal networking mode?

I am not sure if this is just specific to my distro's packages or is a vbox limitation. So, any help would be appreciated.

Ok, so I have a network of VMs, with one VM acting as a NAT for the other VMs in the "internal" network. One of those VMs is running a DHCP & TFTP server, and I just need to boot other VMs off of this server, but all I get starting the VMs when booting from the network is "Fatal: Could not read from the boot medium ! System halted".

This is kinda killjoy since vbox is pretty easy to use and to experiment with, but I can't get around this error. Help ?


Source: (StackOverflow)

What network loads require NIC polling vs interrupts?

Does anyone have some data or basic calculations that can answer when frame coalescing (NAPI) is required and when a single interrupt per frame is sufficient?

My hardware: IBM BladeServer HS22, Broadcom 5709 Gigabit NIC hardware (MSI-X), with dual Xeon E5530 quad-core processors. Main purpose is Squid proxy server. Switch is a nice Cisco 6500 series.

Our basic problem is that during peak times (100 Mbps traffic, only 10,000 pps) that latency and packet loss increases. I have done a lot of tuning and kernel upgrade to 2.6.38 and it has improved the packet loss but latency is still poor. Pings are sporadic; jumping even to 200ms on local Gbps LAN. Squid average response jumps from 30ms to 500+ms even though CPU/memory load is fine.

The interrupts climb to about 15,000/second during the peak. Ksoftirqd isn't using much CPU; I have installed irqbalance to balance the IRQs (8 each for eth0 and eth1) across all the cores but that hasn't helped much.

Intel NICs seem to never have these kinds of problems, but do the fact of the bladesystem and fixed configuration hardware, we are kind of stuck with the Broadcoms.

Everything is pointing at the NIC as being the main culprit. The best idea I have right now is to try decrease the interrupts while keeping both latency low and throughput high.

The bnx2 unfortunately doesn't support adaptive-rx or tx.

The NAPI vs Adaptive Interrupts thread answer provides a great over view of interrupt moderation but no concrete information on how to calculate optimal ethtool coalesce settings for given workaround. Is there a better approach then just trial and error?

Does the above mentioned workload and hardware configuration even need NAPI? Or should it be able to live on single interrupt per packet?


Source: (StackOverflow)

how to find out mac addresses of all machines on network

Is there some easy way to find out mac address of all machines on my network rather than doing an SSH into each and ifconfig | grep HWaddr if there are 300 machines on network I really need some easy solution.


Source: (StackOverflow)

How to reduce number of sockets in TIME_WAIT?

Ubuntu Server 10.04.1 x86

I've got a machine with a FCGI HTTP service behind nginx, that serves a lot of small HTTP requests to a lot of different clients. (About 230 requests per second in the peak hours, average response size with headers is 650 bytes, several millions of different clients per day.)

As a result, I have a lot of sockets, hanging in TIME_WAIT (graph is captured with TCP settings below):

TIME_WAIT

I'd like to reduce the number of sockets.

What can I do besides this?

$ cat /proc/sys/net/ipv4/tcp_fin_timeout
1
$ cat /proc/sys/net/ipv4/tcp_tw_recycle
1
$ cat /proc/sys/net/ipv4/tcp_tw_reuse
1

Update: some details on the actual service layout on the machine:

client -----TCP-socket--> nginx (load balancer reverse proxy) 
       -----TCP-socket--> nginx (worker) 
       --domain-socket--> fcgi-software
                          --single-persistent-TCP-socket--> Redis
                          --single-persistent-TCP-socket--> MySQL (other machine)

I probably should switch load-balancer --> worker connection to domain sockets as well, but the issue about TIME_WAIT sockets would remain — I plan to add a second worker on a separate machine soon. Won't be able to use domain sockets in that case.


Source: (StackOverflow)