Fixing memif's poor performance in a VPP tutorial
Problem statement
I’m going through the Progressive VPP Tutorial and I’ve noticed that in the Connecting Two FD.io VPP Instances the ping is rather atrocious:
vpp# ping 10.10.2.2
116 bytes from 10.10.2.2: icmp_seq=1 ttl=64 time=21.2216 ms
116 bytes from 10.10.2.2: icmp_seq=2 ttl=64 time=20.0083 ms
116 bytes from 10.10.2.2: icmp_seq=3 ttl=64 time=20.0124 ms
116 bytes from 10.10.2.2: icmp_seq=4 ttl=64 time=12.8354 ms
116 bytes from 10.10.2.2: icmp_seq=5 ttl=64 time=11.0053 ms
Statistics: 5 sent, 5 received, 0% packet loss
Because 10-20ms on a “loopback” interface is tragic.
What is also weird is that one core from the VM is pegged at 100%.
I’ll look into that.
Answers
I dug around and found VPP memif ping taking ~20ms thread which matches my experience.
I experimented a bit:
$ cat startup1.conf
unix {cli-listen /run/vpp/cli-vpp1.sock}
api-segment { prefix vpp1 }
plugins { plugin dpdk_plugin.so { disable } }
# here be the fix:
cpu { main-core 0 }
$ cat startup2.conf
unix {cli-listen /run/vpp/cli-vpp2.sock}
api-segment { prefix vpp2 }
plugins { plugin dpdk_plugin.so { disable } }
# here be the fix:
cpu { main-core 1 }
And with this setup (pegging one instance to core0 and the other to core1) the following config:
# 1
create interface memif id 0 master
set int state memif0/0 up
set int ip address memif0/0 10.10.2.1/24
# 2
create interface memif id 0 slave
set int state memif0/0 up
set int ip address memif0/0 10.10.2.2/24
takes the ping now to much more reasonable result:
vpp# sh int addr
local0 (dn):
memif0/0 (up):
L3 10.10.2.1/24
vpp# ping 10.10.2.2
116 bytes from 10.10.2.2: icmp_seq=1 ttl=64 time=.0289 ms
116 bytes from 10.10.2.2: icmp_seq=2 ttl=64 time=.0262 ms
116 bytes from 10.10.2.2: icmp_seq=3 ttl=64 time=.0293 ms
116 bytes from 10.10.2.2: icmp_seq=4 ttl=64 time=.0262 ms
116 bytes from 10.10.2.2: icmp_seq=5 ttl=64 time=.0314 ms
Statistics: 5 sent, 5 received, 0% packet loss
The core1 pegged at 100% – however – remains.
Given that the rx-placement
is polling
:
vpp# sh int rx-placement
Thread 0 (vpp_main):
node memif-input:
memif0/0 queue 0 (polling)
and given this TNSR CPU utilization on SG-5100 thread, I’m led to believe the 100% cpu peg is normal. Which is strange but probably not unreasonable.
Edited to add: The VPP CPU Load section of Troubleshooting documentation page says it clearly:
With at least one interface in polling mode, the VPP CPU utilization is always 100%.