OSPF point-to-multipoint, single area, multiple interfaces packet loss


#1

I’m having a problem with an OSPF setup that I’m testing. My network is composed of two LANs, lets call them LAN A and LAN B.

Here is a digram to help understand how this is set up.

My network core consists of four routers, R1 through R4. Each of these routers have two interfaces, eth1 connected to LAN A and eth2 connected to LAN B.

LAN A subnet is 172.50.0.0/16

LAN B subnet is 172.30.0.0/16

The goal is to route the loopback addresses between the LANs using the core routers.

I have one OSPF area 0.0.0.1, configured the following way:

R1

[code] area 0.0.0.1 {
network 10.0.0.11/32
network 172.30.0.0/16
network 172.50.0.0/16
}
log-adjacency-changes {

}
parameters {
    abr-type cisco
    router-id 10.0.0.11
}[/code]

R2

[code] area 0.0.0.1 {
network 10.0.0.12/32
network 172.30.0.0/16
network 172.50.0.0/16
}
log-adjacency-changes {

}
parameters {
    abr-type cisco
    router-id 10.0.0.12
}

[/code]
R3

[code] area 0.0.0.1 {
network 10.0.0.13/32
network 172.30.0.0/16
network 172.50.0.0/16
}
log-adjacency-changes {

}
parameters {
    abr-type cisco
    router-id 10.0.0.13
}[/code]

R4

[code] area 0.0.0.1 {
network 10.0.0.14/32
network 172.30.0.0/16
network 172.50.0.0/16
}
log-adjacency-changes {

}
parameters {
    abr-type cisco
    router-id 10.0.0.14
}

[/code]
Note: OSPF Network type is specified as point-to-multipoint on all interfaces. This is necessary because LAN A and LAN B use private vlans to isolate the broadcast domain. We cannot assume full mesh connectivity between all routers.

I am seeing strange behaviour where sometimes one or two of the OSPF neighborships get stuck in the Exstart state. It may take 10 minutes to get from Exstart -> Exchange -> Full. When it eventually gets to the Full state, I see 50% packet loss between one pair of hosts. Sometimes it is R1 to R3, sometimes packet loss is between R4 and R2. It seems random. It almost looks like with the Point-to-point network type I’m ending up with a L3 loop.

If I disable eth2 on all routers and let OSPF re-converge, I no longer see the packet loss.

Should I not be using point-to-multipoint network type? Has anyone experienced a similar issue?

I replicated this exact setup with Cisco gear and had no problems.