FreeSWITCH Sending RTP to Private IP (most of the time)

I have a weird issue I’ve been trying to pin down for well over a year really.

About 80% of the time, FreeSWITCH starts by sending RTP to the private IP address of endpoints behind NAT.

FreeSWITCH has a public IP and endpoints are behind NAT. The INVITE from endpoints sends the correct port in the SDP. FreeSWITCH is configured as follows:

ext-rtp-ip = external IP (x.x.x.x)
apply-nat-acl = rfc1918.auto
local-network-acl = rfc1918.auto

What I really want is for FreeSWITCH to start by sending rtp to the port in the initial INVITE and the IP address that the initial INVITE comes from. Instead, it’s sending to the local IP in the SDP which then gets changed via autonat later. Seems unnecessary, since we know (or at least can infer) both the correct IP and port to use based on the initial INVITE…

This is what a SIP trace looks like with the above config:

Anybody have any hints as to how to try and get FreeSWITCH to stop sending RTP to the local IP and relying on autonat?

1 Like

local-network-acl won’t even apply unless you prefix it with autonat:x.x.x.x, that turns on the flag to pick the IP based on the remote network addr using that ACL, we had to do that to preserve backwards compatibility of always telling the lie.

Thanks,
/b
EDIT: Also sounds like your clients need to do better STUN, if you’re public, then always tell the lie and local-network-acl won’t even be needed.

1 Like

Thanks for the tip @BrianWest-SW

So I tried setting local-network-acl to:

autonat:rfc1918.auto
autonat:rfc1918 (my own acl list)
autonat:192.168.0.196 (the actual LAN IP of the device I’m testing)

Unfortunately, no change, FreeSWITCH always insists on trying to send RTP to the LAN IP provided in the SDP.

Of course you’re completely correct that when possible it’s nice to use STUN, but I’m trying to optimize cases where that’s not possible (which are unfortunately plentiful in our case).

Is it possible that FreeSWITCH simply can’t presently automatically send to the source WAN IP / port immediately without waiting to receive rtp from the far side first?

There is a wiki comment that might help from Kristian Kielhofner:

https://freeswitch.org/confluence/display/FREESWITCH/NAT+Traversal#NATTraversal-DistilledWisdom

You zigged where you should have zagged, thats an easy one to make, You should always set ext-rtp-ip to the autonat:publicip, not the internal IP, and not the ACL name… I suspect you’re on AWS or other cloud, so setting the local-network-acl to rfc1918.auto and ext-rtp-ip to the public IP only prefixed with autonat:publicip will solve you probably, if not post a sip trace and lets see where we are.

Thanks,
Brian

Thanks for the link @BoteMan :slight_smile:

@BrianWest-SW Thanks so much for taking a look at this one. It’s one of those issues I’ve tried to fix on and off for a very, very long time without any success.

So as you suggested, I tried with:

ext-rtp-ip = autonat:publicip (previously publicip)
local-network-acl = rfc1918.auto

However the behaviour is unchanged. For additional context, this is on a dedicated server and the SIP profile is bound to a public IP. It’s not behind NAT or any load balancer; it has raw access the the internet.

I went and grabbed a SIP trace of an example call as well as the fs_cli debug level output and a list of all the parameters on the SIP profile I’m testing with. This is tested on 1.10.8.

freeswitch-rtp-local-ip-traces_logs.zip

Is the public IP directly bound to the system?

Yes, indeed it is. It’s bound directly to the NIC.

what is the network topo?

Fairly straightforward scenario with the client behind NAT. Basically this:

(Endpoint => NAT Router ) <=> Internet <=> ( FreeSWITCH Server )

FreeSWITCH gets the call directly and there’s no SBC in-between. e.g. endpoint registers directly.

The root cause is you’re not making the endpoint traverse NAT, when its perfectly capable of doing it. There is no reason to ever set ext-rtp-ip or local-network-acl in this case, setup your Yealink to properly stun, and enable rport. These settings are for when you want to talk to devices behind NAT with you, and OUTSIDE the NAT on the same profile. If you’re not doing that then these settings are moot.

/b

OK, thanks for explaining these settings, it’s more clear now!

As I said earlier, I realize Yealinks can do NAT traversal better. I used a Yealink without STUN as it was on my desk and is illustrative the scenario I’m grappling with. Reason being, we have many devices that don’t STUN (analog gateways to softphones and some deskphones from troublesome vendors).

Even as-is, they do work most of the time thanks to FreeSWITCH doing its autonat thing where it begins sending media to the WAN IP once it receives RTP. However, we do get sporadic reports of delays up to many seconds before FreeSWITCH clues in. When I check those traces, FreeSWITCH is busy sending RTP to the LAN IP for some seconds before switching.

Hence my thought of “hey, maybe it’s possible to pare out the behaviour of sending the a LAN IP completely”, since we know in advance it won’t go anywhere. We also should have all the information we need to try the WAN IP first. e.g.:

  • Detects phone is behind NAT
  • Knows the negotiated RTP port
  • Knows the phone’s WAN IP

I also realize this could be solved with an SBC up front. But I’m hopelessly attached to FreeSWITCH at the moment :heart:

You may need to come to the next community hours, the idea that you want to support devices like that, you’re just going to burn time on supporting end users.

/b

Also I go over a lot of this in this video all still valid.

ClueCon Weekly - Jan 13, 2016 - Brian West - NAT Traversal with FreeSWITCH - YouTube

1 Like

Thanks for sharing the link @BrianWest-SW, really good explanation. I watched the video and summarized everything you said below in case it’s useful for anyone that comes across this. It all makes perfect sense how this works now, so thanks!

Unfortunately, none of the 3 NAT modes described work to send RTP to the WAN IP directly from the outset. I’ll try to join office hours next time they come up. Thanks so much for all the help!


NAT Mode 1

<param name="ext-rtp-ip" value="190.102.98.2"/>
<param name="ext-sip-ip" value="190.102.98.2"/>

When we first started, FreeSWITCH had NAT mode 1 which just had ext-rtp-ip and ext-sip-ip. These two settings would allow you to always tell a lie in the SIP packets and the RTP. We would bind to the local IPs on the box and we would use two particular elements in the SDP and in the SIP packets. You would always be able to speak with your clients that reside outside of the NAT that the FreeSWITCH server is sitting behind.

NAT Mode 2 (Uses NAT-PMP/uPNP to discover your public IP)

<param name="ext-rtp-ip" value="auto-nat"/>
<param name="ext-sip-ip" value="auto-nat"/>
<param name="local-network-acl" value="localnet.auto"/>

With NAT mode 1, this posed a problem: if you wanted to use one profile to talk to entities that live behind the NAT with you at the same time, you couldn’t use a single profile with that particular configuration. You would have to create a separate Sofia profile to be able to speak with those elements behind the NAT with FreeSWITCH and elements outside. This posed a little bit of a problem if you had a client that would reside on the network and roam off the network onto say LTE or otherwise the outside of your network.

The second way you can configure your NAT mode is by setting ext-rtp-ip and ext-sip-ip to auto-nat. What this does is it activates the discovery process in the beginning. A lot of you guys probably run on static IPs when you start FreeSWITCH with -nonat to not waste the 3-5s startup delay.

If you’re on a consumer grade connection or consumer-grade router, you probably have NAT-PnP or NAT-UPnP. One technique that is critical for this function properly is the local-network-acl. This is set up by default with allow lines that will dictate what networks are behind NAT with you.

What this will do is allow the NAT subsystem in Sofia to look at where the request came from. NOT what is in the SDP and NOT what is in the packet, but where did this request ACTUALLY come from - the real IP address. So, it will compare the where it came from against this ACL. If there’s a match, it WILL NOT use the ext-rtp-ip and ext-sip-ip if it matches the ACL.

If it matches the ACL, it will use those external settings. So you can determine if something’s behind NAT with you or outside of NAT with you.

NAT Mode 3

<param name="ext-rtp-ip" value="autonat:190.102.98.2"/>
<param name="ext-sip-ip" value="autonat:190.102.98.2"/>
<param name="local-network-acl" value="localnet.auto"/>

More of a static configuration – I know my public IP, I know I’m going to talk to things behind NAT and things on the public internet. Allows you to prefix your external SIP and RTP IPs with autonat: (no dash). What this does it activates the local-network-acl and hard-sets your IPs to those values if you always know those values. If they change, this will not function anymore; you will have inconsistent calls where they end after 30 seconds if your IP address changed.

This will use the local-network-acl to determine what is local to the system vs. what is not so it can dynamically figure out “should I use the internal RTP IP and the SIP IP or should I use the external RTP IP or external SIP IP.

This whole thing is a little bit confusing because a lot of people will assume that setting the local-network-acl and just setting and IP in the ext-rtp-ip and ext-sip-ip will make this behavior happen. Unfortunately, it won’t. You have to prefix it with autonat: that’s very critical.

The reason we need to do that is to maintain the old behavior where you hard-set the ext-rtp-ip and ext-sip-ip as in Mode 1. We didn’t want to break anybody in the field by changing that behavior.

aggressive-nat-detection

Values: true, false

You probably should not use this – it’s there for legacy reasons. If you’re having to do this, you’ve got some situations where you’re just at your last end and this is the only thing that will work. I don’t think it’s even recommended anymore.

apply-nat-acl

Values: acl_name

Should I do NAT processing on this packet? Maybe set to “nat.auto”. Only benefit of not setting this is for CPU performance gain, which nowadays is negligible.

enable-compact-headers

Values: true, false

If you guys have been paying attention to WebRTC and how bit those SDPs are – there’s a reason they use TCP. They are too big to fit in a UDP frame because of the MTU so they use TCP. This option will try to make the packet a bit smaller to try and fit it within the MTU whenever possible.

-nonat (startup flag)

To turn it off use “-nonat” to start FreeSWITCH. If you still prefer to have NAT detection but want to avoid the port mapping using UPnP/NAT-PMP you can use “-nonatmap”.


1 Like

This is a mindset issue which is very common on this particular topic, I sent an email to your address on your account, please reply with your direct email, and lets see if I can setup a time, I’ll set aside time to work with you one on one for a little bit if you document everything you learn! :slight_smile:

EDIT: Thanks for hopping on the call with us, I think we’re all on the same page now, Good luck and see you in the next Community Call / Office Hours.

Thanks,
Brian

1 Like

Further up this thread the OP said:

I’ve noticed this too. Pretty much all of the time FreeSWITCH auto-adjusts the address to which it sends RTP. After receiving 10 RTP packets it changes from sending to the internal address advertised by the user agent in its SDP to the address that is actually receiving RTP from, debugging a line: “Auto Changing %s port from %s:%u to %s:%u”.

I’ve never needed to use STUN on connected handsets because this behaviour sorts it out. Except when it doesn’t.

I have occasional reports from Android Zoiper softphone users where they can’t hear audio. Traces show that FreeSWITCH doesn’t change the IP addresses that it sends RTP to, despite receiving plenty of RTP for the auto adjust mechanism to kick in. I can’t for the life of me see in the trace or the debug what is different between a working case and a non-working case.

And, yes, the proper solution is to use STUN here - that’s what I’ve advised my customers to set in the app whenever they report this. However my worry is that if there is some mechanism where the auto-adjust doesn’t apply, then I have a whole estate of Snoms, Polycoms and Yealinks currently not STUN-configured that could potentially stop working.

I would really like to understand why in a very small number of cases this mechanism doesn’t work. Has anyone come across this, and have any ideas?

Is NDLB-connectile-dysfunctionin use?