Chapter 13 PPP

13.1. I cannot make ppp(8) work. What am I doing wrong?
13.2. Why does ppp(8) hang when I run it?
13.3. Why will ppp(8) not dial in -auto mode?
13.4. What does “No route to host” mean?
13.5. Why does my connection drop after about 3 minutes?
13.6. Why does my connection drop under heavy load?
13.7. Why does my connection drop after a random amount of time?
13.8. Why does my connection hang after a random amount of time?
13.9. The remote end is not responding. What can I do?
13.10. ppp(8) has hung. What can I do?
13.11. I keep seeing errors about magic being the same. What does it mean?
13.12. LCP negotiations continue until the connection is closed. What is wrong?
13.13. Why does ppp(8) lock up when I shell out to test it?
13.14. Why does ppp(8) over a null-modem cable never exit?
13.15. Why does ppp(8) dial for no reason in -auto mode?
13.16. What do these CCP errors mean?
13.17. Why does ppp(8) not log my connection speed?
13.18. Why does ppp(8) ignore the \ character in my chat script?
13.19. Why does ppp(8) get a “Segmentation fault”, but I see no ppp.core
13.20. Why does the process that forces a dial in -auto mode never connect?
13.21. Why do most games not work with the -nat switch?
13.22. What are FCS errors?
13.23. None of this helps — I am desperate! What can I do?

13.1. I cannot make ppp(8) work. What am I doing wrong?

You should first read the ppp(8) manual page and the PPP section of the handbook. Enable logging with the following command:

set log Phase Chat Connect Carrier lcp ipcp ccp command

This command may be typed at the ppp(8) command prompt or it may be entered in the /etc/ppp/ppp.conf configuration file (the start of the default section is the best place to put it). Make sure that /etc/syslog.conf (see syslog.conf(5)) contains the lines below and the file /var/log/ppp.log exists:

!ppp
*.*        /var/log/ppp.log

You can now find out a lot about what is going on from the log file. Do not worry if it does not all make sense. If you need to get help from someone, it may make sense to them.

13.2. Why does ppp(8) hang when I run it?

This is usually because your hostname will not resolve. The best way to fix this is to make sure that /etc/hosts is consulted by your resolver first by editing /etc/host.conf and putting the hosts line first. Then, simply put an entry in /etc/hosts for your local machine. If you have no local network, change your localhost line:

127.0.0.1        foo.example.com foo localhost

Otherwise, simply add another entry for your host. Consult the relevant manual pages for more details.

You should be able to successfully ping -c1 `hostname` when you are done.

13.3. Why will ppp(8) not dial in -auto mode?

First, check that you have got a default route. By running netstat -rn (see netstat(1)), you should see two entries like this:

Destination        Gateway            Flags     Refs     Use     Netif Expire
default            10.0.0.2           UGSc        0        0      tun0
10.0.0.2           10.0.0.1           UH          0        0      tun0

This is assuming that you have used the addresses from the handbook, the manual page, or from ppp.conf.sample. If you do not have a default route, it may be because you forgot to add the HISADDR line to ppp.conf.

Another reason for the default route line being missing is that you have mistakenly set up a default router in your /etc/rc.conf (see rc.conf(5)) file and you have omitted the line below from ppp.conf:

delete ALL

If this is the case, go back to the Final System Configuration section of the handbook.

13.4. What does “No route to host” mean?

This error is usually due that the following section is missing in your /etc/ppp/ppp.linkup:

MYADDR:
  delete ALL
  add 0 0 HISADDR

This is only necessary if you have a dynamic IP address or do not know the address of your gateway. If you are using interactive mode, you can type the following after entering packet mode (packet mode is indicated by the capitalized PPP in the prompt):

delete ALL
add 0 0 HISADDR

Refer to the PPP and Dynamic IP addresses section of the handbook for further details.

13.5. Why does my connection drop after about 3 minutes?

The default PPP timeout is 3 minutes. This can be adjusted with the following line:

set timeout NNN

where NNN is the number of seconds of inactivity before the connection is closed. If NNN is zero, the connection is never closed due to a timeout. It is possible to put this command in ppp.conf, or to type it at the prompt in interactive mode. It is also possible to adjust it on the fly while the line is active by connecting to ppp's server socket using telnet(1) or pppctl(8). Refer to the ppp(8) man page for further details.

13.6. Why does my connection drop under heavy load?

If you have Link Quality Reporting (LQR) configured, it is possible that too many LQR packets are lost between your machine and the peer. ppp(8) deduces that the line must therefore be bad, and disconnects. LQR is disabled by default and can be enabled with the following line:

enable lqr

13.7. Why does my connection drop after a random amount of time?

Sometimes, on a noisy phone line or even on a line with call waiting enabled, your modem may hang up because it thinks (incorrectly) that it lost carrier.

There is a setting on most modems for determining how tolerant it should be to temporary losses of carrier. Refer to the modem manual for details.

13.8. Why does my connection hang after a random amount of time?

Many people experience hung connections with no apparent explanation. The first thing to establish is which side of the link is hung.

If you are using an external modem, you can simply try using ping(8) to see if the TD light is flashing when you transmit data. If it flashes (and the RD light does not), the problem is with the remote end. If TD does not flash, the problem is local. With an internal modem, you will need to use the set server command in ppp.conf. When the hang occurs, connect to ppp(8) using pppctl(8). If your network connection suddenly revives (PPP was revived due to the activity on the diagnostic socket) or if you cannot connect (assuming the set socket command succeeded at startup time), the problem is local. If you can connect and things are still hung, enable local async logging with set log local async and use ping(8) from another window or terminal to make use of the link. The async logging will show you the data being transmitted and received on the link. If data is going out and not coming back, the problem is remote.

Having established whether the problem is local or remote, you now have two possibilities:

  • If the problem is remote, read on entry Q: 13.9..

  • If the problem is local, read on entry Q: 13.10..

13.9. The remote end is not responding. What can I do?

There is very little you can do about this. Most ISPs will refuse to help if you are not running a Microsoft® OS. You can enable lqr in your ppp.conf, allowing ppp(8) to detect the remote failure and hang up, but this detection is relatively slow and therefore not that useful. You may want to avoid telling your ISP that you are running user-PPP.

First, try disabling all local compression by adding the following to your configuration:

disable pred1 deflate deflate24 protocomp acfcomp shortseq vj
deny pred1 deflate deflate24 protocomp acfcomp shortseq vj

Then reconnect to ensure that this makes no difference. If things improve or if the problem is solved completely, determine which setting makes the difference through trial and error. This will provide good ammunition when you contact your ISP (although it may make it apparent that you are not running a Microsoft product).

Before contacting your ISP, enable async logging locally and wait until the connection hangs again. This may use up quite a bit of disk space. The last data read from the port may be of interest. It is usually ASCII data, and may even describe the problem (“Memory fault”, “Core dumped”).

If your ISP is helpful, they should be able to enable logging on their end, then when the next link drop occurs, they may be able to tell you why their side is having a problem.

13.10. ppp(8) has hung. What can I do?

Your best bet here is to rebuild ppp(8) with debugging information, and then use gdb(1) to grab a stack trace from the ppp process that is stuck. To rebuild the ppp utility with debugging information, you can type:

# cd /usr/src/usr.sbin/ppp
# env DEBUG_FLAGS='-g' make clean
# env DEBUG_FLAGS='-g' make install

Then you should restart ppp and wait until it hangs again. When the debug build of ppp hangs, start gdb on the stuck process by typing:

# gdb ppp `pgrep ppp`

At the gdb prompt, you can use the bt or where commands to get a stack trace. Save the output of your gdb session, and “detach” from the running process by typing quit.

13.11. I keep seeing errors about magic being the same. What does it mean?

Occasionally, just after connecting, you may see messages in the log that say “Magic is same”. Sometimes, these messages are harmless, and sometimes one side or the other exits. Most PPP implementations cannot survive this problem, and even if the link seems to come up, you will see repeated configure requests and configure acknowledgments in the log file until ppp(8) eventually gives up and closes the connection.

This normally happens on server machines with slow disks that are spawning a getty(8) on the port, and executing ppp(8) from a login script or program after login. There were reports of it happening consistently when using slirp. The reason is that in the time taken between getty(8) exiting and ppp(8) starting, the client-side ppp(8) starts sending Line Control Protocol (LCP) packets. Because ECHO is still switched on for the port on the server, the client ppp(8) sees these packets “reflect” back.

One part of the LCP negotiation is to establish a magic number for each side of the link so that “reflections” can be detected. The protocol says that when the peer tries to negotiate the same magic number, a NAK should be sent and a new magic number should be chosen. During the period that the server port has ECHO turned on, the client ppp(8) sends LCP packets, sees the same magic in the reflected packet and NAKs it. It also sees the NAK reflect (which also means ppp(8) must change its magic). This produces a potentially enormous number of magic number changes, all of which are happily piling into the server's tty buffer. As soon as ppp(8) starts on the server, it is flooded with magic number changes and almost immediately decides it has tried enough to negotiate LCP and gives up. Meanwhile, the client, who no longer sees the reflections, becomes happy just in time to see a hangup from the server.

This can be avoided by allowing the peer to start negotiating with the following line in ppp.conf:

set openmode passive

This tells ppp(8) to wait for the server to initiate LCP negotiations. Some servers however may never initiate negotiations. If this is the case, you can do something like:

set openmode active 3

This tells ppp(8) to be passive for 3 seconds, and then to start sending LCP requests. If the peer starts sending requests during this period, ppp(8) will immediately respond rather than waiting for the full 3 second period.

13.12. LCP negotiations continue until the connection is closed. What is wrong?

There is currently an implementation mis-feature in ppp(8) where it does not associate LCP, CCP & IPCP responses with their original requests. As a result, if one PPP implementation is more than 6 seconds slower than the other side, the other side will send two additional LCP configuration requests. This is fatal.

Consider two implementations, A and B. A starts sending LCP requests immediately after connecting and B takes 7 seconds to start. When B starts, A has sent 3 LCP REQs. We are assuming the line has ECHO switched off, otherwise we would see magic number problems as described in the previous section. B sends a REQ, then an ACK to the first of A's REQs. This results in A entering the OPENED state and sending and ACK (the first) back to B. In the meantime, B sends back two more ACKs in response to the two additional REQs sent by A before B started up. B then receives the first ACK from A and enters the OPENED state. A receives the second ACK from B and goes back to the REQ-SENT state, sending another (forth) REQ as per the RFC. It then receives the third ACK and enters the OPENED state. In the meantime, B receives the forth REQ from A, resulting in it reverting to the ACK-SENT state and sending another (second) REQ and (forth) ACK as per the RFC. A gets the REQ, goes into REQ-SENT and sends another REQ. It immediately receives the following ACK and enters OPENED.

This goes on until one side figures out that they are getting nowhere and gives up.

The best way to avoid this is to configure one side to be passive — that is, make one side wait for the other to start negotiating. This can be done with the following command:

set openmode passive

Care should be taken with this option. You should also use this command to limit the amount of time that ppp(8) waits for the peer to begin negotiations:

set stopped N

Alternatively, the following command (where N is the number of seconds to wait before starting negotiations) can be used:

set openmode active N

Check the manual page for details.

13.13. Why does ppp(8) lock up when I shell out to test it?

When you execute the shell or ! command, ppp(8) executes a shell (or if you have passed any arguments, ppp(8) will execute those arguments). The ppp program will wait for the command to complete before continuing. If you attempt to use the PPP link while running the command, the link will appear to have frozen. This is because ppp(8) is waiting for the command to complete.

To execute commands like this, use !bg instead. This will execute the given command in the background, and ppp(8) can continue to service the link.

13.14. Why does ppp(8) over a null-modem cable never exit?

There is no way for ppp(8) to automatically determine that a direct connection has been dropped. This is due to the lines that are used in a null-modem serial cable. When using this sort of connection, LQR should always be enabled with the following line:

enable lqr

LQR is accepted by default if negotiated by the peer.

13.15. Why does ppp(8) dial for no reason in -auto mode?

If ppp(8) is dialing unexpectedly, you must determine the cause, and set up Dial filters (dfilters) to prevent such dialing.

To determine the cause, use the following line:

set log +tcp/ip

This will log all traffic through the connection. The next time the line comes up unexpectedly, you will see the reason logged with a convenient timestamp next to it.

You can now disable dialing under these circumstances. Usually, this sort of problem arises due to DNS lookups. To prevent DNS lookups from establishing a connection (this will not prevent ppp(8) from passing the packets through an established connection), use the following:

set dfilter 1 deny udp src eq 53
set dfilter 2 deny udp dst eq 53
set dfilter 3 permit 0/0 0/0

This is not always suitable, as it will effectively break your demand-dial capabilities — most programs will need a DNS lookup before doing any other network related things.

In the DNS case, you should try to determine what is actually trying to resolve a host name. A lot of the time, sendmail(8) is the culprit. You should make sure that you tell sendmail not to do any DNS lookups in its configuration file. See the section on using email with a dialup connection in the FreeBSD Handbook for details on how to create your own configuration file and what should go into it. You may also want to add the following line to .mc:

define(`confDELIVERY_MODE', `d')dnl

This will make sendmail queue everything until the queue is run (usually, sendmail is run with -bd -q30m, telling it to run the queue every 30 minutes) or until a sendmail -q is done (perhaps from your ppp.linkup).

13.16. What do these CCP errors mean?

I keep seeing the following errors in my log file:

CCP: CcpSendConfigReq
CCP: Received Terminate Ack (1) state = Req-Sent (6)

This is because ppp(8) is trying to negotiate Predictor1 compression, and the peer does not want to negotiate any compression at all. The messages are harmless, but if you wish to remove them, you can disable Predictor1 compression locally too:

disable pred1

13.17. Why does ppp(8) not log my connection speed?

To log all lines of your modem “conversation”, you must enable the following:

set log +connect

This will make ppp(8) log everything up until the last requested “expect” string.

If you wish to see your connect speed and are using PAP or CHAP (and therefore do not have anything to “chat” after the CONNECT in the dial script — no set login script), you must make sure that you instruct ppp(8) to “expect” the whole CONNECT line, something like this:

set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 4 \
  \"\" ATZ OK-ATZ-OK ATDT\\T TIMEOUT 60 CONNECT \\c \\n"

Here, we get our CONNECT, send nothing, then expect a line-feed, forcing ppp(8) to read the whole CONNECT response.

13.18. Why does ppp(8) ignore the \ character in my chat script?

The ppp utility parses each line in your config files so that it can interpret strings such as set phone "123 456 789" correctly and realize that the number is actually only one argument. To specify a " character, you must escape it using a backslash (\).

When the chat interpreter parses each argument, it re-interprets the argument to find any special escape sequences such as \P or \T (see the manual page). As a result of this double-parsing, you must remember to use the correct number of escapes.

If you wish to actually send a \ character to (say) your modem, you would need something like:

set dial "\"\" ATZ OK-ATZ-OK AT\\\\X OK"

It will result in the following sequence:

ATZ
OK
AT\X
OK

Or:

set phone 1234567
set dial "\"\" ATZ OK ATDT\\T"

It will result in the following sequence:

ATZ
OK
ATDT1234567

13.19. Why does ppp(8) get a “Segmentation fault”, but I see no ppp.core

The ppp utility (or any other program for that matter) should never dump core. Because ppp(8) runs setuid (with an effective user ID of 0), the operating system will not write core image of ppp(8) to disk before terminating it. If, however ppp(8) is actually terminating due to a segmentation violation or some other signal that normally causes core to be dumped, and you are sure you are using the latest version (see the start of this section), then you should install the system sources and do the following:

# cd /usr/src/usr.sbin/ppp
# echo STRIP= >> /etc/make.conf
# echo CFLAGS+=-g >> /etc/make.conf
# make install clean

You will now have a debuggable version of ppp(8) installed. You will have to be root to run ppp(8) as all of its privileges have been revoked. When you start ppp(8), take a careful note of what your current directory was at the time.

Now, if and when ppp(8) receives the segmentation violation, it will dump a core file called ppp.core. You should then do the following:

% su
# gdb /usr/sbin/ppp ppp.core
(gdb) bt
.....
(gdb) f 0
....
(gdb) i args
....
(gdb) l
.....

All of this information should be given alongside your question, making it possible to diagnose the problem.

If you are familiar with gdb(1), you may wish to find out some other bits and pieces such as what actually caused the dump or the addresses and values of the relevant variables.

13.20. Why does the process that forces a dial in -auto mode never connect?

This was a known problem with ppp(8) set up to negotiate a dynamic local IP number with the peer in -auto mode. It has been fixed a long time ago — search the manual page for iface.

The problem was that when that initial program calls connect(2), the IP number of the tun(4) interface is assigned to the socket endpoint. The kernel creates the first outgoing packet and writes it to the tun(4) device. ppp(8) then reads the packet and establishes a connection. If, as a result of ppp(8)'s dynamic IP assignment, the interface address is changed, the original socket endpoint will be invalid. Any subsequent packets sent to the peer will usually be dropped. Even if they are not, any responses will not route back to the originating machine as the IP number is no longer owned by that machine.

There are several theoretical ways to approach this problem. It would be nicest if the peer would re-assign the same IP number if possible. The current version of ppp(8) does this, but most other implementations do not.

The easiest method from our side would be to never change the tun(4) interface IP number, but instead to change all outgoing packets so that the source IP number is changed from the interface IP to the negotiated IP on the fly. This is essentially what the iface-alias option in the latest version of ppp(8) is doing (with the help of libalias(3) and ppp(8)'s -nat switch) — it is maintaining all previous interface addresses and NATing them to the last negotiated address.

Another alternative (and probably the most reliable) would be to implement a system call that changes all bound sockets from one IP to another. ppp(8) would use this call to modify the sockets of all existing programs when a new IP number is negotiated. The same system call could be used by DHCP clients when they are forced to call the bind() function for their sockets.

Yet another possibility is to allow an interface to be brought up without an IP number. Outgoing packets would be given an IP number of 255.255.255.255 up until the first SIOCAIFADDR ioctl(2) is done. This would result in fully binding the socket. It would be up to ppp(8) to change the source IP number, but only if it is set to 255.255.255.255, and only the IP number and IP checksum would need to change. This, however is a bit of a hack as the kernel would be sending bad packets to an improperly configured interface, on the assumption that some other mechanism is capable of fixing things retrospectively.

13.21. Why do most games not work with the -nat switch?

The reason games and the like do not work when libalias(3) is in use is that the machine on the outside will try to open a connection or send (unsolicited) UDP packets to the machine on the inside. The NAT software does not know that it should send these packets to the interior machine.

To make things work, make sure that the only thing running is the software that you are having problems with, then either run tcpdump(1) on the tun(4) interface of the gateway or enable ppp(8) TCP/IP logging (set log +tcp/ip) on the gateway.

When you start the offending software, you should see packets passing through the gateway machine. When something comes back from the outside, it will be dropped (that is the problem). Note the port number of these packets then shut down the offending software. Do this a few times to see if the port numbers are consistent. If they are, then the following line in the relevant section of /etc/ppp/ppp.conf will make the software functional:

nat port proto internalmachine:port port

where proto is either tcp or udp, internalmachine is the machine that you want the packets to be sent to and port is the destination port number of the packets.

You will not be able to use the software on other machines without changing the above command, and running the software on two internal machines at the same time is out of the question — after all, the outside world is seeing your entire internal network as being just a single machine.

If the port numbers are not consistent, there are three more options:

  1. Submit support in libalias(3). Examples of “special cases” can be found in /usr/src/sys/netinet/libalias/alias_*.c (alias_ftp.c is a good prototype). This usually involves reading certain recognized outgoing packets, identifying the instruction that tells the outside machine to initiate a connection back to the internal machine on a specific (random) port and setting up a “route” in the alias table so that the subsequent packets know where to go.

    This is the most difficult solution, but it is the best and will make the software work with multiple machines.

  2. Use a proxy. The application may support socks5 for example, or may have a “passive” option that avoids ever requesting that the peer open connections back to the local machine.

  3. Redirect everything to the internal machine using nat addr. This is the sledge-hammer approach.

13.22. What are FCS errors?

FCS stands for Frame Check Sequence. Each PPP packet has a checksum attached to ensure that the data being received is the data being sent. If the FCS of an incoming packet is incorrect, the packet is dropped and the HDLC FCS count is increased. The HDLC error values can be displayed using the show hdlc command.

If your link is bad (or if your serial driver is dropping packets), you will see the occasional FCS error. This is not usually worth worrying about although it does slow down the compression protocols substantially. If you have an external modem, make sure your cable is properly shielded from interference — this may eradicate the problem.

If your link freezes as soon as you have connected and you see a large number of FCS errors, this may be because your link is not 8-bit clean. Make sure your modem is not using software flow control (XON/XOFF). If your datalink must use software flow control, use the command set accmap 0x000a0000 to tell ppp(8) to escape the ^Q and ^S characters.

Another reason for seeing too many FCS errors may be that the remote end has stopped talking PPP. You may want to enable async logging at this point to determine if the incoming data is actually a login or shell prompt. If you have a shell prompt at the remote end, it is possible to terminate ppp(8) without dropping the line by using close lcp (a following term) will reconnect you to the shell on the remote machine.

If nothing in your log file indicates why the link might have been terminated, you should ask the remote administrator (your ISP?) why the session was terminated.

13.23. None of this helps — I am desperate! What can I do?

If all else fails, send as much information as you can, including your config files, how you are starting ppp(8), the relevant parts of your log file and the output of netstat -rn (before and after connecting) to the FreeBSD general questions mailing list and someone should point you in the right direction.