I wrote a program that communicate with serial port, using termios, this program will read serial port in non-blocking mode and write response to serial port once it read data. If there is no data read from serial port, the program will do other thing, on next loop, the program read serial port again.
now the question is, after sometimes gone, maybe several minutes, or maybe several hours, the serial port don't respond to my program any more. Even I execute echo 'HB\n' > /dev/ttyUSB0
(then the serial port should respond 'HACK'), it doesn't respond any more..
I even don't known when the serial port is 'dead', I don't have any clue.. it 'dead' untimed.
here is my configuration:
/// set local mode options
//tminfo.c_lflag |= ICANON;
tminfo.c_lflag &= ~(ICANON | ECHO | ECHOE | ISIG);
/// set control mode options
tminfo.c_cflag |= (CLOCAL | CREAD);
tminfo.c_cflag |= HUPCL;
// set hardware flow control
tminfo.c_cflag &= ~CRTSCTS;
// set how many bits in a character
tminfo.c_cflag &= ~CSIZE;
tminfo.c_cflag |= CS8;
// set parity mode (default to odd validation), this option (PARENB) will both enable input and output parity checking
tminfo.c_cflag &= ~PARENB; // we don't need prity checking now
/// set input mode options
// set input parity checking
tminfo.c_iflag &= ~INPCK;
tminfo.c_cflag &= ~CSTOPB;
/// set output mode options
tminfo.c_oflag &= ~OPOST;
tminfo.c_cc[VMIN] = 1;
tminfo.c_cc[VTIME] = 1;
/// set line speed, defaults to 38400bps, both for input and output
// this call will set both input and output speed
cfsetspeed(&tminfo, B38400);
It's hard to debug the serial in this situation. I really can't figure out what cause the serial port 'dead' on earth. I'm nearly crazy...
what the possible reason? Any help will be appreciated!
when the serial port "dead", its configuration is:
speed 38400 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; swtch = <undef>;
start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 1;
-parenb -parodd cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel
-iutf8
-opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke
/proc/tty/driver/ar933x-uart
I have noticed that this tx
and rx
field value does not change during my program running, even if I write to my serial manually.
serinfo:1.0 driver revision:
0: uart:AR933X UART mmio:0x18020000 irq:11 tx:169 rx:0 RTS|DTR|CD
/proc/tty/driver/serial
serinfo:1.0 driver revision:
0: uart:unknown port:00000000 irq:0
1: uart:unknown port:00000000 irq:0
2: uart:unknown port:00000000 irq:0
3: uart:unknown port:00000000 irq:0
4: uart:unknown port:00000000 irq:0
5: uart:unknown port:00000000 irq:0
6: uart:unknown port:00000000 irq:0
7: uart:unknown port:00000000 irq:0
8: uart:unknown port:00000000 irq:0
9: uart:unknown port:00000000 irq:0
10: uart:unknown port:00000000 irq:0
11: uart:unknown port:00000000 irq:0
12: uart:unknown port:00000000 irq:0
13: uart:unknown port:00000000 irq:0
14: uart:unknown port:00000000 irq:0
15: uart:unknown port:00000000 irq:0
/proc/tty/driver/usbserial
usbserinfo:1.0 driver:2.0
0: module:pl2303 name:"pl2303" vendor:067b product:2303 num_ports:1 port:1 path:usb-ehci-platform-1
and, below is the more detailed code...
int Serial::openup(const char *devfile) {
if(-1 == (devfds = open(devfile, O_RDWR | O_NOCTTY ))) {
perror(strerror(errno));
return -1;
}
// set device file io mode to nonblock
//int oldflags = fcntl(devfds, F_GETFL);
//fcntl(devfds, F_SETFL, oldflags | O_NONBLOCK);
// get terminal's attributes
tcgetattr(devfds, &tminfo);
memset(&tminfo, 0, sizeof(struct termios));
/// set local mode options ///
//tminfo.c_lflag |= ICANON;
tminfo.c_lflag &= ~(ICANON | ECHO | ECHOE | ISIG | IEXTEN);
/// set control mode options ///
tminfo.c_cflag |= (CLOCAL | CREAD);
// disable hardware flow control
tminfo.c_cflag &= ~CRTSCTS;
// set how many bits in a character
tminfo.c_cflag &= ~CSIZE;
tminfo.c_cflag |= CS8;
// we don't need prity checking
tminfo.c_cflag &= ~PARENB;
tminfo.c_cflag &= ~CSTOPB;
/// set input mode options ///
// disable input parity checking, this
tminfo.c_iflag &= ~(INPCK | PARMRK | IGNBRK | BRKINT | ISTRIP
| INLCR | IGNCR | ICRNL | IXON);
/// set output mode options ///
//tminfo.c_oflag |= (OPOST | ONLCR);
tminfo.c_oflag &= ~OPOST; // ***
tminfo.c_cc[VMIN] = 0; // ***
tminfo.c_cc[VTIME] = 1; // ***
/// set line speed, defaults to 38400bps, both for input and output ///
// this call will set both input and output speed
cfsetspeed(&tminfo, B38400);
if(-1 == tcsetattr(devfds, TCSANOW, &tminfo)) {
perror(strerror(errno));
return -1;
}
return 0;
}
int Serial::serve() {
char buffer[256] = {0};
/*
struct timeval timeo;
timeo.tv_sec = 0;
timeo.tv_usec = 2 * 1000;
select(0, NULL, NULL, NULL, &timeo);
*/
//print_trace("ready to read data from serial port.\n");
int read_count = 0;
if((read_count = read_line(devfds, buffer, 256))) {
print_trace("read line: %d bytes, %s\n", read_count, buffer);
if(0 == strncmp(buffer, "S", 1)) {
// do some operation
} else if(0 == strncmp(buffer, "N", 1)) {
// do some operation
}
} else {
//print_trace("read no data.\n");
}
// TODO: test only, for find out the reason of serial port 'dead' problem
tcflush(devfds, TCIFLUSH);
}
there is another function for other module to write to serial port
int Serial::write_to_zigbee_co(const char *msg) {
int write_count = 0;
int len = strlen(msg);
struct timeval timeo;
timeo.tv_sec = 0;
timeo.tv_usec = 20 * 1000;
select(0, NULL, NULL, NULL, &timeo);
tcflush(devfds, TCOFLUSH);
if(len == (write_count = write(devfds, msg, len))) {
} else {
tcflush(devfds, TCOFLUSH);
}
return write_count;
}
Serial ports do not just suddenly "die".
The typical reason for a suddenly "dead" or nonresponsive serial link is unwanted flow control. You seem to have HW flow-control disabled, but software flow-control has not been disabled and raw mode has not been properly configured.
Your initialization needs
tminfo.c_iflag &= ~(IGNBRK | BRKINT | PARMRK | ISTRIP | INLCR | IGNCR | ICRNL | IXON | IXOFF);
Also clear the IEXTEN
in c_lflag
,
and clear the PARENB
in c_cflag
.
Consider using the cfmakeraw()
function to simplify the initialization for raw mode.
Raw mode
cfmakeraw() sets the terminal to something like the "raw" mode of the old Version 7
terminal driver: input is available character by character, echoing is disabled, and
all special processing of terminal input and output characters is disabled. The
terminal attributes are set as follows:
termios_p->c_iflag &= ~(IGNBRK | BRKINT | PARMRK | ISTRIP
| INLCR | IGNCR | ICRNL | IXON);
termios_p->c_oflag &= ~OPOST;
termios_p->c_lflag &= ~(ECHO | ECHONL | ICANON | ISIG | IEXTEN);
termios_p->c_cflag &= ~(CSIZE | PARENB);
termios_p->c_cflag |= CS8;
ADDENDUM
Your revised termios settings look okay.
The next step I would try would be to determine which side of the serial link is at fault. Is one end not receiving or is one end not transmitting?
You could try using the metrics maintained by the serial port drivers. If both sides are running Linux, then you should inspect the files in /proc/tty/drivers. The serial port divers will report the receive and transmit byte counts on each port. Compare the Rx and Tx counts before the test and then after the failure.
If you cannot get any stats from the CC2530 side, then a serial link monitor may be necessary. Beside a dedicated test instrument, you could make/set-up one using a PC with two serial ports. Connect Port A to the host and Port B to the CC2530, so that this PC is a "man in the middle". You would then have to write a program to retransmit the received data of Port A over to Port B, and Port B's RxD to Port A's TxD.
This data that is retransmitted (both channels) also has to be displayed or logged. The purpose is to determine which side of the serial link is failing. Once that has been established, then you have to figure out if it is a receive or transmit issue.
OR
you could post more of your code (the complete open()
, initialization routines and read & write logic) for everyone to desk check it.
The code you have posted has some issues.
Initialization code
// get terminal's attributes
tcgetattr(devfds, &tminfo);
memset(&tminfo, 0, sizeof(struct termios));
This code obtains the termios data, and then zeroes it out!
You need to remove the memset()
statement.
Read code
if((read_count = read_line(devfds, buffer, 256))) {
The serial port has been initialized for non-canonical (aka raw) mode, but here is a read_line()
which is a canonical input operation.
I don't know exactly what happens when you setup raw mode and try to read lines, but if the read operation ever hangs, I would not be surprised.
You need to evaluate the type of data that will be exchanged between these two devices over this serial link.
Is every message composed of ASCII text with each line terminated by a newline character?
If "yes", then you can use canonical mode and the read_line()
.
Otherwise you should use non-canonical mode and the read()
syscall, and write code to parse the received data.
if((read_count = read_line(devfds, buffer, 256))) {
...
} else {
//print_trace("read no data.\n");
}
When read_line()
returns an error (-1), this code will treat it as a good return, and try to process stale or garbage data in the receive buffer. If there have been any read errors they have been undetected and never reported.
tcflush(devfds, TCIFLUSH);
IMO you are misusing tcflush()
. There may be some rare cases, but normally you should not throw away any data until you have actually parsed it and know that it is garbage data. You should delete this tcflush()
statement.
Write code
select(0, NULL, NULL, NULL, &timeo);
Performing a time delay prior to the write is a questionable operation in userspace. In a multitasking environment with scheduling and preemption to disrupt actual execution time, userspace programs rarely need to add such a fixed delay to every write()
syscall.
tcflush(devfds, TCOFLUSH);
Another questionable (mis)use of tcflush()
.
This should be removed.
if(len == (write_count = write(devfds, msg, len))) {
} else {
tcflush(devfds, TCOFLUSH);
}
Another questionable/improper use of tcflush()
.
This should be replaced with better recovery code. It's unlikely that a short write will occur; most likely to be returned is either the full write count or an error return (-1). You need to check for an error return (-1), and the errno
variable. (You need to do this for other syscalls too, such as tcgetattr()
. You need to read the man page of each syscall you use to learn what can be returned.)