threads and file descriptors
I'm seeing a fairly reproducable problem in which the same file descriptor is being used concurrently in two threads, but _unintentionally_. I understand that open file descriptors are shared by all threads, but somehow it looks like calling socket() in one thread is able to return a filedescriptor that's already in use by another thread. Here's an example (a little complicated, but bear with me): this is just stderr debugging output from my app which is running 3 threads at this time. The output is in chronological order. The thread# is in [] to the left, my comments are prefixed with >>> : [01] using fd 16 for /tmp/ibwd.1 [01] state-transition: 2->3 [01] writing to left ( 0 bytes left): 250 Ok: queued as 29C9E4637 [01] state 3; left event LEFT_RSET: [6 bytes] RSET [01] all done <message-id>: 2995 bytes [01] writing to right ( 0 bytes left): RSET [01] state-transition: 3->1 [01] state 1; right event RIGHT_354: [37 bytes] 354 [01] using fd 14 for /tmp/ibwd.1
above thread#1 has opened a file and gets fd 14 for it.
[01] state-transition: 1->2 [02] crm114[25958, 9 categories] stage done, 0.509 seconds. [02] connected to juggernautd(14) at localhost:787
above thread#2 opens a tcp connection to another app, and ALSO uses fd 14.
[01] state 2; left event LEFT_EOM: [3 bytes] . write_string(): unable to write to 14; error 9: Bad file descriptor
above thread#1 chokes when trying to write to fd 14.
Each thread is independent of the others, except during allocation of work. Are there any special precautions I need to take when doing IO in a multi-threaded environment? /Per Jessen, Zürich
On Sun, 07 May 2006 11:06:10 +0200
Per Jessen
I'm seeing a fairly reproducable problem in which the same file descriptor is being used concurrently in two threads, but _unintentionally_.
I understand that open file descriptors are shared by all threads, but somehow it looks like calling socket() in one thread is able to return a filedescriptor that's already in use by another thread.
Here's an example (a little complicated, but bear with me):
this is just stderr debugging output from my app which is running 3 threads at this time. The output is in chronological order. The thread# is in [] to the left, my comments are prefixed with >>> :
[01] using fd 16 for /tmp/ibwd.1 [01] state-transition: 2->3 [01] writing to left ( 0 bytes left): 250 Ok: queued as 29C9E4637 [01] state 3; left event LEFT_RSET: [6 bytes] RSET [01] all done <message-id>: 2995 bytes [01] writing to right ( 0 bytes left): RSET [01] state-transition: 3->1 [01] state 1; right event RIGHT_354: [37 bytes] 354 [01] using fd 14 for /tmp/ibwd.1
above thread#1 has opened a file and gets fd 14 for it.
[01] state-transition: 1->2 [02] crm114[25958, 9 categories] stage done, 0.509 seconds. [02] connected to juggernautd(14) at localhost:787
above thread#2 opens a tcp connection to another app, and ALSO uses fd 14.
[01] state 2; left event LEFT_EOM: [3 bytes] . write_string(): unable to write to 14; error 9: Bad file descriptor
above thread#1 chokes when trying to write to fd 14.
Each thread is independent of the others, except during allocation of work. Are there any special precautions I need to take when doing IO in a multi-threaded environment? I have done quite a bit of this type of programming in the past. There should be NO duplication of file descriptors, but the system will reuse file descriptors. One of the types of programs I used to write a few years ago was where I would spawn the maximum number of threads, and each thread would open and write to the same file using locking. In other words, open(2) (and socket(2), accept(2)) should NEVER return a file descriptor of an already open file.
--
Jerry Feldman
Jerry Feldman wrote:
I have done quite a bit of this type of programming in the past. There should be NO duplication of file descriptors, but the system will reuse file descriptors.
Yep, that I expected.
One of the types of programs I used to write a few years ago was where I would spawn the maximum number of threads, and each thread would open and write to the same file using locking.
In my case the threads are running completely independent of oneanother, except for the sharing of various system resources.
In other words, open(2) (and socket(2), accept(2)) should NEVER return a file descriptor of an already open file.
Yeah, I was expecting that too, but that is nonetheless what seems to be happening. Not just from the trace I published, but also input-files ending being truncated because the file descriptor has been reopened for output. /Per Jessen, Zürich
On Sun, 07 May 2006 15:19:21 +0200
Per Jessen
In my case the threads are running completely independent of oneanother, except for the sharing of various system resources. Threads run in the context of the process. They all share the same global memory. Even when a thread runs in a detached state, it is still part of the parent process. As I said, file descriptors can be reused. So, if fd 14 is closed, the system can return it to you.
Also, in the 2.4 kernel, the old Linux Threads, each thread had a
separate PID where under the 2.6 kernel, you get 1 pid for the process.
(You can force the old Linux Threads behavior though).
--
Jerry Feldman
Jerry Feldman wrote:
Threads run in the context of the process. They all share the same global memory. Even when a thread runs in a detached state, it is still part of the parent process. As I said, file descriptors can be reused. So, if fd 14 is closed, the system can return it to you.
Absolutely - the system just seems to be doing it also when it isn't closed ...
Also, in the 2.4 kernel, the old Linux Threads, each thread had a separate PID where under the 2.6 kernel, you get 1 pid for the process. (You can force the old Linux Threads behavior though).
That's an interesting subject - I'm on 2.6.something, and generally I see just the one process id. However, when debugging, it is quite useful being able to attach to a process id, and "ps -eFl" will give you some id you can use. Also when you're stracing with '-ff -f', each thread has it's own file with an id. /Per Jessen, Zürich
On Sun, 07 May 2006 16:40:37 +0200
Per Jessen
Jerry Feldman wrote:
Threads run in the context of the process. They all share the same global memory. Even when a thread runs in a detached state, it is still part of the parent process. As I said, file descriptors can be reused. So, if fd 14 is closed, the system can return it to you.
Absolutely - the system just seems to be doing it also when it isn't closed ...
Also, in the 2.4 kernel, the old Linux Threads, each thread had a separate PID where under the 2.6 kernel, you get 1 pid for the process. (You can force the old Linux Threads behavior though).
That's an interesting subject - I'm on 2.6.something, and generally I see just the one process id. However, when debugging, it is quite useful being able to attach to a process id, and "ps -eFl" will give you some id you can use. Also when you're stracing with '-ff -f', each thread has it's own file with an id. GDB does not handle threads well. Actually, the Intel debugger is better in this case. BTW: Each thread has its own threadid. I would think that it is unlikely that open(2) will give you a duplicate file descriptor to one that is already open. I think it is more likely that you have a a bug in your program.
--
Jerry Feldman
Jerry Feldman wrote:
I would think that it is unlikely that open(2) will give you a duplicate file descriptor to one that is already open. I think it is more likely that you have a a bug in your program.
Completely agree :-) /Per Jessen, Zürich
Jerry Feldman wrote:
I would think that it is unlikely that open(2) will give you a duplicate file descriptor to one that is already open. I think it is more likely that you have a a bug in your program.
Completely agree, but this strace seems to indicate something's wrong:
28333 and 28315 are both pthreads:
----
28315 write(7, "250 Ok\r\n", 8
On Sun, 07 May 2006 17:48:45 +0200
Per Jessen
Jerry Feldman wrote:
I would think that it is unlikely that open(2) will give you a duplicate file descriptor to one that is already open. I think it is more likely that you have a a bug in your program.
Completely agree, but this strace seems to indicate something's wrong:
28333 and 28315 are both pthreads: ----
28315 write(7, "250 Ok\r\n", 8
28333 waitpid(28360, here 28333 is just cleaning up after a fork()ed process.
28315 <... write resumed> ) = 8 28333 <... waitpid resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 28360 28315 write(2, "[01] state 1; right event RIGHT_"..., 85) = 85 28315 open("/tmp/ibwd.1", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 13
so, 28315 gets fd13 for /tmp/ibwd.1
28313 <... time resumed> NULL) = 1147013625 28362 <... execve resumed> ) = 0 28318 <... write resumed> ) = 2 28333 stat64("28360.stderr",
28315 write(2, "[01] using fd 13 for /tmp/ibwd.1"..., 33 28318 write(14, "Adobe Acrobat 7 Professional Ret"..., 61 28333 <... stat64 resumed> {st_mode=S_IFREG|0600, st_size=0, ...}) = 0 28315 <... write resumed> ) = 33 28318 <... write resumed> ) = 61 28333 unlink("28360.stderr" removing any stderr output from the fork()ed process.
28315 write(7, "354 End data with <CR><LF>.<CR><"..., 37
28318 write(14, "\r\n", 2 28333 <... unlink resumed> ) = 0 28315 <... write resumed> ) = 37 28318 <... write resumed> ) = 2 28333 close(13 and here 28333 is closing fd 13 (which was one end of a pipe() for the fork()ed process) although fd 13 was opened by 28315 ...
Everything is compiled using -fstack-protector-all, which I think should give me a good chance of catching anyone walking right over my stackspace. The issue here is with open(2) (and related calls that return a new file descriptor). The trace does not show that 2 different open(2) calls return 13.
I suggest that you add some diagnostic code whenever an open(2) or
related call returns a file descriptor.
fprintf(stderr, "Thread %ld receives fd %d\n", pthread_self(), fd);
As I mentioned, GDB is not the best tool to trace threads.
If you want to log the data:
Create a log file in the parent thread.
Create a mutex
void logger(int fd)
{
pthread_mutex_lock(&lock_mutex);
fprintf(logstream, "Thread %ld receives fd %d\n",
pthread_self(), fd);
fflush(logstream);
pthread_mutex_unlock(&lock_miutex);
}
--
Jerry Feldman
Jerry Feldman wrote:
The issue here is with open(2) (and related calls that return a new file descriptor). The trace does not show that 2 different open(2) calls return 13.
I had to go quite far back to find the pipe() call that gives me a set of 13 and 14.
I suggest that you add some diagnostic code whenever an open(2) or related call returns a file descriptor.
I was hoping strace would do it for me, but it might be better just doing it myself.
fprintf(stderr, "Thread %ld receives fd %d\n", pthread_self(), fd); As I mentioned, GDB is not the best tool to trace threads. If you want to log the data: Create a log file in the parent thread. Create a mutex void logger(int fd) { pthread_mutex_lock(&lock_mutex); fprintf(logstream, "Thread %ld receives fd %d\n", pthread_self(), fd); fflush(logstream); pthread_mutex_unlock(&lock_miutex); }
I'll try it. Thanks. /Per Jessen, Zürich
On Monday 08 May 2006 8:39 am, Per Jessen wrote:
I had to go quite far back to find the pipe() call that gives me a set of 13 and 14.
I was hoping strace would do it for me, but it might be better just doing it myself. It is tedious to do this, but you really need to track every open(2) and close(2). You also have a complicated system where you are using BOTH threads and forks. Be very careful of using fork(2) from threads. To quote from Dave Butenhof, "Avoid using forkin a threaded program (if you can) unless you intend to exec a new program immediately". Page 197 "Programming with POSIX Threads". I can personally attest for David being a good authority on Pthreads. -- Jerry Feldman
Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9
Jerry Feldman wrote:
It is tedious to do this, but you really need to track every open(2) and close(2). You also have a complicated system where you are using BOTH threads and forks.
Yeah, I'm beginning to wonder if I should be reworking that setup. It would involve running another daemon to repeatedly run the otherwise fork()ed program. But I'm not too keen on running yet another daemon just for that.
Be very careful of using fork(2) from threads. To quote from Dave Butenhof, "Avoid using forkin a threaded program (if you can) unless you intend to exec a new program immediately".
That is pretty much what I do : 1. fiddle with the pipe for input/output. Closing what I don't need, dup2() onto stdout. 2. Redirect stdin and stderr to files. 3. chdir() to working dir. 4. execv. /Per Jessen, Zürich
On May 08, 06 10:38:26 -0400, Jerry Feldman wrote:
It is tedious to do this, but you really need to track every open(2) and close(2). You also have a complicated system where you are using BOTH threads and forks. Be very careful of using fork(2) from threads. To quote from Dave Butenhof, "Avoid using forkin a threaded program (if you can) unless you intend to exec a new program immediately". Page 197 "Programming with POSIX Threads".
Why that? Yes, you have to be carefull about open file descriptors and
memory maps (and other IPC stuff), but it should work. Apache is
successfully using a mixture of threads and processes.
Matthias
--
Matthias Hopf
On Tuesday 09 May 2006 8:13 am, Matthias Hopf wrote:
Be very careful of using fork(2) from threads. To quote from Dave Butenhof, "Avoid using forkin a threaded program (if you can) unless you intend to exec a new program immediately". Page 197 "Programming with POSIX Threads".
Why that? Yes, you have to be carefull about open file descriptors and memory maps (and other IPC stuff), but it should work. Apache is successfully using a mixture of threads and processes. Agreed that it should work, and does. But, you must be aware of what you get in a child process and what you do not get. For instance, when you fork(2) from a thread, the new process inherits the one thread and its state, but not any of the other threads. You also must be very careful of signals. -- Jerry Feldman
Boston Linux and Unix user group http://www.blu.org PGP key id:C5061EA9 PGP Key fingerprint:053C 73EC 3AC1 5C44 3E14 9245 FB00 3ED5 C506 1EA9
Jerry Feldman wrote:
Why that? Yes, you have to be carefull about open file descriptors and memory maps (and other IPC stuff), but it should work. Apache is successfully using a mixture of threads and processes. Agreed that it should work, and does. But, you must be aware of what you get in a child process and what you do not get. For instance, when you fork(2) from a thread, the new process inherits the one thread and its state, but not any of the other threads. You also must be very careful of signals.
I came across this http://www.gnu.org/software/libc/manual/html_node/POSIX-Threads.html (see "Threads and Fork" and "Streams and Fork"), which does a pretty good job of explaining things. I guess it's difficult to say, but I would venture a guess and say if you need to fork() from a Posix thread, it is almost certainly because you expect to execv() very quickly? In my case, the fork() never caused any problems. /Per Jessen, Zürich
I came across this http://www.gnu.org/software/libc/manual/html_node/POSIX-Threads.html (see "Threads and Fork" and "Streams and Fork"), which does a pretty good job of explaining things.
I guess it's difficult to say, but I would venture a guess and say if you need to fork() from a Posix thread, it is almost certainly because you expect to execv() very quickly?
In my case, the fork() never caused any problems. As long as the programmers understand the interaction, it should not be a
On Tuesday 09 May 2006 10:23 am, Per Jessen wrote:
problem. You need to understand what is inherited, and what the appropriate
states are.
--
Jerry Feldman
Just one more assembler vs. high level language war story.
A number of years ago while workign at a large back, we had a Personal Trust
system that had been written in IBM 360 assembler. Our bank was primarily a
Burroughs (Unisys) shop. The vendor, a major accounting firm, had ported
the code to Burroughs COBOL. The Burroughs programmers really hated the
code as there were a lot of cases such as:
pa.
alter r5return to goto pb.
goto code.
pb.
...
code.
...
goto r5return.
r5return.
goto.
I was the only person on the team who knew IBM assembler. The first time I
saw that sequence, my comment was "this is nothing but a "balr" instruction
using register 5 as the subroutine return register. (Note that the IBM
mainframes were not stack machines).
But, my point about C++ and assembler is missed.
While a well written and commented assembler program can be highly
maintainable, a program PROPERLY written in a higher level language should
be more maintainable and more portable. However, the assembler code should
perform better. In terms of maintainability, assembler is generally no
taught in most schools where C++, C and Java are. In the case of the
programming staff at the bank, I was probably the only programmer analyst
who knew IBM assembler (a couple of the systems programmers did).
Burroughs COBOL had a nice feature (similar to the C asm() function):
Procedure Division.
par.
...
enter symbolic
assembler code
...
enter cobol.
The Burroughs system did not have a linkage editor, so your code had to be
monolithic. They did have an efficient way to spawn a child process,
though.
--
Jerry Feldman
Jerry Feldman wrote:
But, my point about C++ and assembler is missed. While a well written and commented assembler program can be highly maintainable, a program PROPERLY written in a higher level language should be more maintainable and more portable.
Completely agree. The one thing with C++ - in my experience that is - is that many programmers seem to loose track of the objective, and start writing code just for the sake of the code. You can do a thousand really elegant things with C++, but they're not always necessary.
However, the assembler code should perform better. In terms of maintainability, assembler is generally no taught in most schools where C++, C and Java are.
Whether a topic is taught in the schools I see more as a question of productivity - ie. can you hire a new graduate and have him do productive work right away. In the few places where assembler would still be warranted, a new graduate wouldn't be much good e.g. due to lack of OS knowledge.
In the case of the programming staff at the bank, I was probably the only programmer analyst who knew IBM assembler (a couple of the systems programmers did).
I would never advocate writing applications in assembler, only system-level software. Not even in TPF (where everything is about speed) do they write apps in assembler (well, not anymore). /Per Jessen, Zürich
Completely agree. The one thing with C++ - in my experience that is - is that many programmers seem to loose track of the objective, and start writing code just for the sake of the code. You can do a thousand really elegant things with C++, but they're not always necessary. True. While I tend to work with some very highly competent engineers, one
On Wednesday 10 May 2006 9:34 am, Per Jessen wrote:
thing I've seen in the industry is that many of the C++ programmers not
only don't know C++, but have no real training of writing maintainable code
or in software engineering techniques. Object oriented code requires more
front-end design than procedural code.
An example, might be writing a date class. The programmer wrote a date class
that could parse a couple of different inputs, and display output in
different formats, but lacked a viable comparison function. Because of the
way it was written and the data stored, it was useless as a base class for
comparison. The solution in this case, was to write a new date base class
that included comparison functions, then restructure the old data class to
inherit from the new base class.
--
Jerry Feldman
Per Jessen wrote:
I was hoping strace would do it for me, but it might be better just doing it myself.
OK, here's some output from my own tracing: Format: [my id][pid][self()] function fd [02][ 2761][3066452896] pipe(): fd=14 [02][ 2761][3066452896] pipe(): fd=18 pipe() call to establish communication with the process about to be forked(). [02][ 3024][3066452896] close(): fd=14 [02][ 3024][3066452896] close(): fd=18 forked() process closes both, [02][ 2761][3066452896] close(): fd=18 [02][ 2761][3066452896] fdopen(): fd=14 We prepare to read output from forked() process. [01][ 2761][3074980768] fopen(): fd=18 [01][ 2761][3074980768] fclose(): fd=18 [01][ 2761][3074980768] pipe(): fd=14 [01][ 2761][3074980768] pipe(): fd=18 Another thread does a pipe() call and gets 14 and 18. This should not be happening. /Per Jessen, Zürich
On Monday 08 May 2006 11:05 am, Per Jessen wrote:
Per Jessen wrote:
I was hoping strace would do it for me, but it might be better just doing it myself.
OK, here's some output from my own tracing:
Format: [my id][pid][self()] function fd
[02][ 2761][3066452896] pipe(): fd=14 [02][ 2761][3066452896] pipe(): fd=18
pipe() call to establish communication with the process about to be forked().
[02][ 3024][3066452896] close(): fd=14 [02][ 3024][3066452896] close(): fd=18
forked() process closes both,
[02][ 2761][3066452896] close(): fd=18 ****> [02][ 2761][3066452896] fdopen(): fd=14
We prepare to read output from forked() process.
[01][ 2761][3074980768] fopen(): fd=18 [01][ 2761][3074980768] fclose(): fd=18 [01][ 2761][3074980768] pipe(): fd=14 [01][ 2761][3074980768] pipe(): fd=18
Another thread does a pipe() call and gets 14 and 18. This should not be happening. Why not. At the time you call pipe(), fd14 and fd18 are closed. You appear to be passing an already closed file descriptor to fdopen.
--
Jerry Feldman
Jerry Feldman wrote:
Format: [my id][pid][self()] function fd
[02][ 2761][3066452896] pipe(): fd=14 [02][ 2761][3066452896] pipe(): fd=18
pipe() call to establish communication with the process about to be forked().
[02][ 3024][3066452896] close(): fd=14 [02][ 3024][3066452896] close(): fd=18
forked() process closes both,
[02][ 2761][3066452896] close(): fd=18
****> [02][ 2761][3066452896] fdopen(): fd=14
We prepare to read output from forked() process.
[01][ 2761][3074980768] fopen(): fd=18 [01][ 2761][3074980768] fclose(): fd=18 [01][ 2761][3074980768] pipe(): fd=14 [01][ 2761][3074980768] pipe(): fd=18
Another thread does a pipe() call and gets 14 and 18. This should not be happening.
Why not. At the time you call pipe(), fd14 and fd18 are closed. You appear to be passing an already closed file descriptor to fdopen.
No, fd 14 is closed by the forked() process, not the local process. Proces 3024 closes fd 14, but it's process 2761 that does the fdopen(). /Per Jessen, Zürich
Jerry Feldman wrote:
Format: [my id][pid][self()] function fd
[02][ 2761][3066452896] pipe(): fd=14 [02][ 2761][3066452896] pipe(): fd=18
pipe() call to establish communication with the process about to be forked().
[02][ 3024][3066452896] close(): fd=14 [02][ 3024][3066452896] close(): fd=18
forked() process closes both,
[02][ 2761][3066452896] close(): fd=18
****> [02][ 2761][3066452896] fdopen(): fd=14
We prepare to read output from forked() process.
[01][ 2761][3074980768] fopen(): fd=18 [01][ 2761][3074980768] fclose(): fd=18 [01][ 2761][3074980768] pipe(): fd=14 [01][ 2761][3074980768] pipe(): fd=18
Another thread does a pipe() call and gets 14 and 18. This should not be happening.
Why not. At the time you call pipe(), fd14 and fd18 are closed. You appear to be passing an already closed file descriptor to fdopen.
No, fd 14 is closed by the forked() process, not the local process. Proces 3024 closes fd 14, but it's process 2761 that does the fdopen(). Ok, Just to make it quick because I have a meeting ASAP Check the return values for pipe().
On Monday 08 May 2006 1:29 pm, Per Jessen wrote:
pipe() returns the fds in a 2 element int array.
Try setting those to -1 before you call pipe() and see what happens.
--
Jerry Feldman
Jerry Feldman wrote:
Ok, Just to make it quick because I have a meeting ASAP Check the return values for pipe(). pipe() returns the fds in a 2 element int array. Try setting those to -1 before you call pipe() and see what happens.
I was already checking the return value of pipe(), but I think I might have found the problem elsewhere. I think (just testing it now) it was a double close() of a file descriptor - where the file descriptor had been handed out again in between two close(). Obviously not the best idea :-) Thanks for your help and patience. /Per Jessen, Zürich
On Monday 08 May 2006 2:22 pm, Per Jessen wrote:
Jerry Feldman wrote:
Ok, Just to make it quick because I have a meeting ASAP Check the return values for pipe(). pipe() returns the fds in a 2 element int array. Try setting those to -1 before you call pipe() and see what happens.
I was already checking the return value of pipe(), but I think I might have found the problem elsewhere. I think (just testing it now) it was a double close() of a file descriptor - where the file descriptor had been handed out again in between two close(). Obviously not the best idea :-) Now solve my problem :-) I have a C++ program that opens a library using dlopen(3): [xxx@yyyyy ~/loaderror]$ ./sbtestmain Loading ./libsbtest.so DLOPEN: ./libsbtest.so: symbol _ZNSdC1EPSt15basic_streambufIcSt11char_traitsIcEE, version GLIBCXX_3.4 not defined in file libstdc++.so.6 with link time reference
This problem occurs on one RHEL 4 U3 IA64 system using the Inel compiler.
Does not occur on any of the testdrive systems running RHEL 4 or SLES9.
Actually, I think the IT people (at our ISV) did an upgrade install rather
than a clean install because both imake and g++ were broken. I have a
Montecito box sitting next to it that also works.
The bottom line is that system calls, like open(2) are generally well
tested. Also note that Rational's Purify Plus would have found your problem
because that is one of the things it tracks.
--
Jerry Feldman
Jerry Feldman wrote:
Now solve my problem :-) I have a C++ program that opens a library using dlopen(3): [xxx@yyyyy ~/loaderror]$ ./sbtestmain Loading ./libsbtest.so DLOPEN: ./libsbtest.so: symbol _ZNSdC1EPSt15basic_streambufIcSt11char_traitsIcEE, version GLIBCXX_3.4 not defined in file libstdc++.so.6 with link time reference
I used to do some C++, but in my most recent job as a software engineer, I was hired to help migrate a product away from C++ - to assembler. This was on S390.
The bottom line is that system calls, like open(2) are generally well tested.
Completely agree - I guess just going by "when all other options have been exhausted and only one possibility remains, it must be the answer however improbable" (to paraphrase Sherlock Holmes).
Also note that Rational's Purify Plus would have found your problem because that is one of the things it tracks.
Interesting. Also across threads? /Per Jessen, Zürich
I used to do some C++, but in my most recent job as a software engineer, I was hired to help migrate a product away from C++ - to assembler. This was on S390. This was a very bad decision. It makes the code totally non-portable, and difficult to maintain. I am the first to admit that C++ code can be a
On Tuesday 09 May 2006 2:45 am, Per Jessen wrote:
performance hog, especially if you are using a lot of templates and RTTI.
In general, a C++ program is going to be much more readable and
maintainable than C or assembler. (That is assuming that the programmers
did a decent job). Not only is assembler non-portable, it is 100% cryptic
(Note that I have written assembler for DEC PDPs, Alphas, Intel IA64) but
also it can be locked onto a specific model. If you use instructions
specific to the S390, that code may not run on earlier IBM platforms.
--
Jerry Feldman
Jerry Feldman wrote:
On Tuesday 09 May 2006 2:45 am, Per Jessen wrote:
I used to do some C++, but in my most recent job as a software engineer, I was hired to help migrate a product away from C++ - to assembler. This was on S390.
This was a very bad decision.
As it happens it was the very best decision. V2 (or its succesors) of this product is now running in perhaps +10'000 large mainframe shops around the world. The C++ version was essentially unmaintainable as no or only very little staff existed with the required blend of in-depth MVS/VM and C++ skills. It also performed about 10-12 times worse than the assembler version, which was a major headache. (essentially it made it unsaleable).
It makes the code totally non-portable,
The product would only ever run on S390 or compatible, so portability was not on the list of requirements.
and difficult to maintain.
Nothing worse than the C++ product, I can guarantee you.
Not only is assembler non-portable, it is 100% cryptic (Note that I have written assembler for DEC PDPs, Alphas, Intel IA64)
Funny, that's exactly what we said about the C++ code - "it is 100% cryptic" :-)
but also it can be locked onto a specific model. If you use instructions specific to the S390, that code may not run on earlier IBM platforms.
Of course - you're preaching to the choir. I've spent the last 14-15 years writing system software for MVS, a little VM and some TPF. The project I've mentioned was during my 4 years with StorageTek, now Sun. As for some instructions being specific to S390, IIRC, we kept to S370/XA as the product was also meant to run on a S370-like architecture used in Japan (Fujitsu-built I think it was). /Per Jessen, Zürich
On Tuesday 09 May 2006 2:45 am, Per Jessen wrote:
Jerry Feldman wrote:
Also note that Rational's Purify Plus would have found your problem because that is one of the things it tracks.
Interesting. Also across threads? Yes. That's why it will cost $10,000US.
--
Jerry Feldman
participants (4)
-
Jerry Feldman
-
Jerry Feldman
-
Matthias Hopf
-
Per Jessen