Post by Sean ConnerPost by Sean ConnerI interpret this to mean, "if the given object files do not include an
object or function with external linkage, then said objects or functions
can be pulled from a "library".
I cannot see how you reach this conclusion and this is exactly what the
rest of your argument hinges on. I'd say this also exactly where the
fundamental disagreement is.
5.1.1.2 #8 and 6.2.2 #2 that you quoted do not say that libraries are only
consulted if something was not resolved in some "primary" translation
units, and, as I said earlier, no such primacy is assigned to any
translation units by the standard.
Again, the key sentence from 5.1.1.2 #8:
Library components
That is, translations units pre-compiled and stored in a library. The C
standard (as far as I can see) does not define "library", but from the
language I've read, it can store translation units and these translation
units can be referenced at a later time.
are linked
Again, "linked" (and "linkage" and "linking") are not defined by the C
standard, but the implication is that different translation units are
somehow combined so that all external references are satisfied.
to satisfy external references to functions and objects
As in right here.
not defined in the current translation.
And I think this is where we have our differenes. If the "current
translation" does not contain function foo(), *then* such a function is
looked for in any given libraries [1]. I ask, what does "not defined in the
current translation" mean to you?
Post by Sean ConnerNor do they say anything like "parts of libraries".
The first two words of 5.1.1.2 #8---"Library components". I looked up
"component" in the Oxford English Dictionary and I found:
A constituent element or part
Post by Sean ConnerAn "entire program" includes full libraries.
Citation please. The phrase "entire program" appears in 6.2.2#2:
In the set of translation units and libraries that constitutes an
entire program ...
I could not find the phrase "full library" (or variations) in the C99
Standard.
Post by Sean ConnerAnd here, you dismissed an important distinction
between static libraries and shared libraries. Static libraries (archives)
are typically just loose collections of object files not linked together.
So (we kiss good-buy to portability and conformance at this point) it kind
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Citation please, from the C standard (any one of C89, C99, or C11). It's
odd to think of GNU C as not being conformant. Or any of the commercial
compilers (don't worry, I'll address one of those in a bit).
Post by Sean Connerof makes sense to say that one of those object files can be pulled back
from the archive "on demand" and linked to the program without pulling in
other object files, so those other object files are kind of not really part
of the program, so whatever external symbols they have is irrelevant. This
behaviour is exactly what you and the other advocates of patchless patching
exploit - where it works, and I have not seen any attempt to demonstrate
that it works except with the GNU linker.
I did some experiments last night on this very subject. At first, I used
Linux but curious, I went and did the same experiments on Solaris, so I'll
use the results from that. On the Solaris system in question, I used the
Sun Compiler Suite, so no GNU tools. And just to make sure:
% cc -V ; ld -V
cc: Sun C 5.12 SunOS_sparc 2011/11/16
ld: Software Generation Utilities - Solaris Link Editors: 5.10-1.1512
(and again, it would be odd to think of the Sun compiler as not being
conformant)
The experiment is a very small program---the function main() calls
func1(), which calls func2(). Each function prints it's been called and a
sample output would look like:
Hello from main
Hello from func1
Hello from func2
Experiment 1---main() is in on translation unit; func1() and func2() are in
another translation unit, which is stored in a library. Compile and run:
% cc -c -o main.o main.c
main() is compiled.
% cc -c -o func.o func.c
% ar rv libfuncall.a func.o
a - func.o
ar: creating libfuncall.a
ar: writing libfuncall.a
func1() and func2() in the same translation unit is compiled into a library.
% cc-o main1 main.o libfuncall.a
The two are linked.
% ./main1
Hello from main
Hello from func1
Hello from func2
And the output. This is the status quo. Next up, the "patchless patching
exploit" (to use your terms)---we have our own version of func2() which,
when called, will print "Hello from myfunc2":
% cc -c -o main.o main.c
% cc -c -o myfunc2.o myfunc2.c
% cc -c -o func.o func.c
% ar rv libfuncall.a func.o
a - func.o
ar: creating libfuncall.a
ar: writing libfuncall.a
% cc -o main2 main.o myfunc2.o libfuncall.a
ld: fatal: symbol 'func2' is multiply-defined:
(file myfunc2.o type=FUNC; file libfuncall.a(func.o) type=FUNC);
ld: fatal: file processing errors. No output written to main2
And this result does back your position---that we have two definitions of
the external function func2() (and for the record, I got this result on
Linux as well). Before I interpret this result, let's go on to experiments
three and four. In these two cases, func1() and func2() are in separate
translation units and both are stored in a library. Experiment 3---the base
line:
% cc -c -o main.o main.c
% cc -c -o func1.o func1.c
% cc -c -o func2.o func2.c
% ar rv libfunc.a func1.o func2.o
a - func1.o
a - func2.o
ar: creating libfunc.a
ar: writing libfunc.a
% cc -o main3 main.o libfunc.a
% ./main3
Hello from main
Hello from func1
Hello from func2
Nothing unusual here. Now, onto the "patchless patching exploit":
% cc -c -o main.o main.c
% cc -c -o myfunc2.o myfunc2.c
% cc -c -o func1.o func1.c
% cc -c -o func2.o func2.c
% ar rv libfunc.a func1.o func2.o
a - func1.o
a - func2.o
ar: creating libfunc.a
ar: writing libfunc.a
% cc -o main4 main.o myfunc2.o libfunc.a
That's odd---it compiled!
% ./main4
Hello from main
Hello from func1
Hello from myfunc2
And ran!
So, how do I interpret these results? In experiment 2, the translation
unit main had an unsatisfied external function, func1(). func1() is not
defined in the translation unit myfunc2, so we then examine the external
functions stored in the library libfuncall.a. There is a component of
libfuncall.a that has the external function func1(), but said component
*also* has external function func2(). When it comes time to link the
translation unit myfunc2, it also had an external function func2() and thus,
an error, because 6.9#5 states:
... somewhere in the entire program there shall be exactly one
external definition for the identifier; otherwise, there shall be no
more than one.
But now we get to experiment 4. This time, the translation unit main had
an unsatisfied external function func1(). func1() is not defined in the
translation unit myfunc2, so we then examine the external functions stored
in the library libfunc.a. There is a component of libfunc.a that has the
external function func1(), so that component is pulled in. That component
has an unsastified external function func2(). Said external function is not
defined in translation unit main, but it *is* defined in translation unit
myfunc2. There are no more unresolved external functions (or objects for
that matter) so we get the final program that works as expected (and again,
for the record, I got the same result on Linux).
In fact, these results seem (in my opinion) to be consistent with the
langauge in the C99 standard. I would also conjecture that you will find
the same results for static compilation across all C compilers.
I will conceed that there might exist a C compiler out there that does not
conform to these behaviors, but it would be as rare as coming across a C
compiler for a sign-magnitude or 1's-complement system [2].
Post by Sean ConnerHowever, the whole point of patchless patching is about shared libraries,
because it is not that difficult to modify a static library and link your
application against this modified library. Shared libraries are not loose
collections of object files. They are pre-linked executables
They are not pre-linked executables. Well, mostly. I know you can
execute libc under Linux:
[spc]lucy:/lib>./libc.so.6
GNU C Library stable release version 2.3.4, by Roland McGrath et al.
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.4.6 20060404 (Red Hat 3.4.6-11).
Compiled on a Linux 2.4.20 system on 2010-04-18.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
linuxthreads-0.10 by Xavier Leroy
The C stubs add-on version 2.1.2.
BIND-8.2.3-T5B
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Glibc-2.0 compatibility add-on by Cristian Gafton
GNU Libidn by Simon Josefsson
libthread_db work sponsored by Alpha Processor Inc
Thread-local storage support included.
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
but that's actually rare. If you try some other shared object (Linux):
[spc]lucy:/usr/lib>./libcrypt.so
Segmentation fault
or Solaris:
[spc]sol:/usr/lib>./libcrypt.so
Illegal Instruction
So I politely disagree with them being "pre-linked executables."
Post by Sean Connerand they most
certainly bring with them all of their external symbols into the program. I
explained how this could result in complications previously and won't
repeat that. As far as I can see, no one in this entire thread has come up
with a show case of patchless patching for a shared library, while claims
like "the difference between [shared and static libraries] is not important
for this disussion [sic]" have been made.
Let me rectify that then. The same four experiments as above, in that
order, but this time with shared libraries. Again. on Solaris:
% cc -V ; ld -V
cc: Sun C 5.12 SunOS_sparc 2011/11/16
ld: Software Generation Utilities - Solaris Link Editors: 5.10-1.1512
Experment 1:
% cc -shared -xcode=pic32 -c -o func.ss func.c
% cc -shared -o libfuncall.so func.ss
% cc -Wl,-R/lusr/home/spc/foo -o smain1 main.o -L/lusr/home/spc/foo -lfuncall
% ldd ./smain1
libfuncall.so => ./libfuncall.so
libc.so.1 => /usr/lib/libc.so.1
libm.so.2 => /usr/lib/libm.so.2
/platform/SUNW,Sun-Fire-T1000/lib/libc_psr.so.1
This is to ensure we have an executable that loads our library at runtime.
It is, so let's run it:
% ./smain1
Hello from main
Hello from func1
Hello from func2
Now experiemnt two, the "patchless patching exploit."
% cc -c -o myfunc2.o myfunc2.c
% cc -Wl,-R/lusr/home/spc/foo -o smain2 main.o myfunc2.o -L/lusr/home/spc/foo -lfuncall
% ldd ./smain2
libfuncall.so => ./libfuncall.so
libc.so.1 => /usr/lib/libc.so.1
libm.so.2 => /usr/lib/libm.so.2
/platform/SUNW,Sun-Fire-T1000/lib/libc_psr.so.1
It compiled, unlike the second experiment with static libraries. But
let's see how it runs:
% ./smain2
Hello from main
Hello from func1
Hello from myfunc2
Wow! It worked! Even with func1() and func2() in the same translation
unit, func1() is calling func2() from translation unit myfunc2. And it's
not Linux! (for the record, it worked under Linux). And just to be
complete, experiments three and four. I won't comment much on these as they
too, work as the static version (even on Linux):
Experiment 3:
% cc -shared -xcode=pic32 -c -o func1.ss func1.c
% cc -shared -xcode=pic32 -c -o func2.ss func2.c
% cc -shared -o libfunc.so func1.ss func2.ss
% cc -Wl,-R/lusr/home/spc/foo -o smain3 main.o -L/lusr/home/spc/foo -lfunc
% ldd ./smain3
libfunc.so => ./libfunc.so
libc.so.1 => /usr/lib/libc.so.1
libm.so.2 => /usr/lib/libm.so.2
/platform/SUNW,Sun-Fire-T1000/lib/libc_psr.so.1
% ./smain3
Hello from main
Hello from func1
Hello from func2
Experiment 4:
% cc -Wl,-R/lusr/home/spc/foo -o smain4 main.o myfunc2.o -L/lusr/home/spc/foo -lfunc
% ldd ./smain4
libfunc.so => ./libfunc.so
libc.so.1 => /usr/lib/libc.so.1
libm.so.2 => /usr/lib/libm.so.2
/platform/SUNW,Sun-Fire-T1000/lib/libc_psr.so.1
% ./smain4
Hello from main
Hello from func1
Hello from myfunc2
Post by Sean ConnerPost by Sean ConnerFunny you mention that---I checked the C99 standard and found *nothing*
about this. It's not unspecified, it's not undefined, it's not
implementation defined, its' not locale specific, *nothing*.
The very first message of mine in this thread explained how having multiple
definitions of external linkage identifiers in an "entire program" is
undefined behaviour, quoting the standard.
That was 6.9#5, which I quoted a portion of, but here's the full quote:
5 An external definition is an external declaration that is also a
definition of a function (other than an inline definition) or an
object. If an identifier declared with external linkage is used in
an expression (other than as part of the operand of a sizeof
operator whose result is an integer constant), somewhere in the
entire program there shall be exactly one external definition for
the identifier; otherwise, there shall be no more than one.
I'm not reading "undefined behavior" there, I see "error" there. Annex J
of the C99 standard lists all the unspecified, undefined,
implementation-defined and locale-specific behaviors. Nowhere is this
addresses.
Look, I recognise you don't like this and think it's a violation of the C
Standard. I don't see it as a violation of the C Standard, but I'll grant
that is may be an unusual interpetation of the C Standard.
Post by Sean ConnerGiven your interpretation as to
what an "entire program" is, you might indeed have difficulty seeing that.
Cheers,
V.
-spc (So did I use two non-comformant compilers for this experiment then?)
[1] In every C compiler I've used over the past 30 years, there is no
need to specify the Standard C library. There have been options to
tell the compiler *not* to reference the Standard C library, but
again, the C Standard is silent on that point.
[2] They exist, but are so rare that I would be surprised if anyone on
this list has used such a system.