Please explain why/how this works
Please explain why/how this works
Other urls found in this thread:
insecure.org
godbolt.org
twitter.com
It doesn't. The output comes from a different application.
What do you mean, it's super simple. You write to the variable a, then write to the array at indexes -1, 0, and 1, then print all of those. As expected, it prints a, then the values at indexes -1, 0, and 1. The only real issue is that -1 isn't supposed to be a valid index. In C, there's nothing really stopping you from putting negative numbers or completely wrong numbers as the index, there's just a chance that it could point to memory that belongs to another program, or it could modify some part of your code and break it. In this case it doesn't seem to do anything really bad, so you're lucky.
you are modifying the argc/argv variables stored on the stack with the negative index?
This guy knows what's up. A[2] declares a pointer to 2 ints worth of statically allocated memory, and A[-1] is just a pointer to 1 int before that. You can write and read that memory just like any other, but its technically undefined behaviour because it's possible to hit a segfault or whatever since that memory isn't being managed directly by you.
#include
int main(int argc, char **argv)
{
argc = 0;
int a = 5;
int A[2];
A[-1] = 1;
A[0] = 2;
A[1] = 3;
printf("%d %d %d %d\n", a, A[-1], A[0], A[1]);
printf("%d\n", argc);
return 0;
}
output
0
5 1 2 3
1
there's a printf after argc=0
i don't know how i managed to copy this version
It doesn't if you have sane compiler flags. Add one of the stack protection flags and you'll see you're unable to modify argc through the array.
This is what I get:
$ gcc -fstack-protector {fname}.c && ./a.out
1 1 2 3
0
exp-sgcheck, from the valgrind suite, can help detect these sorts of errors. Unfortunately, it only works if the first access to the array was valid (so not in this case).
I should also add that you probably want to add const to your parameters. You can set most IDEs to do so automatically.
>there's just a chance that it could point to memory that belongs to another program
No. You do not know what you're talking about.
this
at least not in 2017 with virtual memory
Read insecure.org
Also, stop using C. It's unsafe.
>Please explain why/how this works
A is allocated on the stack.
A is given only 2 ints for its allocation: A[0] and A[1].
There is physically an int at the A[-1] memory location on the stack, but something else is actually allocated at that A[-1] memory address.
It tuns out what's allocated at the A[-1] memory address is not crucial for the correct operation of the program. You're overwriting it with 1, but that doesn't prevent the program from working.
A lot of times, the stack grows downward. If this is the case, then A[-1] might even sit "beyond the top of the stack", and therefore nothing is actually allocated to that location. In that case, there must be something about the code-generation of the cout line that avoids clobbering A[-1].
Compile with -S and look at the .s file to see exactly what's going on.
do stack arrays keep some kind of information in A[-1]?
>implying C++ is safe
you can still fuck up because boundaries are checked with only a few STL methods
That's where both dynamic and static arrays keep their length. Whenever you declare a stack array or malloc to a heap array, the size gets written to A[-1]. Feel free to read it anytime you need to know how many items an array can hold.
>implying implications
For static arrays, that is incorrect. In C++, if you declare int A[2] on the stack and print A[-1], you will see that it does not contain the length of A.
You are correct for dynamic arrays created with malloc.
technically you can use at() method instead of operator[] but then you have to deal with exceptions
>and print A[-1], you will see
maybe you're just reading it wrong?? what if it's a 64 bit int of something weirder
also WHY THE FUCK do compilers allow negative array indexes?
>A[-1]
>That's where both dynamic and static arrays keep their length
Is this in the C standard?
>WHY THE FUCK do compilers allow negative array indexes?
There's nothing wrong with negative indices per se. The problem is that shit languages like C and C++ misinterpret negative indices. A negative index is supposed to mean an offset from the end of the array, not some risky random memory access that most of the time results in an access violation.
cout
>sizeof(A)
Maybe you should learn other languages other than python, pajeet
>misinterpret
wtf ? how is that misinterpreting ? the index is the offset relative to the base address.
>A[-1]
>A[-1]
wait, what?
you can output sizeof(A)/sizeof(A[0]) that C fags use to get the size
the point is that it's stored internally somewhere
variable alignment
>proof it is stored somewhere
No. Using "cout
>maybe you're just reading it wrong?? what if it's a 64 bit int of something weirder
That's good to be skeptical. I would suggest examining all of the 16 bytes prior to the beginning of array A. In other words:
for (int i = 0; i < 16; i++) printf("%02x ", ((const char *)A)[-i]);
When you look at all the bytes that are printed, you won't see the size of A located anywhere in there. (If you do, then change the size of A to some large and weird size like 2753, and try again.) It turns out that the compiler doesn't need to store the size anywhere -- it's a constant, and every time you use it (like for example sizeof(A)) the compiler just generates the constant directly into the code.
> also WHY THE FUCK do compilers allow negative array indexes?
In C and C++, array indexing is semantically identical to pointer arithmetic. So, for example, if you have a pointer p that's pointing to the last character of a string, you can use p[-1] to get the second-to-last character of the string. For simplicity, the pointer and array models were merged, so this also works with arrays. Back in the early 1970s, the designers of C really did assume that people writing code knew exactly what the fuck they were doing. For example, for many years, C compilers never bothered to verify that function arguments were of the correct type -- they just assumed that the programmers were competent about those things -- and it wasn't until 1989 that they added argument type checking. They could never have dreamed that 50 years later massive numbers of complete idiots would be programmers.
Also, there are other languages that explicitly allow negative indexes. For example, in Pascal, you can declare an array as: var a : array[-10..10] of integer; -- meaning that you can index everything from a[-10] to a[10]. During code generation, the compiler simply adds the constant 10 to the address of the array to compensate for this.
>Try to obtain the address of sizeof(A)
sizeof() could be a function returning sizeof(A)*sizeof(A[0]) which are stored, or something like that
>Did you actually test printing A[-1] for a static array?
you won't prove anything because you don't know how to decode that space
consider that it could be a 64 bit int in A[-2] and A[-1] while you're printing only A[-1] and a 32 bit int
>the point is that it's stored internally somewhere
Correct. It's stored as an immediate argument in the machine code. However, it's not stored in the data segment, stack, or heap. In other words, it's stored in a way that does not allow you to take the address of it. Hence, it will never be located at A[-1] or other data location.
Now, this applies only to stack variables and global variables. The rules are different for dynamically-allocated variables. For example, if you dynamically allocate int*p=malloc(24) then you might very well find that p[-1] or p[-2] contains the number 24 -- but it's not recommended that you rely on that fact if you want your code to be portable.
No he's still pointing to his program, it would usually be the frame pointer or the return address, but I'm not so sure with main.
It's worth nothing that there's nothing in the C or c++ standards about HOW memory is allocated - a -1 index is undefined behaviour
the fact that a -1 index references the last item allocated on the stack is an implementation detail - if you compile this on other platforms it may or may not show 5
>there's a chance
Yes - there is a chance - depends on the system and almost all modern systems will not allow this
I don't get it. It's clear as day
What's confusing you?
>sizeof() could be a function returning sizeof(A)*sizeof(A[0]) which are stored, or something like that
It turns out that sizeof() is not a function.
The C and C++ language standards specifically say that sizeof() must yield a compile-time constant.
That allows you to do this:
static int a[10];
static int b = sizeof(a);
The compiler must ensure that b is filled with the correct size prior to the execution of ANY user code in the program (i.e. prior to main() being called.)
This means that no compiler can implement sizeof() as a function.
For proof, simply compile the code using the -S flag and look at the assembly code in the .s file. In no case will you ever see a call to a "sizeof" function. Instead, you'll always see the size appear as a hard-coded constant in the assembly code.
> you won't prove anything because you don't know how to decode that space
Strictly speaking, you are correct. That's why I think looking at the generated assembly code is the quickest way to prove it.
If someone is extremely stubbornly skeptical about this (not that there's anything wrong with that), then I would recommend doing the following:
1: Print out all the bytes in all the data spaces (data segment, stack, heap, and registers).
2: Define the array to be int A[2], and run the program.
3: Define the array to be int A[3], and run the program.
4: Define the array to be int A[4], and run the program.
5: Look at all three outputs and compare them.
I give you a 100% guarantee that you will not see any byte that contains a 2, then a 3, then a 4 (or else 8, then 12, then 16, if it's storing the size in bytes). Due to the requirements of the C/C++ language, it simply makes no sense for the compiler to store the sizeof anywhere in data.
Please note that this is applicable ONLY to variables that reside on the stack, or are declared as global or static. Nothing I said here is applicable to variables stored in dynamically-allocated memory.
>almost all modern systems will not allow this
It's still exploitable. See
There are three ints (12 bytes in total) on the stack between two consecutively allocated int arrays on my machine.
What are they?
>there's just a chance that it could point to memory that belongs to another program
I'll assume that you mean "process" instead of "program".
Your theory is possible only if the kernel provides no protection against processes accessing the memory of other processes. (Or if there is no kernel, such as in an embedded system.)
However, if there is a reasonably full-featured kernel (Linux, Windows, etc.) that makes use of the MMU, then the kernel is supposed to prevent any process from accessing the memory of another process -- unless they cooperate with each other and set up shared memory access.
This is really important, because -- for example -- there might be a password that's stored in plaintext in some RAM location. So you don't want people to be able to write user-space programs that can go out and survey all the physical memory trying to harvest the passwords of other users. (Well, they might be able to if they're running as root, so you need to make sure that the root password stays a secret.)
>What are they?
They might be padding. If an array has a size that's not a multiple of 4 (or 8), then you'll often see padding between it and the variable that follows it.
Could the bytes be other local variable?
I would definitely recommend looking at the assembly code to find out. (Compile using -S and then look at the .s file.)
Nice.
The key is the "sub rsp, 24" instruction -- it reserves 24 bytes of space on the stack for use as local data in the function.
But notice that only 16 of those bytes are used for storing the two local variables (a and A). So the other 8 bytes are extra space, maybe for use as temporaries later in the code?
The local variables were arranged so that A[-1] and A[-2] happens to reside in the extra space. Hence, you got A[-1] and A[-2] for free.
Try putting the declaration for "a" after the declaration for "A" and see what happens. In that case, I wouldn't be surprised if A[-1] ends up stored at exactly the same location as "a", clobbering it.
Everything becomes crystal clear once you see the assembly code.
What makes you say this?
This. It's undefined behavior, but it just happens to work with certain compilers and options.
I created an equivalent C version which explicitly uses int main(void), which shouldn't even allocate storage for argv/argc, and it still does the same thing. Also the assembly generated by gcc (with no optimization) for my C version shows that it allocates 64 (!) bytes of function-local stack for main(), even though the code only uses 12 bytes (assuming sizeof(int) = 12). A is stored at rbp-4, A[-1] at rbp-20, A[0] at rbp-16, and A[1] at RBP-12. Basically it looks like the array is stored "backwards" (with negative elements furthest from the stack base) meaning A[2] through A[-10] would be safe, A[3] would overwrite a, and A[-11] would cause a segfault.
That could be optimized into a constant by the compiler, there's no reason why "size" must be stored ANYWHERE at runtime, except when dealing with things like std::vector.
>except when dealing with things like std::vector.
post c++11 vectors are heap things
run this pls
int A1[1]={1234567};
int A2[1];
int i=-1;
while(A2[i] != 1234567){
i-=1;
}
cout
Isn't there a flag for g++ that creates an assembly file? Just look at that.
/*
Program to run.
Copyright (C) 2000+17 user
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
*/
#include
int main(){
int A1[1]={1234567};
int A2[1];
int i=-1;
while(A2[i] != 1234567){
i-=1;
}
std::cout