Please explain why/how this works

Question

Please explain why/how this works

Hudson Diaz

March 14, 2017 - 20:47

Other urls found in this thread:

insecure.org/stf/smashstack.html
godbolt.org/g/PkND7H
twitter.com/SFWRedditImages

Wyatt Hernandez

It doesn't. The output comes from a different application.

March 14, 2017 - 20:49

Jeremiah Howard

What do you mean, it's super simple. You write to the variable a, then write to the array at indexes -1, 0, and 1, then print all of those. As expected, it prints a, then the values at indexes -1, 0, and 1. The only real issue is that -1 isn't supposed to be a valid index. In C, there's nothing really stopping you from putting negative numbers or completely wrong numbers as the index, there's just a chance that it could point to memory that belongs to another program, or it could modify some part of your code and break it. In this case it doesn't seem to do anything really bad, so you're lucky.

March 14, 2017 - 20:52

Ryan Martinez

you are modifying the argc/argv variables stored on the stack with the negative index?

March 14, 2017 - 20:53

Jayden Anderson

This guy knows what's up. A[2] declares a pointer to 2 ints worth of statically allocated memory, and A[-1] is just a pointer to 1 int before that. You can write and read that memory just like any other, but its technically undefined behaviour because it's possible to hit a segfault or whatever since that memory isn't being managed directly by you.

March 14, 2017 - 20:59

Samuel Russell

#include

int main(int argc, char **argv)
{
argc = 0;
int a = 5;
int A[2];
A[-1] = 1;
A[0] = 2;
A[1] = 3;

printf("%d %d %d %d\n", a, A[-1], A[0], A[1]);
printf("%d\n", argc);
return 0;
}

output
0
5 1 2 3
1

March 14, 2017 - 20:59

Kevin Carter

there's a printf after argc=0
i don't know how i managed to copy this version

March 14, 2017 - 21:00

Connor Cox

It doesn't if you have sane compiler flags. Add one of the stack protection flags and you'll see you're unable to modify argc through the array.

This is what I get:
$ gcc -fstack-protector {fname}.c && ./a.out
1 1 2 3
0

exp-sgcheck, from the valgrind suite, can help detect these sorts of errors. Unfortunately, it only works if the first access to the array was valid (so not in this case).

March 14, 2017 - 21:21

Noah Nelson

I should also add that you probably want to add const to your parameters. You can set most IDEs to do so automatically.

March 14, 2017 - 21:22

Carter Thompson

>there's just a chance that it could point to memory that belongs to another program
No. You do not know what you're talking about.

March 14, 2017 - 21:26

Caleb Green

this
at least not in 2017 with virtual memory

March 14, 2017 - 22:34

Jeremiah Reyes

Read insecure.org/stf/smashstack.html

Also, stop using C. It's unsafe.

March 14, 2017 - 22:42

Xavier Long

>Please explain why/how this works

A is allocated on the stack.

A is given only 2 ints for its allocation: A[0] and A[1].

There is physically an int at the A[-1] memory location on the stack, but something else is actually allocated at that A[-1] memory address.

It tuns out what's allocated at the A[-1] memory address is not crucial for the correct operation of the program. You're overwriting it with 1, but that doesn't prevent the program from working.

A lot of times, the stack grows downward. If this is the case, then A[-1] might even sit "beyond the top of the stack", and therefore nothing is actually allocated to that location. In that case, there must be something about the code-generation of the cout line that avoids clobbering A[-1].

Compile with -S and look at the .s file to see exactly what's going on.

March 15, 2017 - 00:03

Brandon Campbell

do stack arrays keep some kind of information in A[-1]?

March 15, 2017 - 00:10

Camden Moore

>implying C++ is safe

you can still fuck up because boundaries are checked with only a few STL methods

March 15, 2017 - 00:13

Christian Howard

That's where both dynamic and static arrays keep their length. Whenever you declare a stack array or malloc to a heap array, the size gets written to A[-1]. Feel free to read it anytime you need to know how many items an array can hold.

March 15, 2017 - 00:14

Gavin Brown

>implying implications

March 15, 2017 - 00:17

Jason Nelson

For static arrays, that is incorrect. In C++, if you declare int A[2] on the stack and print A[-1], you will see that it does not contain the length of A.

You are correct for dynamic arrays created with malloc.

March 15, 2017 - 00:21

Evan Anderson

technically you can use at() method instead of operator[] but then you have to deal with exceptions

March 15, 2017 - 00:21

Dylan Ortiz

>and print A[-1], you will see
maybe you're just reading it wrong?? what if it's a 64 bit int of something weirder

also WHY THE FUCK do compilers allow negative array indexes?

March 15, 2017 - 01:23

Hudson Diaz

>A[-1]

March 15, 2017 - 01:25

Isaiah Wright

>That's where both dynamic and static arrays keep their length
Is this in the C standard?

March 15, 2017 - 01:27

Levi Price

>WHY THE FUCK do compilers allow negative array indexes?
There's nothing wrong with negative indices per se. The problem is that shit languages like C and C++ misinterpret negative indices. A negative index is supposed to mean an offset from the end of the array, not some risky random memory access that most of the time results in an access violation.

March 15, 2017 - 01:28

Ethan Ross

cout

March 15, 2017 - 01:28

Jeremiah Morris

>sizeof(A)

March 15, 2017 - 01:33

Adrian Adams

Maybe you should learn other languages other than python, pajeet

March 15, 2017 - 01:39

Luis Torres

>misinterpret
wtf ? how is that misinterpreting ? the index is the offset relative to the base address.

March 15, 2017 - 01:42

Justin Gomez

>A[-1]

March 15, 2017 - 01:43

Thomas Mitchell

>A[-1]
wait, what?

March 15, 2017 - 01:47

Brandon Torres

you can output sizeof(A)/sizeof(A[0]) that C fags use to get the size

the point is that it's stored internally somewhere

March 15, 2017 - 01:48

Jason Taylor

variable alignment

March 15, 2017 - 01:49

Charles Harris

>proof it is stored somewhere

No. Using "cout

March 15, 2017 - 02:40

Easton Sanchez

>maybe you're just reading it wrong?? what if it's a 64 bit int of something weirder

That's good to be skeptical. I would suggest examining all of the 16 bytes prior to the beginning of array A. In other words:

for (int i = 0; i < 16; i++) printf("%02x ", ((const char *)A)[-i]);

When you look at all the bytes that are printed, you won't see the size of A located anywhere in there. (If you do, then change the size of A to some large and weird size like 2753, and try again.) It turns out that the compiler doesn't need to store the size anywhere -- it's a constant, and every time you use it (like for example sizeof(A)) the compiler just generates the constant directly into the code.

> also WHY THE FUCK do compilers allow negative array indexes?

In C and C++, array indexing is semantically identical to pointer arithmetic. So, for example, if you have a pointer p that's pointing to the last character of a string, you can use p[-1] to get the second-to-last character of the string. For simplicity, the pointer and array models were merged, so this also works with arrays. Back in the early 1970s, the designers of C really did assume that people writing code knew exactly what the fuck they were doing. For example, for many years, C compilers never bothered to verify that function arguments were of the correct type -- they just assumed that the programmers were competent about those things -- and it wasn't until 1989 that they added argument type checking. They could never have dreamed that 50 years later massive numbers of complete idiots would be programmers.

Also, there are other languages that explicitly allow negative indexes. For example, in Pascal, you can declare an array as: var a : array[-10..10] of integer; -- meaning that you can index everything from a[-10] to a[10]. During code generation, the compiler simply adds the constant 10 to the address of the array to compensate for this.

March 15, 2017 - 03:01

Christopher Peterson

>Try to obtain the address of sizeof(A)

sizeof() could be a function returning sizeof(A)*sizeof(A[0]) which are stored, or something like that

>Did you actually test printing A[-1] for a static array?

you won't prove anything because you don't know how to decode that space

consider that it could be a 64 bit int in A[-2] and A[-1] while you're printing only A[-1] and a 32 bit int

March 15, 2017 - 03:03

Julian Adams

>the point is that it's stored internally somewhere

Correct. It's stored as an immediate argument in the machine code. However, it's not stored in the data segment, stack, or heap. In other words, it's stored in a way that does not allow you to take the address of it. Hence, it will never be located at A[-1] or other data location.

Now, this applies only to stack variables and global variables. The rules are different for dynamically-allocated variables. For example, if you dynamically allocate int*p=malloc(24) then you might very well find that p[-1] or p[-2] contains the number 24 -- but it's not recommended that you rely on that fact if you want your code to be portable.

March 15, 2017 - 03:11

Blake Taylor

No he's still pointing to his program, it would usually be the frame pointer or the return address, but I'm not so sure with main.

March 15, 2017 - 03:14

Camden Carter

It's worth nothing that there's nothing in the C or c++ standards about HOW memory is allocated - a -1 index is undefined behaviour

the fact that a -1 index references the last item allocated on the stack is an implementation detail - if you compile this on other platforms it may or may not show 5

March 15, 2017 - 03:15

Lincoln Parker

>there's a chance

Yes - there is a chance - depends on the system and almost all modern systems will not allow this

March 15, 2017 - 03:18

Angel Wood

I don't get it. It's clear as day
What's confusing you?

March 15, 2017 - 03:19

Justin Baker

>sizeof() could be a function returning sizeof(A)*sizeof(A[0]) which are stored, or something like that

It turns out that sizeof() is not a function.

The C and C++ language standards specifically say that sizeof() must yield a compile-time constant.

That allows you to do this:

static int a[10];
static int b = sizeof(a);

The compiler must ensure that b is filled with the correct size prior to the execution of ANY user code in the program (i.e. prior to main() being called.)

This means that no compiler can implement sizeof() as a function.

For proof, simply compile the code using the -S flag and look at the assembly code in the .s file. In no case will you ever see a call to a "sizeof" function. Instead, you'll always see the size appear as a hard-coded constant in the assembly code.

> you won't prove anything because you don't know how to decode that space

Strictly speaking, you are correct. That's why I think looking at the generated assembly code is the quickest way to prove it.

If someone is extremely stubbornly skeptical about this (not that there's anything wrong with that), then I would recommend doing the following:

1: Print out all the bytes in all the data spaces (data segment, stack, heap, and registers).
2: Define the array to be int A[2], and run the program.
3: Define the array to be int A[3], and run the program.
4: Define the array to be int A[4], and run the program.
5: Look at all three outputs and compare them.

I give you a 100% guarantee that you will not see any byte that contains a 2, then a 3, then a 4 (or else 8, then 12, then 16, if it's storing the size in bytes). Due to the requirements of the C/C++ language, it simply makes no sense for the compiler to store the sizeof anywhere in data.

Please note that this is applicable ONLY to variables that reside on the stack, or are declared as global or static. Nothing I said here is applicable to variables stored in dynamically-allocated memory.

March 15, 2017 - 03:30

Christian Morgan

>almost all modern systems will not allow this
It's still exploitable. See

March 15, 2017 - 03:34

Noah Phillips

There are three ints (12 bytes in total) on the stack between two consecutively allocated int arrays on my machine.

What are they?

March 15, 2017 - 03:34

Ian Gray

godbolt.org/g/PkND7H

March 15, 2017 - 03:37

Aaron Sanders

>there's just a chance that it could point to memory that belongs to another program

I'll assume that you mean "process" instead of "program".

Your theory is possible only if the kernel provides no protection against processes accessing the memory of other processes. (Or if there is no kernel, such as in an embedded system.)

However, if there is a reasonably full-featured kernel (Linux, Windows, etc.) that makes use of the MMU, then the kernel is supposed to prevent any process from accessing the memory of another process -- unless they cooperate with each other and set up shared memory access.

This is really important, because -- for example -- there might be a password that's stored in plaintext in some RAM location. So you don't want people to be able to write user-space programs that can go out and survey all the physical memory trying to harvest the passwords of other users. (Well, they might be able to if they're running as root, so you need to make sure that the root password stays a secret.)

March 15, 2017 - 03:39

Isaac Brown

>What are they?

They might be padding. If an array has a size that's not a multiple of 4 (or 8), then you'll often see padding between it and the variable that follows it.

Could the bytes be other local variable?

I would definitely recommend looking at the assembly code to find out. (Compile using -S and then look at the .s file.)

March 15, 2017 - 03:45

Christopher Foster

Nice.

The key is the "sub rsp, 24" instruction -- it reserves 24 bytes of space on the stack for use as local data in the function.

But notice that only 16 of those bytes are used for storing the two local variables (a and A). So the other 8 bytes are extra space, maybe for use as temporaries later in the code?

The local variables were arranged so that A[-1] and A[-2] happens to reside in the extra space. Hence, you got A[-1] and A[-2] for free.

Try putting the declaration for "a" after the declaration for "A" and see what happens. In that case, I wouldn't be surprised if A[-1] ends up stored at exactly the same location as "a", clobbering it.

Everything becomes crystal clear once you see the assembly code.

March 15, 2017 - 03:54

Brandon Sanchez

What makes you say this?

This. It's undefined behavior, but it just happens to work with certain compilers and options.

I created an equivalent C version which explicitly uses int main(void), which shouldn't even allocate storage for argv/argc, and it still does the same thing. Also the assembly generated by gcc (with no optimization) for my C version shows that it allocates 64 (!) bytes of function-local stack for main(), even though the code only uses 12 bytes (assuming sizeof(int) = 12). A is stored at rbp-4, A[-1] at rbp-20, A[0] at rbp-16, and A[1] at RBP-12. Basically it looks like the array is stored "backwards" (with negative elements furthest from the stack base) meaning A[2] through A[-10] would be safe, A[3] would overwrite a, and A[-11] would cause a segfault.

That could be optimized into a constant by the compiler, there's no reason why "size" must be stored ANYWHERE at runtime, except when dealing with things like std::vector.

March 15, 2017 - 03:58

Adam Robinson

>except when dealing with things like std::vector.
post c++11 vectors are heap things

March 15, 2017 - 04:41

Dominic Brooks

run this pls
int A1[1]={1234567};
int A2[1];

int i=-1;
while(A2[i] != 1234567){
i-=1;
}
cout

March 15, 2017 - 04:44

Isaac James

Isn't there a flag for g++ that creates an assembly file? Just look at that.

March 15, 2017 - 04:56

Carter Rodriguez

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see .
*/

#include

int main(){
int A1[1]={1234567};
int A2[1];

int i=-1;
while(A2[i] != 1234567){
i-=1;
}
std::cout

March 15, 2017 - 05:06

1 2 ... 6 Next

Please explain why/how this works

Last threads