Reminder that there is literally NO reason to use linked lists anymore. It's 2016

Reminder that there is literally NO reason to use linked lists anymore. It's 2016.

Other urls found in this thread:

codeofhonor.com/blog/avoiding-game-crashes-related-to-linked-lists
twitter.com/AnonBabble

what use instead?

fuck off faggot, linked lists are useful

not him but for what? is there something that list do better that any other data structure?

A nice object that allow dynamic resizing and which the garbage collector destroys after is no longer needed.

Array lists/vectors. Any performance lost by the resizing/shifting will be made up for by not having a billion cache misses.

Link lists have a yuuuge advantage over arrays when it comes to inserting and deleting elements.

BUT it's 2016. On-CPU cache is positively yuuuuuge. Turns out, in 2016, optimizing for cache hits is more important than a lot of traditional optimizations.

I mean, if we're talking about proper O(n), where we assume yuuuuuge values of n, then the classic analysis makes sense. But if we're talking about normal data sets, then damn, it's probably going to make more sense to use a simple block of memory. If you need something more specialized than an array, then you're *probably* looking at a specific function out of your data type, like a tree or a trie.

Does that make sense? Unless a 'fancy' data structure particularly lends itself to the specific problem you're solving (e.g. a trie used to hold a dictionary for word lookup), then you're probably going to end up sticking to arrays.

>Link lists have a yuuuge advantage over arrays when it comes to inserting and deleting elements.

wouldn't reallocating and doubling in size when the array becomes full still be O(1) and mantain the space locality advantage over lists?

>yuuuuge

ELI5: How does array lists/vectors avoid cache misses?

I can understand how a linked list can case these issues, but not how it can be avoided by using array lists/vectos.

Is it because the CPU always loads the whole array list/vector into it's cache? (Maybe not always, depending on the size of it)

What about inserting/deleting elements in the middle? Which is usually the advantage people are describing when talking about linked lists since an array will require moving other elements around.

std::vector

This. Nothing even comes close to linked lists in terms of removing and adding shit at runtime.

For?

Anywhere a linked list would be "useful" a stack would be equivalent in functionality while being way more efficient.

>yuuuge
Fuck off.

About deleting i would just leave it an empty entry with some kind of flag value that signals the value is to be skipped.
After a certain threshold reallocate, keep stuff O(1).
On inserting in the middle depends on the problem, i.e. if you want your new element in that exact position but don't care about the rest you can keep it cool and O(1); if you want to mantain order i think list is actually better.
[spoiler]
It would be cool to have a problem where the chance of having to insert in the middle is 1/n so reallocating just to insert a new element would still be O(1) :D.
[/spoiler]

yes but O(1) for a non cached element on a non enormous dataset is slower than O(n).

Memory latency is insane compared to computation time in most applications.

If you're iterating over a linked list, you could end up with a cache miss for each element because they are not stored contiguously. For an array list, because the data is contiguous each cache miss also causes the next few items to be loaded into the cache, meaning the next few iterations will be cache hits = very fast.

That's the general idea.

The only time I've ever used a linked list is for a time-bounded variable length list that gets significantly more insertions and deletions than traverses

N.B. there are some very niche cases where linked lists may be more effective, i.e. heavy inserts/deletions in the middle of a list. But practically everything else is worse.

Linked lists are used all over the place in OS kernels where over-allocation for dynamic arrays is simply a terrible idea. If you're deciding between linked lists and arrays in Java or C#, though, of course it would be almost always better to use arrays.

Thanks user

you can implement linked lists with arrays.

What comes after 0,1,2,3,4? 5,6,7,8
What comes after chain->next->next->next->next?

A cpu always load an entire cache line from RAM. Usually a cache line is about 64 bytes long. The CPU can also prefetch multiple cache lines at once because it predicts that when you access cacheline x you will probably also access cacheline x+1.
With a linked list the cpu has to dereference a pointer for every element sequentially and access RAM if the linked list cells are not allocated continguously. Not only does deferencing a pointer usually suffer from the latency of a cache miss it also wastes bandwidth and RAM capacity.

Then there is also the fact that you can't slice a linked list in segments of m size and then distribute the workload onto multiple cores it without iterating over (list length - m) elements. You also cannot take advantage of SIMD unless you process multiple linked lists.

The only benefits of linked lists is O(1) insertion and deletion if you already have a reference to the element.

There are alternatives to a simple vector if you want that. For example you want to cheaply push to the front of a vector and pop from the end. Just use a double ended queue with one reversed array for the front and one normal array for the back. When iterating and the back array is empty simply move the front elements into the back elements. This operation is O(n) but if you access all n elements it amortizes to O(1) per element.

You can also cheaply delete an element in the middle of a vector by swapping it with the last element and then deleting the last element. This obviously doesn't preserve the order but if you care about order you should just do several bulk delete/insert operations and sort the vector after the fact.

Don't be so grouchy. One must adopt the parlance of one's time.

>About deleting i would just leave it an empty entry with some kind of flag value that signals the value is to be skipped.

Eh... That's going to be a pain.

If I want a value out of an array, I just say MyArray[n]. In your case, I have to say, "while (MyArray[n].flag != valid && n < length) n += 1; if (n == length) return ERROR; x = MyArray[n].value;"

That would be really slow and cumbersome. A simple lookup, instead of being O(1), would now be O(n) in the worst case. Zero performance improvement with O(n) performance reduction.

You are talking about small things though.

If you check what linked lists are used for in linux, you will see that the struct task is in multiple linked lists.
Since the struct is so huge and there's no SIMD instructions that can handle it, most of your arguments just don't apply.

Linked lists are the worst data structure

Nice arguments.

Because they just werk, If wanted to use an array, I'd use one.

They work just like bogosort works.

yeah go fuck yourself. i guess O(n) insert and delete are okay with you then?

It's a tradeoff between flexibility with inserts/removals and access time.

Use the correct one for the job.

Both linked lists and arrays can easily support O(1) insert. Linked lists can even support O(1) delete if you maintain pointers to objects rather than indices. The *only* thing strictly O(n) in a linked list is search, which is also O(n) in an array.

Glad to see that at least one other person is glad that the Linux kernel (and presumably any closed-source OS) doesn't allocate double the necessary memory just for supposed cache coherency on a dynamically-sized list.

Anyone who doesn't know when/why linked lists are good, just hasn't done enough programming to be qualified to speak.

how can you insert in constant time in an array while preserving order?

>arrays can easily support O(1) insert.

WTF are you talking about?

> if you maintain pointers to objects rather than indices

Uh. Pointers and indices are the same thing.

>The *only* thing strictly O(n) in a linked list is search

And you have to search in order to an insert or delete.

I was assuming that order wasn't preserved for insert, so yeah, if O(n) insert while preserving order is important then linked lists are probably a good choice.

>>arrays can easily support O(1) insert.
>WTF are you talking about?
Append an element to the end of the array, given sufficient space.

>Uh. Pointers and indices are the same thing.
They absolutely are not.

>you have to search in order to an insert or delete.
I'm assuming you actually use the objects you store in an array at some point. If you want to delete an object you just finished using, you won't need to look it up a second time. Would you seriously expect your OS to repeat its search of a process list every time a process exits? Would you expect it to repeat its search of a list of allocations every time you free memory? In both cases it stores a pointer (not an index) in a different location, and then uses that pointer to directly access an entry in a linked list.

Containers/collections

Just use Python arrays they are waaaaaay easier to work with :/

Why hasn't this been pointed out by anyone else? Behind the scenes, many Lisp interpreters will make optimizations like this.

No. If you can't expand the region of memory to the proper size you will have to do a copy, so the worst-case scenario is O(n).

Linked list rock
let insert l x =
let rec loop accu = function
| [] -> List.rev_append accu [x]
| y :: ys as l ->
if x < y then
List.rev_append accu (x :: l)
else
loop (y :: accu) ys in
loop [] l
;;

let sort = List.fold_left insert [];;

let print_list l =
begin
match l with
| [] -> ()
| x :: xs ->
Printf.printf
"%d%a" x (fun ppf -> List.iter (Printf.fprintf ppf " %d")) xs
end;
print_newline ()
;;

let main () =
let l = [3; 1; 4; 1; 5] in
print_list l;
let l = sort l in
print_list l
;;

let () = main ();;

what about of using them as reified iterators?

You can store linked lists contiguously.

What's the speed impact for a linked list? Or does it depend on how contiguous the data is?

Is this OCaml?

Well maybe if you're a pajeet coding StartUp.io's webpage.

>NO reason
Actually, linked lists are pretty common in memory allocation algorithms, where it is impractical to have a contiguous data structure to keep track of free blocks.

codeofhonor.com/blog/avoiding-game-crashes-related-to-linked-lists
describes intrusive list that allows O(1) delete.

They're good for inserting and deleting elements without having to shift every other element... but that advantage is often eclipsed by the cost of linear traversal/access.

If you're dealing with a stack or queue-like structure where you want to grow and shrink only at one end of a structure, then maybe linked lists are useful... but that's about all I can think of, honestly.

>what is a bitmap, asked no pajeet ever

>If you're dealing with a stack or queue-like structure where you want to grow and shrink only at one end of a structure, then maybe linked lists are useful... but that's about all I can think of, honestly.
An array list will outperform a linked list in that scenario.

What happens when an "array list" exceeds its current capacity?

You allocate a new one and move old entries lazily.

Then why wouldn't a linked-list queue out-perform an array list queue, since it would avoid copying?

You splice it up and add another array to the chain (AL chain links tend to be fixed size arrays).

ALs are used for different reasons too than just cacheline behavior - for example indexing is O(N/cellsize) instead of O(N).

The main disadvantage is that generic operations are more complex (however tailored sequential operations are still much faster).

tl;dr: There is no holy grail, all structures have different tradeoffs - be it b-trees, arraylists, skiplists...one isn't inherently better than another.

Because the address of the node is not at a predictable location.

I don't know why we dont just make arrays that are around 50mb or so and call it a day

What computer doesn't have a minimum of 2gb these days? You could hold 40 fucking arrays with that. Who the fuck needs more than that

Memory size isn't the problem, memory bandwidth is. If you want to delete an element in middle of 2GB array, you need to move 1GB of memory.

There's no indexed access in this case, so why is that a problem?

user, if every programmer thought that way, our gigabytes of memory would go away very quickly. Stop pretending like your application is going to be the only one running on a system, or even that it's going to be one of only a few.

If you want to delete a element in the middle of a 2GB linked list, you would have had to traverse 1GB of non cache optimized memory to find the element you wanted to delete.

Linked lists are bad if you allocate each node separately.

Linked lists are good if you have something like the inside of malloc where the free list is one big array.

oooor you would already have a reference to the element

And where did you magically get that reference?

the grace of God

Nice memeing.

Linked lists are fantastic for iterating through the list starting from the root.

they are not meant for random access.
I use them mainly for function callbacks.

Say I am making a game and I want to have teams. I would have a list for each team. Now I can call a function for every player of a given team.

If you want to move someone to the blue team, it is easy as using two functions
list_del(&red, ply);
list_add(&blue, ply);


This makes for easy programming in C

fuck vectors and fuck white people

They already do. Years old software runs smoother than its current day counterpart.

>I don't understand build/fold fusion.

linked lists are the only sane way to manage some resources in kernel and other highly performance critical concurrent systems, where O(1) vs O(n) traversal isn't as important as minimizing or eliminating locking times.

>learn linked lists in college
>no need to ever use them again

really useful

Yeah, because performance isn't a thing in 2016. Plz never get a job in software engineering.

They're a good subject for warm-up interview questions.

> Find cycles in linked list (with and without O(1) memory)
> Find the intersecting node of two linked lists (same memory restrictions)
> Reverse a linked list in place (recursive (easy) and iterative (still pretty easy)
> Implement a stack and queue in with a linked list
> Make an LRU cache (good test of hash tables + linked lists)

All questions that should be pretty easy to figure out if you have a formal education in CS. Good way to weed out self-taughts and EEs if they aren't up to snuff.

>Find cycles in linked list --- O(1) memory
>Implement --- queue with a linked list
I don't see how these are possible unless your definition of linked list covers double links or skip links.

>learned programming in college
>went on to become a pajeet in a dead-end cube farm job

FTFY

Finding cycles is possible if you're willing to be hugely time inefficient:
>Keep two pointers, A and B, both initialized to the start of the list
>Travers the list with pointer A, until you reach the end of the list
>If at any step of the way, you hit pointer B, the list has a cycle
>If you reach the end of the list with pointer A, repeat the entire process with both pointers A and B at the second element in the list
>Continue until you either find a cycle or pointer B reaches the end of the list, in which case there is no cycle

A queue with a linked list is much easier. Keep a pointer to the head and tail elements of the list. On insert, just set the tail's next pointer to the new element, and then switch the tail. On remove/dequeue, set the head pointer to the head's next entry, and then return the element that was at the head.

Cycles can be found with O(1) by using two pointers, called fast and slow. Slow moves one node at a time (node = node.next) and fast moves two at a time (node = node.next.next). If fast reaches the end, there are obviously no cycles, if fast is ever == to slow, then there is a cycle. These are the only two possibilities.

Implementing a queue with can be done with a singly linked list. It's useful to have (but not neccessary) to have a pointer to the tail of the linked list. Pop off the head, (temp = node.val, node = node.next, return temp), push new values onto the tail)

You son of a bitch you beat me by a second

Guess that means I rank higher at the next google interview where we're asked to code this inane bullshit on a whiteboard!

I thought there were supposed to be solutions that are not horribly inefficient. I guess I was wrong.

>I don't know why we dont just make arrays that are around 50mb or so and call it a day
Well if you find a CPU with l2 cache size of 50mb then call me

CPU's cache memory in a way that favors memory locations close to each other. If you access an array it will cache most if not all of it in it's L3/L2 cache(depending on architecture) making repeated access fast.

>have big heap
>get root element
>congratulations, you have to copy the whole damn heap instead of changing a few pointers

I just learned this shit, don't tell me that

OP... are you a complete moron or something?

bst

Ignorance: the post. It's only true if you're dealing with small arrays where you won't see any real difference.

It still matters for large arrays or high performance applications. Make a 2 GB array. Now insert an element in the middle of it. Now you have to shift 1 GB of stuff over by one. Do the same with a linked list and you have to write 2 pointers and 1 element.

Choose the data structure that fits your use case. Use an array if you need to quickly access the nth element. Use a linked list if you need to insert at arbitrary positions a lot. Use a linked list with an array of partway-through pointers if you need to speed up nth-element lookups in a linked list. A linked list of small arrays can be a good choice as well in many situations.

which is basically a linked list/array/tree with more syntactic sugar

Adjacent elements in the vector are contiguous in memory, each element of a linked list can be in an arbitrary location in RAM so in the worst case you can have a cache miss every time you follow the link to the next element

Unrolled linked lists (each link stores a small array of elements instead of a single element per link) can reduce this issue

Please kill yourself.

Hash tables

How do you get that reference? yeah that's right.

Get raped and kill yourself, you retarded fucking faggot sack of shit with down syndrome.

I used to be a C programmer like you and I used to use naive linked list implementations like you because you can't be bothered to implement an efficient vector data structure.
Then I moved to C++.

Enjoy your shitty inefficient non cache optimized code.

this, Until a perfect hashing function exists.

2 kinds of people itt
>cs babbies who hated that lecture where they had to write linked lists and not just copy and paste
>people who know their shit and know when to use linked lists

I realized that linked lists use more memory than dynamic arrays.

Well, it's a tradeoff when you're restricted to using less memory. If you had more, it might be possible to do it in less time, but the list cycle algorithm is O(N), and the fast pointer will at most go around the cycle twice. Imagine the two pointers racing each other in a track. You're just measuring when the faster one is lapping the slower one.

Plus, if it's an whiteboard interview solution, having a working solution first then improving it is best.

>fast reaches the end
The what?

Unary trees are worse.