What is a good way to determine if a URL link is a PDF in Python without looking at the file extension?

Question

What is a good way to determine if a URL link is a PDF in Python without looking at the file extension?

Asher Bell

January 18, 2018 - 15:03

Other urls found in this thread:

adobe.com/devnet/pdf/pdf_reference.html
twitter.com/NSFWRedditGif

Asher Rogers

Load the first few bytes of the URL and look for the PDF file format magic number.

January 18, 2018 - 15:05

John Carter

look at the header, you nerd.

adobe.com/devnet/pdf/pdf_reference.html

January 18, 2018 - 15:05

Luke Butler

Bytes of the literal string size? I don't know what you mean by load

January 18, 2018 - 15:09

Jackson Butler

>I don't know what you mean by load
You might lack some basic understanding of how computers work to finish your homework assignment.

January 18, 2018 - 16:02

Christopher Nguyen

In context of Python, retard

January 18, 2018 - 16:09

Asher Sanders

Use mimetype library

January 18, 2018 - 16:11

Lincoln Barnes

yeah.. his point still stands..

January 18, 2018 - 16:11

Benjamin Howard

If the term "load" doesn't ring a bell you should step away from the computer.

January 18, 2018 - 16:12

Grayson James

would load be the right verb here? Wouldnt it be get or parse?

January 18, 2018 - 16:16

Charles Rodriguez

just look at the file extension

January 18, 2018 - 16:20

Henry Campbell

>"You say tomato, I say tomato"

January 18, 2018 - 16:22

Carson Ramirez

>Not knowing primitive instruction in context of unfamiliar scripting language
>Step away from computer
Got it, friendo

January 18, 2018 - 16:28

Wyatt Sanders

"Load" isn't even in the context of your meme script. "To load" is a common verb in the English language. Get a dictionary.

January 18, 2018 - 16:32

Chase Kelly

>Load the first few bytes of the URL
That's what I'm trying to say. There is no specific method that describes what user is describing.

January 18, 2018 - 16:36

Jayden Nguyen

Holy fuck, I'm out.

January 18, 2018 - 16:42

Zachary Clark

Kys pajeet we're not doing your homework

January 18, 2018 - 16:43

Ryder Hill

If someone would tell you to load that pile of sand into a truck, would you also be oblivious?

January 18, 2018 - 16:49

Ayden Jackson

To get this straight:
File extension is only a Windows meme, you could have PDF ending with EXE name and it would still be opened by PDF reader. To check what the file really is you need to look for characteristic data at the beginning of the file, where header is.

January 18, 2018 - 17:04

Xavier Cook

Don't forget to translate "look for" to the context of python. OP has a little problem with this.

January 18, 2018 - 17:24

Christian Kelly

Peek

January 18, 2018 - 17:28

Cameron Thompson

>I don't know what you mean by peek.
>Do you mean like peek-a-boo?
>I need this in the context of a python script
t. OP

January 18, 2018 - 19:05

Cameron Morgan

Use a HEAD request and check the mimetype. If HEAD is not supported use GET with Byte Ranges and check the file header. If Byte Ranges is not supported abort the connection after downloading the file header. If nothing of this makes sense Google the above terms with keywords such as "http requests" and "PDF file format".

January 18, 2018 - 20:03

Adrian Gomez

this, although if OP doesn't trust the file extension (or maybe that's not what he meant?) this method still puts trust in the server to report back the correct type.

January 18, 2018 - 20:10

1 2 3 Next

What is a good way to determine if a URL link is a PDF in Python without looking at the file extension?

Last threads