What is a good way to determine if a URL link is a PDF in Python without looking at the file extension?

What is a good way to determine if a URL link is a PDF in Python without looking at the file extension?

Other urls found in this thread:

adobe.com/devnet/pdf/pdf_reference.html
twitter.com/NSFWRedditGif

Load the first few bytes of the URL and look for the PDF file format magic number.

look at the header, you nerd.

adobe.com/devnet/pdf/pdf_reference.html

Bytes of the literal string size? I don't know what you mean by load

>I don't know what you mean by load
You might lack some basic understanding of how computers work to finish your homework assignment.

In context of Python, retard

Use mimetype library

yeah.. his point still stands..

If the term "load" doesn't ring a bell you should step away from the computer.

would load be the right verb here? Wouldnt it be get or parse?

just look at the file extension

>"You say tomato, I say tomato"

>Not knowing primitive instruction in context of unfamiliar scripting language
>Step away from computer
Got it, friendo

"Load" isn't even in the context of your meme script. "To load" is a common verb in the English language. Get a dictionary.

>Load the first few bytes of the URL
That's what I'm trying to say. There is no specific method that describes what user is describing.

Holy fuck, I'm out.

Kys pajeet we're not doing your homework

If someone would tell you to load that pile of sand into a truck, would you also be oblivious?

To get this straight:
File extension is only a Windows meme, you could have PDF ending with EXE name and it would still be opened by PDF reader. To check what the file really is you need to look for characteristic data at the beginning of the file, where header is.

Don't forget to translate "look for" to the context of python. OP has a little problem with this.

Peek

>I don't know what you mean by peek.
>Do you mean like peek-a-boo?
>I need this in the context of a python script
t. OP

Use a HEAD request and check the mimetype. If HEAD is not supported use GET with Byte Ranges and check the file header. If Byte Ranges is not supported abort the connection after downloading the file header. If nothing of this makes sense Google the above terms with keywords such as "http requests" and "PDF file format".

this, although if OP doesn't trust the file extension (or maybe that's not what he meant?) this method still puts trust in the server to report back the correct type.