download_from_url (Process Media)

Directional

  1. Limits on file size
  2. General protections / limits - currently using secure_filename

Extensions

If using a cloud storage provider one of the following must be true:

  1. No extension is set and response header content-type is set correctly. For example filename is example (note no extension) with content-type image/jpeg
  2. A valid extension is set. For example, for images, the URL path ends with [.jpg, .jpeg, .png, .bmp, .tif, .tiff]

Extension Detection

The content-type metadata might not be set correctly.

When possible we use the filename as determined by the URL as the 1st choice.

In theory this will return None if it can't detect it, and fallback to checking the content-type.

https://diffgram.readme.io/reference#section-media-type-detection

The reason we want to detect the content type is

  1. To know what type of pre-processing to do.
  2. To not download completely arbitrary files
  3. To instruct the frontend on how to handle the file.

Handling of 'application/octet-stream'

is basically a "not found" content type

https://cloud.google.com/storage/docs/metadata#content-type
But since now we try to check the URL first,
if we see this, then we still return that as part of the failed
extension type, not a seperate error.

(We now error directly on having both media type None and content type
None)