Against new naming scheme

+1

I don't really see much benefit in keeping the MD5 as a filename considering it doesn't hold any information to the image itself. The only case I found it useful is when the image is corrupted, meaning, only once.

MDGeist

2008-06-19 15:09:31 UTC

i dont see much benefit in filenames like :

moe%2027551%20tagme.jpg

or

moe%2025466%20chen%20cirno%20ellen%20elly%20gengetu%20genjii%20koakuma%20kurumi%20luize%20mai%20meira%20mima%20mugetu%20orange%20rika%20rumia%20ruukoto%20sara%20shinki%20tokiko%20touhou%20yuki%20yumeko.jpg

either.

You cant sort them

(moe 1234 x
moe 2234 x and houndreds of moe 1235 - moe 2233 in between, great)

You get dupes way more easily

(moe 12345 x tagme.jpg gets moe 12345 x y.jpg then moe 12345 y x.jpg and so on)

Only the id/md5 helps you to find the pic fast

Loss of integrety check (well its harder as you only get the md5 by complicately looking at the link)

and other reasons im too lazy to list right now.

its still easier to download pics into folders

there it doesnt matter how the pic is called...

(long filenames can cause stupid shit on win systems btw)

Eruru

2008-06-19 16:33:46 UTC

I too am against it. I use the MD5 hash name as a way to check for dupes locally. Often times I'll go to save a picture and I'll get the "this file already exists" message. Which is great. Because I could have saved it off any one of the boorus.

I sort by date modified anyway so it doesn't bother me that way.

And as MDGeist mentioned, it's easier to search by hash. I'm sure you can still do this, but I wont know what the hash is as easily. I find it useful when trying to tag pictures on my personal 'booru, as well as when trying to refind a picture to link to a friend.

That said, I do remember and still have files from when moe.imouto named them moe_imouto_org-p-118529650583164.png (Which feels like so long ago, but I guess it's not?)

Radioactive

2008-06-19 16:50:20 UTC

+1

Isn't this going to screw up the MD5 searches on the site?

petopeto

2008-06-19 17:00:43 UTC

You're contriving a bunch of make-up reasons. You aren't making any kind of a "case" by making the filenames look worse than they are (pasting URL-encoded text instead of the actual filenames to make it look ugly; using obviously contrived examples like a tagme and a rare tag-heavy post). A typical filename is "moe 27550 blanc_neige elwing shining_tears tony_taka.jpg", which is nice and clean.

Download filenames are for humans, and MD5s as filenames are opaque and useless to humans. I can't find anything in a directory full of them. It's interesting that you mention sorting, since these filenames usefully sort by post number, where MD5s sort statistically randomly.

Long filenames are cut down to 150 characters or less.

If you want to have MD5 filenames, you can write a GM script to rewrite "http://moe.imouto.org/image/MD5/text.jpg" URLs to "http://moe.imouto.org/image/MD5.jpg".

MDGeist

2008-06-19 17:36:13 UTC

It doesnt help anything if they are sorted by post number if you look for all pics of a certain series/artist/genre/whatever...

Still easier to go to the section and download them there, in another folder.

Well mayor reason for md5s is the nondupe stuff.

Thanks god I dont rely on moe-online-browsing nor on browsing via filenames on hdd ...

as for the GM script
its not that easy
(hello .png -.-)

md5sum

2008-06-19 17:43:27 UTC

Another +1

Personally, I do my downloads with DTA and having the hash as the filename means less dupes for me when somebody decides to change tags, and less of moe's bandwidth used as a direct result.

Is there any way to make this some sort of user preference, rather than telling users to go code up something on their own?

EDIT: Erm, yes, I've had this username for a while, it's just a coincidence...

petopeto

2008-06-19 17:52:06 UTC

url = url.replace(/(.*\/image\/)([0-9a-f]{32})\/.*(\.[^. ]+)/, "$1$2$3")

(A lot easier than the script I've been using to get sensible filenames for the last few months, which involved rewriting URLs to a PHP script on another machine, which set Content-Disposition headers and proxied the file...)

Radioactive

2008-06-19 17:55:20 UTC

So we are going to have to use 'Image Search' instead of checking the MD5 hash now?

petopeto

2008-06-19 18:14:17 UTC

I don't know what you're asking. You can still do md5: searches.

Radioactive

2008-06-19 18:26:35 UTC

You can still do md5: searches.

But to get the MD5 hash for the images is more complicated...

Why the sudden change? Have there been complaints about the naming schema?

MugiMugi

2008-06-19 18:37:23 UTC

Why not keep the original filename ?

petopeto

2008-06-19 18:52:39 UTC

You just run an MD5 program on the file, which is what you have to do for most images anyway.

If the filename is an MD5, then 99/100, you got the file from a danbooru site anyway, and it's faster to just use image search--especially with a GM script to help out. I just control-alt-click a thumbnail on any Danbooru site and it opens a tab with an image search (which includes a bunch of other things to make reposting easy). I think I posted that script at some point.

Radioactive said:
Why the sudden change? Have there been complaints about the naming schema?

Yes, by me. They're terrible. Another image board I saw (Gelbooru, maybe) used tags for the filename and it's always driven me nuts that our board didn't do that.

I might add an option at some point to allow tweaking filenames, but it's not really a trivial change (to pass preferences to the URL generation). If I implement a setting to set your preferred mirror, it'll probably happen after that (since that requires the same underlying changes).

MugiMugi said:
Why not keep the original filename ?

We don't know the original filename, and all too often it's something like "001.jpg", which isn't very useful. (Useful for saving whole pools, but in that case using the pool ordering as filenames would be about as good anyway...)

Eruru

2008-06-19 19:31:01 UTC

I save my files like this...

x:\Image Boards\Series\Utawarerumono\41e31834212ff4cdd605bb01e906b931-Eruru Aruru.png

This gives me enough information to avoid dupes, as well as find the pictures using Picasa2 or windows search. It also allows me to easily find the exact file on the 'boorus if it's there. Especially since moe.imouto doesn't tag very well.

http://moe.imouto.org/post/show/9164/

If I don't know the series, or I don't care about the series, they're organized into 4chan like groups (since that's where I first started how many years ago). Most of the time no information is added to the file name, but sometimes I add a series or character name, or artist if known.

x:\Image Boards\a\ddb815bdfa2fdb7d2626ec1bb5ec0869.jpg

http://moe.imouto.org/post/show/27145/

(Note: I'm aware picasa2 has a "Show Duplicate Files" Option. But it doesn't work so great.)

Mostly I just enjoyed that there would always be a common name for a single image between the 'boorus. It made it so easy to avoid saving the same picture multiple times.

aoie_emesai

2008-06-19 21:09:09 UTC

I with the old/new naming system. It is alot more useful that the useless bunches of varying letters and numbers that is in no way readable or even identifiable to me.

-1

Shuugo

2008-06-20 00:06:47 UTC

I'll go say yes to the new filenames.
Worked great on old konachan.com and when I changed to md5 due to move to danbooru was terribly welcomed.

You have the ids so finding dupes should be as easy as to order by filename

moe 8882 suzumiya_haruhi_no_yuutsu
moe 8882 the_melancholy_of_suzumiya_haruhi

They will be next to each other so it's as I said very easy.

Also it's relatively easy to calculate md5 locally on an UNIX machine, and to write a bash / sed / awk script to do it on all filenames. Actually any language can do that easy enough, java, ruby etc.

Saying MD5 are easy to find dupes is lol, if it would be really an effective way to finding dupes we wouldn't have the iqdb image similarity check on upload.

Meh

Merun

2008-06-20 06:46:08 UTC

Shuugo said:
Saying MD5 are easy to find dupes is lol, if it would be really an effective way to finding dupes we wouldn't have the iqdb image similarity check on upload.

Meh

Considering how many version can be on the internet, meaning crappy Minitokyo, high quality, sample, and any file format, resolution... well yeah, MD5 for dupe is not really effective IMO. Usually I rely on my memory, but there are some duplicate detector software that should be able to do the job.

Eruru

2008-06-20 11:36:57 UTC

I don't think you're understanding my side of the argument. I like the common filename between the boorus. What this means is if I find the same exact file on danbooru as moe, I'll get the error that the file already exists.

With this change moe's filename becomes unique and thus totally useless to me. Especially if it's based off tags, and the tags change.

Also, who cares how relatively easy it is to do anything on a UNIX machine when the majority (I assume) who surf moe.imouto don't use *nix as their OS.

Dupe detector software works great... when there's not thousands of pictures. I used to use Image Dupeless until my collection grew too large for it to cope.

And I mentioned (I think?) that Picasa2 has a duplicate search in it as well, but it doesn't work so great.

At anyrate, I hate the new naming scheme but I guess I'm a minority.

MugiMugi

2008-06-20 12:22:07 UTC

There is no way to get any good automatic dupe check system as it will always fail on that regard, on other hand it's possible with the dupe check detection to use it to find duplicate with a humans help.

It save a ton of time but it's not perfect and clearly does not 'do the jobb' but it helps.

MDGeist

2008-06-20 12:28:18 UTC

the dupe check already fails hard, as we have lotsa parents/childposts where parent is detexted child, but child has better/same res and is bigger but with text...

van

2008-06-20 13:42:38 UTC

I think I like the new naming scheme - I can search on my HD via tag. For finding dupes by MD5 it's easy to write a script to do that (yes, even with windows as your OS using e.g. python, java).

As tags are updated on the board, the tags on the files on your HD will be out of date. However a file like the rename.txt that came with the last site torrent should allow anyone to update the file names on their HD with the new tags. Maybe a new rename.txt once every month or two would be good.

Worst case we could have a compromise naming scheme of something like:
moe <post#> <tags> <md5>.png

MDGeist

2008-06-20 14:45:19 UTC

However a file like the rename.txt

guess who did the rename.txt ...

van

2008-06-20 14:56:36 UTC

MDGeist said:
guess who did the rename.txt ...

Yes, many thanks for that MDGeist!

MDGeist

2008-06-20 15:19:04 UTC

making a new rename.txts is a pain in the ass, as i have to crawl whole moe imouto (1230 pages) to get the new tags (if some were updated)

A db export would only take ms...

its a pity you have to start to program stuff to actually be able to browse your collection once it starts to grow beyond 10 images....

Too bad the dup detection thing has absolutely no use for me, since it doesnt work that good on b/w images AND my imagecollection is already over 800k ...

edit
and before doing such a db export
moe should update rsync more often with saki, and let saki handle more traffic ...
(right now i can only see saki links on page 54+ )

admin2

2008-06-20 15:27:42 UTC

saki is in the middle of a move, being switched to a better server

MDGeist

2008-06-20 15:54:30 UTC

got a rename.txt (with [md5 \/ id] )of danboo btw.
240593 images , a bit old i know

Radioactive

2008-06-20 16:29:28 UTC

MDGeist said:
the dupe check already fails hard, as we have lotsa parents/childposts where parent is detexted child, but child has better/same res and is bigger but with text...

As long as there is sufficient difference between the images we should keep them.

petopeto

2008-06-20 16:57:24 UTC

MDGeist said:
the dupe check already fails hard, as we have lotsa parents/childposts where parent is detexted child, but child has better/same res and is bigger but with text...

Yes, the dupe check does take a little intelligence to use. Are you trolling?

syaoran-kun

2008-06-20 17:31:54 UTC

wth..just use an md5 hasher if you want hash filenames instead of bitching about the new change (that imho is better, cuz md5 filenames don't tell you a shit about the image itself)..what's the problem..too HARD for you?

Skeptic

2008-06-20 17:57:48 UTC

I like the new naming scheme, as it pretty much now automatically does what I'd always had to do manually. If I might make a suggestion: when pruning a long taglist for a filename, it'd be most useful if, rather than just truncating it, we were still guaranteed to have the artist and copyright tags preserved. (Maybe the script already does this and I haven't noticed yet because I haven't saved any really long names yet).

Originally I thought that the tagme and fixme tags were useless in filenames... but it occurred to me that I can locally search for those every month or so and then use the file number to check here for parent/child/fixes.

Name
Email
Password
Confirm Password