Close
Website Suggestion
Today I was stringing together multiple tags because I was looking for a sort of specific type of picture when I had an idea. Perhaps this idea is old and maybe it’s not very good, but I’d like to put it forth anyways. I think that it would be really cool to have suggestions of pictures that you might like that could then be ordered in the usual ways (random, score, etc).
My goal is to go beyond using tags to find new stuff.

One way that I think this could be done: your favorites could be compared to other users and you would be directed to some common favorites amongst users with similar preferences. I would be very willing to help with this project. My strongest language is Java and I know some C, HTML, and CSS, but I’d be willing to learn whatever it takes to do the project (SQL? and ???). To those that know more than I do: is this project doable?
Not most qualified person to say this but I do believe that would really overload the sites speed and workload. Tags deleted and lowercase use to avoid that. It's probably doable but it's a pretty big job. Try asking admin2.
It's possible, and has been done both on other sites(in various forms) and in many disciplines such as economics.

If your serious about this, the first step would be to look into Data Mining(Theory). That is process of extracting patterns from sets of data. What your looking for specifically is Association Rule Learning, a subset of data mining that deals with this kind of task.

In general, this topic has been very throughly worked out(at the level this problem is at), so it's very unlikely that any real innovation will be asked from you, meaning coding it shouldn't be too much of a challenge.

Possible Concerns: 1)Drain on the server, 2)Reliability of data being mined.

1) Never worked with servers, have no idea. Ask someone else.
2) Depends on user's rating habits. Some rate everything and have 1000s of favorites, others none. Maybe if those that tend to rate everything as a favorite are given less weight during evaluation they could be balanced out, but still could be a problem. Due to this I recommend instead of using ratings of 1-3 stars to instead use a binary scale of either unrated(possibly include 1 star), or favorite(combine 2 and 3 star ratings).

Edit: Also see this paper. It's very short and easy to read, and will help you have a better understanding of the task at hand. However, since the data you'll be working with is fairly simple, I recommend against using most of the techniques outline here as they would be both overkill and sub optimal. Instead I recommend Association Rule Learning(works best with binary values, meaning either present or absent) as I said before or Linear Models, as introduced in the paper.

Summary: Yes you can do it, though there are some potential issues
peto briefly talked about this feature a few months ago, using a suggestions system sorta like amazon...

Last I heard we don't really have the resources to do such a thing...
Perhaps it could be possible to do something to reduce the resources required. If for instance you were to run the data mining and association rule learning algorithms once a day, and then only output recommendation when a user called for them, you could significantly reduce the resource drain.

E.X.: Say at 12:00am each day you run the mining/association algorithm. If you use criteria of score:>=5 for posts you get 54,040 currently, and users with posts:>=3(*) you get 8000. From this you get a database of 8000x54040 with 43,232,000 items. Now use the data to formulate association rules. For the rest of the day, if a user requests a recommendation on the site, bring up the relevant association rule(s) and generate recommendations. This shouldn't be a large drain since all of the processing was done beforehand.

(*)I don't have a way to sort users based on how many images they rated, so for this example I used posts:>=3 to weed out accounts that are completely free of activity
I took a quick look and found this..

http://www.railsonwave.com/2008/1/15/programming-collective-intelligence/

actually implementing it..heh

note, I'm not a programmer....or much of a server admin >_>
That's the typical recommend-script approach of finding users with similar tastes. It's fairly good, depending on the data it has to work off.

Affinity Based: Finds people with similar taste, uses them to make recs. If there are a lot of users with well documented taste(lots of data per person) it works best.

Association Rule Based: Finds items that are associated(duh), and groups items independent of one particular user's taste. Best choice if there are a lot of users, but sparse amounts of data per person.

For moe, I think there are lots of users with only a few ratings, and so Association Rule Based(99% sure Amazon uses this one) is probably the way to go.
Then the site will be even slower...anyway your idea is interesting.
Site is plenty fast....but for people in china probably not
admin2 said:
Site is plenty fast....but for people in china probably not
Um...why is there a difference of speed in different countries? Don't other countries have servers or what is it I'm not getting with this?
The moe server is based in the United States, the closer the user is to the Server, the faster the connection.

With almost a 6000mile span, go figure :) Same as in if you lived near your ISP
aoie_emesai said:
The moe server is based in the United States, the closer the user is to the Server, the faster the connection.

With almost a 6000mile span, go figure :) Same as in if you lived near your ISP
Sounds like a wireless router. The closer you get [are], the stronger and faster the signal will be to the internet.

Simple enough. Thanks.
This explains why does some Japanese sites used to be so slow sometime ago. It looks like it has been speeding up recently, though. Thank you very much, that was interesting.
I wonder what speeds I can reach since I'm on the west coast...( for now ) I néed to get faster service. I only get 150k at moment.