Please log in. To create a new account, enter the name and password you want to use.
If you supplied an email address when you signed up or added a email later, you can have your password reset.
This user name doesn't exist. If you want to create a new account, just verify your password and log in.
This user name exists. If you want to create a new account, please choose a different name.
Enter the current email address you have registered in your profile. You'll get an email containing your new password.
You have no email address in your profile, so you can't have your password reset.
Password reset. Check your email in a few minutes
That account does not exist.
The email address specified is not registered with this account.
Delivery to this email address has failed.
I'm exhuming this topic because I have a very similar question (although my constraints are different). Hope it's the right place to ask.
I'm writing a shell & C++ script to build a local, partial representation of yande.re database. I don't want to run this script too often (manually, say once every 6 months or so), but I absolutely want to reduce server load as much as possible. (The motivation is to bother the least possible, to be environmentally friendly, and also as an exercise.)
I basically want to query: For all posts: (id, width, height, file_size, file_url, tags, rating, parent_id) For all pools: (id, name, description, posts) (All these to be understood as raw strings/number such as returned by the API.)
Now my plan is as follow: I didn't try it yet, but I believe that `pool/show.json` without parameter will return all the pools at once. I will take advantage of the fact that posts are described in details in `pool/show` responses to parse and store them right away. (If that doesn't work, I will just query all pools one by one :/. At the moment, I don't know of a way of finding the maximum pool id :(. But that doesn't prevent me to reuse post info in it.) Then I will query all *remaining* posts. I identified two types of query that would work: https://yande.re/post.json?limit=1000&tags=id:123000..123999 https://yande.re/post.json?limit=750&tags=id:123000,123002,123010 ... I tried to combine ranges and singletons, or to use multiple ranges, without success. The former is (I assume) the most server-friendly, but in case I have batch of unknown-yet-posts of say, less than 50 posts, is it still interesting? The later consist of specifying explicitly all post_ids I'm interested in (there is a limit around 750 posts, probably because the url starts to be too long beyond that). I expect this request to be harder to process (or is it?). I guess if I have say 500 continuous posts to query, but the 200th, I'd better ask for the whole range, even if that implies asking for something that I already have. But what is your opinion? What kind of rule could I follow to choose one type of query or the other? Is there a better way to achieve the same result?
EDIT 1: I think these last questions could be reworded as: Is it better to optimize the number of requests; the expected volume (in bytes) of the responses; the complexity of the requests; or a combination of these? Also I noticed in the examples above I get a significant amount of *deleted* posts. I manage that already, but I guess ideally if I could instruct the server not to list them it would be better for everyone. I tried adding `deleted:false` to the list of tags, but no luck. Is there a proper way of doing that?
kompil
I'm exhuming this topic because I have a very similar question (although my constraints are different). Hope it's the right place to ask.
I'm writing a shell & C++ script to build a local, partial representation of yande.re database. I don't want to run this script too often (manually, say once every 6 months or so), but I absolutely want to reduce server load as much as possible. (The motivation is to bother the least possible, to be environmentally friendly, and also as an exercise.)
I basically want to query:
For all posts: (id, width, height, file_size, file_url, tags, rating, parent_id)
For all pools: (id, name, description, posts)
(All these to be understood as raw strings/number such as returned by the API.)
I already manage a basic JSON parser on:
https://yande.re/pool/show.json?id=2100
https://yande.re/post.json?limit=100&tags=id:123400..123499
Now my plan is as follow: I didn't try it yet, but I believe that `pool/show.json` without parameter will return all the pools at once. I will take advantage of the fact that posts are described in details in `pool/show` responses to parse and store them right away. (If that doesn't work, I will just query all pools one by one :/. At the moment, I don't know of a way of finding the maximum pool id :(. But that doesn't prevent me to reuse post info in it.)
Then I will query all *remaining* posts. I identified two types of query that would work:
https://yande.re/post.json?limit=1000&tags=id:123000..123999
https://yande.re/post.json?limit=750&tags=id:123000,123002,123010 ...
I tried to combine ranges and singletons, or to use multiple ranges, without success.
The former is (I assume) the most server-friendly, but in case I have batch of unknown-yet-posts of say, less than 50 posts, is it still interesting?
The later consist of specifying explicitly all post_ids I'm interested in (there is a limit around 750 posts, probably because the url starts to be too long beyond that). I expect this request to be harder to process (or is it?).
I guess if I have say 500 continuous posts to query, but the 200th, I'd better ask for the whole range, even if that implies asking for something that I already have. But what is your opinion? What kind of rule could I follow to choose one type of query or the other? Is there a better way to achieve the same result?
EDIT 1:
I think these last questions could be reworded as: Is it better to optimize the number of requests; the expected volume (in bytes) of the responses; the complexity of the requests; or a combination of these?
Also I noticed in the examples above I get a significant amount of *deleted* posts. I manage that already, but I guess ideally if I could instruct the server not to list them it would be better for everyone. I tried adding `deleted:false` to the list of tags, but no luck. Is there a proper way of doing that?
Thanks :D