Jump to content

Armory Scraping

  • Please log in to reply
51 replies to this topic

#21 malthrin


    stalemate associate

  • Moderators
  • 9,107 posts

Posted 25 July 2010 - 01:02 PM

Has anyone been getting logged out periodically since this week's maintenance?
Lampkin in EJBSG 28 | Anders in EJBSG 24 | Cavil in EJBSG 20
Boomer in EJBSG 19 | Roslin in EJBSG 17 | Roslin in EJBSG 13 | Roslin in EJBSG 8
MTG Online draft viewer

#22 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 25 July 2010 - 01:59 PM

Yes. I had to modify my library to raise an error when I get the "You must log in." message so that scripts know they need to authenticate again.

#23 malthrin


    stalemate associate

  • Moderators
  • 9,107 posts

Posted 26 July 2010 - 06:42 PM

Yeah, I did the same. Kind of a pain for anyone trying to do this on an authenticator account, though.
Lampkin in EJBSG 28 | Anders in EJBSG 24 | Cavil in EJBSG 20
Boomer in EJBSG 19 | Roslin in EJBSG 17 | Roslin in EJBSG 13 | Roslin in EJBSG 8
MTG Online draft viewer

#24 dozens


    Glass Joe

  • Members
  • 20 posts

Posted 15 October 2010 - 03:10 PM

looks like I am little late to the party :)

The auction browsing is based off your primary toon, does anyone know the url to select a different primary toon?

#25 Rinji


    Glass Joe

  • Members
  • 1 posts

Posted 22 November 2010 - 09:37 PM

To Login (w/o Authenticator):

Post to URL: https://us.battle.net/login/en/login.xml
  • accountName: <account name>
  • password: <password>

Afterwards, I do a get request to http://www.wowarmory.com/auctionhouse/index.xml, which should build the correct cookies. To validate a successful login, check the cookies for (perl regular expression):

Note: I do a match in the cookies for the 'auction_sk' code, as it appears to be passing this param 'sk' around during POSTs, but currently it does not seem to be needed for a successful response.

To List all Characters:

Get URL request: http://www.wowarmory.com/vault/character-select.xml

XML response should be easily queried by xpath (ex.):

Note: The first name listed (I believe) is the default character.

To Switch the Default Character:

does anyone know the url to select a different primary toon?

Post to URL: http://www.wowarmory.com/vault/character-select-submit.json
  • cn: <character name>
  • r: <server>

Successful response:

#26 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 06 December 2010 - 04:05 AM

Lost my last post in the ether, but it looks like Blizzard has moved over to the new Armory in prep for Cataclysm.

They have switched to using REST calls for using the Remote AH. Here are the new base search pages:


Browsing has also changed:

https://us.battle.net/wow/en/vault/character/auction/horde/browse?key=val&key2=val2 (same options at the end as before)

The old method of setting rhtml=n to get JSON/XML does not seem to produce JSON/XML anymore. This should probably be what people work on figuring out first.

I have not looked into the other methods for posting/canceling auctions yet, but they should be trivial to discover.

#27 immense


    Glass Joe

  • Members
  • 4 posts

Posted 06 December 2010 - 05:49 AM

While trying to rework my scripts I found that the Id key no longer works to find items by their ID, for example:

https://us.battle.ne.../browse?id=4306 will NOT give you results for silk cloth

I'll keep digging and post updates to this thread.

#28 immense


    Glass Joe

  • Members
  • 4 posts

Posted 06 December 2010 - 09:01 AM

You can search the new interface by item ID with the following string:


This would only return the results for Silk Cloth rather than getting results for items such as a bolt of silk cloth when using the text search.

If anyone has figured out what file has replaced the bid.json script please feel free to share =)

#29 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 06 December 2010 - 09:07 AM

This should get you started:

browse: GET https://us.battle.net/wow/en/vault/character/auction/horde/
bid_auction: POST https://us.battle.net/wow/en/vault/character/auction/horde/bid
cancel_auction: POST https://us.battle.net/wow/en/vault/character/auction/horde/cancel
create_auction: POST https://us.battle.net/wow/en/vault/character/auction/horde/createAuction
deposit (ticket request): POST https://us.battle.net/wow/en/vault/character/auction/horde/deposit
money: POST https://us.battle.net/wow/en/vault/character/auction/horde/money
my_auctions: GET https://us.battle.net/wow/en/vault/character/auction/horde/auctions
my_bids: GET https://us.battle.net/wow/en/vault/character/auction/horde/bids

Keep in mind that the "horde" portion of the URL is dynamic and can be "horde", "alliance", or "neutral".

So far I've found that the bid, cancel, create, deposit (ticket request) and money requests all return native JSON.

I still haven't found a way to get the browse, "my auctions", or "my bids" requests to return anything but HTML.

Also, upon looking at this I got the idea that since all of the POST requests returned JSON that maybe any POST request would, but sending POST instead of GET to the browse URL still results in HTML.

#30 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 06 December 2010 - 09:54 AM

From a PM:

In regards to https://us.battle.ne...on/alliance/bid, what variables are you posting to bid / buyout an auction?

Is it still the Auction id and Money (in copper)?


There is an additional key required now (other than the auction id (auc) and amount to bid (money)) that is "xtoken". The value of this key is taken from the cookie "xstoken" that is set during login.

These new "xtoken" and "xstoken" keys are used for several of the operations. There seems to be some confusion with them, for example in the POST request for bidding on an auction, the "xtoken" value is taken from the cookie "xstoken" (note the "s").

In the cancel operation, the "xtoken" value comes from the "xtoken" cookie, same with the create operation.

In the deposit operation when creating a new auction, there is also a new key "sk", which is derived from the "xstoken" cookie.

#31 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 06 December 2010 - 01:21 PM

I've put a lot of the changed stuff up online for my personal use at: https://www.ef.net/remote_auctionhouse.

Just some of the changes with the new Armory and for ease of having everything in one place rather than spread around in multiple places. I'll try to keep it updated with new stuff we discover as time permits.

#32 immense


    Glass Joe

  • Members
  • 4 posts

Posted 29 December 2010 - 07:08 PM

Has anyone figured out how to change the default character in the "new" armory?

It looks like the browser sends a post request with the following parameters but I cannot get my app to work with them, I just get a (404) Page not found error:

Post url: http://us.battle.net.../pref/character
index=3 (Character index)
xstoken=478e4589-5337-49db-b5ff-039743710268 (Set in cookie)

Thanks in advance for the help!

#33 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 30 December 2010 - 10:25 PM

Of the people I collaborate with on the Armory stuff, all of us are still using the "old" mechanisms to interact with the Remote Auction House. Shortly after Blizzard rolled out the new Armory they disabled the "old" URL's, but the new Armory didn't support any easy way to gather data (it returned actual HTML, not XML/JSON).

Due to popular demand from people who actually parsed Armory data, they re-enabled the old URL's while they work on finalizing an API for accessing the new Armory.

This was a thread on the subject: WTB XML Feeds! Offering lunch at Javier's! - Forums - World of Warcraft

When interacting with World of Warcraft, you can specify the character with the 'cn' key, like this:
GET /auctionhouse/search.json?pageSize=50&rhtml=false&cn=Charname&f=1&r=Mal%27Ganis

It also works for creating an auction, bidding, etc. All of the actions will accept 'cn' (character name), 'f' (faction) and 'r' (realm).

In short, don't use the new Armory URL's (yet) unless you just like parsing HTML. If you want XML/JSON just use the old URL's until they publish the new API.

#34 Oldertarl


    Glass Joe

  • Members
  • 3 posts

Posted 04 January 2011 - 02:54 PM

Got some nice browser error yesterday showing json data rather than the new armory HTML page.
Perhaps ( keeping hope up ) they are in the test phase now and will release new data feeds soon.

#35 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 15 January 2011 - 04:42 AM

So I have a question that isn't directly related to Armory scraping, but is related to scraping... specifically, I am looking for a programmatic way to get the current patch level of Warcraft. Anyone have ideas? I poked around a bit with Wireshark looking at how the launcher works (pasted below), but I don't see anything that looks like a patch version in the plaintext portion of what it does.

The important part for me is a programmatic, reliable way to get the data. I can't look inside WTF files. I need to be able to get it off the wire, preferably from Blizzard directly. If there are other options, I'm open to those as well.

Here is the Wireshark of the launcher starting up if anyone is curious:

PORT: 80

GET /wow-pod-retail/NA/config_recommended_na_2.xml HTTP/1.1
Host: ak.worldofwarcraft.com.edgesuite.net
User-Agent: Launcher/4.0.0 CFNetwork/454.11.5 Darwin/10.6.0 (i386) (MacBookPro6%2C2)
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: keep-alive

HTTP/1.1 200 OK
Server: Apache
ETag: "3173b46d78ca246332f8be5915acbbe3:1291056533"
Last-Modified: Mon, 29 Nov 2010 18:48:53 GMT
Accept-Ranges: bytes
Content-Length: 525
Content-Type: application/xml
Date: Sat, 15 Jan 2011 04:21:14 GMT
Connection: keep-alive

  <versioninfo type="pod">
    <version product="WoW">
        <server id="akamai" url="http://ak.worldofwarcraft.com.edgesuite.net/wow-pod-retail/NA/12911.streaming.2/"/>
        <threshold speed="1000000" red="5" yellow="5" />
        <threshold speed="1000001" red="1" yellow="5" />
        <setting name="patchapplicationstage" value="Recommended"/>

#36 Slackie


    Bald Bull

  •  Patrons
  • 2,003 posts

Posted 18 January 2011 - 12:59 AM

Anyone have experience interacting with the remote AH (specifically search.json) using a character with UTF-8 symbols in the name?

I've tried to escape them but I just keep getting the This account does not have any characters who are eligible to use the Auction House error message.

If I change cn to another character on the same account with no UTF-8 symbols it works fine.

#37 Blup


    Von Kaiser

  • Members
  • 69 posts

Posted 29 January 2011 - 10:55 PM

Has anyone found a way to get the guild tabard form the guild summary page? It seems to be built using javascript and I can't seem to find a way to actually get it from the html I'm trying to parse.

#38 immense


    Glass Joe

  • Members
  • 4 posts

Posted 07 February 2011 - 04:22 PM

The undermine journal came back up today, does that mean that XML data feeds are back up too? I have always parsed HTML but I am interested in using XML for obvious reasons!

#39 Jakobud



  • Banned
  • 1 posts

Posted 09 February 2011 - 11:37 PM

I know this is an old thread, but I had a question about the search.json URL. When I perform a basic search like this:


It returns results like this:

{"auctionSearch":{"start":0,"end":200,"total":200,"auctions":[{"auc":11788343... etc....

See the "total" of 200? That means it only returns the first 200 results. How do I change this to return more results? How are websites like the Undermine Journal able to pull EVERY auction and not just the first 200?

#40 Arrielle


    Glass Joe

  • Members
  • 1 posts

Posted 10 February 2011 - 06:01 AM

Like Slakies said before, there are a limit of 200 results per query

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users