Anthropologie is an American clothing retailer. Currently, the company manages more than 200 stores around the world and offers a carefully selected assortment of clothing, jewelry, underwear, home furnishings and decor, beauty products and gifts. In August 1992, Richard Hein came up with the idea to open a clothing store for creative and educated women aged 30-45 years, so the Anthropologie store appeared. Scraping anthropologie.com with Diggernaut is easy process, you can use provided web scraper to collect product and price data from the online store.
Approx number of goods: 50000
Approx number of page requests: 50000
Recommended subscription plan: Small
PLEASE NOTE! The number of requests can exceed the number of products, because data about variations, images, etc. can be scraped from other resources and will require additional requests. Also part of the product data can be delivered using XHR requests, which also increases the total number of required page requests.
How to use the web scraper to extract data about goods and prices from anthropologie.com
To use the web scraper for Anthropologie store’s website, you must have an account with our Diggernaut service. You can just simply follow this comprehensive guide:
- Go through this registration link to open free account with Diggernaut
- After registering and confirming the email address, you will need to log in to your account
- Create a project with any name and description, if you do not know how to do it, please refer to our documentation
- Switch to the created project and create a digger with any name, if you do not know how to do it, please refer to our documentation
- Copy the following digger configuration to the clipboard and paste it into the digger you created, if you do not know how to do it, refer to our documentation
- Switch the mode of the digger from Debug to Active, if you do not know how to do it, please refer to our documentation
- Run your digger and wait until the completion, if you do not know how to do it, please refer to our documentation
- Download the scraped dataset in the format you need, if you do not know how to do it, please refer to our documentation
You can also setup a schedule for running your scraper and collect data regularly.
Scraping configuration for the digger
---
config:
debug: 2
agent: Firefox
do:
- walk:
to: https://www.anthropologie.com
do:
- find:
path: .c-main-navigation__li--level-1
do:
- find:
path: span
slice: 0
do:
- parse
- space_dedupe
- trim
- normalize:
routine: lower
- variable_set: cat1
- find:
path: .c-main-navigation__li--level-2
do:
- variable_clear: subcat
- find:
path: .c-main-navigation__a--level-2
do:
- parse
- space_dedupe
- trim
- normalize:
routine: lower
- variable_set: cat2
- find:
path: .c-main-navigation__li--level-3 a
do:
- parse
- space_dedupe
- trim
- normalize:
routine: lower
- variable_set: cat3
- variable_set:
field: subcat
value: 1
- parse:
attr: href
- pool_clear: main
- link_add:
pool: main
- walk:
to: links
pool: main
do:
- find:
path: .js-pagination__arrow--next
slice: 0
do:
- parse:
attr: href
- link_add:
pool: main
- find:
path: .c-product-tile__image-link
do:
- parse:
attr: href
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- walk:
to: value
do:
- find:
path: body
do:
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- register_set: Anthropologie
- object_field_set:
object: product
field: brand
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: meta[> img.c-product-image
do:
- parse:
attr: src
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: script:matches(window\.productData)
do:
- parse:
filter:
- window.productData\s*=\s*\'\s*(.+)\s*\'\s*;
- normalize:
routine: Base64ZLIBDecode
- normalize:
routine: json2xml
- to_block
- find:
path: body_safe
do:
- find:
path: primaryslice:hasChild(displaylabel:matches(Color))
do:
- find:
path: sliceitems > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: variations
joinby: "|"
- find:
path: sliceitems
do:
- variable_clear: iid
- find:
path: id
slice: 0
do:
- parse
- variable_set: iid
- find:
path: images
do:
- parse
- register_set: http://images.anthropologie.com/is/image/Anthropologie/_
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: product > stylenumber
slice: 0
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: sku
- find:
path: product > product > brand
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: brand
- find:
path: product > product > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: name
- find:
path: product > product > longdescription
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: description
- variable_get: cat1
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat2
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat3
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- object_save:
name: product
- variable_get: subcat
- if:
match: (1)
else:
- find:
path: .c-main-navigation__a--level-2
do:
- parse:
attr: href
- pool_clear: main
- link_add:
pool: main
- walk:
to: links
pool: main
do:
- find:
path: .js-pagination__arrow--next
slice: 0
do:
- parse:
attr: href
- link_add:
pool: main
- find:
path: .c-product-tile__image-link
do:
- parse:
attr: href
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- walk:
to: value
do:
- find:
path: body
do:
- object_new: product
- eval:
routine: js
body: '(function (){var d = new Date(); return d.toISOString()})();'
- object_field_set:
object: product
field: date
- register_set: Anthropologie
- object_field_set:
object: product
field: brand
- static_get: url
- object_field_set:
object: product
field: url
- find:
path: meta[> img.c-product-image
do:
- parse:
attr: src
filter:
- (.+)\?
- (.+)
- normalize:
routine: url
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: script:matches(window\.productData)
do:
- parse:
filter:
- window.productData\s*=\s*\'\s*(.+)\s*\'\s*;
- normalize:
routine: Base64ZLIBDecode
- normalize:
routine: json2xml
- to_block
- find:
path: body_safe
do:
- find:
path: primaryslice:hasChild(displaylabel:matches(Color))
do:
- find:
path: sliceitems > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: variations
joinby: "|"
- find:
path: sliceitems
do:
- variable_clear: iid
- find:
path: id
slice: 0
do:
- parse
- variable_set: iid
- find:
path: images
do:
- parse
- register_set: http://images.anthropologie.com/is/image/Anthropologie/_
- object_field_set:
object: product
field: images
joinby: "|"
- find:
path: product > stylenumber
slice: 0
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: sku
- find:
path: product > product > brand
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: brand
- find:
path: product > product > displayname
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: name
- find:
path: product > product > longdescription
do:
- parse
- space_dedupe
- trim
- object_field_set:
object: product
field: description
- variable_get: cat1
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat2
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- variable_get: cat3
- if:
match: (\S)
do:
- object_field_set:
object: product
field: category
joinby: "|"
- object_save:
name: product
Sample of scraped data
Below is a sample of a dataset with several products in JSON format (so you can easily review it and see data structure). The dataset can be downloaded as CSV, XLSX, XML, or any other text format using the templates.
[{
"product": {
"brand": "Illume",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:15:58.241Z",
"description": "New from the fragrance masters at Illume, Anatomy of a Fragrance bath and beauty products are sophisticated, lighthearted luxuries. Each is crafted in Minnesota, where Illume combines their signature scents with beautiful packaging designed in-house. From lavish hand creams to triple-milled soaps to nature-inspired perfumes, their line is ready-made for gifting and indulging. **Honey Rose**: a warm, romantic scent with notes of lily of the valley, sandalwood and bergamot **Orchid Vanille**: a bright, fresh combination of orange blossom, jasmine, black currant and praline **Wildflower Bergamot**: A zesty blend of bergamot, lemon and mango layered with cedar and sandalwood",
"images": "https://images.anthropologie.com/is/image/Anthropologie/44448363_040_b|http://images.anthropologie.com/is/image/Anthropologie/44448363_040_b|http://images.anthropologie.com/is/image/Anthropologie/44448363_070_b|http://images.anthropologie.com/is/image/Anthropologie/44448363_065_b",
"name": "Anatomy of a Fragrance Gift Set",
"sku": "44448363",
"url": "https://www.anthropologie.com/shop/anatomy-of-a-fragrance-gift-set",
"variations": "Wildflower Bergamot|Orchid Vanille|Honey Rose"
}
}
,{
"product": {
"brand": "Capri Blue",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:15:59.713Z",
"description": "Capri Blue's iconic vessels and fragrances - proudly designed and poured in Mississippi - are a long-standing favorite at Anthropologie. The line pairs striking visuals with intoxicating scents to create beautifully aromatic products like soy-blended candles and vegan-formulated beauty care. **Volcano**: tropical fruits, sugared oranges, lemons and limes, redolent with lightly exotic mountain greens **Coastal**: notes of pineapple, verbena and coconut, accented by sparkling lemon, bergamot and grapefruit **Fir & Firewood**: a fruity, green aroma of apple, clove, fir, pine needle, white birch, cedar, vetiver and musk **Japanese Quince & Cedar**: aromatic cedar wood is embellished with sun-ripened cassis, sugared quince, accents of red currant and a splash of sparkling pomelo **Gardenia & Fig**: bright greens and fresh peach mingle with gardenia, rose, ylang ylang and coconut over a base of light musk **Cinnamon Toddy**: a mouthwatering medley of ripe apple, warm cinnamon, golden clove and grated nutmeg topped with notes of honey and maple **Spiced Cider**: nutmeg, clove and cinnamon are layered over fresh apple and juicy orange notes **Lagoon**: top notes of freesia, incense and tamarind blend over a musky base of cashmere, wood and vetiver **Grapefruit Neroli**: sun-kissed grapefruit, quince and tangerine over neroli, vanilla, orchid and currant",
"images": "https://images.anthropologie.com/is/image/Anthropologie/19851559_033_b|https://images.anthropologie.com/is/image/Anthropologie/19851559_033_b10|http://images.anthropologie.com/is/image/Anthropologie/19851559_033_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_033_b10|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b10|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b15|http://images.anthropologie.com/is/image/Anthropologie/19851559_090_b16|http://images.anthropologie.com/is/image/Anthropologie/19851559_049_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_026_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_098_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_040_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_007_b|http://images.anthropologie.com/is/image/Anthropologie/19851559_007_b2",
"name": "Capri Blue Iridescent Jar Candle",
"sku": "19851559",
"url": "https://www.anthropologie.com/shop/capri-blue-iridescent-jar-candle8",
"variations": "Fir and Firewood|Spiced Cider|Volcano|Spiced Cider|Fir and Firewood|Volcano|Volcano"
}
}
,{
"product": {
"brand": "Anthropologie",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:16:00.340Z",
"images": "https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b3|https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b|https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b2|https://images.anthropologie.com/is/image/Anthropologie/39336862_001_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_001_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_074_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_010_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_010_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_010_b15|http://images.anthropologie.com/is/image/Anthropologie/39336862_030_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_030_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_030_b15|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_040_b14|http://images.anthropologie.com/is/image/Anthropologie/39336862_065_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_065_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_065_b3|http://images.anthropologie.com/is/image/Anthropologie/39336862_051_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_051_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_051_b10|http://images.anthropologie.com/is/image/Anthropologie/39336862_066_b|http://images.anthropologie.com/is/image/Anthropologie/39336862_066_b2|http://images.anthropologie.com/is/image/Anthropologie/39336862_066_b10",
"name": "Slivered Geode Coaster",
"sku": "39336862",
"url": "https://www.anthropologie.com/shop/geode-coaster",
"variations": "Black Quartz|Dyed Citron|White Quartz|Adventurian|Dyed Blue|Dyed Magenta|Amethyst|Rose quartz"
}
}
,{
"product": {
"brand": "Floreat",
"category": "gifts|features|the gift guide",
"date": "2017-12-05T21:16:01.211Z",
"images": "https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b|https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b2|https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b3|https://images.anthropologie.com/is/image/Anthropologie/43663541_000_b4|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b2|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b3|http://images.anthropologie.com/is/image/Anthropologie/43663541_000_b4|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b2|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b3|http://images.anthropologie.com/is/image/Anthropologie/43663541_049_b4",
"name": "Floreat Printed Sleep Pants",
"sku": "43663541",
"url": "https://www.anthropologie.com/shop/floreat-printed-sleep-pants",
"variations": "ASSORTED|BLUE MOTIF"
}
}]