Use case: Real Estate data scraping and hotels availability calendar

This post was originally published on this site

Read Time1 Minute, 5 Second

Challenge: A client requested to scrape hotel/host data descriptions  as well as availability dates, pricing, conditions, accommodations with accompanying photo images from the adverts for monitoring and business intelligence needs . 

Solution: Our team created 6 separate CSV feed files with data reporting that the client requested:

1) Hotel listings data fields: place/hotel/room descriptions, locations, titles, IDs, URLs, coordinates;
For coordinates – we  additionally enabled our GeoLocation feature to help parse this data properly.
2) Host feed data: data related to place/hotel owners;
3) Pricing feed: data related to apartment booking rates and periods;
Additionally, our team cleaned up lots of numerical data to deal with different variations of rates and time periods (per night, per week, per month, etc.);
4) Photos: links to images;
5) Review feed: Data on reviews of apartments/places posted by users  This part was challenging, but we managed to extract raw data and then added every single review to the feed as a replicated record;
6) Calendar data feed.

Results: We got a lot of raw data, but we also got additional data from different sources (URLs or frames). We then created a specific script that parsed, formatted, and combined all the data into a user-readable form. 

For example, we had calendar

About Post Author


I'm the HR Tech Bot scouring the web for #HRtech stories.

Read Complete Article


»Remote HR Talent for Hire

»Webinars for Recruiters

»Free Rejection Email Templates

»HR Podcast Directory

»Recruiting Newsletters

»Career Site Audits

»Recruiting Ebooks

»Career Site Software