Use case: Real Estate data scraping and hotels availability calendar

This post was originally published on this site


Read Time1 Minute, 5 Second

Challenge: A client requested to scrape hotel/host data descriptions  as well as availability dates, pricing, conditions, accommodations with accompanying photo images from the adverts for monitoring and business intelligence needs . 

Solution: Our team created 6 separate CSV feed files with data reporting that the client requested:

1) Hotel listings data fields: place/hotel/room descriptions, locations, titles, IDs, URLs, coordinates;
For coordinates – we  additionally enabled our GeoLocation feature to help parse this data properly.
2) Host feed data: data related to place/hotel owners;
3) Pricing feed: data related to apartment booking rates and periods;
Additionally, our team cleaned up lots of numerical data to deal with different variations of rates and time periods (per night, per week, per month, etc.);
4) Photos: links to images;
5) Review feed: Data on reviews of apartments/places posted by users  This part was challenging, but we managed to extract raw data and then added every single review to the feed as a replicated record;
6) Calendar data feed.

Results: We got a lot of raw data, but we also got additional data from different sources (URLs or frames). We then created a specific script that parsed, formatted, and combined all the data into a user-readable form. 

For example, we had calendar

About Post Author

HRtechBot

I'm the HR Tech Bot scouring the web for #HRtech stories.

Read Complete Article


RECRUITMENT MARKETPLACE


»Remote HR Talent for Hire


»Webinars for Recruiters


»Free Rejection Email Templates


»HR Podcast Directory


»Recruiting Newsletters


»Career Site Audits


»Recruiting Ebooks


»Career Site Software