image

Tech Lead : Web Crawling

Location : Bangalore
Experience : 4-8 Years
Tech Lead : Web Crawling

The Product team at Almug builds products and automation solutions for our clients in the capital markets space. We are constantly adding new methods of extracting, analysing and displaying our datasets, while still appreciating the time and effort it takes to build a system that will scale.

We are looking for an ambitious individual who never gets tired of handling large pool of unstructured data sources. Data is the most important fuel for the products/solutions we build at Almug. Build tools and frameworks which can scrape data without much developer intervention. Our goal is to build 500K+ of stable data pipelines. This requires a high degree of automation.

Characteristics:
    • Must have a strong passion for coding
    • Must be a self-starter
    • Must take quality, security and performance seriously
    • Ability to pair with other engineers and cross-team as needed
    • Be excellent in a crisis driving the product team towards quick actions to positively impact clients.
    • Running zero down time deployments
Responsibilities & Tech Requirements:
    • 4-8 years of experience in Web Scraping or Crawling through Scrapy or other frameworks and related libraries (like beautifulsoup).
    • Strong Python/Java Programming Capability
    • Your role is to apply your knowledge set to fetch data from multiple online sources, cleanse it and build APIs on top of it.
    • Good understanding of data structure and algorithms and how it impacts the efficiency of the system in real work.
    • Sound Knowledge in bypassing Bot Detection Techniques
    • Web RestFul APIs / Microservices Development Experience
    • Think deeply about developing large scale scraping tools including data integrity, health and monitoring systems.
    • Develop a deep understanding of our vast data sources on the web and know exactly how, when, and which data to scrape, parse and store.
    • Work with SQL and NoSQL databases to store raw data.
    • Develop frameworks for automating and maintaining constant flow of data from multiple sources.
    • Good knowledge of distributed technologies, real-time systems of high throughput, low latency, and highly scalable systems.
    • Work independently with little supervision to research and test innovative solutions.
Compensation:
    • Salary no bar for deserving candidates
About Almug:

Founded in 2015, Almug is an AI technology first company with deep domain expertise in Capital Markets. We are trusted AI partner to our customers for their financial Data and Solution needs. Our data products deliver data at industry leading time and accuracy. We use technology and design to bring our clients action-driven content, which provides comprehensive and strategic insights to drive decision making. We believe in taking risks and challenging the status quo by continuously innovating and improving. Our team is made up of passionate engineers, scientists, designers and more that work together to make the impossible possible.

To learn more about us: www.almug.ai