The Web Crawl Engineer works with our web crawl engineering team and is responsible for capturing and managing the highest quality content from the web. The ideal candidate demonstrates independence and initiative, is a problem solver, works well autonomously, and is technologically savvy. Additionally, the ideal candidate is open to being trained in, and helping advance, best practices and standards around large-scale web harvests, web data processing and engineering, and contributing to the development of new harvesting, access, and analysis tools.
The position will work in the Web Archiving Group in support of web harvesting services and programs working with partners ranging from national libraries and archives to collaborative international initiatives supporting the collection, preservation, and accessibility of web content. The role will help design the strategy and implementation of web archiving services using open source technologies and platforms, develop harvest techniques and tools to enable archival capture and re-rendering of rich media, streaming content, social media, as well as traditional web page content. The position will also create tools, services, and workflows to improve crawl analysis, reports, data management and derivation, and identify technical and operational requirements. This role contributes to defining deployment architectures and workflows, managing data at scale, and monitoring production systems.
Responsibilities and Duties
Skills & Requirements
Reporting Structure: The Web Crawl Engineer reports to the Web Archiving Engineering Manager and works closely with other departments. The position works alongside other web archiving engineers as well as program staff in Web Archiving and Data Services Group and with the broader Internet Archive infrastructure and engineering teams.
To Apply: Please send your resume and cover letter to firstname.lastname@example.org with the subject line "Web Crawl Engineer."
Internet Archive reserves the right to revise job descriptions or work hours as required.
Benefits & Perks
The Internet Archive provides a comprehensive benefits package including; PTO, paid holidays, medical, dental, vision, FSA, commuter, STD, LTD, 403B/Roth accounts and Friday lunches at IA HQ.
Internet Archive is an Equal Opportunity Employer M/F/D/V/L/G/B/T and will consider for employment, qualified applicants with criminal histories in a manner consistent with the requirements of the Fair Chance Ordinance.