Internet Archive

Senior Software Engineer

Archive-It - San Francisco or REMOTE, CA - Full Time

The Internet Archive is seeking a Senior Software Engineer for its Archive-It Group. The Archive-It team is responsible for maintaining a web application which automates high quality captures of content from the web. An ideal candidate demonstrates independence and initiative, is a problem solver, works well autonomously, has deep experience on the Unix/Linux command line and broad experience in systems architecture. Additionally, the ideal candidate is open to helping advance the state of preserving web-published content, working on the platform which drives a large portion of global web capture.

The successful candidate will work in the Archive-It Group in support of building and maintaining high quality software for the collection, preservation, and accessibility of web content. The role will help design and implement the future of a toolset and APIs which automate web capture using open source technologies and platforms. An ideal candidate is interested in developing harvest techniques and tools to enable archival capture and re-rendering of rich media, streaming content, social media, as well as traditional web page content. This role contributes to defining deployment  architectures and workflows, managing data at scale, and monitoring production systems.

Essential Job Functions:

  • Contending with the complexity of a suite of tools that capture web content accurately at the micro and global scale with equal accuracy
  • Configuration, maintenance and improvement of web crawling tools
  • Contribute to the development of a distributed python-based database used for crawl material deduplication, analysis and reporting.
  • Demonstrated experience of delivering on commitments with deadlines and project time lines and working in a collaborative team of engineers and project/product managers.

Minimum Qualifications:

  • Strong experience in Unix shell scripting and Python coding required
  • Strong experience with python, bash, java, and C-based debugging tools strongly preferred
  • Solid experience in Internet protocols (HTTP is must.) Strong knowledge of HTML, JavaScript and Web technologies in general
  • Knowledge of building and deploying web applications, databases, web-host services, and Linux system administration
  • Ability to work in, and enjoy, a loosely structured work environment

 

Preferred Qualifications:

 

  • Cluster computing experience is preferred, especially familiarity with Hadoop and related technologies and tools
  • Experience working with Javascript and HTML in a large-scale application preferred
  • Experience or familiarity with Java preferred
  • Experience with applications designed to display archived web content
  • Experience with development environments and system monitoring/administration tools
  • Experience with open source practices, version control, and code review
  • Experience with Atlassian tool sets
  • Flexibility and a sense of humor are a plus
  • Bachelor's Degree in Computer Science or a related field, five years of progressively responsible experience in software development.

Reporting Structure: The Senior Software Engineer reports to the Engineering Manager for Archive-It and works closely with other departments. The position works alongside other web archiving engineers as well as program staff in Web Archiving & Data Services Group and with the broader Internet Archive infrastructure and engineering teams.

Benefits & Perks:

The Internet Archive provides a comprehensive benefits package including; PTO, paid holidays, medical, dental, vision, FSA, commuter, STD, LTD, 403B/Roth accounts.

Internet Archive is an Equal Opportunity Employer M/F/D/V/L/G/B/T and will consider for employment, qualified applicants with criminal histories in a manner consistent with the requirements of the Fair Chance Ordinance.

 

Apply: Senior Software Engineer
* Required fields
First name*
Last name*
Email address*
Location
Phone number*
Resume*

Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or paste resume

Paste your resume here or attach resume file